Setup
To access theCouchbaseSearchVectorStore
you first need to install the langchain-couchbase
partner package:
Credentials
Head over to the Couchbase website and create a new connection, making sure to save your database username and password:Initialization
Before instantiating we need to create a connection.Create Couchbase Connection Object
We create a connection to the Couchbase cluster initially and then pass the cluster object to the Vector Store. Here, we are connecting using the username and password from above. You can also connect using any other supported way to your cluster. For more information on connecting to the Couchbase cluster, please check the documentation.Simple Instantiation
Below, we create the vector store object with the cluster information and the search index name.Specify the Text & Embeddings Field
You can optionally specify the text & embeddings field for the document using thetext_key
and embedding_key
fields.
Manage vector store
Once you have created your vector store, we can interact with it by adding and deleting different items.Add items to vector store
We can add items to our vector store by using theadd_documents
function.
Delete items from vector store
Query vector store
Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.Query directly
Similarity search
Performing a simple similarity search can be done as follows:Similarity search with Score
You can also fetch the scores for the results by calling thesimilarity_search_with_score
method.
Filtering Results
You can filter the search results by specifying any filter on the text or metadata in the document that is supported by the Couchbase Search service. Thefilter
can be any valid SearchQuery supported by the Couchbase Python SDK. These filters are applied before the Vector Search is performed.
If you want to filter on one of the fields in the metadata, you need to specify it using .
For example, to fetch the source
field in the metadata, you need to specify metadata.source
.
Note that the filter needs to be supported by the Search Index.
Specifying Fields to Return
You can specify the fields to return from the document usingfields
parameter in the searches. These fields are returned as part of the metadata
object in the returned Document. You can fetch any field that is stored in the Search index. The text_key
of the document is returned as part of the document’s page_content
.
If you do not specify any fields to be fetched, all the fields stored in the index are returned.
If you want to fetch one of the fields in the metadata, you need to specify it using .
For example, to fetch the source
field in the metadata, you need to specify metadata.source
.
Query by turning into retriever
You can also transform the vector store into a retriever for easier usage in your chains. Here is how to transform your vector store into a retriever and then invoke the retreiever with a simple query and filter.Hybrid Queries
Couchbase allows you to do hybrid searches by combining Vector Search results with searches on non-vector fields of the document like themetadata
object.
The results will be based on the combination of the results from both Vector Search and the searches supported by Search Service. The scores of each of the component searches are added up to get the total score of the result.
To perform hybrid searches, there is an optional parameter, search_options
that can be passed to all the similarity searches.
The different search/query possibilities for the search_options
can be found here.
Create Diverse Metadata for Hybrid Search
In order to simulate hybrid search, let us create some random metadata from the existing documents. We uniformly add three fields to the metadata,date
between 2010 & 2020, rating
between 1 & 5 and author
set to either John Doe or Jane Doe.
Query by Exact Value
We can search for exact matches on a textual field like the author in themetadata
object.
Query by Partial Match
We can search for partial matches by specifying a fuzziness for the search. This is useful when you want to search for slight variations or misspellings of a search query. Here, “Jae” is close (fuzziness of 1) to “Jane”.Query by Date Range Query
We can search for documents that are within a date range query on a date field likemetadata.date
.
Query by Numeric Range Query
We can search for documents that are within a range for a numeric field likemetadata.rating
.
Combining Multiple Search Queries
Different search queries can be combined using AND (conjuncts) or OR (disjuncts) operators. In this example, we are checking for documents with a rating between 3 & 4 and dated between 2015 & 2018.filter
parameter instead of hybrid search.
Combining Hybrid Search Query with Filters
Hybrid Search can be combined with filters to get the best of both hybrid search and the filters for results matching the requirements. In this example, we are checking for documents with a rating between 3 & 5 and matching the string “independence” in the text field.Other Queries
Similarly, you can use any of the supported Query methods like Geo Distance, Polygon Search, Wildcard, Regular Expressions, etc in thesearch_options
parameter. Please refer to the documentation for more details on the available query methods and their syntax.
Usage for retrieval-augmented generation
For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:Frequently Asked Questions
Question: Should I create the Search index before creating the CouchbaseSearchVectorStore object?
Yes, currently you need to create the Search index before creating theCouchbaseSearchVectoreStore
object.
Question: I am not seeing all the fields that I specified in my search results
In Couchbase, we can only return the fields stored in the Search index. Please ensure that the field that you are trying to access in the search results is part of the Search index. One way to handle this is to index and store a document’s fields dynamically in the index.- In Capella, you need to go to “Advanced Mode” then under the chevron “General Settings” you can check “[X] Store Dynamic Fields” or “[X] Index Dynamic Fields”
- In Couchbase Server, in the Index Editor (not Quick Editor) under the chevron “Advanced” you can check “[X] Store Dynamic Fields” or “[X] Index Dynamic Fields”
Question: I am unable to see the metadata object in my search results
This is most likely due to themetadata
field in the document not being indexed and/or stored by the Couchbase Search index. In order to index the metadata
field in the document, you need to add it to the index as a child mapping.
If you select to map all the fields in the mapping, you will be able to search by all metadata fields. Alternatively, to optimize the index, you can select the specific fields inside metadata
object to be indexed. You can refer to the docs to learn more about indexing child mappings.
Creating Child Mappings
Question: What is the difference between filter and search_options / hybrid queries?
Filters are pre-filters that are used to restrict the documents searched in a Search index. It is available in Couchbase Server 7.6.4 & higher. Hybrid Queries are additional search queries that can be used to tune the results being returned from the search index. Both filters and hybrid search queries have the same capabilites with slightly different syntax. Filters are SearchQuery objects while the hybrid search queries are dictionaries.API reference
For detailed documentation of allCouchbaseSearchVectorStore
features and configurations head to the API reference