Vespa is a fully featured search engine and vector database. It supports vector search (ANN), lexical search, and search in structured data, all in the same query.This notebook shows how to use
Vespa.ai
as a LangChain vector store.
You’ll need to install langchain-community
with pip install -qU langchain-community
to use this integration
In order to create the vector store, we use
pyvespa to create a
connection a Vespa
service.
pyvespa
package, you can either connect to a
Vespa Cloud instance
or a local
Docker instance.
Here, we will create a new Vespa application and deploy that using Docker.
Creating a Vespa application
First, we need to create an application package:text
for holding the document text and embedding
for holding
the embedding vector. The text
field is set up to use a BM25 index for
efficient text retrieval, and we’ll see how to use this and hybrid search a
bit later.
The embedding
field is set up with a vector of length 384 to hold the
embedding representation of the text. See
Vespa’s Tensor Guide
for more on tensors in Vespa.
Lastly, we add a rank profile to
instruct Vespa how to order documents. Here we set this up with a
nearest neighbor search.
Now we can deploy this application locally:
Vespa
service. In case you
already have a Vespa application running, for instance in the cloud,
please refer to the PyVespa application for how to connect.
Creating a Vespa vector store
Now, let’s load some documents:1536
to reflect the larger size of that embedding.
To feed these to Vespa, we need to configure how the vector store should map to
fields in the Vespa application. Then we create the vector store directly from
this set of documents:
default
ranking function, which we set up in the application package
above. You can use the ranking
argument to similarity_search
to
specify which ranking function to use.
Please refer to the pyvespa documentation
for more information.
This covers the basic usage of the Vespa store in LangChain.
Now you can return the results and continue using these in LangChain.
Updating documents
An alternative to callingfrom_documents
, you can create the vector
store directly and call add_texts
from that. This can also be used to update
documents:
pyvespa
library contains methods to manipulate
content on Vespa which you can use directly.
Deleting documents
You can delete documents using thedelete
function:
pyvespa
connection contains methods to delete documents as well.
Returning with scores
Thesimilarity_search
method only returns the documents in order of
relevancy. To retrieve the actual scores:
"all-MiniLM-L6-v2"
embedding model using the
cosine distance function (as given by the argument angular
in the
application function).
Different embedding functions need different distance functions, and Vespa
needs to know which distance function to use when orderings documents.
Please refer to the
documentation on distance functions
for more information.
As retriever
To use this vector store as a LangChain retriever simply call theas_retriever
function, which is a standard vector store
method:
Metadata
In the example so far, we’ve only used the text and the embedding for that text. Documents usually contain additional information, which in LangChain is referred to as metadata. Vespa can contain many fields with different types by adding them to the application package:Custom query
If the default behavior of the similarity search does not fit your requirements, you can always provide your own query. Thus, you don’t need to provide all of the configuration to the vector store, but rather just write this yourself. First, let’s add a BM25 ranking function to our application:Hybrid search
Hybrid search means using both a classic term-based search such as BM25 and a vector search and combining the results. We need to create a new rank profile for hybrid search on Vespa:Native embedders in Vespa
Up until this point we’ve used an embedding function in Python to provide embeddings for the texts. Vespa supports embedding function natively, so you can defer this calculation in to Vespa. One benefit is the ability to use GPUs when embedding documents if you have a large collections. Please refer to Vespa embeddings for more information. First, we need to modify our application package:hfembedding
field
includes instructions for embedding using the hf-embedder
.
Now we can query with a custom query:
embed
instruction to embed the query
using the same model as for the documents.
Approximate nearest neighbor
In all of the above examples, we’ve used exact nearest neighbor to find results. However, for large collections of documents this is not feasible as one has to scan through all documents to find the best matches. To avoid this, we can use approximate nearest neighbors. First, we can change the embedding field to create a HNSW index:approximate
argument to True
: