Milvus is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.This notebook shows how to use functionality related to the Milvus vector database.
Setup
You’ll need to installlangchain-milvus
with pip install -qU langchain-milvus
to use this integration.
Credentials
No credentials are needed to use theMilvus
vector store.
Initialization
Milvus Lite
The easiest way to prototype is to use Milvus Lite, where everything is stored in a local vector database file. Only the Flat index can be used.Milvus Server
If you have a large amount of data (e.g., more than a million vectors), we recommend setting up a more performant Milvus server on Docker or Kubernetes. The Milvus server offers support for a variety of indexes. Leveraging these different indexes can significantly enhance the retrieval capabilities and expedite the retrieval process, tailored to your specific requirements. As an illustration, consider the case of Milvus Standalone. To initiate the Docker container, you can run the following command:If you want to use Zilliz Cloud, the fully managed cloud service for Milvus, please adjust the uri and token, which correspond to the Public Endpoint and Api key in Zilliz Cloud.
Compartmentalize the data with Milvus Collections
You can store unrelated documents in different collections within the same Milvus instance. Here’s how you can create a new collection:Manage vector store
Once you have created your vector store, we can interact with it by adding and deleting different items.Add items to vector store
We can add items to our vector store by using theadd_documents
function.
Delete items from vector store
Query vector store
Once your vector store has been created and the relevant documents have been added, you will most likely wish to query it during the running of your chain or agent.Query directly
Similarity search
Performing a simple similarity search with filtering on metadata can be done as follows:Similarity search with score
You can also search with score:Milvus
vector store, you can visit the API reference.
Query by turning into retriever
You can also transform the vector store into a retriever for easier usage in your chains.Hybrid Search
The most common hybrid search scenario is the dense + sparse hybrid search, where candidates are retrieved using both semantic vector similarity and precise keyword matching. Results from these methods are merged, reranked, and passed to an LLM to generate the final answer. This approach balances precision and semantic understanding, making it highly effective for diverse query scenarios.Full-text search
Since Milvus 2.5, full-text search is natively supported through the Sparse-BM25 approach, by representing the BM25 algorithm as sparse vectors. Milvus accepts raw text as input and automatically converts it into sparse vectors stored in a specified field, eliminating the need for manual sparse embedding generation. For full-text search Milvus VectorStore accepts abuiltin_function
parameter. Through this parameter, you can pass in an instance of the BM25BuiltInFunction
. This is different than semantic search which usually passes dense embeddings to the VectorStore
,
Here is a simple example of hybrid search in Milvus with OpenAI dense embedding for semantic search and BM25 for full-text search:
In the code above, we define an instance of
- When you use
BM25BuiltInFunction
, please note that the full-text search is available in Milvus Standalone and Milvus Distributed, but not in Milvus Lite, although it is on the roadmap for future inclusion. It will also be available in Zilliz Cloud (fully-managed Milvus) soon. Please reach out to support@zilliz.com for more information.
BM25BuiltInFunction
and pass it to the Milvus
object. BM25BuiltInFunction
is a lightweight wrapper class for Function
in Milvus. We can use it with OpenAIEmbeddings
to initialize a dense + sparse hybrid search Milvus vector store instance.
BM25BuiltInFunction
does not require the client to pass corpus or training, all are automatically processed at the Milvus server’s end, so users do not need to care about any vocabulary and corpus. In addition, users can also customize the analyzer to implement the custom text processing in the BM25.
Rerank the candidates
After the first stage of retrieval, we need to rerank the candidates to get a better result. You can refer to the Reranking for more information. Here is an example for weighted reranking:Usage for retrieval-augmented generation
For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:Per-User Retrieval
When building a retrieval app, you often have to build it with multiple users in mind. This means that you may be storing data not just for one user, but for many different users, and they should not be able to see each other’s data. Milvus recommends using partition_key to implement multi-tenancy. Here is an example:The Partition key feature is not available in Milvus Lite, if you want to use it, you need to start Milvus server, as mentioned above.
search_kwargs={"expr": '<partition_key> == "xxxx"'}
search_kwargs={"expr": '<partition_key> == in ["xxx", "xxx"]'}
Do replace <partition_key>
with the name of the field that is designated as the partition key.
Milvus changes to a partition based on the specified partition key, filters entities according to the partition key, and searches among the filtered entities.