Intel’s Visual Data Management System (VDMS) is a storage solution for efficient access of big-”visual”-data that aims to achieve cloud scale by searching for relevant visual data via visual metadata stored as a graph and enabling machine friendly enhancements to visual data for faster access. VDMS is licensed under MIT. For more information on VDMS
, visit this page, and find the LangChain API reference here.
VDMS supports:
- K nearest neighbor search
- Euclidean distance (L2) and inner product (IP)
- Libraries for indexing and computing distances: FaissFlat (Default), FaissHNSWFlat, FaissIVFFlat, Flinng, TileDBDense, TileDBSparse
- Embeddings for text, images, and video
- Vector and metadata searches
Setup
To access VDMS vector stores you’ll need to install thelangchain-vdms
integration package and deploy a VDMS server via the publicly available Docker image.
For simplicity, this notebook will deploy a VDMS server on local host using port 55555.
Credentials
You can useVDMS
without any credentials.
To enable automated tracing of your model calls, set your LangSmith API key:
Initialization
Use the VDMS Client to connect to a VDMS vectorstore using FAISS IndexFlat indexing (default) and Euclidean distance (default) as the distance metric for similarity search.Manage vector store
Add items to vector store
add_documents
does not check whether the ids are unique. For this reason, use upsert
to delete existing id entries prior to adding.
Update items in vector store
Delete items from vector store
Query vector store
Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.Query directly
Performing a simple similarity search can be done as follows:Query by turning into retriever
You can also transform the vector store into a retriever for easier usage in your chains.Delete collection
Previously, we removed documents based on itsid
. Here, all documents are removed since no ID is provided.
Usage for retrieval-augmented generation
For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:Similarity Search using other engines
VDMS supports various libraries for indexing and computing distances: FaissFlat (Default), FaissHNSWFlat, FaissIVFFlat, Flinng, TileDBDense, and TileDBSparse. By default, the vectorstore uses FaissFlat. Below we show a few examples using the other engines.Similarity Search using Faiss HNSWFlat and Euclidean Distance
Here, we add the documents to VDMS using Faiss IndexHNSWFlat indexing and L2 as the distance metric for similarity search. We search for three documents (k=3
) related to a query and also return the score along with the document.
Similarity Search using Faiss IVFFlat and Inner Product (IP) Distance
We add the documents to VDMS using Faiss IndexIVFFlat indexing and IP as the distance metric for similarity search. We search for three documents (k=3
) related to a query and also return the score along with the document.
Similarity Search using FLINNG and IP Distance
In this section, we add the documents to VDMS using Filters to Identify Near-Neighbor Groups (FLINNG) indexing and IP as the distance metric for similarity search. We search for three documents (k=3
) related to a query and also return the score along with the document.
Filtering on metadata
It can be helpful to narrow down the collection before working with it. For example, collections can be filtered on metadata using theget_by_constraints
method. A dictionary is used to filter metadata. Here we retrieve the document where langchain_id = "2"
and remove it from the vector store.
NOTE: id
was generated as additional metadata as an integer while langchain_id
(the internal ID) is an unique string for each entry.
id
to filter for a range of IDs since it is an integer.