Vector Search introduction and langchain integration guide.
What is Amazon MemoryDB?
MemoryDB is compatible with Redis OSS, a popular open source data store, enabling you to quickly build applications using the same flexible and friendly Redis OSS data structures, APIs, and commands that they already use today. With MemoryDB, all of your data is stored in memory, which enables you to achieve microsecond read and single-digit millisecond write latency and high throughput. MemoryDB also stores data durably across multiple Availability Zones (AZs) using a Multi-AZ transactional log to enable fast failover, database recovery, and node restarts.Vector search for MemoryDB
Vector search for MemoryDB extends the functionality of MemoryDB. Vector search can be used in conjunction with existing MemoryDB functionality. Applications that do not use vector search are unaffected by its presence. Vector search is available in all Regions that MemoryDB is available. You can use your existing MemoryDB data or Redis OSS API to build machine learning and generative AI use cases, such as retrieval-augmented generation, anomaly detection, document retrieval, and real-time recommendations.- Indexing of multiple fields in Redis hashes and
JSON
- Vector similarity search (with
HNSW
(ANN) orFLAT
(KNN)) - Vector Range Search (e.g. find all vectors within a radius of a query vector)
- Incremental indexing without performance loss
Setting up
Install Redis Python client
Redis-py
is a python client that can be used to connect to MemoryDB
MemoryDB Connection
Valid Redis Url schemas are:redis://
- Connection to Redis cluster, unencryptedrediss://
- Connection to Redis cluster, with TLS encryption
Sample data
First we will describe some sample data so that the various attributes of the Redis vector store can be demonstrated.Create MemoryDB vector store
The InMemoryVectorStore instance can be initialized using the below methodsInMemoryVectorStore.__init__
- Initialize directlyInMemoryVectorStore.from_documents
- Initialize from a list ofLangChain.docstore.Document
objectsInMemoryVectorStore.from_texts
- Initialize from a list of texts (optionally with metadata)InMemoryVectorStore.from_existing_index
- Initialize from an existing MemoryDB index
Querying
There are multiple ways to query theInMemoryVectorStore
implementation based on what use case you have:
similarity_search
: Find the most similar vectors to a given vector.similarity_search_with_score
: Find the most similar vectors to a given vector and return the vector distancesimilarity_search_limit_score
: Find the most similar vectors to a given vector and limit the number of results to thescore_threshold
similarity_search_with_relevance_scores
: Find the most similar vectors to a given vector and return the vector similaritiesmax_marginal_relevance_search
: Find the most similar vectors to a given vector while also optimizing for diversity
MemoryDB as Retriever
Here we go over different options for using the vector store as a retriever. There are three different search methods we can use to do retrieval. By default, it will use semantic similarity.similarity_distance_threshold
retriever which allows the user to specify the vector distance
similarity_score_threshold
allows the user to define the minimum score for similar documents