Neo4j is an open-source graph database with integrated support for vector similarity searchIt supports:
- approximate nearest neighbor search
- Euclidean similarity and cosine similarity
- Hybrid search combining vector and keyword searches
Neo4jVector
).
See the installation instruction.
OpenAIEmbeddings
so we have to get the OpenAI API Key.
Similarity Search with Cosine Distance (Default)
Working with vectorstore
Above, we created a vectorstore from scratch. However, often times we want to work with an existing vectorstore. In order to do that, we can initialize it directly.from_existing_graph
method. This method pulls relevant text information from the database, and calculates and stores the text embeddings back to the database.
Metadata filtering
Neo4j vector store also supports metadata filtering by combining parallel runtime and exact nearest neighbor search. Requires Neo4j 5.18 or greater version. Equality filtering has the following syntax.$eq: Equal
$ne: Not Equal
$lt: Less than
$lte: Less than or equal
$gt: Greater than
$gte: Greater than or equal
$in: In a list of values
$nin: Not in a list of values
$between: Between two values
$like: Text contains value
$ilike: lowered text contains value
OR
operator between filters
Add documents
We can add documents to the existing vectorstore.Customize response with retrieval query
You can also customize responses by using a custom Cypher snippet that can fetch other information from the graph. Under the hood, the final Cypher statement is constructed like so:text
: Union[str, Dict] = Value used to populatepage_content
of a documentscore
: Float = Similarity scoremetadata
: Dict = Additional metadata of a document
embedding
as a dictionary to text
column,
Hybrid search (vector + keyword)
Neo4j integrates both vector and keyword indexes, which allows you to use a hybrid search approachRetriever options
This section shows how to useNeo4jVector
as a retriever.
Question Answering with Sources
This section goes over how to do question-answering with sources over an Index. It does this by using theRetrievalQAWithSourcesChain
, which does the lookup of the documents from an Index.