Elasticsearch is a distributed, RESTful search and analytics engine. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. It supports keyword search, vector search, hybrid search and complex filtering.The
ElasticsearchRetriever
is a generic wrapper to enable flexible access to all Elasticsearch
features through the Query DSL. For most use cases the other classes (ElasticsearchStore
, ElasticsearchEmbeddings
, etc.) should suffice, but if they don’t you can use ElasticsearchRetriever
.
This guide will help you get started with the Elasticsearch retriever. For detailed documentation of all ElasticsearchRetriever
features and configurations head to the API reference.
Integration details
Setup
There are two main ways to set up an Elasticsearch instance:- Elastic Cloud: Elastic Cloud is a managed Elasticsearch service. Sign up for a free trial. To connect to an Elasticsearch instance that does not require login credentials (starting the docker instance with security enabled), pass the Elasticsearch URL and index name along with the embedding object to the constructor.
- Local Install Elasticsearch: Get started with Elasticsearch by running it locally. The easiest way is to use the official Elasticsearch Docker image. See the Elasticsearch Docker documentation for more information.
Installation
This retriever lives in thelangchain-elasticsearch
package. For demonstration purposes, we will also install langchain-community
to generate text embeddings.
Configure
Here we define the connection to Elasticsearch. In this example we use a locally running instance. Alternatively, you can make an account in Elastic Cloud and start a free trial.Define example data
Index data
Typically, users make use ofElasticsearchRetriever
when they already have data in an Elasticsearch index. Here we index some example text documents. If you created an index for example using ElasticsearchStore.from_documents
that’s also fine.
Instantiation
Vector search
Dense vector retrieval using fake embeddings in this example.BM25
Traditional keyword matching.Hybrid search
The combination of vector search and BM25 search using Reciprocal Rank Fusion (RRF) to combine the result sets.Fuzzy matching
Keyword matching with typo tolerance.Complex filtering
Combination of filters on different fields.Custom document mapper
It is possible to cusomize the function that maps an Elasticsearch result (hit) to a LangChain document.Usage
Following the above examples, we use.invoke
to issue a single query. Because retrievers are Runnables, we can use any method in the Runnable interface, such as .batch
, as well.
Use within a chain
We can also incorporate retrievers into chains to build larger applications, such as a simple RAG application. For demonstration purposes, we instantiate an OpenAI chat model as well.API reference
For detailed documentation of allElasticsearchRetriever
features and configurations head to the API reference.