- A way to extract text from files (PDF, PPT, DOCX, etc)
- ML-based chunking that provides state of the art performance.
- The Boomerang embeddings model.
- Its own internal vector database where text chunks and embedding vectors are stored.
- A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments, including support for Hybrid Search as well as multiple reranking options such as the multi-lingual relevance reranker, MMR, UDF reranker.
- An LLM for creating a generative summary, based on the retrieved documents (context), including citations.
similarity_search
and similarity_search_with_score
as well as using the LangChain as_retriever
functionality.
Setup
To use theVectaraVectorStore
you first need to install the partner package.
Getting Started
To get started, use the following steps:- If you don’t already have one, Sign up for your free Vectara trial.
- Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingest from input documents. To create a corpus, use the “Create Corpus” button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.
- Next you’ll need to create API keys to access the corpus. Click on the “Access Control” tab in the corpus view and then the “Create API Key” button. Give your key a name, and choose whether you want query-only or query+index for your key. Click “Create” and you now have an active API key. Keep this key confidential.
corpus_key
and api_key
.
You can provide VECTARA_API_KEY
to LangChain in two ways:
-
Include in your environment these two variables:
VECTARA_API_KEY
. For example, you can set these variables using os.environ and getpass as follows:
- Add them to the
Vectara
vectorstore constructor:
Vectara RAG (retrieval augmented generation)
We now create aVectaraQueryConfig
object to control the retrieval and summarization options:
- We enable summarization, specifying we would like the LLM to pick the top 7 matching chunks and respond in English
Runnable
object that encpasulates the full Vectara RAG pipeline, using the as_rag
method:
Vectara Chat
In most uses of LangChain to create chatbots, one must integrate a specialmemory
component that maintains the history of chat sessions and then uses that history to ensure the chatbot is aware of conversation history.
With Vectara Chat - all of that is performed in the backend by Vectara automatically.