Using Yellowbrick as the vector store for ChatGpt
This tutorial demonstrates how to create a simple chatbot backed by ChatGpt that uses Yellowbrick as a vector store to support Retrieval Augmented Generation (RAG). What you’ll need:- An account on the Yellowbrick sandbox
- An api key from OpenAI
Setup: Enter the information used to connect to Yellowbrick and OpenAI API
Our chatbot integrates with ChatGpt via the langchain library, so you’ll need an API key from OpenAI first: To get an api key for OpenAI:- Register at platform.openai.com/
- Add a payment method - You’re unlikely to go over free quota
- Create an API key
Part 1: Creating a baseline chatbot backed by ChatGpt without a Vector Store
We will use langchain to query ChatGPT. As there is no Vector Store, ChatGPT will have no context in which to answer the question.Part 2: Connect to Yellowbrick and create the embedding tables
To load your document embeddings into Yellowbrick, you should create your own table for storing them in. Note that the Yellowbrick database that the table is in has to be UTF-8 encoded. Create a table in a UTF-8 database with the following schema, providing a table name of your choice:Part 3: Extract the documents to index from an existing table in Yellowbrick
Extract document paths and contents from an existing Yellowbrick table. We’ll use these documents to create embeddings from in the next step.Part 4: Load the Yellowbrick Vector Store with Documents
Go through documents, split them into digestable chunks, create the embedding and insert into the Yellowbrick table. This takes around 5 minutes.Part 5: Creating a chatbot that uses Yellowbrick as the vector store
Next, we add Yellowbrick as a vector store. The vector store has been populated with embeddings representing the administrative chapter of the Yellowbrick product documentation. We’ll send the same queries as above to see the impoved responses.Part 6: Introducing an Index to Increase Performance
Yellowbrick also supports indexing using the Locality-Sensitive Hashing approach. This is an approximate nearest-neighbor search technique, and allows one to trade off similarity search time at the expense of accuracy. The index introduces two new tunable parameters:- The number of hyperplanes, which is provided as an argument to
create_lsh_index(num_hyperplanes)
. The more documents, the more hyperplanes are needed. LSH is a form of dimensionality reduction. The original embeddings are transformed into lower dimensional vectors where the number of components is the same as the number of hyperplanes. - The Hamming distance, an integer representing the breadth of the search. Smaller Hamming distances result in faster retrieval but lower accuracy.