Timescale Vector isThis notebook shows how to use the Postgres vector database (PostgreSQL++
for AI applications. It enables you to efficiently store and query billions of vector embeddings inPostgreSQL
. PostgreSQL also known asPostgres
, is a free and open-source relational database management system (RDBMS) emphasizing extensibility andSQL
compliance.
TimescaleVector
) to perform self-querying. In the notebook we’ll demo the SelfQueryRetriever
wrapped around a TimescaleVector vector store.
What is Timescale Vector?
Timescale Vector is PostgreSQL++ for AI applications. Timescale Vector enables you to efficiently store and query millions of vector embeddings inPostgreSQL
.
- Enhances
pgvector
with faster and more accurate similarity search on 1B+ vectors via DiskANN inspired indexing algorithm. - Enables fast time-based vector search via automatic time-based partitioning and indexing.
- Provides a familiar SQL interface for querying vector embeddings and relational data.
- Simplifies operations by enabling you to store relational metadata, vector embeddings, and time-series data in a single database.
- Benefits from rock-solid PostgreSQL foundation with enterprise-grade feature liked streaming backups and replication, high-availability and row-level security.
- Enables a worry-free experience with enterprise-grade security and compliance.
How to access Timescale Vector
Timescale Vector is available on Timescale, the cloud PostgreSQL platform. (There is no self-hosted version at this time.) LangChain users get a 90-day free trial for Timescale Vector.- To get started, signup to Timescale, create a new database and follow this notebook!
- See the Timescale Vector explainer blog for more details and performance benchmarks.
- See the installation instructions for more details on using Timescale Vector in python.
Creating a TimescaleVector vectorstore
First we’ll want to create a Timescale Vector vectorstore and seed it with some data. We’ve created a small demo set of documents that contain summaries of movies. NOTE: The self-query retriever requires you to havelark
installed (pip install lark
). We also need the timescale-vector
package.
OpenAIEmbeddings
, so let’s load your OpenAI API key.
.env
file you downloaded after creating a new database.
If you haven’t already, signup for Timescale, and create a new database.
The URI will look something like this: postgres://tsdbadmin:<password>@<id>.tsdb.cloud.timescale.com:<port>/tsdb?sslmode=require
Creating our self-querying retriever
Now we can instantiate our retriever. To do this we’ll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents.Self Querying Retrieval with Timescale Vector
And now we can try actually using our retriever! Run the queries below and note how you can specify a query, filter, composite filter (filters with AND, OR) in natural language and the self-query retriever will translate that query into SQL and perform the search on the Timescale Vector (Postgres) vectorstore. This illustrates the power of the self-query retriever. You can use it to perform complex searches over your vectorstore without you or your users having to write any SQL directly!Filter k
We can also use the self query retriever to specifyk
: the number of documents to fetch.
We can do this by passing enable_limit=True
to the constructor.