Spanner is a highly scalable database that combines unlimited scalability with relational semantics, such as secondary indexes, strong consistency, schemas, and SQL providing 99.999% availability in one easy solution.This notebook goes over how to use
Spanner
for Vector Search with SpannerVectorStore
class.
Learn more about the package on GitHub.
Before You Begin
To run this notebook, you will need to do the following:- Create a Google Cloud Project
- Enable the Cloud Spanner API
- Create a Spanner instance
- Create a Spanner database
🦜🔗 Library Installation
The integration lives in its ownlangchain-google-spanner
package, so we need to install it.
🔐 Authentication
Authenticate to Google Cloud as the IAM user logged into this notebook in order to access your Google Cloud Project.- If you are using Colab to run this notebook, use the cell below and continue.
- If you are using Vertex AI Workbench, check out the setup instructions here.
☁ Set Your Google Cloud Project
Set your Google Cloud project so that you can leverage Google Cloud resources within this notebook. If you don’t know your project ID, try the following:- Run
gcloud config list
. - Run
gcloud projects list
. - See the support page: Locate the project ID.
💡 API Enablement
Thelangchain-google-spanner
package requires that you enable the Spanner API in your Google Cloud Project.
Basic Usage
Set Spanner database values
Find your database values, in the Spanner Instances page.Initialize a table
TheSpannerVectorStore
class instance requires a database table with id, content and embeddings columns.
The helper method init_vector_store_table()
that can be used to create a table with the proper schema for you.
Create an embedding class instance
You can use any LangChain embeddings model. You may need to enable Vertex AI API to useVertexAIEmbeddings
. We recommend setting the embedding model’s version for production, learn more about the Text embeddings models.
SpannerVectorStore
To initialize theSpannerVectorStore
class you need to provide 4 required arguments and other arguments are optional and only need to pass if it’s different from default ones
instance_id
- The name of the Spanner instancedatabase_id
- The name of the Spanner databasetable_name
- The name of the table within the database to store the documents & their embeddings.embedding_service
- The Embeddings implementation which is used to generate the embeddings.