Zilliz Cloud Pipelines transform your unstructured data to a searchable vector collection, chaining up the embedding, ingestion, search, and deletion of your data. Zilliz Cloud Pipelines are available in the Zilliz Cloud Console and via RestFul APIs.This notebook demonstrates how to prepare Zilliz Cloud Pipelines and use the them via a LangChain Retriever.
Prepare Zilliz Cloud Pipelines
To get pipelines ready for LangChain Retriever, you need to create and configure the services in Zilliz Cloud. 1. Set up Database 2. Create PipelinesUse LangChain Retriever
Add documents
To add documents, you can use the methodadd_texts
or add_doc_url
, which inserts documents from a list of texts or a presigned/public url with corresponding metadata into the store.
-
if using a text ingestion pipeline, you can use the method
add_texts
, which inserts a batch of texts with the corresponding metadata into the Zilliz Cloud storage. Arguments:texts
: A list of text strings.metadata
: A key-value dictionary of metadata will be inserted as preserved fields required by ingestion pipeline. Defaults to None.
-
if using a document ingestion pipeline, you can use the method
add_doc_url
, which inserts a document from url with the corresponding metadata into the Zilliz Cloud storage. Arguments:doc_url
: A document url.metadata
: A key-value dictionary of metadata will be inserted as preserved fields required by ingestion pipeline. Defaults to None.
Get relevant documents
To query the retriever, you can use the methodget_relevant_documents
, which returns a list of LangChain Document objects.
Arguments:
query
: String to find relevant documents for.top_k
: The number of results. Defaults to 10.offset
: The number of records to skip in the search result. Defaults to 0.output_fields
: The extra fields to present in output.filter
: The Milvus expression to filter search results. Defaults to "".run_manager
: The callbacks handler to use.