Fleet AI Context is a dataset of high-quality embeddings of the top 1200 most popular & permissive Python Libraries & their documentation.
The Fleet AI
team is on a mission to embed the world’s most important data. They’ve started by embedding the top 1200 Python libraries to enable code generation with up-to-date knowledge. They’ve been kind enough to share their embeddings of the LangChain docs and API reference.
Let’s take a look at how we can use these embeddings to power a docs retrieval system and ultimately a simple code-generating chain!
Retriever chunks
As part of their embedding process, the Fleet AI team first chunked long documents before embedding them. This means the vectors correspond to sections of pages in the LangChain docs, not entire pages. By default, when we spin up a retriever from these embeddings, we’ll be retrieving these embedded chunks. We will be using Fleet Context’sdownload_embeddings()
to grab LangChain’s documentation embeddings. You can view all supported libraries’ documentation at fleet.so/context.