Overview
TheBoxRetriever
class helps you get your unstructured content from Box in LangChain’s Document
format. You can do this by searching for files based on a full-text search or using Box AI to retrieve a Document
containing the result of an AI query against files. This requires including a List[str]
containing Box file ids, i.e. ["12345","67890"]
Box AI requires an Enterprise Plus license
Integration details
1: Bring-your-own data (i.e., index and search a custom corpus of documents):Retriever | Self-host | Cloud offering | Package |
---|---|---|---|
BoxRetriever | ❌ | ✅ | langchain-box |
Setup
In order to use the Box package, you will need a few things:- A Box account — If you are not a current Box customer or want to test outside of your production Box instance, you can use a free developer account.
- A Box app — This is configured in the developer console, and for Box AI, must have the
Manage AI
scope enabled. Here you will also select your authentication method - The app must be enabled by the administrator. For free developer accounts, this is whomever signed up for the account.
Credentials
For these examples, we will use token authentication. This can be used with any authentication method. Just get the token with whatever methodology. If you want to learn more about how to use other authentication types withlangchain-box
, visit the Box provider document.
Installation
This retriever lives in thelangchain-box
package:
Instantiation
Now we can instantiate our retriever:Search
langchain_box.utilities.SearchOptions
in conjunction with the langchain_box.utilities.SearchTypeFilter
and langchain_box.utilities.DocumentFiles
enums to filter on things like created date, which part of the file to search, and even to limit the search scope to a specific folder.
For more information, check out the API reference.
Box AI
Usage
Citations
With Box AI and theBoxRetriever
, you can return the answer to your prompt, return the citations used by Box to get that answer, or both. No matter how you choose to use Box AI, the retriever returns a List[Document]
object. We offer this flexibility with two bool
arguments, answer
and citations
. Answer defaults to True
and citations defaults to False
, do you can omit both if you just want the answer. If you want both, you can just include citations=True
and if you only want citations, you would include answer=False
and citations=True
Get both
Citations only
Use within a chain
Like other retrievers, BoxRetriever can be incorporated into LLM applications via chains. We will need a LLM or chat model:Use as an agent tool
Like other retrievers, BoxRetriever can be also be added to a LangGraph agent as a tool.Extra fields
All Box connectors offer the ability to select additional fields from the BoxFileFull
object to return as custom LangChain metadata. Each object accepts an optional List[str]
called extra_fields
containing the json key from the return object, like extra_fields=["shared_link"]
.
The connector will add this field to the list of fields the integration needs to function and then add the results to the metadata returned in the Document
or Blob
, like "metadata" : { "source" : "source, "shared_link" : "shared_link" }
. If the field is unavailable for that file, it will be returned as an empty string, like "shared_link" : ""
.