Overview
Integration details
Class | Package | Local | Serializable | [JS support] | Downloads | Version |
---|---|---|---|---|---|---|
ChatXinference | langchain-xinference | ✅ | ❌ | ✅ | ✅ | ✅ |
Model features
Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
---|---|---|---|---|---|---|---|---|---|
✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |
Setup
InstallXinference
through PyPI:
Deploy Xinference Locally or in a Distributed Cluster
For local deployment, runxinference
.
To deploy Xinference in a cluster, first start an Xinference supervisor using the xinference-supervisor
. You can also use the option -p to specify the port and -H to specify the host. The default port is 8080 and the default host is 0.0.0.0.
Then, start the Xinference workers using xinference-worker
on each server you want to run them on.
You can consult the README file from Xinference for more information.
Wrapper
To use Xinference with LangChain, you need to first launch a model. You can use command line interface (CLI) to do so:Installation
The LangChain Xinference integration lives in thelangchain-xinference
package: