Overview
This guide covers how to use the LangChainRunPod
LLM class to interact with text generation models hosted on RunPod Serverless.
Setup
-
Install the package:
- Deploy an LLM Endpoint: Follow the setup steps in the RunPod Provider Guide to deploy a compatible text generation endpoint on RunPod Serverless and get its Endpoint ID.
-
Set Environment Variables: Make sure
RUNPOD_API_KEY
andRUNPOD_ENDPOINT_ID
are set.
Instantiation
Initialize theRunPod
class. You can pass model-specific parameters via model_kwargs
and configure polling behavior.
Invocation
Use the standard LangChain.invoke()
and .ainvoke()
methods to call the model. Streaming is also supported via .stream()
and .astream()
(simulated by polling the RunPod /stream
endpoint).
Async Usage
Chaining
The LLM integrates seamlessly with LangChain Expression Language (LCEL) chains.Endpoint Considerations
- Input: The endpoint handler should expect the prompt string within
{"input": {"prompt": "...", ...}}
. - Output: The handler should return the generated text within the
"output"
key of the final status response (e.g.,{"output": "Generated text..."}
or{"output": {"text": "..."}}
). - Streaming: For simulated streaming via the
/stream
endpoint, the handler must populate the"stream"
key in the status response with a list of chunk dictionaries, like[{"output": "token1"}, {"output": "token2"}]
.
API reference
For detailed documentation of theRunPod
LLM class, parameters, and methods, refer to the source code or the generated API reference (if available).
Link to source code: https://github.com/runpod/langchain-runpod/blob/main/langchain_runpod/llms.py