ChatCloudflareWorkersAI

This will help you get started with CloudflareWorkersAI chat models. For detailed documentation of all ChatCloudflareWorkersAI features and configurations head to the API reference.

Overview

Integration details

Class	Package	Local	Serializable	JS support	Downloads	Version
ChatCloudflareWorkersAI	langchain-cloudflare	✅	❌	❌

Model features

Tool calling	Structured output	JSON mode	Image input	Audio input	Video input	Token-level streaming	Native async	Token usage	Logprobs
✅	✅	✅	❌	❌	❌	❌	❌	✅	❌

Setup

To access CloudflareWorkersAI models you’ll need to create a/an CloudflareWorkersAI account, get an API key, and install the langchain-cloudflare integration package.

Credentials

Head to www.cloudflare.com/developer-platform/products/workers-ai/ to sign up to CloudflareWorkersAI and generate an API key. Once you’ve done this set the CF_AI_API_KEY environment variable and the CF_ACCOUNT_ID environment variable:

import getpass
import os

if not os.getenv("CF_AI_API_KEY"):
    os.environ["CF_AI_API_KEY"] = getpass.getpass(
        "Enter your CloudflareWorkersAI API key: "
    )

if not os.getenv("CF_ACCOUNT_ID"):
    os.environ["CF_ACCOUNT_ID"] = getpass.getpass(
        "Enter your CloudflareWorkersAI account ID: "
    )

If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:

# os.environ["LANGSMITH_TRACING"] = "true"
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")

Installation

The LangChain CloudflareWorkersAI integration lives in the langchain-cloudflare package:

%pip install -qU langchain-cloudflare

Instantiation

Now we can instantiate our model object and generate chat completions:

Update model instantiation with relevant params.

from langchain_cloudflare.chat_models import ChatCloudflareWorkersAI

llm = ChatCloudflareWorkersAI(
    model="@cf/meta/llama-3.3-70b-instruct-fp8-fast",
    temperature=0,
    max_tokens=1024,
    # other params...
)

Invocation

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg

AIMessage(content="J'adore la programmation.", additional_kwargs={}, response_metadata={'token_usage': {'prompt_tokens': 37, 'completion_tokens': 9, 'total_tokens': 46}, 'model_name': '@cf/meta/llama-3.3-70b-instruct-fp8-fast'}, id='run-995d1970-b6be-49f3-99ae-af4cdba02304-0', usage_metadata={'input_tokens': 37, 'output_tokens': 9, 'total_tokens': 46})

print(ai_msg.content)

J'adore la programmation.

Chaining

We can chain our model with a prompt template like so:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate(
    [
        (
            "system",
            "You are a helpful assistant that translates {input_language} to {output_language}.",
        ),
        ("human", "{input}"),
    ]
)

chain = prompt | llm
chain.invoke(
    {
        "input_language": "English",
        "output_language": "German",
        "input": "I love programming.",
    }
)

AIMessage(content='Ich liebe das Programmieren.', additional_kwargs={}, response_metadata={'token_usage': {'prompt_tokens': 32, 'completion_tokens': 7, 'total_tokens': 39}, 'model_name': '@cf/meta/llama-3.3-70b-instruct-fp8-fast'}, id='run-d1b677bc-194e-4473-90f1-aa65e8e46d50-0', usage_metadata={'input_tokens': 32, 'output_tokens': 7, 'total_tokens': 39})

Structured Outputs

json_schema = {
    "title": "joke",
    "description": "Joke to tell user.",
    "type": "object",
    "properties": {
        "setup": {
            "type": "string",
            "description": "The setup of the joke",
        },
        "punchline": {
            "type": "string",
            "description": "The punchline to the joke",
        },
        "rating": {
            "type": "integer",
            "description": "How funny the joke is, from 1 to 10",
            "default": None,
        },
    },
    "required": ["setup", "punchline"],
}
structured_llm = llm.with_structured_output(json_schema)

structured_llm.invoke("Tell me a joke about cats")

{'setup': 'Why did the cat join a band?',
 'punchline': 'Because it wanted to be the purr-cussionist',
 'rating': '8'}

Bind tools

from typing import List

from langchain_core.tools import tool


@tool
def validate_user(user_id: int, addresses: List[str]) -> bool:
    """Validate user using historical addresses.

    Args:
        user_id (int): the user ID.
        addresses (List[str]): Previous addresses as a list of strings.
    """
    return True


llm_with_tools = llm.bind_tools([validate_user])

result = llm_with_tools.invoke(
    "Could you validate user 123? They previously lived at "
    "123 Fake St in Boston MA and 234 Pretend Boulevard in "
    "Houston TX."
)
result.tool_calls

[{'name': 'validate_user',
  'args': {'user_id': '123',
   'addresses': '["123 Fake St in Boston MA", "234 Pretend Boulevard in Houston TX"]'},
  'id': '31ec7d6a-9ce5-471b-be64-8ea0492d1387',
  'type': 'tool_call'}]

API reference

developers.cloudflare.com/workers-ai/ developers.cloudflare.com/agents/

Providers

Integrations by component

Overview

Integration details

Model features

Setup

Credentials

Installation

Instantiation

Invocation

Chaining

Structured Outputs

Bind tools

API reference

Providers

Integrations by component

​Overview

​Integration details

​Model features

​Setup

​Credentials

​Installation

​Instantiation

​Invocation

​Chaining

​Structured Outputs

​Bind tools

​API reference

Overview

Integration details

Model features

Setup

Credentials

Installation

Instantiation

Invocation

Chaining

Structured Outputs

Bind tools

API reference