- Unified API Access: Connect to 250+ LLMs (OpenAI, Claude, Gemini, Groq, Mistral) through one API
- Low Latency: Sub-3ms internal latency with intelligent routing and load balancing
- Enterprise Security: SOC 2, HIPAA, GDPR compliance with RBAC and audit logging
- Quota and cost management: Token-based quotas, rate limiting, and comprehensive usage tracking
- Observability: Full request/response logging, metrics, and traces with customizable retention
Prerequisites
Before integrating LangChain with TrueFoundry, ensure you have:- TrueFoundry Account: A TrueFoundry account with at least one model provider configured. Follow quick start guide here
- Personal Access Token: Generate a token by following the TrueFoundry token generation guide
Quickstart
You can connect to TrueFoundry’s unified LLM gateway through theChatOpenAI
interface.
- Set the
base_url
to your TrueFoundry endpoint (explained below) - Set the
api_key
to your TrueFoundry PAT (Personal Access Token) - Use the same
model-name
as shown in the unified code snippet

Installation
Basic Setup
Connect to TrueFoundry by updating theChatOpenAI
model in LangChain:
LangGraph Integration
Observability and Governance

- Performance Metrics: Track key latency metrics like Request Latency, Time to First Token (TTFS), and Inter-Token Latency (ITL) with P99, P90, and P50 percentiles
- Cost and Token Usage: Gain visibility into your application’s costs with detailed breakdowns of input/output tokens and the associated expenses for each model
- Usage Patterns: Understand how your application is being used with detailed analytics on user activity, model distribution, and team-based usage
- Rate Limiting & Load Balancing: Configure limits, distribute traffic across models, and set up fallbacks
Support
For questions, issues, or support:- Email: support@truefoundry.com
- Documentation: https://docs.truefoundry.com/