- 🔬 Build for fast and production usages
- 🚂 Support llama3, qwen2, gemma, etc, and many quantized versions full list
- ⛓️ OpenAI-compatible API
- 💬 Built-in ChatGPT like UI
- 🔥 Accelerated LLM decoding with state-of-the-art inference backends
- 🌥️ Ready for enterprise-grade cloud deployment (Kubernetes, Docker and BentoCloud)
Installation
Installopenllm
through PyPI
Launch OpenLLM server locally
To start an LLM server, useopenllm hello
command: