-
LlamaEdgeChatService
provides developers an OpenAI API compatible service to chat with LLMs via HTTP requests. -
LlamaEdgeChatLocal
enables developers to chat with LLMs locally (coming soon).
LlamaEdgeChatService
and LlamaEdgeChatLocal
run on the infrastructure driven by WasmEdge Runtime, which provides a lightweight and portable WebAssembly container environment for LLM inference tasks.
Chat via API Service
LlamaEdgeChatService
works on the llama-api-server
. Following the steps in llama-api-server quick-start, you can host your own API service so that you can chat with any models you like on any device you have anywhere as long as the internet is available.