📄️ Integrate as a Model Provider
This guide focuses on how to setup the classes and configuration necessary to act as a chat provider.
🗃️ OpenAI
4 items
📄️ OpenAI (Text Completion)
LiteLLM supports OpenAI text completion models
📄️ OpenAI-Compatible Endpoints
Selecting openai as the provider routes your request to an OpenAI-compatible endpoint using the upstream
🗃️ Azure OpenAI
5 items
🗃️ Azure AI
7 items
🗃️ Vertex AI
7 items
🗃️ Google AI Studio
5 items
📄️ Anthropic
LiteLLM supports all anthropic models.
📄️ AWS Sagemaker
LiteLLM supports All Sagemaker Huggingface Jumpstart Models
🗃️ Bedrock
8 items
📄️ Milvus - Vector Store
Use Milvus as a vector store for RAG.
📄️ LiteLLM Proxy (LLM Gateway)
| Property | Details |
📄️ Meta Llama
| Property | Details |
📄️ Mistral AI API
https://docs.mistral.ai/api/
📄️ Codestral API [Mistral AI]
Codestral is available in select code-completion plugins but can also be queried directly. See the documentation for more details.
📄️ Cohere
API KEYS
📄️ Anyscale
https://app.endpoints.anyscale.com/
🗃️ HuggingFace
2 items
📄️ Hyperbolic
Overview
📄️ Databricks
LiteLLM supports all models on Databricks
📄️ Deepgram
LiteLLM supports Deepgram's /listen endpoint.
📄️ IBM watsonx.ai
LiteLLM supports all IBM watsonx.ai foundational models and embeddings.
📄️ Predibase
LiteLLM supports all models on Predibase
🗃️ Nvidia NIM
2 items
📄️ Nscale (EU Sovereign)
https://docs.nscale.com/docs/inference/chat
📄️ xAI
https://docs.x.ai/docs
📄️ Moonshot AI
Overview
📄️ LM Studio
https://lmstudio.ai/docs/basics/server
📄️ Cerebras
https://inference-docs.cerebras.ai/api-reference/chat-completions
📄️ Volcano Engine (Volcengine)
https://www.volcengine.com/docs/82379/1263482
📄️ Triton Inference Server
LiteLLM supports Embedding Models on Triton Inference Servers
📄️ Ollama
LiteLLM supports all models from Ollama
📄️ Perplexity AI (pplx-api)
https://www.perplexity.ai
📄️ FriendliAI
We support ALL FriendliAI models, just set friendliai/ as a prefix when sending completion requests
📄️ Galadriel
https://docs.galadriel.com/api-reference/chat-completion-API
📄️ Topaz
| Property | Details |
📄️ Groq
https://groq.com/
📄️ Deepseek
https://deepseek.com/
📄️ ElevenLabs
ElevenLabs provides high-quality AI voice technology, including speech-to-text capabilities through their transcription API.
📄️ Fal AI
Fal AI provides fast, scalable access to state-of-the-art image generation models including FLUX, Stable Diffusion, Imagen, and more.
📄️ Fireworks AI
We support ALL Fireworks AI models, just set fireworks_ai/ as a prefix when sending completion requests
📄️ Clarifai
Anthropic, OpenAI, Qwen, xAI, Gemini and most of Open soured LLMs are Supported on Clarifai.
📄️ CompactifAI
https://docs.compactif.ai/
📄️ Lemonade
Lemonade Server is an OpenAI-compatible local language model inference provider optimized for AMD GPUs and NPUs. The lemonade litellm provider supports standard chat completions with full OpenAI API compatibility.
📄️ VLLM
LiteLLM supports all models on VLLM.
📄️ Llamafile
LiteLLM supports all models on Llamafile.
📄️ Infinity
| Property | Details |
📄️ Xinference [Xorbits Inference]
https://inference.readthedocs.io/en/latest/index.html
📄️ AI/ML API
https://aimlapi.com/
📄️ Cloudflare Workers AI
https://developers.cloudflare.com/workers-ai/models/text-generation/
📄️ DeepInfra
https://deepinfra.com/
📄️ Github
https://github.com/marketplace/models
📄️ GitHub Copilot
https://docs.github.com/en/copilot
📄️ AI21
LiteLLM supports the following AI21 models:
📄️ NLP Cloud
LiteLLM supports all LLMs on NLP Cloud.
📄️ Recraft
https://www.recraft.ai/
📄️ Replicate
LiteLLM supports all models on Replicate
🗃️ RunwayML
2 items
📄️ Together AI
LiteLLM supports all models on Together AI.
📄️ v0
Overview
📄️ Vercel AI Gateway
Overview
📄️ Morph
LiteLLM supports all models on Morph
📄️ Lambda AI
Overview
📄️ Novita AI
| Property | Details |
📄️ Voyage AI
https://docs.voyageai.com/embeddings/
📄️ Jina AI
https://jina.ai/embeddings/
📄️ Aleph Alpha
LiteLLM supports all models from Aleph Alpha.
📄️ Baseten
LiteLLM supports both Baseten Model APIs and dedicated deployments with automatic routing.
📄️ OpenRouter
LiteLLM supports all the text / chat / vision models from OpenRouter
📄️ SambaNova
https://cloud.sambanova.ai/
📄️ Custom API Server (Custom Format)
Call your custom torch-serve / internal LLM APIs via LiteLLM
📄️ Petals
Petals//github.com/bigscience-workshop/petals
📄️ Snowflake
| Property | Details |
📄️ GradientAI
https://digitalocean.com/products/gradientai
📄️ Featherless AI
https://featherless.ai/
📄️ Nebius AI Studio
https://docs.nebius.com/studio/inference/quickstart
📄️ Dashscope (Qwen API)
https://dashscope.console.aliyun.com/
📄️ Bytez
LiteLLM supports all chat models on Bytez!
📄️ Heroku
Provision a Model
📄️ Oracle Cloud Infrastructure (OCI)
LiteLLM supports the following models for OCI on-demand GenAI API.
📄️ DataRobot
LiteLLM supports all models from DataRobot. Select datarobot as the provider to route your request through the datarobot OpenAI-compatible endpoint using the upstream official OpenAI Python API library.
📄️ 🆕 OVHCloud AI Endpoints
Leading French Cloud provider in Europe with data sovereignty and privacy.
📄️ Weights & Biases Inference
https://weave-docs.wandb.ai/quickstart-inference
📄️ CometAPI
LiteLLM supports all AI models from CometAPI. CometAPI provides access to 500+ AI models through a unified API interface, including cutting-edge models like GPT-5, Claude Opus 4.1, and various other state-of-the-art language models.