Providers | liteLLM

📄️ Integrate as a Model Provider

This guide focuses on how to setup the classes and configuration necessary to act as a chat provider.

📄️ OpenAI (Text Completion)

LiteLLM supports OpenAI text completion models

📄️ OpenAI-Compatible Endpoints

Selecting openai as the provider routes your request to an OpenAI-compatible endpoint using the upstream

📄️ AWS Sagemaker

LiteLLM supports All Sagemaker Huggingface Jumpstart Models

📄️ LiteLLM Proxy (LLM Gateway)

| Property | Details |

📄️ Codestral API [Mistral AI]

Codestral is available in select code-completion plugins but can also be queried directly. See the documentation for more details.

📄️ Databricks

LiteLLM supports all models on Databricks

📄️ Deepgram

LiteLLM supports Deepgram's /listen endpoint.

📄️ IBM watsonx.ai

LiteLLM supports all IBM watsonx.ai foundational models and embeddings.

📄️ Predibase

LiteLLM supports all models on Predibase

📄️ Nscale (EU Sovereign)

https://docs.nscale.com/docs/inference/chat

📄️ Cerebras

https://inference-docs.cerebras.ai/api-reference/chat-completions

📄️ Volcano Engine (Volcengine)

https://www.volcengine.com/docs/82379/1263482

📄️ Triton Inference Server

LiteLLM supports Embedding Models on Triton Inference Servers

📄️ FriendliAI

We support ALL FriendliAI models, just set friendliai/ as a prefix when sending completion requests

📄️ Galadriel

https://docs.galadriel.com/api-reference/chat-completion-API

📄️ ElevenLabs

ElevenLabs provides high-quality AI voice technology, including speech-to-text capabilities through their transcription API.

📄️ Fal AI

Fal AI provides fast, scalable access to state-of-the-art image generation models including FLUX, Stable Diffusion, Imagen, and more.

📄️ Fireworks AI

We support ALL Fireworks AI models, just set fireworks_ai/ as a prefix when sending completion requests

📄️ Clarifai

Anthropic, OpenAI, Qwen, xAI, Gemini and most of Open soured LLMs are Supported on Clarifai.

Lemonade Server is an OpenAI-compatible local language model inference provider optimized for AMD GPUs and NPUs. The lemonade litellm provider supports standard chat completions with full OpenAI API compatibility.

📄️ VLLM

LiteLLM supports all models on VLLM.

📄️ Llamafile

LiteLLM supports all models on Llamafile.

📄️ Infinity

| Property | Details |

📄️ Xinference [Xorbits Inference]

https://inference.readthedocs.io/en/latest/index.html

📄️ AI/ML API

https://aimlapi.com/

📄️ Cloudflare Workers AI

https://developers.cloudflare.com/workers-ai/models/text-generation/

📄️ DeepInfra

https://deepinfra.com/

📄️ Github

https://github.com/marketplace/models

📄️ GitHub Copilot

https://docs.github.com/en/copilot

📄️ AI21

LiteLLM supports the following AI21 models:

📄️ NLP Cloud

LiteLLM supports all LLMs on NLP Cloud.

📄️ Recraft

https://www.recraft.ai/

📄️ Replicate

LiteLLM supports all models on Replicate

🗃️ RunwayML

2 items

📄️ Together AI

LiteLLM supports all models on Together AI.

📄️ v0

Overview

📄️ Vercel AI Gateway

Overview

📄️ Morph

LiteLLM supports all models on Morph

📄️ Lambda AI

Overview

📄️ Novita AI

| Property | Details |

📄️ Voyage AI

https://docs.voyageai.com/embeddings/

📄️ Jina AI

https://jina.ai/embeddings/

📄️ Aleph Alpha

LiteLLM supports all models from Aleph Alpha.

📄️ Baseten

LiteLLM supports both Baseten Model APIs and dedicated deployments with automatic routing.

📄️ OpenRouter

LiteLLM supports all the text / chat / vision models from OpenRouter

📄️ SambaNova

https://cloud.sambanova.ai/

📄️ Custom API Server (Custom Format)

Call your custom torch-serve / internal LLM APIs via LiteLLM

📄️ Petals

Petals//github.com/bigscience-workshop/petals

📄️ Snowflake

| Property | Details |

📄️ GradientAI

https://digitalocean.com/products/gradientai

📄️ Featherless AI

https://featherless.ai/

📄️ Nebius AI Studio

https://docs.nebius.com/studio/inference/quickstart

📄️ Dashscope (Qwen API)

https://dashscope.console.aliyun.com/

📄️ Bytez

LiteLLM supports all chat models on Bytez!

📄️ Heroku

Provision a Model

📄️ Oracle Cloud Infrastructure (OCI)

LiteLLM supports the following models for OCI on-demand GenAI API.

📄️ DataRobot

LiteLLM supports all models from DataRobot. Select datarobot as the provider to route your request through the datarobot OpenAI-compatible endpoint using the upstream official OpenAI Python API library.

📄️ 🆕 OVHCloud AI Endpoints

Leading French Cloud provider in Europe with data sovereignty and privacy.

📄️ Weights & Biases Inference

https://weave-docs.wandb.ai/quickstart-inference

📄️ CometAPI

LiteLLM supports all AI models from CometAPI. CometAPI provides access to 500+ AI models through a unified API interface, including cutting-edge models like GPT-5, Claude Opus 4.1, and various other state-of-the-art language models.

📄️ Integrate as a Model Provider

🗃️ OpenAI

📄️ OpenAI (Text Completion)

📄️ OpenAI-Compatible Endpoints

🗃️ Azure OpenAI

🗃️ Azure AI

🗃️ Vertex AI

🗃️ Google AI Studio

📄️ Anthropic

📄️ AWS Sagemaker

🗃️ Bedrock

📄️ Milvus - Vector Store

📄️ LiteLLM Proxy (LLM Gateway)

📄️ Meta Llama

📄️ Mistral AI API

📄️ Codestral API [Mistral AI]

📄️ Cohere

📄️ Anyscale

🗃️ HuggingFace

📄️ Hyperbolic

📄️ Databricks

📄️ Deepgram

📄️ IBM watsonx.ai

📄️ Predibase

🗃️ Nvidia NIM

📄️ Nscale (EU Sovereign)

📄️ xAI

📄️ Moonshot AI

📄️ LM Studio

📄️ Cerebras

📄️ Volcano Engine (Volcengine)

📄️ Triton Inference Server

📄️ Ollama

📄️ Perplexity AI (pplx-api)

📄️ FriendliAI

📄️ Galadriel

📄️ Topaz

📄️ Groq

📄️ Deepseek

📄️ ElevenLabs

📄️ Fal AI

📄️ Fireworks AI

📄️ Clarifai

📄️ CompactifAI

📄️ Lemonade

📄️ VLLM

📄️ Llamafile

📄️ Infinity

📄️ Xinference [Xorbits Inference]

📄️ AI/ML API

📄️ Cloudflare Workers AI

📄️ DeepInfra

📄️ Github

📄️ GitHub Copilot

📄️ AI21

📄️ NLP Cloud

📄️ Recraft

📄️ Replicate

🗃️ RunwayML

📄️ Together AI

📄️ v0

📄️ Vercel AI Gateway

📄️ Morph

📄️ Lambda AI

📄️ Novita AI

📄️ Voyage AI

📄️ Jina AI

📄️ Aleph Alpha

📄️ Baseten

📄️ OpenRouter

📄️ SambaNova

📄️ Custom API Server (Custom Format)

📄️ Petals

📄️ Snowflake

📄️ GradientAI

📄️ Featherless AI

📄️ Nebius AI Studio

📄️ Dashscope (Qwen API)

📄️ Bytez

📄️ Heroku