concept

OpenAI-compatible API

An OpenAI-compatible API is the inference endpoint that mimics OpenAI's HTTP shape (`/v1/chat/completions`, `messages` format, streaming SSE) so client code written for OpenAI works without changes — Groq, Fireworks, Together, OpenRouter, vLLM, llama.cpp all ship OpenAI-compatible endpoints.

OpenAI's API shape became the de-facto standard in 2023-2024; by 2026 almost every hosted inference platform offers an OpenAI-compatible endpoint. Benefits: switching vendors is one URL + API key change; existing SDK code, libraries (Vercel AI SDK, LiteLLM, OpenRouter, Cursor, Cline) all 'just work'; new models drop into existing eval pipelines without code change. Trade-offs: vendor-specific features (Anthropic prompt caching, Google grounding, native tool formats) require non-compatible APIs to expose; compatibility shims sometimes lag the latest OpenAI API additions (parallel tool calls, structured outputs strict mode). For production stacks needing portability, prioritize OpenAI-compatible vendors — switching cost drops by orders of magnitude.

When to use openai-compatible api

Common mistakes

FAQ

What is openai-compatible api?

An OpenAI-compatible API is the inference endpoint that mimics OpenAI's HTTP shape (`/v1/chat/completions`, `messages` format, streaming SSE) so client code written for OpenAI works without changes — Groq, Fireworks, Together, OpenRouter, vLLM, llama.cpp all ship OpenAI-compatible endpoints.

When should I use openai-compatible api?

Any production stack needing vendor portability.

What are the most common mistakes with openai-compatible api?

Assuming full feature parity — some advanced features don't translate.

Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/openai-compatible-api.md.