Vercel AI Gateway
"AI Gateway provides a unified API to access hundreds of AI models through a single endpoint, with built-in budgets, usage monitoring, and fallbacks." [1]
Vercel AI Gateway is a unified REST API that routes requests to hundreds of AI models across multiple providers through a single endpoint, with built-in automatic fallback, load balancing, spend limits, and observability logging. It launched in May 2025 and reached general availability in August 2025, and is used by teams at v0.app, Browserbase, and Perplexity. Pricing is usage-based per token with a free tier offering $5 of credits every 30 days on a rate-limited subset of models; paid tiers unlock BYOK, zero data retention routing, and higher rate limits. The service holds SOC 2 Type 2, ISO 27001, HIPAA, GDPR, and PCI DSS certifications, and exposes OpenAI-compatible, Anthropic-compatible, and native SDKs for TypeScript and Python.
Best for / Avoid if
Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt)
Pricing & procurement
- Pricing model
- Usage-based [2]
- Published pricing
- ✓ Yes [3]
- Free tier
- ✓ Yes [4]
- Free tier details
- $5 of AI Gateway Credits every 30 days for teams that have not purchased credits. Free tier covers a subset of models (not the full catalog) and is rate-limited per model. Monthly free credit stops applying once the team makes a first purchase of AI Gateway Credits. [5]
- Self-serve signup
- ✓ Yes [6]
- Requires sales call
- ✗ No
- Enterprise plan
- ✓ Yes [7]
| Plan | Item | Per | Amount | Source |
|---|---|---|---|---|
| Free | Gateway usage fee — no markup, free-tier models only, rate-limited | request | 0% + $0 | source |
| Paid | Gateway markup on model tokens (PAYG, zero markup) | % of spend | 0% | source |
| Paid | Gateway markup on model tokens with BYOK (zero markup) | % of spend | 0% | source |
| Pro and Enterprise | Custom Reporting — tag/user ID/quota entity ID write | 1,000 writes | $0.075 | source |
| Pro and Enterprise | Custom Reporting — reporting endpoint query | 1,000 queries | $5 | source |
| Pro and Enterprise | Team-wide provider allowlist surcharge | 1,000 successful requests | $0.1 | source |
| Pro and Enterprise | Team-wide Zero Data Retention (ZDR) surcharge | 1,000 requests | $0.1 | source |
Capabilities
- Supported actions
- unified_chat_completions, openai_compatible_api, anthropic_messages_api_compatible, openai_responses_api_compatible, model_routing, automatic_fallback, load_balancing, prompt_caching, spend_limits, budgets, rate_limiting, observability_logging, tracing, byo_provider_keys, virtual_keys, provider_allowlist, zero_data_retention, disallow_prompt_training, routing_rules, model_rewrite_rules, model_deny_rules, custom_reporting, spend_reporting_by_user_model_tag, embeddings, reranking, image_generation, video_generation, speech_tts_stt, realtime_speech_to_speech, web_search_augmentation, dynamic_model_discovery, oidc_authentication, provider_timeout_config, service_tier_selection [8]
- Input types
- chat completions, text completions, embeddings, reranking, image generation, video generation, audio/speech, realtime speech-to-speech [9]
- Output types
- streaming (SSE), JSON, OpenAI-compatible response, Anthropic-compatible response, OpenAI Responses API-compatible response
- Webhooks
- ✓ Yes [10]
- Sandbox / test mode
- ✗ No
- SDK languages
- TypeScript/JavaScript, TypeScript/JavaScript (OpenAI-compatible drop-in), Python (OpenAI-compatible drop-in), TypeScript (Anthropic-compatible drop-in) [11]
- MCP server
- ✓ Yes [12]
Trust & compliance
- SOC 2
- SOC 2 Type II [13]
- HIPAA
- ✓ Yes [14]
- GDPR
- ✓ Yes [15]
- ISO 27001
- ✓ Yes [16]
- PCI DSS
- ✓ Yes [17]
- Published SLA
- ✓ Yes [18]
- Rate limits
- Free tier requests are rate-limited per model with lower limits than the paid tier; a 429 is returned when exceeded. Purchasing AI Gateway Credits moves the team to the paid tier with higher rate limits. Vercel states it places no rate limits of its own on paid-tier queries - upstream provider limits may apply. Specific numeric RPM/TPM figures are not publicly documented. [19]
- Known restrictions
- BYOK is only available on the paid tier (requires purchased AI Gateway Credits), Per-request zero data retention (ZDR) and provider allowlist are Pro and Enterprise only, Team-wide provider allowlist incurs $0.10 per 1,000 successful requests surcharge (Pro and Enterprise), Team-wide Zero Data Retention incurs $0.10 per 1,000 requests surcharge (Pro and Enterprise), Custom Reporting API endpoint (/v1/report) is not available on Hobby or Pro-trial plans, Custom Reporting incurs $0.075 per 1,000 tag/user ID/quota entity ID writes and $5 per 1,000 queries, Routing rules feature is in beta and may change before GA, Free tier limited to subset of models, not full catalog, Free monthly credit ($5) ceases once the first credit purchase is made [20]
Developer surface
Integration
- API style
- rest
- Base URL
- https://ai-gateway.vercel.sh/v1
- Version
- v1
- Versioning
- url
- Stability
- ga
- Auth methods
- api_key, jwt
- Error format
- openai-compatible
Adoption & maturity
- Launched
- 2025-05-20
- GA
- 2025-08-21
- Notable customers
- v0.app, Browserbase, Perplexity
Other AI Gateway & LLM Routing APIs
Portkey
"Production Stack for Gen AI Builders"
Bifrost (Maxim AI)
"The fastest, most resilient, enterprise-grade LLM, MCP, and agent gateway."
Cloudflare AI Gateway
"Connect to any model, dynamically route requests, and manage usage, billing, and logs from one unified gateway."
TrueFoundry AI Gateway
"A unified AI gateway to securely manage and govern AI across 1600+ models with policy control, real-time monitoring, and up to 30% cost reduction."
Helicone
"Open-source LLM observability and monitoring platform for developers" - routes, debugs, and analyzes AI applications with access to 100+ models through one API with built-in observability, automatic fallbacks, and zero markup pricing.
OpenRouter
"The Unified Interface For LLMs" - OpenRouter scouts for the best prices, the lowest latencies, and the highest throughput across dozens of providers, offering a single OpenAI-compatible API with automatic fallback, model routing, and unified billing.
References
- ↑Description: vercel.com
- ↑Pricing model: vercel.com · vercel.com
- ↑Published pricing: vercel.com
- ↑Free tier: vercel.com · vercel.com
- ↑Free tier details: vercel.com · vercel.com
- ↑Self-serve signup: vercel.com
- ↑Enterprise plan: vercel.com
- ↑Supported actions: vercel.com · vercel.com · vercel.com
- ↑Input types: vercel.com · vercel.com
- ↑Webhooks: vercel.com
- ↑SDK languages: vercel.com · vercel.com
- ↑MCP server: vercel.com
- ↑SOC 2: vercel.com
- ↑HIPAA: vercel.com
- ↑GDPR: vercel.com
- ↑ISO 27001: vercel.com
- ↑PCI DSS: vercel.com
- ↑Published SLA: vercel.com
- ↑Rate limits: vercel.com · vercel.com
- ↑Known restrictions: vercel.com · vercel.com · vercel.com
Change history
- 2026-06-21 Capabilities: {} → {"observability":true,"spend_controls":true,"fallback_routing":true}
- 2026-06-21 Summary Md: (none) → Vercel AI Gateway is a unified REST API that routes requests to hundreds of AI …
- 2026-06-21 Score Agent Friendliness: (none) → 55
- 2026-06-21 Score Pricing Transparency: (none) → 75
- 2026-06-21 Score Setup Speed: (none) → 85
- 2026-06-21 Score Docs Quality: (none) → 55
- 2026-06-21 Score Procurement Friction: (none) → 90
- 2026-06-21 Score Trust Readiness: (none) → 100
- 2026-06-21 Best For: (none) → Prototypes and side projects - free to start, no sales call, Regulated or enter…
- 2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
- 2026-06-21 Llms Txt URL: (none) → https://vercel.com/llms.txt
- 2026-06-21 Rendering: (none) → static
- 2026-06-21 Has Structured Data: (none) → No
- 2026-06-21 Robots Allows Agents: (none) → Yes
- 2026-06-21 API Reference URL: (none) → https://vercel.com/login?next=%2Freference
- 2026-06-21 Status Page URL: (none) → https://status.vercel.com
- 2026-06-21 Changelog URL: (none) → https://vercel.com/changelog
- 2026-06-21 Docs URL: (none) → https://vercel.com/docs
- 2026-06-21 Llms Txt Present: (none) → Yes
- 2026-06-21 Free Tier Details: set to $5 of AI Gateway Credits every 30 days for teams that have not purchased credit…
- 2026-06-21 Self Serve Signup: set to Yes
- 2026-06-21 Requires Sales Call: set to No
- 2026-06-21 Enterprise Plan Available: set to Yes
- 2026-06-21 SOC 2: set to type_2
- 2026-06-21 HIPAA: set to Yes
- 2026-06-21 GDPR: set to Yes
- 2026-06-21 ISO 27001: set to Yes
- 2026-06-21 PCI DSS: set to Yes
- 2026-06-21 SLA Published: set to Yes
- 2026-06-21 SLA URL: set to https://vercel.com/legal/sla
- 2026-06-21 Data Retention Policy URL: set to https://vercel.com/docs/ai-gateway/security-and-compliance/zdr
- 2026-06-21 Documented Rate Limits: set to Free tier requests are rate-limited per model with lower limits than the paid t…
- 2026-06-21 Known Restrictions: set to BYOK is only available on the paid tier (requires purchased AI Gateway Credits)…
- 2026-06-21 Auth Methods: set to api_key, jwt
- 2026-06-21 Auth Docs URL: set to https://vercel.com/docs/ai-gateway/authentication-and-byok/authentication
- 2026-06-21 API Style: set to rest
- 2026-06-21 Base URL: set to https://ai-gateway.vercel.sh/v1
- 2026-06-21 API Version: set to v1
- 2026-06-21 Versioning Scheme: set to url
- 2026-06-21 Stability: set to ga
- 2026-06-21 Deprecation Policy URL: set to https://vercel.com/docs/release-phases
- 2026-06-21 MCP URL: set to https://mcp.vercel.com
- 2026-06-21 Quickstart URL: set to https://vercel.com/docs/ai-gateway/getting-started/text
- 2026-06-21 Error Format: set to openai-compatible
- 2026-06-21 Requires Verification: set to No
- 2026-06-21 Price Basis: set to token
- 2026-06-21 Slug: set to vercel-ai-gateway
- 2026-06-21 Launched At: set to 2025-05-20
- 2026-06-21 GA Date: set to 2025-08-21
- 2026-06-21 Notable Customers: set to v0.app, Browserbase, Perplexity
Suggest an edit / leave a review
Leave a review or comment
curl -X POST https://apio.sh/api/feedback/vercel-ai-gateway \
-H 'Content-Type: application/json' \
-d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'Suggest a correction to a field (cite a source)
curl -X POST https://apio.sh/api/suggest/vercel-ai-gateway/FIELD \
-H 'Content-Type: application/json' \
-d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'