OpenAI Realtime API (gpt-realtime)
"The Realtime API enables low-latency, bidirectional audio communication for building voice agents and audio applications." [1]
OpenAI Realtime API is a WebSocket-based service for low-latency, bidirectional speech-to-speech communication, targeting developers building voice agents, real-time translation, live transcription, and call center automation. Pricing is usage-based starting at $0.0192 per minute of audio input, with self-serve signup and no sales call required. The API supports function calling, voice activity detection, interruption handling, and inbound SIP telephony via a third-party carrier. It holds SOC 2 Type II, HIPAA, GDPR, ISO 27001, and PCI DSS certifications, with SDKs available for Python, Node.js, Go, Java, Ruby, and .NET.
Best for / Avoid if
Best for: Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt); Teams needing broad API coverage out of the box
Avoid if: You want to try it free before paying
Pricing & procurement
- Pricing model
- Usage-based [2]
- Published pricing
- ✓ Yes [3]
- Free tier
- ✗ No
- Self-serve signup
- ✓ Yes
- Requires sales call
- ✗ No
- Enterprise plan
- ✓ Yes [4]
| Plan | Item | Per | Amount | Source |
|---|---|---|---|---|
| gpt-realtime-2 | Audio input tokens | 1M tokens | $32 | source |
| gpt-realtime-2 | Audio input tokens (cached) | 1M tokens | $0.4 | source |
| gpt-realtime-2 | Audio output tokens | 1M tokens | $64 | source |
| gpt-realtime-2 | Text input tokens | 1M tokens | $4 | source |
| gpt-realtime-2 | Text input tokens (cached) | 1M tokens | $0.4 | source |
| gpt-realtime-2 | Text output tokens | 1M tokens | $24 | source |
| gpt-realtime-2 | Image input tokens | 1M tokens | $5 | source |
| gpt-realtime-2 | Image input tokens (cached) | 1M tokens | $0.5 | source |
| gpt-realtime-mini | Text input tokens | 1M tokens | $0.6 | source |
| gpt-realtime-mini | Text input tokens (cached) | 1M tokens | $0.06 | source |
| gpt-realtime-mini | Text output tokens | 1M tokens | $2.4 | source |
| gpt-realtime-translate | Audio translation output | minute | $0.034 | source |
| gpt-realtime-whisper | Audio transcription output | minute | $0.017 | source |
Capabilities
- Supported actions
- realtime_conversation, speech_to_speech, inbound_telephony, sip_trunking, call_transfer, function_calling, tool_calling, interruption_handling, voice_activity_detection, live_transcription, realtime_translation, mcp_server_integration, websocket_streaming, webrtc_streaming, call_recording, webhook_events, reasoning_effort_control [5]
- Regions
- North Europe, South Central US, East US 2, West US, EU Data Residency [6]
- Languages
- English, Mandarin, Spanish, French, German, Japanese, Korean, Portuguese, Arabic, Hindi, Russian, Italian, Indonesian, 70+ additional input languages via Whisper-based transcription (gpt-realtime-whisper), 70+ input languages for translation via gpt-realtime-translate (output to 13 languages listed above) [7]
- Input types
- audio stream (WebRTC), audio stream (WebSocket), SIP phone call, text [8]
- Output types
- audio stream, text transcript, tool call results, webhook events, session events
- Webhooks
- ✓ Yes [9]
- Sandbox / test mode
- ✗ No
- SDK languages
- Python, Node.js, .NET, Java, Go, Ruby, Python (Agents SDK), Node.js (Agents SDK) [10]
- MCP server
- ✓ Yes [11]
Trust & compliance
- SOC 2
- SOC 2 Type II [12]
- HIPAA
- ✓ Yes [13]
- GDPR
- ✓ Yes [14]
- ISO 27001
- ✓ Yes [15]
- PCI DSS
- ✓ Yes [16]
- Published SLA
- ✓ Yes [17]
- Rate limits
- gpt-realtime-2: Tier 1–5 scaling from 200–20,000 RPM and 40,000–15,000,000 TPM; audio billed at 1 token per 100ms input, 1 token per 50ms output. gpt-realtime-translate: 50–850 audio-minutes per minute depending on tier. [18]
- Known restrictions
- SIP support is inbound calls only (no outbound SIP calling), SIP requires a third-party SIP trunking provider (e.g. Twilio) - telephony is BYO carrier, not included, gpt-realtime-translate and gpt-realtime-whisper support only translation/transcription session types, not full voice-agent conversations, No BYO-LLM or BYO-voice support - only OpenAI models available, No sandbox/test environment - live API keys required from the start, SLA (99.9% uptime) is only available to Scale Tier / enterprise customers, not standard pay-as-you-go users, Beta interface deprecated September 15, 2025; GA interface requires updated headers and event shapes [19]
Developer surface
Integration
- API style
- websocket
- Base URL
- wss://api.openai.com/v1/realtime
- Version
- gpt-realtime-2025-08-28
- Versioning
- url
- Stability
- ga
- Auth methods
- api_key
- Idempotency keys
- ✗ No
- Error format
- vendor-specific
Adoption & maturity
- Launched
- 2024-10-01
- GA
- 2025-08-28
- Notable customers
- Perplexity, Healthify, Speak, Zillow, Genspark
Other Realtime Voice Agent APIs
ElevenLabs Conversational AI (ElevenAgents)
"Deploy human-like Conversational AI in minutes. ElevenLabs delivers low latency interactions in dozens of languages with enterprise-grade security."
Telnyx Voice AI Agents
"Carrier-grade Voice AI from Telnyx. Sub-200ms latency, 80+ languages, A-level STIR/SHAKEN attestation, and one platform that replaces 5 vendors."
Twilio ConversationRelay
"Twilio's Conversation Relay empowers you to build powerful AI voice experiences for your customers. Let Twilio handle the heavy lifting of speech recognition, text-to-speech, and voice synthesis."
Vapi
"Build and deploy voice agents that deliver the outcomes you want at the scale your customers need."
Cartesia Line
"Turn any text agent into a world-class conversational agent, deployed anywhere."
Pipecat Cloud (Daily)
"The fastest path to production voice AI" - a managed hosting platform for deploying and scaling Pipecat agents in production with built-in infrastructure.
References
- ↑Description: developers.openai.com
- ↑Pricing model: developers.openai.com · developers.openai.com
- ↑Published pricing: developers.openai.com
- ↑Enterprise plan: openai.com
- ↑Supported actions: developers.openai.com · developers.openai.com
- ↑Regions: developers.openai.com
- ↑Languages: developers.openai.com · community.openai.com
- ↑Input types: developers.openai.com
- ↑Webhooks: developers.openai.com
- ↑SDK languages: developers.openai.com
- ↑MCP server: developers.openai.com
- ↑SOC 2: trust.openai.com
- ↑HIPAA: help.openai.com
- ↑GDPR: trust.openai.com
- ↑ISO 27001: trust.openai.com
- ↑PCI DSS: trust.openai.com
- ↑Published SLA: openai.com
- ↑Rate limits: developers.openai.com · developers.openai.com
- ↑Known restrictions: openai.com · developers.openai.com
Change history
- 2026-06-21 Capabilities: {} → {"multilingual":true}
- 2026-06-21 Summary Md: (none) → OpenAI Realtime API is a WebSocket-based service for low-latency, bidirectional…
- 2026-06-21 Avoid If: (none) → You want to try it free before paying
- 2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
- 2026-06-21 Score Pricing Transparency: (none) → 85
- 2026-06-21 Score Setup Speed: (none) → 60
- 2026-06-21 Score Docs Quality: (none) → 50
- 2026-06-21 Score Agent Friendliness: (none) → 50
- 2026-06-21 Score Procurement Friction: (none) → 85
- 2026-06-21 Score Trust Readiness: (none) → 100
- 2026-06-21 Best For: (none) → Regulated or enterprise workloads - compliance attestations and an enterprise p…
- 2026-06-21 Llms Txt Present: (none) → No
- 2026-06-21 Rendering: (none) → static
- 2026-06-21 Has Structured Data: (none) → No
- 2026-06-21 Robots Allows Agents: (none) → Yes
- 2026-06-21 API Reference URL: (none) → https://platform.openai.com/api/reference/overview
- 2026-06-21 Markdown Docs Served: (none) → Yes
- 2026-06-21 Markdown Docs URL: (none) → https://platform.openai.com/docs/guides/realtime.md
- 2026-06-21 Docs URL: (none) → https://developers.openai.com/api/docs
- 2026-06-21 MCP Server Available: set to Yes
- 2026-06-21 Pricing Model: set to usage_based
- 2026-06-21 Has Published Pricing: set to Yes
- 2026-06-21 Free Tier Available: set to No
- 2026-06-21 Self Serve Signup: set to Yes
- 2026-06-21 Requires Sales Call: set to No
- 2026-06-21 Enterprise Plan Available: set to Yes
- 2026-06-21 SOC 2: set to type_2
- 2026-06-21 HIPAA: set to Yes
- 2026-06-21 GDPR: set to Yes
- 2026-06-21 ISO 27001: set to Yes
- 2026-06-21 PCI DSS: set to Yes
- 2026-06-21 SLA Published: set to Yes
- 2026-06-21 SLA URL: set to https://openai.com/api-scale-tier/
- 2026-06-21 Data Retention Policy URL: set to https://developers.openai.com/api/docs/guides/your-data
- 2026-06-21 Documented Rate Limits: set to gpt-realtime-2: Tier 1–5 scaling from 200–20,000 RPM and 40,000–15,000,000 TPM;…
- 2026-06-21 Known Restrictions: set to SIP support is inbound calls only (no outbound SIP calling), SIP requires a thi…
- 2026-06-21 Auth Methods: set to api_key
- 2026-06-21 API Version: set to gpt-realtime-2025-08-28
- 2026-06-21 Versioning Scheme: set to url
- 2026-06-21 Stability: set to ga
- 2026-06-21 Deprecation Policy URL: set to https://developers.openai.com/api/docs/deprecations
- 2026-06-21 MCP URL: set to https://developers.openai.com/mcp
- 2026-06-21 Quickstart URL: set to https://developers.openai.com/api/docs/guides/voice-agents
- 2026-06-21 Idempotency Supported: set to No
- 2026-06-21 Error Format: set to vendor-specific
- 2026-06-21 Requires Verification: set to No
- 2026-06-21 Starting Price Usd: set to 0.0192
- 2026-06-21 Price Basis: set to minute (audio input)
- 2026-06-21 Launched At: set to 2024-10-01
- 2026-06-21 GA Date: set to 2025-08-28
Suggest an edit / leave a review
Leave a review or comment
curl -X POST https://apio.sh/api/feedback/openai-realtime \
-H 'Content-Type: application/json' \
-d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'Suggest a correction to a field (cite a source)
curl -X POST https://apio.sh/api/suggest/openai-realtime/FIELD \
-H 'Content-Type: application/json' \
-d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'