OpenAI Realtime API (gpt-realtime)

"The Realtime API enables low-latency, bidirectional audio communication for building voice agents and audio applications." [1]

Realtime Voice Agent APIs

platform.openai.com/docs/guides/realtime · By OpenAI · Agent JSON · Suggest an edit · Last verified 2026-06-21 · Source confidence: high

OpenAI Realtime API is a WebSocket-based service for low-latency, bidirectional speech-to-speech communication, targeting developers building voice agents, real-time translation, live transcription, and call center automation. Pricing is usage-based starting at $0.0192 per minute of audio input, with self-serve signup and no sales call required. The API supports function calling, voice activity detection, interruption handling, and inbound SIP telephony via a third-party carrier. It holds SOC 2 Type II, HIPAA, GDPR, ISO 27001, and PCI DSS certifications, with SDKs available for Python, Node.js, Go, Java, Ruby, and .NET.

Best for / Avoid if

Best for: Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt); Teams needing broad API coverage out of the box

Avoid if: You want to try it free before paying

Pricing & procurement

Pricing model: Usage-based [2]
Published pricing: Yes [3]
Free tier: No
Self-serve signup: Yes
Requires sales call: No
Enterprise plan: Yes [4]

Published prices
Plan	Item	Per	Amount	Source
gpt-realtime-2	Audio input tokens	1M tokens	$32	source
gpt-realtime-2	Audio input tokens (cached)	1M tokens	$0.4	source
gpt-realtime-2	Audio output tokens	1M tokens	$64	source
gpt-realtime-2	Text input tokens	1M tokens	$4	source
gpt-realtime-2	Text input tokens (cached)	1M tokens	$0.4	source
gpt-realtime-2	Text output tokens	1M tokens	$24	source
gpt-realtime-2	Image input tokens	1M tokens	$5	source
gpt-realtime-2	Image input tokens (cached)	1M tokens	$0.5	source
gpt-realtime-mini	Text input tokens	1M tokens	$0.6	source
gpt-realtime-mini	Text input tokens (cached)	1M tokens	$0.06	source
gpt-realtime-mini	Text output tokens	1M tokens	$2.4	source
gpt-realtime-translate	Audio translation output	minute	$0.034	source
gpt-realtime-whisper	Audio transcription output	minute	$0.017	source

Capabilities

Multilingual

Supported actions: realtime_conversation, speech_to_speech, inbound_telephony, sip_trunking, call_transfer, function_calling, tool_calling, interruption_handling, voice_activity_detection, live_transcription, realtime_translation, mcp_server_integration, websocket_streaming, webrtc_streaming, call_recording, webhook_events, reasoning_effort_control [5]developers.openai.com/api/docs/guides/realtime“SIP: Use for telephony voice agents. WebRTC: Use for browser and mobile clients that capture or play audio directly. WebSocket: Use when your server already receives raw audio from a media pipeline.”developers.openai.com/api/docs/guides/realtime-sip“The Realtime API supports inbound phone calls through SIP (Session Initiation Protocol). A SIP trunking provider converts telephone calls to IP traffic, which OpenAI routes to the Realtime API via a webhook system.”
Regions: North Europe, South Central US, East US 2, West US, EU Data Residency [6]
Languages: English, Mandarin, Spanish, French, German, Japanese, Korean, Portuguese, Arabic, Hindi, Russian, Italian, Indonesian, 70+ additional input languages via Whisper-based transcription (gpt-realtime-whisper), 70+ input languages for translation via gpt-realtime-translate (output to 13 languages listed above) [7]
Input types: audio stream (WebRTC), audio stream (WebSocket), SIP phone call, text [8]
Output types: audio stream, text transcript, tool call results, webhook events, session events
Webhooks: Yes [9]
Sandbox / test mode: No
SDK languages: Python, Node.js, .NET, Java, Go, Ruby, Python (Agents SDK), Node.js (Agents SDK) [10]
MCP server: Yes [11]

Trust & compliance

SOC 2: SOC 2 Type II [12]
HIPAA: Yes [13]
GDPR: Yes [14]
ISO 27001: Yes [15]
PCI DSS: Yes [16]
Published SLA: Yes [17]
Rate limits: gpt-realtime-2: Tier 1–5 scaling from 200–20,000 RPM and 40,000–15,000,000 TPM; audio billed at 1 token per 100ms input, 1 token per 50ms output. gpt-realtime-translate: 50–850 audio-minutes per minute depending on tier. [18]
Known restrictions: SIP support is inbound calls only (no outbound SIP calling), SIP requires a third-party SIP trunking provider (e.g. Twilio) - telephony is BYO carrier, not included, gpt-realtime-translate and gpt-realtime-whisper support only translation/transcription session types, not full voice-agent conversations, No BYO-LLM or BYO-voice support - only OpenAI models available, No sandbox/test environment - live API keys required from the start, SLA (99.9% uptime) is only available to Scale Tier / enterprise customers, not standard pay-as-you-go users, Beta interface deprecated September 15, 2025; GA interface requires updated headers and event shapes [19]

Developer surface

Docs rendering: static · markdown variants served

Integration

API style: websocket
Base URL: wss://api.openai.com/v1/realtime
Version: gpt-realtime-2025-08-28
Versioning: url
Stability: ga
Auth methods: api_key
Idempotency keys: No
Error format: vendor-specific

SDKs

Python openai · repo
Node.js openai · repo
.NET OpenAI · repo
Java openai-java · repo
Go openai-go · repo
Ruby openai · repo
Python (Agents SDK) openai-agents · repo
Node.js (Agents SDK) openai-agents-js · repo

Adoption & maturity

Launched: 2024-10-01
GA: 2025-08-28
Notable customers: Perplexity, Healthify, Speak, Zillow, Genspark

Other Realtime Voice Agent APIs

ElevenLabs Conversational AI (ElevenAgents)
"Deploy human-like Conversational AI in minutes. ElevenLabs delivers low latency interactions in dozens of languages with enterprise-grade security."
Hybrid · free tier · public pricing · self-serve
Telnyx Voice AI Agents
"Carrier-grade Voice AI from Telnyx. Sub-200ms latency, 80+ languages, A-level STIR/SHAKEN attestation, and one platform that replaces 5 vendors."
Usage · public pricing · self-serve
Twilio ConversationRelay
"Twilio's Conversation Relay empowers you to build powerful AI voice experiences for your customers. Let Twilio handle the heavy lifting of speech recognition, text-to-speech, and voice synthesis."
Usage · public pricing · self-serve
Vapi
"Build and deploy voice agents that deliver the outcomes you want at the scale your customers need."
Usage · free tier · public pricing · self-serve
Cartesia Line
"Turn any text agent into a world-class conversational agent, deployed anywhere."
Hybrid · free tier · public pricing · self-serve
Pipecat Cloud (Daily)
"The fastest path to production voice AI" - a managed hosting platform for deploying and scaling Pipecat agents in production with built-in infrastructure.
Usage · public pricing · self-serve

OpenAI Realtime API (gpt-realtime) alternatives · OpenAI Realtime API (gpt-realtime) vs ElevenLabs Conversational AI (ElevenAgents) · All Realtime Voice Agent APIs APIs

References

Each field above carries a numbered source - hover for a preview, click to jump here.

↑Description: developers.openai.com
↑Pricing model: developers.openai.com · developers.openai.com
↑Published pricing: developers.openai.com
↑Enterprise plan: openai.com
↑Supported actions: developers.openai.com · developers.openai.com
↑Regions: developers.openai.com
↑Languages: developers.openai.com · community.openai.com
↑Input types: developers.openai.com
↑Webhooks: developers.openai.com
↑SDK languages: developers.openai.com
↑MCP server: developers.openai.com
↑SOC 2: trust.openai.com
↑HIPAA: help.openai.com
↑GDPR: trust.openai.com
↑ISO 27001: trust.openai.com
↑PCI DSS: trust.openai.com
↑Published SLA: openai.com
↑Rate limits: developers.openai.com · developers.openai.com
↑Known restrictions: openai.com · developers.openai.com

Change history

Every field change, who made it, and when - from our audited data pipeline and editors.

2026-06-21 Capabilities: {} → {"multilingual":true}
2026-06-21 Summary Md: (none) → OpenAI Realtime API is a WebSocket-based service for low-latency, bidirectional…
2026-06-21 Avoid If: (none) → You want to try it free before paying
2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
2026-06-21 Score Pricing Transparency: (none) → 85
2026-06-21 Score Setup Speed: (none) → 60
2026-06-21 Score Docs Quality: (none) → 50
2026-06-21 Score Agent Friendliness: (none) → 50
2026-06-21 Score Procurement Friction: (none) → 85
2026-06-21 Score Trust Readiness: (none) → 100
2026-06-21 Best For: (none) → Regulated or enterprise workloads - compliance attestations and an enterprise p…
2026-06-21 Llms Txt Present: (none) → No
2026-06-21 Rendering: (none) → static
2026-06-21 Has Structured Data: (none) → No
2026-06-21 Robots Allows Agents: (none) → Yes
2026-06-21 API Reference URL: (none) → https://platform.openai.com/api/reference/overview
2026-06-21 Markdown Docs Served: (none) → Yes
2026-06-21 Markdown Docs URL: (none) → https://platform.openai.com/docs/guides/realtime.md
2026-06-21 Docs URL: (none) → https://developers.openai.com/api/docs
2026-06-21 MCP Server Available: set to Yes
2026-06-21 Pricing Model: set to usage_based
2026-06-21 Has Published Pricing: set to Yes
2026-06-21 Free Tier Available: set to No
2026-06-21 Self Serve Signup: set to Yes
2026-06-21 Requires Sales Call: set to No
2026-06-21 Enterprise Plan Available: set to Yes
2026-06-21 SOC 2: set to type_2
2026-06-21 HIPAA: set to Yes
2026-06-21 GDPR: set to Yes
2026-06-21 ISO 27001: set to Yes
2026-06-21 PCI DSS: set to Yes
2026-06-21 SLA Published: set to Yes
2026-06-21 SLA URL: set to https://openai.com/api-scale-tier/
2026-06-21 Data Retention Policy URL: set to https://developers.openai.com/api/docs/guides/your-data
2026-06-21 Documented Rate Limits: set to gpt-realtime-2: Tier 1–5 scaling from 200–20,000 RPM and 40,000–15,000,000 TPM;…
2026-06-21 Known Restrictions: set to SIP support is inbound calls only (no outbound SIP calling), SIP requires a thi…
2026-06-21 Auth Methods: set to api_key
2026-06-21 API Version: set to gpt-realtime-2025-08-28
2026-06-21 Versioning Scheme: set to url
2026-06-21 Stability: set to ga
2026-06-21 Deprecation Policy URL: set to https://developers.openai.com/api/docs/deprecations
2026-06-21 MCP URL: set to https://developers.openai.com/mcp
2026-06-21 Quickstart URL: set to https://developers.openai.com/api/docs/guides/voice-agents
2026-06-21 Idempotency Supported: set to No
2026-06-21 Error Format: set to vendor-specific
2026-06-21 Requires Verification: set to No
2026-06-21 Starting Price Usd: set to 0.0192
2026-06-21 Price Basis: set to minute (audio input)
2026-06-21 Launched At: set to 2024-10-01
2026-06-21 GA Date: set to 2025-08-28

Suggest an edit / leave a review

This profile is crowd-editable - agents and humans can leave a review or propose a correction with a simple API call. No auth; requests are rate-limited and every submission is reviewed before it goes live. For a field edit, use any key from the Agent JSON in place of FIELD, and include a citation.

Leave a review or comment

curl -X POST https://apio.sh/api/feedback/openai-realtime \
  -H 'Content-Type: application/json' \
  -d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'

Suggest a correction to a field (cite a source)

curl -X POST https://apio.sh/api/suggest/openai-realtime/FIELD \
  -H 'Content-Type: application/json' \
  -d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'

All the ways to contribute →

Best for / Avoid if

Pricing & procurement

Capabilities

Trust & compliance

Developer surface

Integration

Adoption & maturity

Other Realtime Voice Agent APIs

ElevenLabs Conversational AI (ElevenAgents)

Telnyx Voice AI Agents

Twilio ConversationRelay

Vapi

Cartesia Line

Pipecat Cloud (Daily)

References

Change history

Suggest an edit / leave a review