ElevenLabs Scribe (Speech to Text)

"Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages." [1]

Speech-to-Text & Transcription APIs

elevenlabs.io/speech-to-text · By ElevenLabs · Agent JSON · Suggest an edit · Last verified 2026-06-21 · Source confidence: high

ElevenLabs Scribe is a REST speech-to-text API supporting batch and real-time transcription across 90+ languages, with sub-150ms latency for streaming use cases. It covers speaker diarization, word and character timestamps, entity detection and redaction, multichannel processing, and keyterm prompting, making it suitable for podcasts, video captioning, meeting documentation, and AI agent integrations. Pricing starts at $0.22 per hour of audio with a free tier of 4.5 hours per month, self-serve signup, and an enterprise plan available. The service holds SOC 2 Type 2, HIPAA, GDPR, ISO 27001, and PCI DSS certifications, and ships SDKs for Python, Node.js, Swift, Kotlin, and Flutter.

Best for / Avoid if

Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt)

Pricing & procurement

Pricing model: Hybrid (base + usage) [2]
Published pricing: Yes [3]
Free tier: Yes [4]
Free tier details: Free plan includes 4 hours 30 minutes/month of Scribe v1/v2 transcription and 2 hours 30 minutes/month of Scribe v2 Realtime transcription at no cost (recurring monthly allowance, shared with other platform features via the 10,000 credit pool).
Self-serve signup: Yes
Requires sales call: No
Enterprise plan: Yes [5]

Published prices
Plan	Item	Per	Amount	Source
Pay As You Go	Scribe v1/v2 batch transcription	hour of audio	$0.22	source
Pay As You Go	Scribe v2 Realtime transcription	hour of audio	$0.39	source
Pay As You Go	Entity detection add-on	hour of audio	$0.07	source
Pay As You Go	Keyterm prompting add-on	hour of audio	$0.05	source
Free	Monthly plan fee	month	$0	source
Free	Scribe v1/v2 included transcription	4 hours 30 minutes/month included	$0	source
Free	Scribe v2 Realtime included transcription	2 hours 30 minutes/month included	$0	source
Starter	Monthly plan fee	month	$6	source
Starter	Scribe v1/v2 included transcription	4.5 hours included/month	$0	source
Starter	Scribe v2 Realtime included transcription	2.5 hours included/month	$0	source
Creator	Monthly plan fee	month	$22	source
Creator	Scribe v1/v2 included transcription	27 hours included/month	$0	source
Creator	Scribe v2 Realtime included transcription	15 hours included/month	$0	source
Pro	Monthly plan fee	month	$99	source
Pro	Scribe v1/v2 included transcription	100 hours included/month	$0	source
Pro	Scribe v2 Realtime included transcription	56 hours included/month	$0	source
Scale	Monthly plan fee	month	$299	source
Scale	Scribe v1/v2 included transcription	450 hours included/month	$0	source
Scale	Scribe v2 Realtime included transcription	254 hours included/month	$0	source
Business	Monthly plan fee	month	$990	source
Business	Scribe v1/v2 included transcription	1359 hours included/month	$0	source
Business	Scribe v2 Realtime included transcription	767 hours included/month	$0	source
Enterprise	Scribe v1/v2 included transcription	4500 hours included/month (example volume)	$0	source
Enterprise	Scribe v2 Realtime included transcription	2538 hours included/month (example volume)	$0	source

Capabilities

Real-time streaming
Speaker diarization
PII redaction

Supported actions: transcribe_batch, transcribe_streaming, speaker_diarization, language_detection, word_timestamps, character_timestamps, entity_detection, entity_redaction, keyterm_prompting, dynamic_audio_tagging, no_verbatim_mode, multichannel_processing, webhook_delivery, voice_activity_detection, manual_commit_control
Regions: US, EU, India, Singapore [6]
Languages: Afrikaans, Amharic, Arabic, Armenian, Assamese, Asturian, Azerbaijani, Belarusian, Bengali, Bosnian, Bulgarian, Burmese, Cantonese, Catalan, Central Kurdish, Chichewa, Chinese (Mandarin), Croatian, Czech, Danish, Dutch, English, Estonian, Filipino, Finnish, French, Fulah, Galician, Ganda, Georgian, German, Greek, Gujarati, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kabuverdianu, Kannada, Kazakh, Khmer, Korean, Kyrgyz, Lao, Latvian, Lingala, Lithuanian, Luo, Luxembourgish, Macedonian, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Northern Sotho, Norwegian, Occitan, Oriya, Pashto, Pedi, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Serbian, Shona, Sindhi, Slovak, Slovenian, Somali, Spanish, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Umbundu, Urdu, Uzbek, Vietnamese, Welsh, Wolof, Xhosa, Zulu [7]
Input types: audio/aac, audio/aiff, audio/ogg, audio/mpeg (MP3), audio/opus, audio/wav, audio/flac, audio/m4a, audio/webm, video/mp4, video/avi, video/mkv, video/quicktime (MOV), video/wmv, video/x-flv, video/mpeg, video/3gpp, file upload (up to 3 GB), cloud storage URL (up to 2 GB), YouTube URL, TikTok URL, WebSocket PCM stream (8–48 kHz), WebSocket μ-law stream (ulaw_8000) [8]
Output types: JSON (word-level timestamps, speaker IDs, confidence scores), plain text, SRT, DOCX, HTML, PDF, segmented JSON, partial_transcript (streaming), committed_transcript (streaming), committed_transcript_with_timestamps (streaming)
Webhooks: Yes [9]
Sandbox / test mode: No [10]
SDK languages: Python, Node.js, Swift, Kotlin, Flutter [11]
MCP server: Yes [12]

Trust & compliance

SOC 2: SOC 2 Type II [13]
HIPAA: Yes [14]
GDPR: Yes [15]
ISO 27001: Yes [16]
PCI DSS: Yes [17]
Published SLA: No [18]
Rate limits: Concurrency for Scribe v1/v2 batch: min(4, round_up(audio_duration_secs/480)). Files over 8 minutes chunked into 4 parallel segments. Scribe v2 Realtime: 30+ concurrent streams on Business plans; enterprise plans include elevated limits. Response headers expose current-concurrent-requests and maximum-concurrent-requests. HTTP 429 returned on rate_limit_exceeded or concurrent_limit_exceeded. [19]
Known restrictions: Maximum file size: 3 GB (file upload) or 2 GB (cloud storage URL), Maximum audio duration: 10 hours (standard), 1 hour (multichannel), Minimum audio duration: 100ms, Maximum channels in multichannel mode: 5, Maximum speakers for diarization: 32, Maximum keyterms: 1,000 per request (batch); 50 keyterms (realtime), Keyterm max length: under 50 characters, max 5 words (batch); up to 20 characters (realtime), Scribe v1 deprecated, removal July 9 2026, Zero Retention Mode (enable_logging=false) is enterprise-only, Data residency (EU, India, Singapore) is enterprise-only feature, HIPAA support requires BAA with ElevenLabs Sales and Zero Retention Mode enabled, Speaker diarization not available on Scribe v2 Realtime, Dual channel not supported on Scribe v2 Realtime, Entity detection and redaction incur additional cost; speaker role detection also incurs additional cost [20]

Developer surface

Docs rendering: static · llms.txt present

Integration

API style: rest
Base URL: https://api.elevenlabs.io
Version: v1
Versioning: url
Stability: ga
Auth methods: api_key
Idempotency keys: No
Error format: vendor-specific
Webhook signing: hmac

SDKs

Python elevenlabs · repo
Node.js @elevenlabs/elevenlabs-js · repo
Swift · repo
Kotlin io.elevenlabs:elevenlabs-android · repo
Flutter elevenlabs_agents · repo

Adoption & maturity

Launched: 2025-02-26
GA: 2025-02-26
Notable customers: Revolut, Klarna, Washington Post, Deutsche Telekom, HarperCollins

Other Speech-to-Text & Transcription APIs

Azure AI Speech to Text
"Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."
Usage · free tier · public pricing · self-serve
Amazon Transcribe
"Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application."
Usage · free tier · public pricing · self-serve
Google Cloud Speech-to-Text
"Accurate voice typing and transcription powered by Gemini."
Usage · free tier · public pricing · self-serve
IBM watsonx Speech to Text
"IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."
Usage · free tier · public pricing · self-serve
AssemblyAI
"Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."
Usage · public pricing · self-serve
Speechmatics
"Low-latency speech-to-text for multilingual, multi-speaker conversations."
Usage · free tier · public pricing · self-serve

ElevenLabs Scribe (Speech to Text) alternatives · ElevenLabs Scribe (Speech to Text) vs Azure AI Speech to Text · All Speech-to-Text & Transcription APIs APIs

References

Each field above carries a numbered source - hover for a preview, click to jump here.

↑Description: elevenlabs.io · elevenlabs.io
↑Pricing model: elevenlabs.io · elevenlabs.io
↑Published pricing: elevenlabs.io
↑Free tier: elevenlabs.io · elevenlabs.io
↑Enterprise plan: elevenlabs.io
↑Regions: elevenlabs.io
↑Languages: elevenlabs.io · elevenlabs.io
↑Input types: elevenlabs.io
↑Webhooks: elevenlabs.io
↑Sandbox: elevenlabs.io
↑SDK languages: elevenlabs.io
↑MCP server: elevenlabs.io
↑SOC 2: compliance.elevenlabs.io · elevenlabs.io
↑HIPAA: elevenlabs.io · elevenlabs.io
↑GDPR: elevenlabs.io
↑ISO 27001: compliance.elevenlabs.io · elevenlabs.io
↑PCI DSS: compliance.elevenlabs.io · elevenlabs.io
↑Published SLA: elevenlabs.io
↑Rate limits: elevenlabs.io
↑Known restrictions: elevenlabs.io · elevenlabs.io

Change history

Every field change, who made it, and when - from our audited data pipeline and editors.

2026-06-21 Capabilities: {} → {"pii_redaction":true,"real_time_streaming":true,"speaker_diarization":true}
2026-06-21 Summary Md: (none) → ElevenLabs Scribe is a REST speech-to-text API supporting batch and real-time t…
2026-06-21 Score Agent Friendliness: (none) → 65
2026-06-21 Score Pricing Transparency: (none) → 100
2026-06-21 Score Setup Speed: (none) → 85
2026-06-21 Score Docs Quality: (none) → 55
2026-06-21 Score Procurement Friction: (none) → 100
2026-06-21 Score Trust Readiness: (none) → 80
2026-06-21 Best For: (none) → Prototypes and side projects - free to start, no sales call, Regulated or enter…
2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
2026-06-21 Rendering: (none) → static
2026-06-21 Llms Txt URL: (none) → https://elevenlabs.io/llms.txt
2026-06-21 Has Structured Data: (none) → Yes
2026-06-21 Robots Allows Agents: (none) → Yes
2026-06-21 API Reference URL: (none) → https://elevenlabs.io/api
2026-06-21 Status Page URL: (none) → https://status.elevenlabs.io
2026-06-21 Changelog URL: (none) → https://elevenlabs.io/changelog
2026-06-21 Docs URL: (none) → https://elevenlabs.io/docs/overview/intro
2026-06-21 Llms Txt Present: (none) → Yes
2026-06-21 Free Tier Details: set to Free plan includes 4 hours 30 minutes/month of Scribe v1/v2 transcription and 2…
2026-06-21 Self Serve Signup: set to Yes
2026-06-21 Requires Sales Call: set to No
2026-06-21 Enterprise Plan Available: set to Yes
2026-06-21 SOC 2: set to type_2
2026-06-21 HIPAA: set to Yes
2026-06-21 GDPR: set to Yes
2026-06-21 ISO 27001: set to Yes
2026-06-21 PCI DSS: set to Yes
2026-06-21 SLA Published: set to No
2026-06-21 Data Retention Policy URL: set to https://elevenlabs.io/dpa
2026-06-21 Documented Rate Limits: set to Concurrency for Scribe v1/v2 batch: min(4, round_up(audio_duration_secs/480)). …
2026-06-21 Source Confidence: set to high
2026-06-21 Extractor: set to claude-subagent:sonnet
2026-06-21 Last Verified At: set to 2026-06-21T00:00:00.000Z
2026-06-21 Known Restrictions: set to Maximum file size: 3 GB (file upload) or 2 GB (cloud storage URL), Maximum audi…
2026-06-21 Auth Methods: set to api_key
2026-06-21 Auth Docs URL: set to https://elevenlabs.io/docs/api-reference/introduction
2026-06-21 API Style: set to rest
2026-06-21 Base URL: set to https://api.elevenlabs.io
2026-06-21 API Version: set to v1
2026-06-21 Versioning Scheme: set to url
2026-06-21 Stability: set to ga
2026-06-21 Deprecation Policy URL: set to https://elevenlabs.io/docs/developers/best-practices/breaking-changes-policy
2026-06-21 Quickstart URL: set to https://elevenlabs.io/docs/eleven-api/guides/cookbooks/speech-to-text
2026-06-21 Idempotency Supported: set to No
2026-06-21 Error Format: set to vendor-specific
2026-06-21 Webhook Signing: set to hmac
2026-06-21 Slug: set to elevenlabs-scribe
2026-06-21 Requires Verification: set to No
2026-06-21 Starting Price Usd: set to 0.22

Suggest an edit / leave a review

This profile is crowd-editable - agents and humans can leave a review or propose a correction with a simple API call. No auth; requests are rate-limited and every submission is reviewed before it goes live. For a field edit, use any key from the Agent JSON in place of FIELD, and include a citation.

Leave a review or comment

curl -X POST https://apio.sh/api/feedback/elevenlabs-scribe \
  -H 'Content-Type: application/json' \
  -d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'

Suggest a correction to a field (cite a source)

curl -X POST https://apio.sh/api/suggest/elevenlabs-scribe/FIELD \
  -H 'Content-Type: application/json' \
  -d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'

All the ways to contribute →

Best for / Avoid if

Pricing & procurement

Capabilities

Trust & compliance

Developer surface

Integration

Adoption & maturity

Other Speech-to-Text & Transcription APIs

Azure AI Speech to Text

Amazon Transcribe

Google Cloud Speech-to-Text

IBM watsonx Speech to Text

AssemblyAI

Speechmatics

References

Change history

Suggest an edit / leave a review