Speechmatics

"Low-latency speech-to-text for multilingual, multi-speaker conversations." [1]

www.speechmatics.com · By Speechmatics · Agent JSON · Suggest an edit · Last verified 2026-06-21 · Source confidence: high

Speechmatics is a speech-to-text API supporting batch and real-time transcription across EU, US, and Australia regions, with capabilities including speaker diarization, language detection, translation, summarization, and audio event detection, making it suited for contact centers, legal, medical, and broadcast use cases. Pricing starts at $0.0022 per minute with a free tier of 3,000 minutes per month and self-serve signup, scaling to enterprise plans with dedicated regional endpoints. The API is REST-based with SDK support for Python, Node.js, .NET, and Rust, and holds SOC 2 Type 2, HIPAA, GDPR, and ISO 27001 certifications.

Best for / Avoid if

Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; Teams needing broad API coverage out of the box

Pricing & procurement

Pricing model
Usage-based [2]
Published pricing
Yes [3]
Free tier
Yes [4]
Free tier details
Free 3,000 minutes (50 hours) of STT per month; no credit card required. Free tier caps: 2 concurrent real-time sessions, 10 batch hours/month. (TTS free tier: 1 million characters/month - excluded from STT pricing scope.) [5]
Self-serve signup
Yes
Requires sales call
No
Enterprise plan
Yes [6]
Published prices
PlanItemPerAmountSource
FreeSpeech-to-text transcription (batch and real-time, Standard model)3,000 minutes (50 hours) per month included$0source
ProBatch transcription — Standard modelhour of audio$0.8source
ProBatch transcription — Enhanced modelhour of audio$1.04source
ProReal-time (streaming) transcription — Standard modelhour of audio$1.04source
ProReal-time (streaming) transcription — Enhanced modelhour of audio$1.35source
ProVolume discount on batch or real-time transcription above 500 hours/monthhour of audio (applied automatically above 500 hours/month per STT type)20%source
EnterpriseBatch or real-time transcriptioncustom (contact sales; additional volume discounts from 24,000 hours/year) - source

Capabilities

  • Real-time streaming
  • Speaker diarization
  • Speech translation
  • Medical transcription
Supported actions
transcribe_batch, transcribe_streaming, speaker_diarization, speaker_identification, language_detection, word_timestamps, translation, summarization, sentiment_analysis, chapter_generation, custom_dictionary, audio_event_detection, text_to_speech, voice_agents_flow
Regions
EU (eu1.asr.api.speechmatics.com), EU2 - enterprise only (eu2.asr.api.speechmatics.com), US (us1.asr.api.speechmatics.com), US2 - enterprise only (us2.asr.api.speechmatics.com), Australia (au1.asr.api.speechmatics.com) [7]
Languages
Arabic, Bashkir, Basque, Belarusian, Bengali, Bulgarian, Cantonese, Catalan, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Interlingua, Irish, Italian, Japanese, Korean, Latvian, Lithuanian, Malay, Maltese, Mandarin, Marathi, Mongolian, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Slovakian, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Uyghur, Vietnamese, Welsh [8]
Input types
audio/wav, audio/mp3, audio/aac, audio/ogg, audio/mpeg, audio/amr, audio/m4a, video/mp4, audio/flac, PCM_S16LE (real-time WebSocket streaming) [9]
Output types
JSON (json-v2), Plain text (.txt), SRT subtitles, Word-level timestamps, Speaker diarization output, Alignment output (word_start_and_end, one_per_line), Translation output, Sentiment analysis, Summarization, Chapter markers
Webhooks
Yes [10]
Sandbox / test mode
No [11]
SDK languages
Python, Node.js, .NET, Rust [12]
MCP server
No

Trust & compliance

SOC 2
SOC 2 Type II [13]
HIPAA
Yes [14]
GDPR
Yes [15]
ISO 27001
Yes [16]
PCI DSS
No [17]
Published SLA
No [18]
Rate limits
Batch: 10 new jobs/second (POST); 50 job status requests/second (GET); 20,000 concurrent jobs max. Free tier: 2 concurrent real-time sessions, 10 hours/month batch. Paid tier: 50 concurrent real-time sessions, 6,000 hours/month. File size limit: <1 GB per direct upload. Max session duration: 48 hours. Max 50 speaker identifiers. Translation: max 5 target languages per request. [19]
Known restrictions
Supported batch input formats are exhaustive: wav, mp3, aac, ogg, mpeg, amr, m4a, mp4, flac only, Raw audio formats without embedded codec cannot be processed in batch mode, File size limit: less than 1 GB when submitting directly in request body (larger files must use URL), Maximum audio duration for batch jobs: 2 hours, Data retention: audio files and transcripts deleted after 7 days, Melia 1 model available in EU1 and US1 regions only, Pro tier capped at 6,000 hours/month; Enterprise for higher volumes, Maximum 50 speaker identifiers across all speakers, Translation: maximum 5 target languages per request, Real-time sessions auto-terminate after 48 hours, or 1 hour of no audio, or 3 minutes of no activity/pings [20]

Developer surface

Docs rendering: static

Integration

API style
rest
Base URL
https://asr.api.speechmatics.com/v2
Version
v2
Versioning
url
Stability
ga
Auth methods
api_key, jwt
Error format
vendor-specific
Rate limit
10 / second

SDKs

  • Python speechmatics-batch · repo
  • Python speechmatics-rt · repo
  • Python speechmatics-voice · repo
  • Python speechmatics-tts · repo
  • Node.js @speechmatics/batch-client · repo
  • Node.js @speechmatics/real-time-client · repo
  • Node.js @speechmatics/flow-client · repo
  • .NET · repo
  • Rust · repo

Adoption & maturity

Launched
2006-01-01
Notable customers
what3words, 3Play Media, Veritone, Deloitte UK, Vonage

Other Speech-to-Text & Transcription APIs

  • ElevenLabs Scribe (Speech to Text)

    "Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."

    Hybrid · free tier · public pricing · self-serve

  • Azure AI Speech to Text

    "Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."

    Usage · free tier · public pricing · self-serve

  • Amazon Transcribe

    "Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application."

    Usage · free tier · public pricing · self-serve

  • Google Cloud Speech-to-Text

    "Accurate voice typing and transcription powered by Gemini."

    Usage · free tier · public pricing · self-serve

  • IBM watsonx Speech to Text

    "IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."

    Usage · free tier · public pricing · self-serve

  • AssemblyAI

    "Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."

    Usage · public pricing · self-serve

Speechmatics alternatives · Speechmatics vs ElevenLabs Scribe (Speech to Text) · All Speech-to-Text & Transcription APIs APIs

References

Each field above carries a numbered source - hover for a preview, click to jump here.

  1. Description: speechmatics.com
  2. Pricing model: speechmatics.com
  3. Published pricing: speechmatics.com
  4. Free tier: speechmatics.com
  5. Free tier details: speechmatics.com · docs.speechmatics.com
  6. Enterprise plan: speechmatics.com
  7. Regions: docs.speechmatics.com
  8. Languages: speechmatics.com
  9. Input types: docs.speechmatics.com
  10. Webhooks: docs.speechmatics.com
  11. Sandbox: speechmatics.com
  12. SDK languages: docs.speechmatics.com
  13. SOC 2: speechmatics.com
  14. HIPAA: speechmatics.com
  15. GDPR: speechmatics.com
  16. ISO 27001: speechmatics.com
  17. PCI DSS: speechmatics.com
  18. Published SLA: speechmatics.com
  19. Rate limits: docs.speechmatics.com
  20. Known restrictions: docs.speechmatics.com · docs.speechmatics.com

Change history

Every field change, who made it, and when - from our audited data pipeline and editors.

  1. 2026-06-21 Capabilities: {}{"medical":true,"translation":true,"real_time_streaming":true,"speaker_diarizat…
  2. 2026-06-21 Summary Md: (none)Speechmatics is a speech-to-text API supporting batch and real-time transcripti…
  3. 2026-06-21 Score Docs Quality: (none)15
  4. 2026-06-21 Score Procurement Friction: (none)100
  5. 2026-06-21 Score Trust Readiness: (none)70
  6. 2026-06-21 Best For: (none)Prototypes and side projects - free to start, no sales call, Regulated or enter…
  7. 2026-06-21 Scoring Methodology: (none)Scores are computed deterministically from this profile's published, sourced fi…
  8. 2026-06-21 Score Agent Friendliness: (none)30
  9. 2026-06-21 Score Pricing Transparency: (none)100
  10. 2026-06-21 Score Setup Speed: (none)85
  11. 2026-06-21 Llms Txt Present: (none)No
  12. 2026-06-21 Has Structured Data: (none)Yes
  13. 2026-06-21 Robots Allows Agents: (none)Yes
  14. 2026-06-21 Status Page URL: (none)https://status.speechmatics.com
  15. 2026-06-21 Docs URL: (none)https://docs.speechmatics.com/
  16. 2026-06-21 Rendering: (none)static
  17. 2026-06-21 Pricing Model: set to usage_based
  18. 2026-06-21 Has Published Pricing: set to Yes
  19. 2026-06-21 Free Tier Available: set to Yes
  20. 2026-06-21 Error Format: set to vendor-specific
  21. 2026-06-21 Free Tier Details: set to Free 3,000 minutes (50 hours) of STT per month; no credit card required. Free t…
  22. 2026-06-21 Self Serve Signup: set to Yes
  23. 2026-06-21 Requires Sales Call: set to No
  24. 2026-06-21 Enterprise Plan Available: set to Yes
  25. 2026-06-21 SOC 2: set to type_2
  26. 2026-06-21 HIPAA: set to Yes
  27. 2026-06-21 GDPR: set to Yes
  28. 2026-06-21 ISO 27001: set to Yes
  29. 2026-06-21 PCI DSS: set to No
  30. 2026-06-21 SLA Published: set to No
  31. 2026-06-21 Data Retention Policy URL: set to https://www.speechmatics.com/legal/privacy-policy
  32. 2026-06-21 Documented Rate Limits: set to Batch: 10 new jobs/second (POST); 50 job status requests/second (GET); 20,000 c…
  33. 2026-06-21 Rate Limit Requests: set to 10
  34. 2026-06-21 Rate Limit Window: set to second
  35. 2026-06-21 Known Restrictions: set to Supported batch input formats are exhaustive: wav, mp3, aac, ogg, mpeg, amr, m4…
  36. 2026-06-21 Auth Methods: set to api_key, jwt
  37. 2026-06-21 Auth Docs URL: set to https://docs.speechmatics.com/get-started/authentication
  38. 2026-06-21 API Style: set to rest
  39. 2026-06-21 Base URL: set to https://asr.api.speechmatics.com/v2
  40. 2026-06-21 API Version: set to v2
  41. 2026-06-21 Versioning Scheme: set to url
  42. 2026-06-21 Stability: set to ga
  43. 2026-06-21 Quickstart URL: set to https://docs.speechmatics.com/get-started/quickstart
  44. 2026-06-21 Slug: set to speechmatics
  45. 2026-06-21 Requires Verification: set to No
  46. 2026-06-21 Starting Price Usd: set to 0.0022
  47. 2026-06-21 Price Basis: set to minute
  48. 2026-06-21 Free Tier Limit: set to 3000 minutes (50 hours) per month
  49. 2026-06-21 Launched At: set to 2006-01-01
  50. 2026-06-21 Notable Customers: set to what3words, 3Play Media, Veritone, Deloitte UK, Vonage

Suggest an edit / leave a review

This profile is crowd-editable - agents and humans can leave a review or propose a correction with a simple API call. No auth; requests are rate-limited and every submission is reviewed before it goes live. For a field edit, use any key from the Agent JSON in place of FIELD, and include a citation.

Leave a review or comment

curl -X POST https://apio.sh/api/feedback/speechmatics \
  -H 'Content-Type: application/json' \
  -d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'

Suggest a correction to a field (cite a source)

curl -X POST https://apio.sh/api/suggest/speechmatics/FIELD \
  -H 'Content-Type: application/json' \
  -d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'

All the ways to contribute →