Amazon Transcribe

"Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application." [1]

aws.amazon.com/transcribe/ · By Amazon Web Services · Agent JSON · Suggest an edit · Last verified 2026-06-21 · Source confidence: high

Amazon Transcribe is an automatic speech recognition service from AWS that converts audio to text via batch or real-time streaming, with support for speaker diarization, custom vocabularies, custom language models, and multi-language identification. It targets a broad range of applications including contact center analytics, clinical documentation through a dedicated medical variant, accessibility captioning, and toxic content detection in gaming. Pricing starts at $0.006 per minute on a pay-as-you-go basis, with a free tier of 60 minutes per month for the first 12 months. The service is HIPAA-eligible, SOC 2 Type 2 certified, ISO 27001 and PCI DSS compliant, available across 25 AWS regions including GovCloud, and provides SDKs for Python, JavaScript, Java, Go, C++, Ruby, and PHP.

Best for / Avoid if

Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; Teams needing broad API coverage out of the box

Pricing & procurement

Pricing model
Usage-based [2]
Published pricing
Yes [3]
Free tier
Yes [4]
Free tier details
60 minutes per month for the first 12 months after account creation, shared across Amazon Transcribe standard, Call Analytics, and Transcribe Medical. Unused minutes do not roll over. [5]
Self-serve signup
Yes
Requires sales call
No
Enterprise plan
Yes [6]
Published prices
PlanItemPerAmountSource
Free TierStandard transcription (batch or streaming)60 minutes per month for first 12 months$0source
Standard - Tier 1Batch transcriptionminute (first 250,000 minutes/month)$0.006source
Standard - Tier 2Batch transcriptionminute (next 750,000 minutes/month)$0.0042source
Standard - Tier 3Batch transcriptionminute (over 1,000,000 minutes/month)$0.0029source
Standard - Tier 1Streaming transcriptionminute (first 250,000 minutes/month)$0.01source
Standard - Tier 2Streaming transcriptionminute (next 750,000 minutes/month)$0.0062source
Standard - Tier 3Streaming transcriptionminute (over 1,000,000 minutes/month)$0.0042source
Custom Language Model - Tier 1CLM batch transcriptionminute (first 250,000 minutes/month)$0.007source
Custom Language Model - Tier 2CLM batch transcriptionminute (next 750,000 minutes/month)$0.0043source
Custom Language Model - Tier 3CLM batch transcriptionminute (over 1,000,000 minutes/month)$0.0031source
Custom Language Model - Tier 1CLM streaming transcriptionminute (first 250,000 minutes/month)$0.012source
Custom Language Model - Tier 2CLM streaming transcriptionminute (next 750,000 minutes/month)$0.0074source
Custom Language Model - Tier 3CLM streaming transcriptionminute (over 1,000,000 minutes/month)$0.0052source
Add-on - Tier 1Automatic content redaction (PII) - batchminute (first 250,000 minutes/month)$0.0024source
Add-on - Tier 2Automatic content redaction (PII) - batchminute (next 750,000 minutes/month)$0.0015source
Add-on - Tier 3Automatic content redaction (PII) - batchminute (over 1,000,000 minutes/month)$0.001source
Add-on - Tier 1Automatic content redaction (PII) - streamingminute (first 250,000 minutes/month)$0.003source
Add-on - Tier 2Automatic content redaction (PII) - streamingminute (next 750,000 minutes/month)$0.0019source
Add-on - Tier 3Automatic content redaction (PII) - streamingminute (over 1,000,000 minutes/month)$0.0013source
Add-on - Tier 1Toxicity detection - batchminute (first 250,000 minutes/month)$0.002source
Add-on - Tier 2Toxicity detection - batchminute (next 750,000 minutes/month)$0.0012source
Add-on - Tier 3Toxicity detection - batchminute (over 1,000,000 minutes/month)$0.0009source
Call Analytics - Tier 1Post-call analyticsminute (first 250,000 minutes/month)$0.03source
Call Analytics - Tier 2Post-call analyticsminute (next 750,000 minutes/month)$0.0186source
Call Analytics - Tier 3Post-call analyticsminute (next 4,000,000 minutes/month)$0.0138source
Call Analytics - Tier 1Real-time call analyticsminute (first 250,000 minutes/month)$0.0375source
Call Analytics - Tier 2Real-time call analyticsminute (next 750,000 minutes/month)$0.0233source
Call Analytics - Tier 3Real-time call analyticsminute (next 4,000,000 minutes/month)$0.0173source
Add-on - Tier 1Generative call summarizationminute (first 250,000 minutes/month)$0.0024source
Add-on - Tier 2Generative call summarizationminute (next 750,000 minutes/month)$0.0015source
Add-on - Tier 3Generative call summarizationminute (next 4,000,000 minutes/month)$0.0011source
Transcribe MedicalMedical batch transcriptionminute$0.075source
Transcribe MedicalMedical streaming transcriptionminute$0.075source

Capabilities

  • Real-time streaming
  • Speaker diarization
  • Medical transcription
  • PII redaction
Supported actions
transcribe_batch, transcribe_streaming, speaker_diarization, language_detection, multi_language_identification, word_timestamps, confidence_scores, custom_vocabulary, custom_language_models, vocabulary_filtering, automatic_punctuation, channel_identification, pii_redaction, pii_identification, subtitles_generation, alternative_transcriptions, call_analytics_batch, call_analytics_streaming, sentiment_analysis, call_summarization, issue_detection, call_categorization, medical_transcription, phi_identification, job_queueing, content_redaction_audio [7]
Regions
US East (N. Virginia) us-east-1, US East (Ohio) us-east-2, US West (N. California) us-west-1, US West (Oregon) us-west-2, Africa (Cape Town) af-south-1, Asia Pacific (Hong Kong) ap-east-1, Asia Pacific (Mumbai) ap-south-1, Asia Pacific (Seoul) ap-northeast-2, Asia Pacific (Singapore) ap-southeast-1, Asia Pacific (Sydney) ap-southeast-2, Asia Pacific (Tokyo) ap-northeast-1, Asia Pacific (Malaysia) ap-southeast-5, Asia Pacific (Thailand) ap-southeast-7, Canada (Central) ca-central-1, Europe (Frankfurt) eu-central-1, Europe (Ireland) eu-west-1, Europe (London) eu-west-2, Europe (Paris) eu-west-3, Europe (Stockholm) eu-north-1, Europe (Zurich) eu-central-2, Middle East (Bahrain) me-south-1, Mexico (Central) mx-central-1, South America (São Paulo) sa-east-1, AWS GovCloud (US-East) us-gov-east-1, AWS GovCloud (US-West) us-gov-west-1 [8]
Languages
Abkhaz (ab-GE), Afrikaans (af-ZA), Albanian (sq-AL), Amharic (am-ET), Arabic Gulf (ar-AE), Arabic Modern Standard (ar-SA), Armenian (hy-AM), Asturian (ast-ES), Azerbaijani (az-AZ), Bashkir (ba-RU), Basque (eu-ES), Belarusian (be-BY), Bengali (bn-IN), Bosnian (bs-BA), Bulgarian (bg-BG), Burmese (my-MM), Catalan (ca-ES), Central Kurdish Iran (ckb-IR), Central Kurdish Iraq (ckb-IQ), Chinese Cantonese (zh-HK), Chinese Simplified (zh-CN), Chinese Traditional (zh-TW), Croatian (hr-HR), Czech (cs-CZ), Danish (da-DK), Dutch (nl-NL), English Australian (en-AU), English British (en-GB), English Indian (en-IN), English Irish (en-IE), English New Zealand (en-NZ), English Scottish (en-AB), English South African (en-ZA), English US (en-US), English Welsh (en-WL), Estonian (et-EE), Farsi (fa-IR), Farsi Afghan (fa-AF), Finnish (fi-FI), French (fr-FR), French Canadian (fr-CA), Galician (gl-ES), Georgian (ka-GE), German (de-DE), German Swiss (de-CH), Greek (el-GR), Gujarati (gu-IN), Haitian Creole (ht-HT), Hausa (ha-NG), Hebrew (he-IL), Hindi Indian (hi-IN), Hungarian (hu-HU), Icelandic (is-IS), Indonesian (id-ID), Italian (it-IT), Japanese (ja-JP), Javanese (jv-ID), Kabyle (kab-DZ), Kannada (kn-IN), Kazakh (kk-KZ), Khmer (km-KH), Kinyarwanda (rw-RW), Korean (ko-KR), Kyrgyz (ky-KG), Latvian (lv-LV), Lithuanian (lt-LT), Luganda (lg-IN), Macedonian (mk-MK), Malay (ms-MY), Malayalam (ml-IN), Maltese (mt-MT), Marathi (mr-IN), Meadow Mari (mhr-RU), Mongolian (mn-MN), Nepali (ne-NP), Norwegian Bokmål (no-NO), Odia/Oriya (or-IN), Pashto (ps-AF), Polish (pl-PL), Portuguese (pt-PT), Portuguese Brazilian (pt-BR), Punjabi (pa-IN), Romanian (ro-RO), Russian (ru-RU), Serbian (sr-RS), Sinhala (si-LK), Slovak (sk-SK), Slovenian (sl-SI), Somali (so-SO), Spanish (es-ES), Spanish Mexican (es-MX), Spanish US (es-US), Sundanese (su-ID), Swahili Kenya (sw-KE), Swahili Burundi (sw-BI), Swahili Rwanda (sw-RW), Swahili Tanzania (sw-TZ), Swahili Uganda (sw-UG), Swedish (sv-SE), Tagalog/Filipino (tl-PH), Tamil (ta-IN), Tatar (tt-RU), Telugu (te-IN), Thai (th-TH), Turkish (tr-TR), Ukrainian (uk-UA), Uyghur (ug-CN), Uzbek (uz-UZ), Vietnamese (vi-VN), Welsh (cy-WL), Wolof (wo-SN), Zulu (zu-ZA) [9]
Input types
audio file via Amazon S3 (batch), media stream via HTTP/2 (streaming), media stream via WebSocket (streaming), FLAC (recommended lossless), WAV with PCM 16-bit encoding (recommended lossless), single-channel audio, dual-channel audio, sample rates 8,000 Hz to 48,000 Hz
Output types
JSON transcript with full text, word-level timestamps (start time, end time), confidence scores per word, speaker-labeled transcript (diarization), channel-identified transcript, SRT/VTT subtitles (batch), redacted transcript (PII removed), call analytics JSON with sentiment and categories
Webhooks
No [10]
Sandbox / test mode
No [11]
SDK languages
Python (batch), Python (streaming), JavaScript/Node.js (streaming), Java V2 (streaming), C++ (streaming), Ruby V3, PHP V3, Go [12]
MCP server
No

Trust & compliance

SOC 2
SOC 2 Type II [13]
HIPAA
Yes [14]
GDPR
Yes [15]
ISO 27001
Yes [16]
PCI DSS
Yes [17]
Published SLA
Yes [18]
Rate limits
Concurrent transcription jobs: 250 (adjustable). Concurrent streams (HTTP/2 + WebSocket): 25 (adjustable). StartTranscriptionJob: 25 TPS (adjustable). StartStreamTranscription: 25 TPS (adjustable). Maximum audio file length: 28,800 seconds (8 hours). Maximum audio file size: 2 GB. Minimum audio file duration: 500 milliseconds. Job records retained: 90 days. [19]
Known restrictions
Maximum audio file length: 28,800 seconds (8 hours) for standard batch, Maximum audio file size: 2 GB, Maximum audio file length for Medical batch: 14,400 seconds (4 hours), Maximum audio file length for Call Analytics batch: 14,400 seconds (4 hours), Streaming sessions limited to 4 hours per open connection, Media with more than two channels is not currently supported, Amazon Transcribe Medical is only available in US English, Automatic content redaction does not remove PII from source audio files, only transcripts, Custom language model training limited to 5 concurrent jobs and 10 models per account by default, Billing in one-second increments with a 15-second minimum per request [20]

Developer surface

Docs rendering: static

Integration

API style
rest
Base URL
https://transcribe.{region}.amazonaws.com
Versioning
none
Stability
ga
Auth methods
hmac_signature
Error format
vendor-specific
Rate limit
25 / second

SDKs

  • Python (batch) boto3 · repo
  • Python (streaming) amazon-transcribe · repo
  • JavaScript/Node.js (streaming) @aws-sdk/client-transcribe-streaming · repo
  • Java V2 (streaming) software.amazon.awssdk:transcribestreaming · repo
  • C++ (streaming) aws-cpp-sdk-transcribestreaming · repo
  • Ruby V3 aws-sdk-transcribestreamingservice · repo
  • PHP V3 aws/aws-sdk-php · repo
  • Go github.com/aws/aws-sdk-go-v2/service/transcribestreaming · repo

Adoption & maturity

Launched
2017-11-29
GA
2018-04-04

Other Speech-to-Text & Transcription APIs

  • ElevenLabs Scribe (Speech to Text)

    "Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."

    Hybrid · free tier · public pricing · self-serve

  • Azure AI Speech to Text

    "Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."

    Usage · free tier · public pricing · self-serve

  • Google Cloud Speech-to-Text

    "Accurate voice typing and transcription powered by Gemini."

    Usage · free tier · public pricing · self-serve

  • IBM watsonx Speech to Text

    "IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."

    Usage · free tier · public pricing · self-serve

  • AssemblyAI

    "Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."

    Usage · public pricing · self-serve

  • Speechmatics

    "Low-latency speech-to-text for multilingual, multi-speaker conversations."

    Usage · free tier · public pricing · self-serve

Amazon Transcribe alternatives · Amazon Transcribe vs ElevenLabs Scribe (Speech to Text) · All Speech-to-Text & Transcription APIs APIs

References

Each field above carries a numbered source - hover for a preview, click to jump here.

  1. Description: docs.aws.amazon.com · aws.amazon.com
  2. Pricing model: aws.amazon.com · docs.aws.amazon.com
  3. Published pricing: aws.amazon.com
  4. Free tier: aws.amazon.com · aws.amazon.com
  5. Free tier details: aws.amazon.com
  6. Enterprise plan: aws.amazon.com
  7. Supported actions: docs.aws.amazon.com
  8. Regions: docs.aws.amazon.com
  9. Languages: docs.aws.amazon.com
  10. Webhooks: docs.aws.amazon.com
  11. Sandbox: docs.aws.amazon.com
  12. SDK languages: docs.aws.amazon.com
  13. SOC 2: aws.amazon.com
  14. HIPAA: docs.aws.amazon.com · aws.amazon.com
  15. GDPR: aws.amazon.com
  16. ISO 27001: aws.amazon.com
  17. PCI DSS: aws.amazon.com
  18. Published SLA: aws.amazon.com
  19. Rate limits: docs.aws.amazon.com
  20. Known restrictions: docs.aws.amazon.com · docs.aws.amazon.com

Change history

Every field change, who made it, and when - from our audited data pipeline and editors.

  1. 2026-06-21 Capabilities: {}{"medical":true,"pii_redaction":true,"real_time_streaming":true,"speaker_diariz…
  2. 2026-06-21 Summary Md: (none)Amazon Transcribe is an automatic speech recognition service from AWS that conv…
  3. 2026-06-21 Score Setup Speed: (none)85
  4. 2026-06-21 Score Pricing Transparency: (none)100
  5. 2026-06-21 Score Docs Quality: (none)15
  6. 2026-06-21 Score Procurement Friction: (none)100
  7. 2026-06-21 Score Trust Readiness: (none)100
  8. 2026-06-21 Best For: (none)Prototypes and side projects - free to start, no sales call, Regulated or enter…
  9. 2026-06-21 Scoring Methodology: (none)Scores are computed deterministically from this profile's published, sourced fi…
  10. 2026-06-21 Score Agent Friendliness: (none)30
  11. 2026-06-21 Has Structured Data: (none)Yes
  12. 2026-06-21 Status Page URL: (none)https://status.aws.amazon.com
  13. 2026-06-21 Docs URL: (none)https://docs.aws.amazon.com/
  14. 2026-06-21 Llms Txt Present: (none)No
  15. 2026-06-21 Rendering: (none)static
  16. 2026-06-21 Robots Allows Agents: (none)Yes
  17. 2026-06-21 SDK Packages: set to Python (batch), Python (streaming), JavaScript/Node.js (streaming), Java V2 (st…
  18. 2026-06-21 MCP Server Available: set to No
  19. 2026-06-21 Pricing Model: set to usage_based
  20. 2026-06-21 Has Published Pricing: set to Yes
  21. 2026-06-21 Free Tier Available: set to Yes
  22. 2026-06-21 Free Tier Details: set to 60 minutes per month for the first 12 months after account creation, shared acr…
  23. 2026-06-21 Self Serve Signup: set to Yes
  24. 2026-06-21 Requires Sales Call: set to No
  25. 2026-06-21 Enterprise Plan Available: set to Yes
  26. 2026-06-21 SOC 2: set to type_2
  27. 2026-06-21 HIPAA: set to Yes
  28. 2026-06-21 GDPR: set to Yes
  29. 2026-06-21 ISO 27001: set to Yes
  30. 2026-06-21 PCI DSS: set to Yes
  31. 2026-06-21 SLA Published: set to Yes
  32. 2026-06-21 SLA URL: set to https://aws.amazon.com/ai/services/language-sla/
  33. 2026-06-21 Data Retention Policy URL: set to https://docs.aws.amazon.com/transcribe/latest/dg/opt-out.html
  34. 2026-06-21 Documented Rate Limits: set to Concurrent transcription jobs: 250 (adjustable). Concurrent streams (HTTP/2 + W…
  35. 2026-06-21 Rate Limit Requests: set to 25
  36. 2026-06-21 Rate Limit Window: set to second
  37. 2026-06-21 Auth Methods: set to hmac_signature
  38. 2026-06-21 Auth Docs URL: set to https://docs.aws.amazon.com/transcribe/latest/dg/security-iam.html
  39. 2026-06-21 API Style: set to rest
  40. 2026-06-21 Base URL: set to https://transcribe.{region}.amazonaws.com
  41. 2026-06-21 Stability: set to ga
  42. 2026-06-21 Quickstart URL: set to https://docs.aws.amazon.com/transcribe/latest/dg/getting-started.html
  43. 2026-06-21 Error Format: set to vendor-specific
  44. 2026-06-21 Requires Verification: set to No
  45. 2026-06-21 Starting Price Usd: set to 0.006
  46. 2026-06-21 Price Basis: set to minute
  47. 2026-06-21 Free Tier Limit: set to 60 minutes/month for 12 months
  48. 2026-06-21 Launched At: set to 2017-11-29
  49. 2026-06-21 GA Date: set to 2018-04-04
  50. 2026-06-21 Notable Customers: set to (none)

Suggest an edit / leave a review

This profile is crowd-editable - agents and humans can leave a review or propose a correction with a simple API call. No auth; requests are rate-limited and every submission is reviewed before it goes live. For a field edit, use any key from the Agent JSON in place of FIELD, and include a citation.

Leave a review or comment

curl -X POST https://apio.sh/api/feedback/aws-transcribe \
  -H 'Content-Type: application/json' \
  -d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'

Suggest a correction to a field (cite a source)

curl -X POST https://apio.sh/api/suggest/aws-transcribe/FIELD \
  -H 'Content-Type: application/json' \
  -d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'

All the ways to contribute →