Amazon Transcribe

"Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application." [1]

Speech-to-Text & Transcription APIs

aws.amazon.com/transcribe/ · By Amazon Web Services · Agent JSON · Suggest an edit · Last verified 2026-06-21 · Source confidence: high

Amazon Transcribe is an automatic speech recognition service from AWS that converts audio to text via batch or real-time streaming, with support for speaker diarization, custom vocabularies, custom language models, and multi-language identification. It targets a broad range of applications including contact center analytics, clinical documentation through a dedicated medical variant, accessibility captioning, and toxic content detection in gaming. Pricing starts at $0.006 per minute on a pay-as-you-go basis, with a free tier of 60 minutes per month for the first 12 months. The service is HIPAA-eligible, SOC 2 Type 2 certified, ISO 27001 and PCI DSS compliant, available across 25 AWS regions including GovCloud, and provides SDKs for Python, JavaScript, Java, Go, C++, Ruby, and PHP.

Best for / Avoid if

Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; Teams needing broad API coverage out of the box

Pricing & procurement

Pricing model: Usage-based [2]
Published pricing: Yes [3]
Free tier: Yes [4]
Free tier details: 60 minutes per month for the first 12 months after account creation, shared across Amazon Transcribe standard, Call Analytics, and Transcribe Medical. Unused minutes do not roll over. [5]
Self-serve signup: Yes
Requires sales call: No
Enterprise plan: Yes [6]

Published prices
Plan	Item	Per	Amount	Source
Free Tier	Standard transcription (batch or streaming)	60 minutes per month for first 12 months	$0	source
Standard - Tier 1	Batch transcription	minute (first 250,000 minutes/month)	$0.006	source
Standard - Tier 2	Batch transcription	minute (next 750,000 minutes/month)	$0.0042	source
Standard - Tier 3	Batch transcription	minute (over 1,000,000 minutes/month)	$0.0029	source
Standard - Tier 1	Streaming transcription	minute (first 250,000 minutes/month)	$0.01	source
Standard - Tier 2	Streaming transcription	minute (next 750,000 minutes/month)	$0.0062	source
Standard - Tier 3	Streaming transcription	minute (over 1,000,000 minutes/month)	$0.0042	source
Custom Language Model - Tier 1	CLM batch transcription	minute (first 250,000 minutes/month)	$0.007	source
Custom Language Model - Tier 2	CLM batch transcription	minute (next 750,000 minutes/month)	$0.0043	source
Custom Language Model - Tier 3	CLM batch transcription	minute (over 1,000,000 minutes/month)	$0.0031	source
Custom Language Model - Tier 1	CLM streaming transcription	minute (first 250,000 minutes/month)	$0.012	source
Custom Language Model - Tier 2	CLM streaming transcription	minute (next 750,000 minutes/month)	$0.0074	source
Custom Language Model - Tier 3	CLM streaming transcription	minute (over 1,000,000 minutes/month)	$0.0052	source
Add-on - Tier 1	Automatic content redaction (PII) - batch	minute (first 250,000 minutes/month)	$0.0024	source
Add-on - Tier 2	Automatic content redaction (PII) - batch	minute (next 750,000 minutes/month)	$0.0015	source
Add-on - Tier 3	Automatic content redaction (PII) - batch	minute (over 1,000,000 minutes/month)	$0.001	source
Add-on - Tier 1	Automatic content redaction (PII) - streaming	minute (first 250,000 minutes/month)	$0.003	source
Add-on - Tier 2	Automatic content redaction (PII) - streaming	minute (next 750,000 minutes/month)	$0.0019	source
Add-on - Tier 3	Automatic content redaction (PII) - streaming	minute (over 1,000,000 minutes/month)	$0.0013	source
Add-on - Tier 1	Toxicity detection - batch	minute (first 250,000 minutes/month)	$0.002	source
Add-on - Tier 2	Toxicity detection - batch	minute (next 750,000 minutes/month)	$0.0012	source
Add-on - Tier 3	Toxicity detection - batch	minute (over 1,000,000 minutes/month)	$0.0009	source
Call Analytics - Tier 1	Post-call analytics	minute (first 250,000 minutes/month)	$0.03	source
Call Analytics - Tier 2	Post-call analytics	minute (next 750,000 minutes/month)	$0.0186	source
Call Analytics - Tier 3	Post-call analytics	minute (next 4,000,000 minutes/month)	$0.0138	source
Call Analytics - Tier 1	Real-time call analytics	minute (first 250,000 minutes/month)	$0.0375	source
Call Analytics - Tier 2	Real-time call analytics	minute (next 750,000 minutes/month)	$0.0233	source
Call Analytics - Tier 3	Real-time call analytics	minute (next 4,000,000 minutes/month)	$0.0173	source
Add-on - Tier 1	Generative call summarization	minute (first 250,000 minutes/month)	$0.0024	source
Add-on - Tier 2	Generative call summarization	minute (next 750,000 minutes/month)	$0.0015	source
Add-on - Tier 3	Generative call summarization	minute (next 4,000,000 minutes/month)	$0.0011	source
Transcribe Medical	Medical batch transcription	minute	$0.075	source
Transcribe Medical	Medical streaming transcription	minute	$0.075	source

Capabilities

Real-time streaming
Speaker diarization
Medical transcription
PII redaction

Supported actions: transcribe_batch, transcribe_streaming, speaker_diarization, language_detection, multi_language_identification, word_timestamps, confidence_scores, custom_vocabulary, custom_language_models, vocabulary_filtering, automatic_punctuation, channel_identification, pii_redaction, pii_identification, subtitles_generation, alternative_transcriptions, call_analytics_batch, call_analytics_streaming, sentiment_analysis, call_summarization, issue_detection, call_categorization, medical_transcription, phi_identification, job_queueing, content_redaction_audio [7]
Regions: US East (N. Virginia) us-east-1, US East (Ohio) us-east-2, US West (N. California) us-west-1, US West (Oregon) us-west-2, Africa (Cape Town) af-south-1, Asia Pacific (Hong Kong) ap-east-1, Asia Pacific (Mumbai) ap-south-1, Asia Pacific (Seoul) ap-northeast-2, Asia Pacific (Singapore) ap-southeast-1, Asia Pacific (Sydney) ap-southeast-2, Asia Pacific (Tokyo) ap-northeast-1, Asia Pacific (Malaysia) ap-southeast-5, Asia Pacific (Thailand) ap-southeast-7, Canada (Central) ca-central-1, Europe (Frankfurt) eu-central-1, Europe (Ireland) eu-west-1, Europe (London) eu-west-2, Europe (Paris) eu-west-3, Europe (Stockholm) eu-north-1, Europe (Zurich) eu-central-2, Middle East (Bahrain) me-south-1, Mexico (Central) mx-central-1, South America (São Paulo) sa-east-1, AWS GovCloud (US-East) us-gov-east-1, AWS GovCloud (US-West) us-gov-west-1 [8]docs.aws.amazon.com/transcribe/latest/dg/what-is.html“Amazon Transcribe is supported in the following AWS Regions: af-south-1 (Cape Town), ap-east-1 (Hong Kong), ap-northeast-1 (Tokyo), ap-northeast-2 (Seoul), ap-south-1 (Mumbai), ap-southeast-1 (Singapore), ap-southeast-2 (Sydney), ap-southeast-5 (Malaysia), ap-southeast-7 (Thailand), ca-central-1 (Canada, Central), eu-central-1 (Frankfurt), eu-central-2 (Zurich), eu-north-1 (Stockholm), eu-west-1 (Ireland), eu-west-2 (London), eu-west-3 (Paris), me-south-1 (Bahrain), mx-central-1 (Mexico), sa-east-1 (São Paulo), us-east-1, us-east-2, us-gov-east-1, us-gov-west-1, us-west-1, us-west-2.”
Languages: Abkhaz (ab-GE), Afrikaans (af-ZA), Albanian (sq-AL), Amharic (am-ET), Arabic Gulf (ar-AE), Arabic Modern Standard (ar-SA), Armenian (hy-AM), Asturian (ast-ES), Azerbaijani (az-AZ), Bashkir (ba-RU), Basque (eu-ES), Belarusian (be-BY), Bengali (bn-IN), Bosnian (bs-BA), Bulgarian (bg-BG), Burmese (my-MM), Catalan (ca-ES), Central Kurdish Iran (ckb-IR), Central Kurdish Iraq (ckb-IQ), Chinese Cantonese (zh-HK), Chinese Simplified (zh-CN), Chinese Traditional (zh-TW), Croatian (hr-HR), Czech (cs-CZ), Danish (da-DK), Dutch (nl-NL), English Australian (en-AU), English British (en-GB), English Indian (en-IN), English Irish (en-IE), English New Zealand (en-NZ), English Scottish (en-AB), English South African (en-ZA), English US (en-US), English Welsh (en-WL), Estonian (et-EE), Farsi (fa-IR), Farsi Afghan (fa-AF), Finnish (fi-FI), French (fr-FR), French Canadian (fr-CA), Galician (gl-ES), Georgian (ka-GE), German (de-DE), German Swiss (de-CH), Greek (el-GR), Gujarati (gu-IN), Haitian Creole (ht-HT), Hausa (ha-NG), Hebrew (he-IL), Hindi Indian (hi-IN), Hungarian (hu-HU), Icelandic (is-IS), Indonesian (id-ID), Italian (it-IT), Japanese (ja-JP), Javanese (jv-ID), Kabyle (kab-DZ), Kannada (kn-IN), Kazakh (kk-KZ), Khmer (km-KH), Kinyarwanda (rw-RW), Korean (ko-KR), Kyrgyz (ky-KG), Latvian (lv-LV), Lithuanian (lt-LT), Luganda (lg-IN), Macedonian (mk-MK), Malay (ms-MY), Malayalam (ml-IN), Maltese (mt-MT), Marathi (mr-IN), Meadow Mari (mhr-RU), Mongolian (mn-MN), Nepali (ne-NP), Norwegian Bokmål (no-NO), Odia/Oriya (or-IN), Pashto (ps-AF), Polish (pl-PL), Portuguese (pt-PT), Portuguese Brazilian (pt-BR), Punjabi (pa-IN), Romanian (ro-RO), Russian (ru-RU), Serbian (sr-RS), Sinhala (si-LK), Slovak (sk-SK), Slovenian (sl-SI), Somali (so-SO), Spanish (es-ES), Spanish Mexican (es-MX), Spanish US (es-US), Sundanese (su-ID), Swahili Kenya (sw-KE), Swahili Burundi (sw-BI), Swahili Rwanda (sw-RW), Swahili Tanzania (sw-TZ), Swahili Uganda (sw-UG), Swedish (sv-SE), Tagalog/Filipino (tl-PH), Tamil (ta-IN), Tatar (tt-RU), Telugu (te-IN), Thai (th-TH), Turkish (tr-TR), Ukrainian (uk-UA), Uyghur (ug-CN), Uzbek (uz-UZ), Vietnamese (vi-VN), Welsh (cy-WL), Wolof (wo-SN), Zulu (zu-ZA) [9]
Input types: audio file via Amazon S3 (batch), media stream via HTTP/2 (streaming), media stream via WebSocket (streaming), FLAC (recommended lossless), WAV with PCM 16-bit encoding (recommended lossless), single-channel audio, dual-channel audio, sample rates 8,000 Hz to 48,000 Hz
Output types: JSON transcript with full text, word-level timestamps (start time, end time), confidence scores per word, speaker-labeled transcript (diarization), channel-identified transcript, SRT/VTT subtitles (batch), redacted transcript (PII removed), call analytics JSON with sentiment and categories
Webhooks: No [10]
Sandbox / test mode: No [11]
SDK languages: Python (batch), Python (streaming), JavaScript/Node.js (streaming), Java V2 (streaming), C++ (streaming), Ruby V3, PHP V3, Go [12]
MCP server: No

Trust & compliance

SOC 2: SOC 2 Type II [13]
HIPAA: Yes [14]
GDPR: Yes [15]
ISO 27001: Yes [16]
PCI DSS: Yes [17]
Published SLA: Yes [18]
Rate limits: Concurrent transcription jobs: 250 (adjustable). Concurrent streams (HTTP/2 + WebSocket): 25 (adjustable). StartTranscriptionJob: 25 TPS (adjustable). StartStreamTranscription: 25 TPS (adjustable). Maximum audio file length: 28,800 seconds (8 hours). Maximum audio file size: 2 GB. Minimum audio file duration: 500 milliseconds. Job records retained: 90 days. [19]docs.aws.amazon.com/general/latest/gr/transcribe.html“Number of concurrent transcription jobs: Each supported Region: 250 (Adjustable Yes). Number of concurrent streams (HTTP/2 + Websocket): Each supported Region: 25 (Adjustable Yes). Transactions per second, StartTranscriptionJob: Each supported Region: 25 per second. Maximum audio file length: 28,800 Seconds (No). Maximum audio file size: 2 Gigabytes (No).”
Known restrictions: Maximum audio file length: 28,800 seconds (8 hours) for standard batch, Maximum audio file size: 2 GB, Maximum audio file length for Medical batch: 14,400 seconds (4 hours), Maximum audio file length for Call Analytics batch: 14,400 seconds (4 hours), Streaming sessions limited to 4 hours per open connection, Media with more than two channels is not currently supported, Amazon Transcribe Medical is only available in US English, Automatic content redaction does not remove PII from source audio files, only transcripts, Custom language model training limited to 5 concurrent jobs and 10 models per account by default, Billing in one-second increments with a 15-second minimum per request [20]docs.aws.amazon.com/general/latest/gr/transcribe.html“Maximum audio file length: Each supported Region: 28,800 Seconds | No. Maximum audio file size: Each supported Region: 2 Gigabytes | No. Minimum audio file duration: Each supported Region: 500 Milliseconds | No.”docs.aws.amazon.com/transcribe/latest/dg/how-input.html“Amazon Transcribe supports single-channel and dual-channel media. Media with more than two channels is not currently supported.”

Developer surface

Docs rendering: static

Integration

API style: rest
Base URL: https://transcribe.{region}.amazonaws.com
Versioning: none
Stability: ga
Auth methods: hmac_signature
Error format: vendor-specific
Rate limit: 25 / second

SDKs

Python (batch) boto3 · repo
Python (streaming) amazon-transcribe · repo
JavaScript/Node.js (streaming) @aws-sdk/client-transcribe-streaming · repo
Java V2 (streaming) software.amazon.awssdk:transcribestreaming · repo
C++ (streaming) aws-cpp-sdk-transcribestreaming · repo
Ruby V3 aws-sdk-transcribestreamingservice · repo
PHP V3 aws/aws-sdk-php · repo
Go github.com/aws/aws-sdk-go-v2/service/transcribestreaming · repo

Adoption & maturity

Launched: 2017-11-29
GA: 2018-04-04

Other Speech-to-Text & Transcription APIs

ElevenLabs Scribe (Speech to Text)
"Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."
Hybrid · free tier · public pricing · self-serve
Azure AI Speech to Text
"Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."
Usage · free tier · public pricing · self-serve
Google Cloud Speech-to-Text
"Accurate voice typing and transcription powered by Gemini."
Usage · free tier · public pricing · self-serve
IBM watsonx Speech to Text
"IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."
Usage · free tier · public pricing · self-serve
AssemblyAI
"Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."
Usage · public pricing · self-serve
Speechmatics
"Low-latency speech-to-text for multilingual, multi-speaker conversations."
Usage · free tier · public pricing · self-serve

Amazon Transcribe alternatives · Amazon Transcribe vs ElevenLabs Scribe (Speech to Text) · All Speech-to-Text & Transcription APIs APIs

References

Each field above carries a numbered source - hover for a preview, click to jump here.

↑Description: docs.aws.amazon.com · aws.amazon.com
↑Pricing model: aws.amazon.com · docs.aws.amazon.com
↑Published pricing: aws.amazon.com
↑Free tier: aws.amazon.com · aws.amazon.com
↑Free tier details: aws.amazon.com
↑Enterprise plan: aws.amazon.com
↑Supported actions: docs.aws.amazon.com
↑Regions: docs.aws.amazon.com
↑Languages: docs.aws.amazon.com
↑Webhooks: docs.aws.amazon.com
↑Sandbox: docs.aws.amazon.com
↑SDK languages: docs.aws.amazon.com
↑SOC 2: aws.amazon.com
↑HIPAA: docs.aws.amazon.com · aws.amazon.com
↑GDPR: aws.amazon.com
↑ISO 27001: aws.amazon.com
↑PCI DSS: aws.amazon.com
↑Published SLA: aws.amazon.com
↑Rate limits: docs.aws.amazon.com
↑Known restrictions: docs.aws.amazon.com · docs.aws.amazon.com

Change history

Every field change, who made it, and when - from our audited data pipeline and editors.

2026-06-21 Capabilities: {} → {"medical":true,"pii_redaction":true,"real_time_streaming":true,"speaker_diariz…
2026-06-21 Summary Md: (none) → Amazon Transcribe is an automatic speech recognition service from AWS that conv…
2026-06-21 Score Setup Speed: (none) → 85
2026-06-21 Score Pricing Transparency: (none) → 100
2026-06-21 Score Docs Quality: (none) → 15
2026-06-21 Score Procurement Friction: (none) → 100
2026-06-21 Score Trust Readiness: (none) → 100
2026-06-21 Best For: (none) → Prototypes and side projects - free to start, no sales call, Regulated or enter…
2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
2026-06-21 Score Agent Friendliness: (none) → 30
2026-06-21 Has Structured Data: (none) → Yes
2026-06-21 Status Page URL: (none) → https://status.aws.amazon.com
2026-06-21 Docs URL: (none) → https://docs.aws.amazon.com/
2026-06-21 Llms Txt Present: (none) → No
2026-06-21 Rendering: (none) → static
2026-06-21 Robots Allows Agents: (none) → Yes
2026-06-21 SDK Packages: set to Python (batch), Python (streaming), JavaScript/Node.js (streaming), Java V2 (st…
2026-06-21 MCP Server Available: set to No
2026-06-21 Pricing Model: set to usage_based
2026-06-21 Has Published Pricing: set to Yes
2026-06-21 Free Tier Available: set to Yes
2026-06-21 Free Tier Details: set to 60 minutes per month for the first 12 months after account creation, shared acr…
2026-06-21 Self Serve Signup: set to Yes
2026-06-21 Requires Sales Call: set to No
2026-06-21 Enterprise Plan Available: set to Yes
2026-06-21 SOC 2: set to type_2
2026-06-21 HIPAA: set to Yes
2026-06-21 GDPR: set to Yes
2026-06-21 ISO 27001: set to Yes
2026-06-21 PCI DSS: set to Yes
2026-06-21 SLA Published: set to Yes
2026-06-21 SLA URL: set to https://aws.amazon.com/ai/services/language-sla/
2026-06-21 Data Retention Policy URL: set to https://docs.aws.amazon.com/transcribe/latest/dg/opt-out.html
2026-06-21 Documented Rate Limits: set to Concurrent transcription jobs: 250 (adjustable). Concurrent streams (HTTP/2 + W…
2026-06-21 Rate Limit Requests: set to 25
2026-06-21 Rate Limit Window: set to second
2026-06-21 Auth Methods: set to hmac_signature
2026-06-21 Auth Docs URL: set to https://docs.aws.amazon.com/transcribe/latest/dg/security-iam.html
2026-06-21 API Style: set to rest
2026-06-21 Base URL: set to https://transcribe.{region}.amazonaws.com
2026-06-21 Stability: set to ga
2026-06-21 Quickstart URL: set to https://docs.aws.amazon.com/transcribe/latest/dg/getting-started.html
2026-06-21 Error Format: set to vendor-specific
2026-06-21 Requires Verification: set to No
2026-06-21 Starting Price Usd: set to 0.006
2026-06-21 Price Basis: set to minute
2026-06-21 Free Tier Limit: set to 60 minutes/month for 12 months
2026-06-21 Launched At: set to 2017-11-29
2026-06-21 GA Date: set to 2018-04-04
2026-06-21 Notable Customers: set to (none)

Suggest an edit / leave a review

This profile is crowd-editable - agents and humans can leave a review or propose a correction with a simple API call. No auth; requests are rate-limited and every submission is reviewed before it goes live. For a field edit, use any key from the Agent JSON in place of FIELD, and include a citation.

Leave a review or comment

curl -X POST https://apio.sh/api/feedback/aws-transcribe \
  -H 'Content-Type: application/json' \
  -d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'

Suggest a correction to a field (cite a source)

curl -X POST https://apio.sh/api/suggest/aws-transcribe/FIELD \
  -H 'Content-Type: application/json' \
  -d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'

All the ways to contribute →

Best for / Avoid if

Pricing & procurement

Capabilities

Trust & compliance

Developer surface

Integration

Adoption & maturity

Other Speech-to-Text & Transcription APIs

ElevenLabs Scribe (Speech to Text)

Azure AI Speech to Text

Google Cloud Speech-to-Text

IBM watsonx Speech to Text

AssemblyAI

Speechmatics

References

Change history

Suggest an edit / leave a review