AssemblyAI
"Voice AI infrastructure for developers building products that transcribe, understand, and act on speech." [1]
AssemblyAI is a voice AI platform providing speech-to-text transcription, speaker diarization, and audio intelligence features via REST API, aimed at developers building products on top of speech data. Pricing is usage-based at $0.0025 per minute with a $50 one-time free credit requiring no credit card, and enterprise plans are available. The service holds SOC 2 Type II, HIPAA, GDPR, ISO 27001, and PCI DSS certifications, with data processed in the US and EU. Customers include Zoom, Spotify, and Dovetail, and SDKs are actively maintained for Python and Node.js.
Best for / Avoid if
Best for: Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt); Teams needing broad API coverage out of the box
Avoid if: You want to try it free before paying
Pricing & procurement
- Pricing model
- Usage-based [2]
- Published pricing
- ✓ Yes [3]
- Free tier
- ✗ No [4]
- Free tier details
- $50 one-time credit on signup, no credit card required (not a recurring free tier)
- Self-serve signup
- ✓ Yes [5]
- Requires sales call
- ✗ No
- Enterprise plan
- ✓ Yes [6]
| Plan | Item | Per | Amount | Source |
|---|---|---|---|---|
| Pay As You Go | Batch transcription (Universal-3 Pro model) | hour of audio | $0.21 | source |
| Pay As You Go | Batch transcription (Universal-2 model) | hour of audio | $0.15 | source |
| Pay As You Go | Streaming transcription (Universal-3 Pro Streaming / u3-rt-pro) | hour of audio | $0.45 | source |
| Pay As You Go | Streaming transcription (Universal-Streaming English) | hour of audio | $0.15 | source |
| Pay As You Go | Streaming transcription (Universal-Streaming Multilingual) | hour of audio | $0.15 | source |
| Pay As You Go | Speaker Identification add-on | hour of audio | $0.02 | source |
| Pay As You Go | Translation add-on | hour of audio | $0.06 | source |
| Pay As You Go | Custom Formatting add-on | hour of audio | $0.03 | source |
| Pay As You Go | Entity Detection add-on | hour of audio | $0.08 | source |
| Pay As You Go | Sentiment Analysis add-on | hour of audio | $0.02 | source |
| Pay As You Go | Key Phrases / Auto Highlights add-on | hour of audio | $0.01 | source |
| Pay As You Go | Topic Detection (IAB) add-on | hour of audio | $0.15 | source |
| Pay As You Go | Medical Mode add-on | hour of audio | $0.15 | source |
| Pay As You Go | Speaker Diarization (async standard) add-on | hour of audio | $0.02 | source |
| Pay As You Go | Speaker Diarization (async experimental) add-on | hour of audio | $0.065 | source |
| Pay As You Go | Speaker Diarization (streaming) add-on | hour of audio | $0.12 | source |
| Pay As You Go | Keyterms Prompting add-on | hour of audio | $0.05 | source |
| Pay As You Go | General Prompting (Beta) add-on (U3 Pro only) | hour of audio | $0.05 | source |
| Pay As You Go | Voice Focus add-on (U3 Pro Streaming only) | hour of audio | $0.1 | source |
| Pay As You Go | Profanity Filtering add-on | hour of audio | $0.01 | source |
| Pay As You Go | PII Audio Redaction add-on | hour of audio | $0.05 | source |
| Pay As You Go | PII Text Redaction add-on | hour of audio | $0.08 | source |
| Pay As You Go | Content Moderation add-on | hour of audio | $0.15 | source |
Capabilities
- Supported actions
- transcribe_batch, transcribe_streaming, speaker_diarization, language_detection, word_timestamps, sentiment_analysis, entity_detection, topic_detection, pii_redaction, pii_audio_redaction, profanity_filtering, content_moderation, multichannel_transcription, custom_vocabulary, keyterms_prompting, webhook_notifications, medical_mode, automatic_punctuation, code_switching_detection, translation, voice_focus_noise_reduction, general_prompting [7]
- Regions
- United States, European Union (Dublin, Ireland) [8]
- Languages
- English, Spanish, French, German, Indonesian, Italian, Japanese, Dutch, Polish, Portuguese, Russian, Swedish, Turkish, Ukrainian, Catalan, Arabic, Azerbaijani, Bulgarian, Bosnian, Mandarin Chinese, Czech, Danish, Greek, Estonian, Finnish, Galician, Hebrew, Hindi, Croatian, Hungarian, Korean, Macedonian, Malay, Norwegian, Romanian, Slovak, Swiss German, Tagalog, Thai, Urdu, Vietnamese, Afrikaans, Belarusian, Welsh, Persian, Armenian, Icelandic, Kazakh, Lithuanian, Latvian, Maori, Marathi, Slovenian, Swahili, Tamil, Amharic, Assamese, Bengali, Gujarati, Hausa, Javanese, Georgian, Khmer, Kannada, Luxembourgish, Lingala, Lao, Malayalam, Mongolian, Maltese, Burmese, Nepali, Occitan, Punjabi, Pashto, Sindhi, Shona, Somali, Serbian, Telugu, Tajik, Uzbek, Yoruba [9]
- Input types
- audio/mpeg (MP3), audio/wav (WAV), audio/aac (AAC), audio/flac (FLAC), audio/ogg (OGG, OGA, MOGG), audio/opus (OPUS), audio/aiff (AIF, AIFF), audio/alac (ALAC), audio/amr (AMR), audio/mp4 (M4A, M4B, M4P, M4R), audio/ac3 (AC3), audio/ape (APE), audio/dss (DSS), audio/flv (FLV), audio/wma (WMA), audio/wv (WV), audio/qcp (QCP), audio/tta (TTA), audio/voc (VOC), video/mp4 (MP4, M4V), video/webm (WEBM), video/quicktime (MOV), video/ts (TS, MTS, M2TS), video/mp2 (MP2), video/mxf (MXF), file URL, local file upload, WebSocket (streaming), live audio stream [10]
- Output types
- JSON transcript with word-level timestamps, confidence scores, speaker-labeled utterances, sentiment analysis results, entity detection results, topic/IAB category results, SRT captions, VTT captions, redacted audio (beep or silence)
- Webhooks
- ✓ Yes [11]
- Sandbox / test mode
- ✗ No [12]
- SDK languages
- Python, Node.js, C#/.NET, Java [13]
- MCP server
- ✓ Yes [14]
Trust & compliance
- SOC 2
- SOC 2 Type II [15]
- HIPAA
- ✓ Yes [16]
- GDPR
- ✓ Yes [17]
- ISO 27001
- ✓ Yes [18]
- PCI DSS
- ✓ Yes [19]
- Published SLA
- ✓ Yes [20]
- Rate limits
- Free tier: 5 new streams per minute (streaming); Pay-as-you-go: 100 new streams per minute. Max file size for /v2/transcript: 5GB; max audio duration: 10 hours. Max local file upload via /v2/upload: 2.2GB. General API rate limit: 20,000 requests per 5 minutes. [21]
- Known restrictions
- Maximum file size for transcription endpoint: 5GB, Maximum audio duration: 10 hours, Maximum local file upload size: 2.2GB, Free-tier streaming concurrency: 5 new streams/minute, Pay-as-you-go streaming concurrency: 100 new streams/minute, Java, C#/.NET, Go, and Ruby SDKs discontinued April 2025; only Python and JavaScript SDKs are actively maintained, Summarization and Auto Chapters features deprecated (migrate to LLM Gateway)
Developer surface
Integration
Adoption & maturity
- Launched
- 2017-01-01
- Notable customers
- Zoom, Spotify, Veed, CallRail, Dovetail, Calabrio, Kapwing, Jiminny, Grain, Supernormal
Other Speech-to-Text & Transcription APIs
ElevenLabs Scribe (Speech to Text)
"Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."
Azure AI Speech to Text
"Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."
Amazon Transcribe
"Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application."
Google Cloud Speech-to-Text
"Accurate voice typing and transcription powered by Gemini."
IBM watsonx Speech to Text
"IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."
Speechmatics
"Low-latency speech-to-text for multilingual, multi-speaker conversations."
References
- ↑Description: assemblyai.com
- ↑Pricing model: assemblyai.com
- ↑Published pricing: assemblyai.com
- ↑Free tier: assemblyai.com
- ↑Self-serve signup: assemblyai.com
- ↑Enterprise plan: assemblyai.com
- ↑Supported actions: assemblyai.com
- ↑Regions: assemblyai.com
- ↑Languages: assemblyai.com · assemblyai.com
- ↑Input types: support.assemblyai.com
- ↑Webhooks: assemblyai.com
- ↑Sandbox: assemblyai.com
- ↑SDK languages: github.com · assemblyai.com · github.com
- ↑MCP server: assemblyai.com · assemblyai.com
- ↑SOC 2: assemblyai.com
- ↑HIPAA: assemblyai.com
- ↑GDPR: assemblyai.com · assemblyai.com
- ↑ISO 27001: assemblyai.com
- ↑PCI DSS: assemblyai.com
- ↑Published SLA: assemblyai.com
- ↑Rate limits: assemblyai.com · assemblyai.com
Change history
- 2026-06-21 Capabilities: {} → {"medical":true,"translation":true,"pii_redaction":true,"real_time_streaming":t…
- 2026-06-21 Summary Md: (none) → AssemblyAI is a voice AI platform providing speech-to-text transcription, speak…
- 2026-06-21 Score Pricing Transparency: (none) → 85
- 2026-06-21 Score Setup Speed: (none) → 60
- 2026-06-21 Score Docs Quality: (none) → 75
- 2026-06-21 Score Procurement Friction: (none) → 85
- 2026-06-21 Score Trust Readiness: (none) → 100
- 2026-06-21 Best For: (none) → Regulated or enterprise workloads - compliance attestations and an enterprise p…
- 2026-06-21 Avoid If: (none) → You want to try it free before paying
- 2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
- 2026-06-21 Score Agent Friendliness: (none) → 70
- 2026-06-21 Llms Txt URL: (none) → https://www.assemblyai.com/llms.txt
- 2026-06-21 Llms Txt Present: (none) → Yes
- 2026-06-21 Rendering: (none) → static
- 2026-06-21 Has Structured Data: (none) → No
- 2026-06-21 Robots Allows Agents: (none) → Yes
- 2026-06-21 Openapi Spec URL: (none) → https://www.assemblyai.com/openapi.json
- 2026-06-21 API Reference URL: (none) → https://www.assemblyai.com/docs
- 2026-06-21 Status Page URL: (none) → https://status.assemblyai.com
- 2026-06-21 Changelog URL: (none) → https://www.assemblyai.com/changelog
- 2026-06-21 Docs URL: (none) → https://www.assemblyai.com/docs
- 2026-06-21 Requires Sales Call: set to No
- 2026-06-21 Enterprise Plan Available: set to Yes
- 2026-06-21 SOC 2: set to type_2
- 2026-06-21 HIPAA: set to Yes
- 2026-06-21 GDPR: set to Yes
- 2026-06-21 ISO 27001: set to Yes
- 2026-06-21 PCI DSS: set to Yes
- 2026-06-21 SLA Published: set to Yes
- 2026-06-21 SLA URL: set to https://www.assemblyai.com/security
- 2026-06-21 Data Retention Policy URL: set to https://www.assemblyai.com/legal/privacy-policy
- 2026-06-21 Documented Rate Limits: set to Free tier: 5 new streams per minute (streaming); Pay-as-you-go: 100 new streams…
- 2026-06-21 Rate Limit Requests: set to 20000
- 2026-06-21 Rate Limit Window: set to 5 minutes
- 2026-06-21 Known Restrictions: set to Maximum file size for transcription endpoint: 5GB, Maximum audio duration: 10 h…
- 2026-06-21 Auth Methods: set to api_key
- 2026-06-21 Auth Docs URL: set to https://www.assemblyai.com/docs/api-reference/overview
- 2026-06-21 API Style: set to rest
- 2026-06-21 Base URL: set to https://api.assemblyai.com
- 2026-06-21 API Version: set to v2
- 2026-06-21 Versioning Scheme: set to url
- 2026-06-21 Stability: set to ga
- 2026-06-21 Deprecation Policy URL: set to https://www.assemblyai.com/changelog
- 2026-06-21 MCP URL: set to https://assemblyai.com/docs/mcp
- 2026-06-21 Quickstart URL: set to https://www.assemblyai.com/docs/getting-started
- 2026-06-21 Idempotency Supported: set to No
- 2026-06-21 Error Format: set to vendor-specific
- 2026-06-21 Webhook Events URL: set to https://www.assemblyai.com/docs/getting-started/webhooks
- 2026-06-21 Requires Verification: set to No
- 2026-06-21 Starting Price Usd: set to 0.0025
Suggest an edit / leave a review
Leave a review or comment
curl -X POST https://apio.sh/api/feedback/assemblyai \
-H 'Content-Type: application/json' \
-d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'Suggest a correction to a field (cite a source)
curl -X POST https://apio.sh/api/suggest/assemblyai/FIELD \
-H 'Content-Type: application/json' \
-d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'