Gladia
"End-to-end audio infrastructure to record, transcribe and enrich audio through a single API" [1]
Gladia is an audio infrastructure API covering batch and real-time speech-to-text transcription, speaker diarization, translation, summarization, sentiment and emotion analysis, and named entity recognition, targeting voice agents, contact centers, meeting assistants, and media captioning workflows. Pricing is usage-based at $0.61 per hour with a free tier of 10 hours per month and no sales call required to start. The API is REST-based with TypeScript, JavaScript, and Python SDKs, webhooks, and an MCP server, and is hosted in EU (France, default) and US regions. Gladia holds SOC 2 Type II, HIPAA, and GDPR compliance, and counts Aircall, Citibank, Samsung, Oracle, and Microsoft among its customers.
Best for / Avoid if
Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt)
Pricing & procurement
- Pricing model
- Usage-based [2]
- Published pricing
- ✓ Yes [3]
- Free tier
- ✓ Yes [4]
- Free tier details
- 10 hours free monthly on the Starter (Pay-as-you-go) plan; no credit card required; free plan limited to 3 concurrent async transcriptions and 1 live session
- Self-serve signup
- ✓ Yes [5]
- Requires sales call
- ✗ No
- Enterprise plan
- ✓ Yes [6]
| Plan | Item | Per | Amount | Source |
|---|---|---|---|---|
| Starter | Async transcription | hour of audio | $0.61 | source |
| Starter | Real-time (streaming) transcription | hour of audio | $0.75 | source |
| Starter | Free monthly included audio | 10 hours per month | $0 | source |
| Growth | Async transcription (committed volume, as low as) | hour of audio | $0.2 | source |
| Growth | Real-time (streaming) transcription (committed volume, as low as) | hour of audio | $0.25 | source |
| Enterprise | Async and real-time transcription (custom pricing) | - | source |
Capabilities
- Supported actions
- transcribe_batch, transcribe_streaming, speaker_diarization, language_detection, code_switching, word_timestamps, translation, summarization, chapterization, sentiment_analysis, emotion_analysis, named_entity_recognition, pii_redaction, subtitle_generation_srt, subtitle_generation_vtt, custom_vocabulary, audio_to_llm_prompting, file_upload, url_input, webhook_callbacks, job_status_polling [7]
- Regions
- EU (France, default), US [8]
- Languages
- Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Wolof, Yiddish, Yoruba, 139 languages total per vendor docs (enumeration above shows the core 100; full list at docs.gladia.io/chapters/limits-and-specifications/languages) [9]
- Input types
- audio/aac, audio/ac3, audio/eac3, audio/flac, audio/m4a, audio/mp2, audio/mp3, audio/ogg, audio/opus, audio/wav, video/3g2, video/3gp, video/avi, video/flv, video/m4v, video/mkv, video/mov, video/mp4, video/wmv, file upload, file URL, live WebSocket stream, online video service URL (TikTok, Instagram, Facebook, Vimeo, Dailymotion, LinkedIn) [10]
- Output types
- JSON, SRT subtitles, VTT subtitles, word timestamps, speaker-labeled transcripts, summaries, chapters, named entities, sentiment scores, translated text
- Webhooks
- ✓ Yes [11]
- Sandbox / test mode
- ✗ No [12]
- SDK languages
- TypeScript/JavaScript, Python [13]
- MCP server
- ✓ Yes [14]
Trust & compliance
- SOC 2
- SOC 2 Type II [15]
- HIPAA
- ✓ Yes [16]
- GDPR
- ✓ Yes [17]
- ISO 27001
- ✗ No [18]
- PCI DSS
- ✗ No [19]
- Published SLA
- ✗ No [20]
- Rate limits
- Free plan: 10 hours/month, 3 concurrent async transcriptions, 1 concurrent live session. Paid plan: 25 concurrent async transcriptions, 30 concurrent live sessions, queue of up to 300 async requests. Enterprise: on-demand/customizable. Single live session max duration: 3 hours. Pre-recorded max file duration: 135 minutes (Enterprise: 4h15m). Max file size: 1000 MB. [21]
- Known restrictions
- Maximum pre-recorded audio duration: 135 minutes per request (up to 4h15m on Enterprise), Maximum file size: 1,000 MB, Real-time WebSocket session limited to 3 hours, Free plan capped at 10 hours/month total usage, Free plan data may be used for model training; paid plans opt out by default, Wolof language does not support automatic language detection/code-switching [22]
Developer surface
Integration
Adoption & maturity
- Launched
- 2022-01-01
- GA
- 2023-06-01
- Notable customers
- Aircall, Attention, Recall, VEED, Mojo, Livestorm, Daily, Carv, Citibank, Samsung, Oracle, Microsoft, SoftBank
Other Speech-to-Text & Transcription APIs
ElevenLabs Scribe (Speech to Text)
"Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."
Azure AI Speech to Text
"Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."
Amazon Transcribe
"Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application."
Google Cloud Speech-to-Text
"Accurate voice typing and transcription powered by Gemini."
IBM watsonx Speech to Text
"IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."
AssemblyAI
"Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."
References
- ↑Description: gladia.io
- ↑Pricing model: gladia.io · gladia.io
- ↑Published pricing: gladia.io
- ↑Free tier: gladia.io · docs.gladia.io
- ↑Self-serve signup: gladia.io
- ↑Enterprise plan: gladia.io
- ↑Supported actions: docs.gladia.io
- ↑Regions: gladia.io · gladia.io
- ↑Languages: docs.gladia.io
- ↑Input types: docs.gladia.io
- ↑Webhooks: docs.gladia.io
- ↑Sandbox: gladia.io
- ↑SDK languages: docs.gladia.io
- ↑MCP server: github.com
- ↑SOC 2: gladia.io · gladia.io
- ↑HIPAA: gladia.io · gladia.io
- ↑GDPR: gladia.io · gladia.io
- ↑ISO 27001: gladia.io
- ↑PCI DSS: gladia.io
- ↑Published SLA: gladia.io · gladia.io
- ↑Rate limits: docs.gladia.io · docs.gladia.io
- ↑Known restrictions: docs.gladia.io · gladia.io
Change history
- 2026-06-21 Capabilities: {} → {"translation":true,"pii_redaction":true,"real_time_streaming":true,"speaker_di…
- 2026-06-21 Summary Md: (none) → Gladia is an audio infrastructure API covering batch and real-time speech-to-te…
- 2026-06-21 Score Pricing Transparency: (none) → 100
- 2026-06-21 Score Setup Speed: (none) → 80
- 2026-06-21 Score Docs Quality: (none) → 35
- 2026-06-21 Score Procurement Friction: (none) → 100
- 2026-06-21 Score Trust Readiness: (none) → 55
- 2026-06-21 Best For: (none) → Prototypes and side projects - free to start, no sales call, Regulated or enter…
- 2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
- 2026-06-21 Score Agent Friendliness: (none) → 55
- 2026-06-21 Llms Txt Present: (none) → Yes
- 2026-06-21 Rendering: (none) → static
- 2026-06-21 Has Structured Data: (none) → Yes
- 2026-06-21 Robots Allows Agents: (none) → No
- 2026-06-21 Status Page URL: (none) → https://status.gladia.io
- 2026-06-21 Changelog URL: (none) → https://www.gladia.io/changelog
- 2026-06-21 Docs URL: (none) → https://docs.gladia.io/chapters/introduction
- 2026-06-21 Llms Txt URL: (none) → https://www.gladia.io/llms.txt
- 2026-06-21 MCP Server Available: set to Yes
- 2026-06-21 Pricing Model: set to usage_based
- 2026-06-21 Has Published Pricing: set to Yes
- 2026-06-21 Free Tier Available: set to Yes
- 2026-06-21 Free Tier Details: set to 10 hours free monthly on the Starter (Pay-as-you-go) plan; no credit card requi…
- 2026-06-21 Self Serve Signup: set to Yes
- 2026-06-21 Requires Sales Call: set to No
- 2026-06-21 Enterprise Plan Available: set to Yes
- 2026-06-21 SOC 2: set to type_2
- 2026-06-21 HIPAA: set to Yes
- 2026-06-21 GDPR: set to Yes
- 2026-06-21 Data Retention Policy URL: set to https://www.gladia.io/security
- 2026-06-21 Documented Rate Limits: set to Free plan: 10 hours/month, 3 concurrent async transcriptions, 1 concurrent live…
- 2026-06-21 Rate Limit Requests: set to 25
- 2026-06-21 Rate Limit Window: set to concurrent
- 2026-06-21 Known Restrictions: set to Maximum pre-recorded audio duration: 135 minutes per request (up to 4h15m on En…
- 2026-06-21 Auth Methods: set to api_key
- 2026-06-21 Auth Docs URL: set to https://docs.gladia.io/api-reference/authentication
- 2026-06-21 API Style: set to rest
- 2026-06-21 Base URL: set to https://api.gladia.io
- 2026-06-21 API Version: set to v2
- 2026-06-21 Versioning Scheme: set to url
- 2026-06-21 Stability: set to ga
- 2026-06-21 Deprecation Policy URL: set to https://docs.gladia.io/chapters/get-started/pages/migration-from-v1
- 2026-06-21 MCP URL: set to https://github.com/gladiaio/mcp-gladia
- 2026-06-21 Quickstart URL: set to https://docs.gladia.io/chapters/pre-recorded-stt/quickstart
- 2026-06-21 Idempotency Supported: set to No
- 2026-06-21 Error Format: set to vendor-specific
- 2026-06-21 Webhook Signing: set to hmac_sha256
- 2026-06-21 Webhook Events URL: set to https://support.gladia.io/article/how-to-set-up-webhooks-for-real-time-notifica…
- 2026-06-21 SLA Published: set to No
- 2026-06-21 Starting Price Usd: set to 0.61
Suggest an edit / leave a review
Leave a review or comment
curl -X POST https://apio.sh/api/feedback/gladia \
-H 'Content-Type: application/json' \
-d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'Suggest a correction to a field (cite a source)
curl -X POST https://apio.sh/api/suggest/gladia/FIELD \
-H 'Content-Type: application/json' \
-d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'