Gladia

"End-to-end audio infrastructure to record, transcribe and enrich audio through a single API" [1]

Speech-to-Text & Transcription APIs

www.gladia.io · By Gladia · Agent JSON · Suggest an edit · Last verified 2026-06-21 · Source confidence: high

Gladia is an audio infrastructure API covering batch and real-time speech-to-text transcription, speaker diarization, translation, summarization, sentiment and emotion analysis, and named entity recognition, targeting voice agents, contact centers, meeting assistants, and media captioning workflows. Pricing is usage-based at $0.61 per hour with a free tier of 10 hours per month and no sales call required to start. The API is REST-based with TypeScript, JavaScript, and Python SDKs, webhooks, and an MCP server, and is hosted in EU (France, default) and US regions. Gladia holds SOC 2 Type II, HIPAA, and GDPR compliance, and counts Aircall, Citibank, Samsung, Oracle, and Microsoft among its customers.

Best for / Avoid if

Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt)

Pricing & procurement

Pricing model: Usage-based [2]
Published pricing: Yes [3]
Free tier: Yes [4]
Free tier details: 10 hours free monthly on the Starter (Pay-as-you-go) plan; no credit card required; free plan limited to 3 concurrent async transcriptions and 1 live session
Self-serve signup: Yes [5]
Requires sales call: No
Enterprise plan: Yes [6]

Published prices
Plan	Item	Per	Amount	Source
Starter	Async transcription	hour of audio	$0.61	source
Starter	Real-time (streaming) transcription	hour of audio	$0.75	source
Starter	Free monthly included audio	10 hours per month	$0	source
Growth	Async transcription (committed volume, as low as)	hour of audio	$0.2	source
Growth	Real-time (streaming) transcription (committed volume, as low as)	hour of audio	$0.25	source
Enterprise	Async and real-time transcription (custom pricing)		-	source

Capabilities

Real-time streaming
Speaker diarization
Speech translation
PII redaction

Supported actions: transcribe_batch, transcribe_streaming, speaker_diarization, language_detection, code_switching, word_timestamps, translation, summarization, chapterization, sentiment_analysis, emotion_analysis, named_entity_recognition, pii_redaction, subtitle_generation_srt, subtitle_generation_vtt, custom_vocabulary, audio_to_llm_prompting, file_upload, url_input, webhook_callbacks, job_status_polling [7]
Regions: EU (France, default), US [8]
Languages: Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Wolof, Yiddish, Yoruba, 139 languages total per vendor docs (enumeration above shows the core 100; full list at docs.gladia.io/chapters/limits-and-specifications/languages) [9]
Input types: audio/aac, audio/ac3, audio/eac3, audio/flac, audio/m4a, audio/mp2, audio/mp3, audio/ogg, audio/opus, audio/wav, video/3g2, video/3gp, video/avi, video/flv, video/m4v, video/mkv, video/mov, video/mp4, video/wmv, file upload, file URL, live WebSocket stream, online video service URL (TikTok, Instagram, Facebook, Vimeo, Dailymotion, LinkedIn) [10]
Output types: JSON, SRT subtitles, VTT subtitles, word timestamps, speaker-labeled transcripts, summaries, chapters, named entities, sentiment scores, translated text
Webhooks: Yes [11]
Sandbox / test mode: No [12]
SDK languages: TypeScript/JavaScript, Python [13]
MCP server: Yes [14]

Trust & compliance

SOC 2: SOC 2 Type II [15]
HIPAA: Yes [16]
GDPR: Yes [17]
ISO 27001: No [18]
PCI DSS: No [19]
Published SLA: No [20]
Rate limits: Free plan: 10 hours/month, 3 concurrent async transcriptions, 1 concurrent live session. Paid plan: 25 concurrent async transcriptions, 30 concurrent live sessions, queue of up to 300 async requests. Enterprise: on-demand/customizable. Single live session max duration: 3 hours. Pre-recorded max file duration: 135 minutes (Enterprise: 4h15m). Max file size: 1000 MB. [21]docs.gladia.io/chapters/limits-and-specifications/concurrency“Free Tier: Monthly usage 10 Hours, Pre-recorded concurrency: 3, Live concurrency: 1. Paid Tier: Pre-recorded concurrency: 25, Live concurrency: 30, Queue capacity: Up to 300. Live session duration: A single realtime (live) transcription session cannot exceed 3 hours”docs.gladia.io/chapters/limits-and-specifications/pages/supported-formats“The maximum length of audio that can be transcribed in a single request is currently 135 minutes ... Enterprise plans support up to 4 hours 15 minutes. File size limit: 1000 MB”
Known restrictions: Maximum pre-recorded audio duration: 135 minutes per request (up to 4h15m on Enterprise), Maximum file size: 1,000 MB, Real-time WebSocket session limited to 3 hours, Free plan capped at 10 hours/month total usage, Free plan data may be used for model training; paid plans opt out by default, Wolof language does not support automatic language detection/code-switching [22]

Developer surface

Docs rendering: static · llms.txt present

Integration

API style: rest
Base URL: https://api.gladia.io
Version: v2
Versioning: url
Stability: ga
Auth methods: api_key
Idempotency keys: No
Error format: vendor-specific
Webhook signing: hmac_sha256
Rate limit: 25 / concurrent

SDKs

TypeScript/JavaScript @gladiaio/sdk · repo
Python gladiaio-sdk · repo

Adoption & maturity

Launched: 2022-01-01
GA: 2023-06-01
Notable customers: Aircall, Attention, Recall, VEED, Mojo, Livestorm, Daily, Carv, Citibank, Samsung, Oracle, Microsoft, SoftBank

Other Speech-to-Text & Transcription APIs

ElevenLabs Scribe (Speech to Text)
"Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."
Hybrid · free tier · public pricing · self-serve
Azure AI Speech to Text
"Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."
Usage · free tier · public pricing · self-serve
Amazon Transcribe
"Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application."
Usage · free tier · public pricing · self-serve
Google Cloud Speech-to-Text
"Accurate voice typing and transcription powered by Gemini."
Usage · free tier · public pricing · self-serve
IBM watsonx Speech to Text
"IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."
Usage · free tier · public pricing · self-serve
AssemblyAI
"Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."
Usage · public pricing · self-serve

Gladia alternatives · Gladia vs ElevenLabs Scribe (Speech to Text) · All Speech-to-Text & Transcription APIs APIs

References

Each field above carries a numbered source - hover for a preview, click to jump here.

↑Description: gladia.io
↑Pricing model: gladia.io · gladia.io
↑Published pricing: gladia.io
↑Free tier: gladia.io · docs.gladia.io
↑Self-serve signup: gladia.io
↑Enterprise plan: gladia.io
↑Supported actions: docs.gladia.io
↑Regions: gladia.io · gladia.io
↑Languages: docs.gladia.io
↑Input types: docs.gladia.io
↑Webhooks: docs.gladia.io
↑Sandbox: gladia.io
↑SDK languages: docs.gladia.io
↑MCP server: github.com
↑SOC 2: gladia.io · gladia.io
↑HIPAA: gladia.io · gladia.io
↑GDPR: gladia.io · gladia.io
↑ISO 27001: gladia.io
↑PCI DSS: gladia.io
↑Published SLA: gladia.io · gladia.io
↑Rate limits: docs.gladia.io · docs.gladia.io
↑Known restrictions: docs.gladia.io · gladia.io

Change history

Every field change, who made it, and when - from our audited data pipeline and editors.

2026-06-21 Capabilities: {} → {"translation":true,"pii_redaction":true,"real_time_streaming":true,"speaker_di…
2026-06-21 Summary Md: (none) → Gladia is an audio infrastructure API covering batch and real-time speech-to-te…
2026-06-21 Score Pricing Transparency: (none) → 100
2026-06-21 Score Setup Speed: (none) → 80
2026-06-21 Score Docs Quality: (none) → 35
2026-06-21 Score Procurement Friction: (none) → 100
2026-06-21 Score Trust Readiness: (none) → 55
2026-06-21 Best For: (none) → Prototypes and side projects - free to start, no sales call, Regulated or enter…
2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
2026-06-21 Score Agent Friendliness: (none) → 55
2026-06-21 Llms Txt Present: (none) → Yes
2026-06-21 Rendering: (none) → static
2026-06-21 Has Structured Data: (none) → Yes
2026-06-21 Robots Allows Agents: (none) → No
2026-06-21 Status Page URL: (none) → https://status.gladia.io
2026-06-21 Changelog URL: (none) → https://www.gladia.io/changelog
2026-06-21 Docs URL: (none) → https://docs.gladia.io/chapters/introduction
2026-06-21 Llms Txt URL: (none) → https://www.gladia.io/llms.txt
2026-06-21 MCP Server Available: set to Yes
2026-06-21 Pricing Model: set to usage_based
2026-06-21 Has Published Pricing: set to Yes
2026-06-21 Free Tier Available: set to Yes
2026-06-21 Free Tier Details: set to 10 hours free monthly on the Starter (Pay-as-you-go) plan; no credit card requi…
2026-06-21 Self Serve Signup: set to Yes
2026-06-21 Requires Sales Call: set to No
2026-06-21 Enterprise Plan Available: set to Yes
2026-06-21 SOC 2: set to type_2
2026-06-21 HIPAA: set to Yes
2026-06-21 GDPR: set to Yes
2026-06-21 Data Retention Policy URL: set to https://www.gladia.io/security
2026-06-21 Documented Rate Limits: set to Free plan: 10 hours/month, 3 concurrent async transcriptions, 1 concurrent live…
2026-06-21 Rate Limit Requests: set to 25
2026-06-21 Rate Limit Window: set to concurrent
2026-06-21 Known Restrictions: set to Maximum pre-recorded audio duration: 135 minutes per request (up to 4h15m on En…
2026-06-21 Auth Methods: set to api_key
2026-06-21 Auth Docs URL: set to https://docs.gladia.io/api-reference/authentication
2026-06-21 API Style: set to rest
2026-06-21 Base URL: set to https://api.gladia.io
2026-06-21 API Version: set to v2
2026-06-21 Versioning Scheme: set to url
2026-06-21 Stability: set to ga
2026-06-21 Deprecation Policy URL: set to https://docs.gladia.io/chapters/get-started/pages/migration-from-v1
2026-06-21 MCP URL: set to https://github.com/gladiaio/mcp-gladia
2026-06-21 Quickstart URL: set to https://docs.gladia.io/chapters/pre-recorded-stt/quickstart
2026-06-21 Idempotency Supported: set to No
2026-06-21 Error Format: set to vendor-specific
2026-06-21 Webhook Signing: set to hmac_sha256
2026-06-21 Webhook Events URL: set to https://support.gladia.io/article/how-to-set-up-webhooks-for-real-time-notifica…
2026-06-21 SLA Published: set to No
2026-06-21 Starting Price Usd: set to 0.61

Suggest an edit / leave a review

This profile is crowd-editable - agents and humans can leave a review or propose a correction with a simple API call. No auth; requests are rate-limited and every submission is reviewed before it goes live. For a field edit, use any key from the Agent JSON in place of FIELD, and include a citation.

Leave a review or comment

curl -X POST https://apio.sh/api/feedback/gladia \
  -H 'Content-Type: application/json' \
  -d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'

Suggest a correction to a field (cite a source)

curl -X POST https://apio.sh/api/suggest/gladia/FIELD \
  -H 'Content-Type: application/json' \
  -d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'

All the ways to contribute →

Best for / Avoid if

Pricing & procurement

Capabilities

Trust & compliance

Developer surface

Integration

Adoption & maturity

Other Speech-to-Text & Transcription APIs

ElevenLabs Scribe (Speech to Text)

Azure AI Speech to Text

Amazon Transcribe

Google Cloud Speech-to-Text

IBM watsonx Speech to Text

AssemblyAI

References

Change history

Suggest an edit / leave a review