Groq Speech-to-Text (Whisper)

"Groq API is designed to provide a fast speech-to-text solution, offering OpenAI-compatible endpoints that enable near-instant transcriptions and translations, with the ability to integrate high-quality audio processing into applications at speeds that rival human interaction." [1]

Speech-to-Text & Transcription APIs

groq.com/ · By Groq · Agent JSON · Suggest an edit · Last verified 2026-06-21 · Source confidence: high

Groq Speech-to-Text runs OpenAI-compatible Whisper endpoints (large-v3 and large-v3-turbo) optimized for speed, supporting batch transcription, word and segment timestamps, language detection, and audio translation to English across dozens of languages. Pricing starts at $0.04 per hour of audio on a self-serve basis, with a generous free tier covering 2,000 requests and 28,800 audio seconds per day. SDKs are available for Python, Node.js, C#, and PHP, an MCP server is available, and the platform holds SOC 2 Type 2, HIPAA, and GDPR certifications with an enterprise plan for higher rate limits.

Best for / Avoid if

Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt)

Pricing & procurement

Pricing model: Usage-based [2]
Published pricing: Yes [3]
Free tier: Yes [4]
Free tier details: Free Plan available at no cost with community support and zero-data retention option. Rate limits on free tier: 20 RPM, 2,000 RPD, 7,200 audio seconds/hour (ASH), 28,800 audio seconds/day (ASD) for Whisper models. Max file size 25 MB. No monetary charge; perpetual free access within these limits.
Self-serve signup: Yes
Requires sales call: No
Enterprise plan: Yes [5]

Published prices
Plan	Item	Per	Amount	Source
Pay As You Go	Whisper Large v3 (whisper-large-v3) transcription	hour of audio transcribed	$0.111	source
Pay As You Go	Whisper Large v3 Turbo (whisper-large-v3-turbo) transcription	hour of audio transcribed	$0.04	source

Capabilities

Speech translation

Supported actions: transcribe_batch, translation_to_english, word_timestamps, segment_timestamps, language_detection, prompt_guided_transcription [6]
Regions: United States, Global (multi-region)
Languages: Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Cantonese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba [7]console.groq.com/docs/model/whisper-large-v3“Whisper Large v3 supports 99+ languages, making it exceptionally versatile for global applications.”openai-whisper.mintlify.app/concepts/languages“Whisper supports 99 languages total. [Full enumeration: Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Cantonese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba]”console.groq.com/docs/model/whisper-large-v3-turbo“Multilingual: Supports 99+ languages”
Input types: audio/flac, audio/mp3, audio/mp4, audio/mpeg, audio/mpga, audio/m4a, audio/ogg, audio/wav, audio/webm, file upload, URL (including Base64URL) [8]
Output types: JSON, text, verbose_json (with word and segment timestamps) [9]
Webhooks: No
Sandbox / test mode: No
SDK languages: Python, Node.js, C#, PHP [10]
MCP server: Yes [11]

Trust & compliance

SOC 2: SOC 2 Type II [12]
HIPAA: Yes [13]
GDPR: Yes [14]
ISO 27001: Unknown
PCI DSS: Unknown
Published SLA: No [15]groq.com/security“Visit the Groq Trust Center to learn more about our information security program, compliance posture, and supporting documentation.”console.groq.com/docs/legal/customer-data-processing-addendum“This Data Processing Addendum contains no Service Level Agreement (SLA) or uptime commitment. The document focuses exclusively on data protection, processing, and security requirements.”
Rate limits: {"free_plan": {"whisper-large-v3": "20 RPM, 2,000 RPD, 7,200 ASH, 28,800 ASD", "whisper-large-v3-turbo": "20 RPM, 2,000 RPD, 7,200 ASH, 28,800 ASD"}, "developer_plan": {"whisper-large-v3": "300 RPM, 200,000 ASH", "whisper-large-v3-turbo": "400 RPM, 400,000 ASH"}, "note": "Higher limits available for enterprise use cases"} [16]
Known restrictions: Free tier: 25 MB max file size per request, Developer tier: 100 MB max file size per request, Minimum billable duration: 10 seconds per request, Single audio track only (first track used from multi-track files), Translation endpoint outputs English only, whisper-large-v3-turbo does not support translation endpoint, Audio shorter than 30 seconds is padded with silence (minimum 0.01 seconds), BAA (HIPAA) excludes beta, alpha, free tier, and experimental features, distil-whisper-large-v3-en was deprecated and sunset on August 23, 2025; migrate to whisper-large-v3-turbo [17]

Developer surface

Docs rendering: static

Integration

API style: rest
Base URL: https://api.groq.com/openai/v1
Version: v1
Versioning: url
Stability: ga
Auth methods: api_key
Idempotency keys: No
Error format: vendor-specific
Rate limit: 20 / minute

SDKs

Python groq · repo
Node.js groq-sdk · repo
C# jgravelle.GroqAPILibrary · repo
PHP lucianotonet/groq-php · repo

Adoption & maturity

Launched: 2024-06-24
GA: 2024-06-24
Notable customers: IBM, PGA of America, Stats Perform, GPTZero, StackAI, Mem0

Other Speech-to-Text & Transcription APIs

ElevenLabs Scribe (Speech to Text)
"Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."
Hybrid · free tier · public pricing · self-serve
Azure AI Speech to Text
"Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."
Usage · free tier · public pricing · self-serve
Amazon Transcribe
"Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application."
Usage · free tier · public pricing · self-serve
Google Cloud Speech-to-Text
"Accurate voice typing and transcription powered by Gemini."
Usage · free tier · public pricing · self-serve
IBM watsonx Speech to Text
"IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."
Usage · free tier · public pricing · self-serve
AssemblyAI
"Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."
Usage · public pricing · self-serve

Groq Speech-to-Text (Whisper) alternatives · Groq Speech-to-Text (Whisper) vs ElevenLabs Scribe (Speech to Text) · All Speech-to-Text & Transcription APIs APIs

References

Each field above carries a numbered source - hover for a preview, click to jump here.

↑Description: console.groq.com
↑Pricing model: groq.com · groq.com
↑Published pricing: groq.com
↑Free tier: groq.com · console.groq.com
↑Enterprise plan: groq.com
↑Supported actions: console.groq.com
↑Languages: console.groq.com · openai-whisper.mintlify.app · console.groq.com
↑Input types: console.groq.com
↑Output types: console.groq.com
↑SDK languages: console.groq.com
↑MCP server: github.com
↑SOC 2: console.groq.com · groq.com
↑HIPAA: console.groq.com · groq.com
↑GDPR: groq.com · console.groq.com
↑Published SLA: groq.com · console.groq.com
↑Rate limits: console.groq.com · console.groq.com
↑Known restrictions: console.groq.com · console.groq.com

Change history

Every field change, who made it, and when - from our audited data pipeline and editors.

2026-06-21 Capabilities: {} → {"translation":true}
2026-06-21 Summary Md: (none) → Groq Speech-to-Text runs OpenAI-compatible Whisper endpoints (large-v3 and larg…
2026-06-21 Score Pricing Transparency: (none) → 100
2026-06-21 Score Agent Friendliness: (none) → 40
2026-06-21 Score Setup Speed: (none) → 85
2026-06-21 Score Docs Quality: (none) → 45
2026-06-21 Score Procurement Friction: (none) → 100
2026-06-21 Score Trust Readiness: (none) → 55
2026-06-21 Best For: (none) → Prototypes and side projects - free to start, no sales call, Regulated or enter…
2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
2026-06-21 Has Structured Data: (none) → No
2026-06-21 Robots Allows Agents: (none) → Yes
2026-06-21 API Reference URL: (none) → https://groq.com/groqcloud
2026-06-21 Status Page URL: (none) → https://status.groq.com
2026-06-21 Changelog URL: (none) → https://groq.com/changelog
2026-06-21 Docs URL: (none) → https://console.groq.com/docs/overview
2026-06-21 Rendering: (none) → static
2026-06-21 Llms Txt Present: (none) → No
2026-06-21 Has Published Pricing: set to Yes
2026-06-21 Free Tier Available: set to Yes
2026-06-21 Free Tier Details: set to Free Plan available at no cost with community support and zero-data retention o…
2026-06-21 Self Serve Signup: set to Yes
2026-06-21 Requires Sales Call: set to No
2026-06-21 Enterprise Plan Available: set to Yes
2026-06-21 SOC 2: set to type_2
2026-06-21 HIPAA: set to Yes
2026-06-21 GDPR: set to Yes
2026-06-21 SLA Published: set to No
2026-06-21 Data Retention Policy URL: set to https://console.groq.com/docs/your-data
2026-06-21 Documented Rate Limits: set to {"free_plan":{"whisper-large-v3":"20 RPM, 2,000 RPD, 7,200 ASH, 28,800 ASD","wh…
2026-06-21 Rate Limit Requests: set to 20
2026-06-21 Rate Limit Window: set to minute
2026-06-21 Known Restrictions: set to Free tier: 25 MB max file size per request, Developer tier: 100 MB max file siz…
2026-06-21 Auth Methods: set to api_key
2026-06-21 Auth Docs URL: set to https://console.groq.com/docs/quickstart
2026-06-21 API Style: set to rest
2026-06-21 Base URL: set to https://api.groq.com/openai/v1
2026-06-21 Versioning Scheme: set to url
2026-06-21 Stability: set to ga
2026-06-21 Deprecation Policy URL: set to https://console.groq.com/docs/deprecations
2026-06-21 Quickstart URL: set to https://console.groq.com/docs/quickstart
2026-06-21 Idempotency Supported: set to No
2026-06-21 Error Format: set to vendor-specific
2026-06-21 Requires Verification: set to No
2026-06-21 Starting Price Usd: set to 0.04
2026-06-21 Slug: set to groq-speech-to-text
2026-06-21 Free Tier Limit: set to 2000 audio requests/day (RPD); 7200 audio seconds/hour (ASH); 28800 audio secon…
2026-06-21 Launched At: set to 2024-06-24
2026-06-21 GA Date: set to 2024-06-24
2026-06-21 Notable Customers: set to IBM, PGA of America, Stats Perform, GPTZero, StackAI, Mem0

Suggest an edit / leave a review

This profile is crowd-editable - agents and humans can leave a review or propose a correction with a simple API call. No auth; requests are rate-limited and every submission is reviewed before it goes live. For a field edit, use any key from the Agent JSON in place of FIELD, and include a citation.

Leave a review or comment

curl -X POST https://apio.sh/api/feedback/groq-speech-to-text \
  -H 'Content-Type: application/json' \
  -d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'

Suggest a correction to a field (cite a source)

curl -X POST https://apio.sh/api/suggest/groq-speech-to-text/FIELD \
  -H 'Content-Type: application/json' \
  -d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'

All the ways to contribute →

Best for / Avoid if

Pricing & procurement

Capabilities

Trust & compliance

Developer surface

Integration

Adoption & maturity

Other Speech-to-Text & Transcription APIs

ElevenLabs Scribe (Speech to Text)

Azure AI Speech to Text

Amazon Transcribe

Google Cloud Speech-to-Text

IBM watsonx Speech to Text

AssemblyAI

References

Change history

Suggest an edit / leave a review