Groq Speech-to-Text (Whisper)
"Groq API is designed to provide a fast speech-to-text solution, offering OpenAI-compatible endpoints that enable near-instant transcriptions and translations, with the ability to integrate high-quality audio processing into applications at speeds that rival human interaction." [1]
Groq Speech-to-Text runs OpenAI-compatible Whisper endpoints (large-v3 and large-v3-turbo) optimized for speed, supporting batch transcription, word and segment timestamps, language detection, and audio translation to English across dozens of languages. Pricing starts at $0.04 per hour of audio on a self-serve basis, with a generous free tier covering 2,000 requests and 28,800 audio seconds per day. SDKs are available for Python, Node.js, C#, and PHP, an MCP server is available, and the platform holds SOC 2 Type 2, HIPAA, and GDPR certifications with an enterprise plan for higher rate limits.
Best for / Avoid if
Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt)
Pricing & procurement
- Pricing model
- Usage-based [2]
- Published pricing
- ✓ Yes [3]
- Free tier
- ✓ Yes [4]
- Free tier details
- Free Plan available at no cost with community support and zero-data retention option. Rate limits on free tier: 20 RPM, 2,000 RPD, 7,200 audio seconds/hour (ASH), 28,800 audio seconds/day (ASD) for Whisper models. Max file size 25 MB. No monetary charge; perpetual free access within these limits.
- Self-serve signup
- ✓ Yes
- Requires sales call
- ✗ No
- Enterprise plan
- ✓ Yes [5]
| Plan | Item | Per | Amount | Source |
|---|---|---|---|---|
| Pay As You Go | Whisper Large v3 (whisper-large-v3) transcription | hour of audio transcribed | $0.111 | source |
| Pay As You Go | Whisper Large v3 Turbo (whisper-large-v3-turbo) transcription | hour of audio transcribed | $0.04 | source |
Capabilities
- Supported actions
- transcribe_batch, translation_to_english, word_timestamps, segment_timestamps, language_detection, prompt_guided_transcription [6]
- Regions
- United States, Global (multi-region)
- Languages
- Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Cantonese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba [7]
- Input types
- audio/flac, audio/mp3, audio/mp4, audio/mpeg, audio/mpga, audio/m4a, audio/ogg, audio/wav, audio/webm, file upload, URL (including Base64URL) [8]
- Output types
- JSON, text, verbose_json (with word and segment timestamps) [9]
- Webhooks
- ✗ No
- Sandbox / test mode
- ✗ No
- SDK languages
- Python, Node.js, C#, PHP [10]
- MCP server
- ✓ Yes [11]
Trust & compliance
- SOC 2
- SOC 2 Type II [12]
- HIPAA
- ✓ Yes [13]
- GDPR
- ✓ Yes [14]
- ISO 27001
- – Unknown
- PCI DSS
- – Unknown
- Published SLA
- ✗ No [15]
- Rate limits
- {"free_plan": {"whisper-large-v3": "20 RPM, 2,000 RPD, 7,200 ASH, 28,800 ASD", "whisper-large-v3-turbo": "20 RPM, 2,000 RPD, 7,200 ASH, 28,800 ASD"}, "developer_plan": {"whisper-large-v3": "300 RPM, 200,000 ASH", "whisper-large-v3-turbo": "400 RPM, 400,000 ASH"}, "note": "Higher limits available for enterprise use cases"} [16]
- Known restrictions
- Free tier: 25 MB max file size per request, Developer tier: 100 MB max file size per request, Minimum billable duration: 10 seconds per request, Single audio track only (first track used from multi-track files), Translation endpoint outputs English only, whisper-large-v3-turbo does not support translation endpoint, Audio shorter than 30 seconds is padded with silence (minimum 0.01 seconds), BAA (HIPAA) excludes beta, alpha, free tier, and experimental features, distil-whisper-large-v3-en was deprecated and sunset on August 23, 2025; migrate to whisper-large-v3-turbo [17]
Developer surface
Integration
Adoption & maturity
- Launched
- 2024-06-24
- GA
- 2024-06-24
- Notable customers
- IBM, PGA of America, Stats Perform, GPTZero, StackAI, Mem0
Other Speech-to-Text & Transcription APIs
ElevenLabs Scribe (Speech to Text)
"Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."
Azure AI Speech to Text
"Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."
Amazon Transcribe
"Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application."
Google Cloud Speech-to-Text
"Accurate voice typing and transcription powered by Gemini."
IBM watsonx Speech to Text
"IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."
AssemblyAI
"Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."
References
- ↑Description: console.groq.com
- ↑Pricing model: groq.com · groq.com
- ↑Published pricing: groq.com
- ↑Free tier: groq.com · console.groq.com
- ↑Enterprise plan: groq.com
- ↑Supported actions: console.groq.com
- ↑Languages: console.groq.com · openai-whisper.mintlify.app · console.groq.com
- ↑Input types: console.groq.com
- ↑Output types: console.groq.com
- ↑SDK languages: console.groq.com
- ↑MCP server: github.com
- ↑SOC 2: console.groq.com · groq.com
- ↑HIPAA: console.groq.com · groq.com
- ↑GDPR: groq.com · console.groq.com
- ↑Published SLA: groq.com · console.groq.com
- ↑Rate limits: console.groq.com · console.groq.com
- ↑Known restrictions: console.groq.com · console.groq.com
Change history
- 2026-06-21 Capabilities: {} → {"translation":true}
- 2026-06-21 Summary Md: (none) → Groq Speech-to-Text runs OpenAI-compatible Whisper endpoints (large-v3 and larg…
- 2026-06-21 Score Pricing Transparency: (none) → 100
- 2026-06-21 Score Agent Friendliness: (none) → 40
- 2026-06-21 Score Setup Speed: (none) → 85
- 2026-06-21 Score Docs Quality: (none) → 45
- 2026-06-21 Score Procurement Friction: (none) → 100
- 2026-06-21 Score Trust Readiness: (none) → 55
- 2026-06-21 Best For: (none) → Prototypes and side projects - free to start, no sales call, Regulated or enter…
- 2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
- 2026-06-21 Has Structured Data: (none) → No
- 2026-06-21 Robots Allows Agents: (none) → Yes
- 2026-06-21 API Reference URL: (none) → https://groq.com/groqcloud
- 2026-06-21 Status Page URL: (none) → https://status.groq.com
- 2026-06-21 Changelog URL: (none) → https://groq.com/changelog
- 2026-06-21 Docs URL: (none) → https://console.groq.com/docs/overview
- 2026-06-21 Rendering: (none) → static
- 2026-06-21 Llms Txt Present: (none) → No
- 2026-06-21 Has Published Pricing: set to Yes
- 2026-06-21 Free Tier Available: set to Yes
- 2026-06-21 Free Tier Details: set to Free Plan available at no cost with community support and zero-data retention o…
- 2026-06-21 Self Serve Signup: set to Yes
- 2026-06-21 Requires Sales Call: set to No
- 2026-06-21 Enterprise Plan Available: set to Yes
- 2026-06-21 SOC 2: set to type_2
- 2026-06-21 HIPAA: set to Yes
- 2026-06-21 GDPR: set to Yes
- 2026-06-21 SLA Published: set to No
- 2026-06-21 Data Retention Policy URL: set to https://console.groq.com/docs/your-data
- 2026-06-21 Documented Rate Limits: set to {"free_plan":{"whisper-large-v3":"20 RPM, 2,000 RPD, 7,200 ASH, 28,800 ASD","wh…
- 2026-06-21 Rate Limit Requests: set to 20
- 2026-06-21 Rate Limit Window: set to minute
- 2026-06-21 Known Restrictions: set to Free tier: 25 MB max file size per request, Developer tier: 100 MB max file siz…
- 2026-06-21 Auth Methods: set to api_key
- 2026-06-21 Auth Docs URL: set to https://console.groq.com/docs/quickstart
- 2026-06-21 API Style: set to rest
- 2026-06-21 Base URL: set to https://api.groq.com/openai/v1
- 2026-06-21 Versioning Scheme: set to url
- 2026-06-21 Stability: set to ga
- 2026-06-21 Deprecation Policy URL: set to https://console.groq.com/docs/deprecations
- 2026-06-21 Quickstart URL: set to https://console.groq.com/docs/quickstart
- 2026-06-21 Idempotency Supported: set to No
- 2026-06-21 Error Format: set to vendor-specific
- 2026-06-21 Requires Verification: set to No
- 2026-06-21 Starting Price Usd: set to 0.04
- 2026-06-21 Slug: set to groq-speech-to-text
- 2026-06-21 Free Tier Limit: set to 2000 audio requests/day (RPD); 7200 audio seconds/hour (ASH); 28800 audio secon…
- 2026-06-21 Launched At: set to 2024-06-24
- 2026-06-21 GA Date: set to 2024-06-24
- 2026-06-21 Notable Customers: set to IBM, PGA of America, Stats Perform, GPTZero, StackAI, Mem0
Suggest an edit / leave a review
Leave a review or comment
curl -X POST https://apio.sh/api/feedback/groq-speech-to-text \
-H 'Content-Type: application/json' \
-d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'Suggest a correction to a field (cite a source)
curl -X POST https://apio.sh/api/suggest/groq-speech-to-text/FIELD \
-H 'Content-Type: application/json' \
-d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'