Groq Speech-to-Text (Whisper)

"Groq API is designed to provide a fast speech-to-text solution, offering OpenAI-compatible endpoints that enable near-instant transcriptions and translations, with the ability to integrate high-quality audio processing into applications at speeds that rival human interaction." [1]

groq.com/ · By Groq · Agent JSON · Suggest an edit · Last verified 2026-06-21 · Source confidence: high

Groq Speech-to-Text runs OpenAI-compatible Whisper endpoints (large-v3 and large-v3-turbo) optimized for speed, supporting batch transcription, word and segment timestamps, language detection, and audio translation to English across dozens of languages. Pricing starts at $0.04 per hour of audio on a self-serve basis, with a generous free tier covering 2,000 requests and 28,800 audio seconds per day. SDKs are available for Python, Node.js, C#, and PHP, an MCP server is available, and the platform holds SOC 2 Type 2, HIPAA, and GDPR certifications with an enterprise plan for higher rate limits.

Best for / Avoid if

Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt)

Pricing & procurement

Pricing model
Usage-based [2]
Published pricing
Yes [3]
Free tier
Yes [4]
Free tier details
Free Plan available at no cost with community support and zero-data retention option. Rate limits on free tier: 20 RPM, 2,000 RPD, 7,200 audio seconds/hour (ASH), 28,800 audio seconds/day (ASD) for Whisper models. Max file size 25 MB. No monetary charge; perpetual free access within these limits.
Self-serve signup
Yes
Requires sales call
No
Enterprise plan
Yes [5]
Published prices
PlanItemPerAmountSource
Pay As You GoWhisper Large v3 (whisper-large-v3) transcriptionhour of audio transcribed$0.111source
Pay As You GoWhisper Large v3 Turbo (whisper-large-v3-turbo) transcriptionhour of audio transcribed$0.04source

Capabilities

  • Speech translation
Supported actions
transcribe_batch, translation_to_english, word_timestamps, segment_timestamps, language_detection, prompt_guided_transcription [6]
Regions
United States, Global (multi-region)
Languages
Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Cantonese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba [7]
Input types
audio/flac, audio/mp3, audio/mp4, audio/mpeg, audio/mpga, audio/m4a, audio/ogg, audio/wav, audio/webm, file upload, URL (including Base64URL) [8]
Output types
JSON, text, verbose_json (with word and segment timestamps) [9]
Webhooks
No
Sandbox / test mode
No
SDK languages
Python, Node.js, C#, PHP [10]
MCP server
Yes [11]

Trust & compliance

SOC 2
SOC 2 Type II [12]
HIPAA
Yes [13]
GDPR
Yes [14]
ISO 27001
Unknown
PCI DSS
Unknown
Published SLA
No [15]
Rate limits
{"free_plan": {"whisper-large-v3": "20 RPM, 2,000 RPD, 7,200 ASH, 28,800 ASD", "whisper-large-v3-turbo": "20 RPM, 2,000 RPD, 7,200 ASH, 28,800 ASD"}, "developer_plan": {"whisper-large-v3": "300 RPM, 200,000 ASH", "whisper-large-v3-turbo": "400 RPM, 400,000 ASH"}, "note": "Higher limits available for enterprise use cases"} [16]
Known restrictions
Free tier: 25 MB max file size per request, Developer tier: 100 MB max file size per request, Minimum billable duration: 10 seconds per request, Single audio track only (first track used from multi-track files), Translation endpoint outputs English only, whisper-large-v3-turbo does not support translation endpoint, Audio shorter than 30 seconds is padded with silence (minimum 0.01 seconds), BAA (HIPAA) excludes beta, alpha, free tier, and experimental features, distil-whisper-large-v3-en was deprecated and sunset on August 23, 2025; migrate to whisper-large-v3-turbo [17]

Developer surface

Integration

API style
rest
Base URL
https://api.groq.com/openai/v1
Version
v1
Versioning
url
Stability
ga
Auth methods
api_key
Idempotency keys
No
Error format
vendor-specific
Rate limit
20 / minute

SDKs

  • Python groq · repo
  • Node.js groq-sdk · repo
  • C# jgravelle.GroqAPILibrary · repo
  • PHP lucianotonet/groq-php · repo

Adoption & maturity

Launched
2024-06-24
GA
2024-06-24
Notable customers
IBM, PGA of America, Stats Perform, GPTZero, StackAI, Mem0

Other Speech-to-Text & Transcription APIs

  • ElevenLabs Scribe (Speech to Text)

    "Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."

    Hybrid · free tier · public pricing · self-serve

  • Azure AI Speech to Text

    "Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."

    Usage · free tier · public pricing · self-serve

  • Amazon Transcribe

    "Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application."

    Usage · free tier · public pricing · self-serve

  • Google Cloud Speech-to-Text

    "Accurate voice typing and transcription powered by Gemini."

    Usage · free tier · public pricing · self-serve

  • IBM watsonx Speech to Text

    "IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."

    Usage · free tier · public pricing · self-serve

  • AssemblyAI

    "Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."

    Usage · public pricing · self-serve

Groq Speech-to-Text (Whisper) alternatives · Groq Speech-to-Text (Whisper) vs ElevenLabs Scribe (Speech to Text) · All Speech-to-Text & Transcription APIs APIs

References

Each field above carries a numbered source - hover for a preview, click to jump here.

  1. Description: console.groq.com
  2. Pricing model: groq.com · groq.com
  3. Published pricing: groq.com
  4. Free tier: groq.com · console.groq.com
  5. Enterprise plan: groq.com
  6. Supported actions: console.groq.com
  7. Languages: console.groq.com · openai-whisper.mintlify.app · console.groq.com
  8. Input types: console.groq.com
  9. Output types: console.groq.com
  10. SDK languages: console.groq.com
  11. MCP server: github.com
  12. SOC 2: console.groq.com · groq.com
  13. HIPAA: console.groq.com · groq.com
  14. GDPR: groq.com · console.groq.com
  15. Published SLA: groq.com · console.groq.com
  16. Rate limits: console.groq.com · console.groq.com
  17. Known restrictions: console.groq.com · console.groq.com

Change history

Every field change, who made it, and when - from our audited data pipeline and editors.

  1. 2026-06-21 Capabilities: {}{"translation":true}
  2. 2026-06-21 Summary Md: (none)Groq Speech-to-Text runs OpenAI-compatible Whisper endpoints (large-v3 and larg…
  3. 2026-06-21 Score Pricing Transparency: (none)100
  4. 2026-06-21 Score Agent Friendliness: (none)40
  5. 2026-06-21 Score Setup Speed: (none)85
  6. 2026-06-21 Score Docs Quality: (none)45
  7. 2026-06-21 Score Procurement Friction: (none)100
  8. 2026-06-21 Score Trust Readiness: (none)55
  9. 2026-06-21 Best For: (none)Prototypes and side projects - free to start, no sales call, Regulated or enter…
  10. 2026-06-21 Scoring Methodology: (none)Scores are computed deterministically from this profile's published, sourced fi…
  11. 2026-06-21 Has Structured Data: (none)No
  12. 2026-06-21 Robots Allows Agents: (none)Yes
  13. 2026-06-21 API Reference URL: (none)https://groq.com/groqcloud
  14. 2026-06-21 Status Page URL: (none)https://status.groq.com
  15. 2026-06-21 Changelog URL: (none)https://groq.com/changelog
  16. 2026-06-21 Docs URL: (none)https://console.groq.com/docs/overview
  17. 2026-06-21 Rendering: (none)static
  18. 2026-06-21 Llms Txt Present: (none)No
  19. 2026-06-21 Has Published Pricing: set to Yes
  20. 2026-06-21 Free Tier Available: set to Yes
  21. 2026-06-21 Free Tier Details: set to Free Plan available at no cost with community support and zero-data retention o…
  22. 2026-06-21 Self Serve Signup: set to Yes
  23. 2026-06-21 Requires Sales Call: set to No
  24. 2026-06-21 Enterprise Plan Available: set to Yes
  25. 2026-06-21 SOC 2: set to type_2
  26. 2026-06-21 HIPAA: set to Yes
  27. 2026-06-21 GDPR: set to Yes
  28. 2026-06-21 SLA Published: set to No
  29. 2026-06-21 Data Retention Policy URL: set to https://console.groq.com/docs/your-data
  30. 2026-06-21 Documented Rate Limits: set to {"free_plan":{"whisper-large-v3":"20 RPM, 2,000 RPD, 7,200 ASH, 28,800 ASD","wh…
  31. 2026-06-21 Rate Limit Requests: set to 20
  32. 2026-06-21 Rate Limit Window: set to minute
  33. 2026-06-21 Known Restrictions: set to Free tier: 25 MB max file size per request, Developer tier: 100 MB max file siz…
  34. 2026-06-21 Auth Methods: set to api_key
  35. 2026-06-21 Auth Docs URL: set to https://console.groq.com/docs/quickstart
  36. 2026-06-21 API Style: set to rest
  37. 2026-06-21 Base URL: set to https://api.groq.com/openai/v1
  38. 2026-06-21 Versioning Scheme: set to url
  39. 2026-06-21 Stability: set to ga
  40. 2026-06-21 Deprecation Policy URL: set to https://console.groq.com/docs/deprecations
  41. 2026-06-21 Quickstart URL: set to https://console.groq.com/docs/quickstart
  42. 2026-06-21 Idempotency Supported: set to No
  43. 2026-06-21 Error Format: set to vendor-specific
  44. 2026-06-21 Requires Verification: set to No
  45. 2026-06-21 Starting Price Usd: set to 0.04
  46. 2026-06-21 Slug: set to groq-speech-to-text
  47. 2026-06-21 Free Tier Limit: set to 2000 audio requests/day (RPD); 7200 audio seconds/hour (ASH); 28800 audio secon…
  48. 2026-06-21 Launched At: set to 2024-06-24
  49. 2026-06-21 GA Date: set to 2024-06-24
  50. 2026-06-21 Notable Customers: set to IBM, PGA of America, Stats Perform, GPTZero, StackAI, Mem0

Suggest an edit / leave a review

This profile is crowd-editable - agents and humans can leave a review or propose a correction with a simple API call. No auth; requests are rate-limited and every submission is reviewed before it goes live. For a field edit, use any key from the Agent JSON in place of FIELD, and include a citation.

Leave a review or comment

curl -X POST https://apio.sh/api/feedback/groq-speech-to-text \
  -H 'Content-Type: application/json' \
  -d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'

Suggest a correction to a field (cite a source)

curl -X POST https://apio.sh/api/suggest/groq-speech-to-text/FIELD \
  -H 'Content-Type: application/json' \
  -d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'

All the ways to contribute →