Voicegain
"Voice AI under your Control" - Build AI Voice Agents and Voice AI Apps with Speech-to-Text and LLM APIs, deployable in datacenter or cloud. [1]
Voicegain is a speech-to-text and voice AI platform aimed at contact centers, healthcare payers, and enterprises that need telephony transcription, PII/PCI redaction, real-time agent assist, and custom ASR model training. Pricing starts at $0.0015 per minute on a pay-as-you-go basis, with a $50 one-time signup credit and no credit card required; on-premise and private-cloud deployments are available but require an annual commitment. The platform holds SOC 2 Type 2, HIPAA, GDPR, and PCI DSS certifications, and customers include Aetna, Samsung, and Sutherland.
Best for / Avoid if
Best for: Regulated or enterprise workloads - compliance attestations and an enterprise plan; Teams needing broad API coverage out of the box; Cost-sensitive teams - low, transparent entry price
Avoid if: You want to try it free before paying
Pricing & procurement
- Pricing model
- Usage-based [2]
- Published pricing
- ✓ Yes [3]
- Free tier
- ✗ No [4]
- Free tier details
- $50 one-time credit on signup, no credit card required (not a recurring free allowance)
- Self-serve signup
- ✓ Yes
- Requires sales call
- ✗ No
- Enterprise plan
- ✓ Yes [5]
- Minimum commitment
- Edge/on-premise deployment requires annual commitment and minimum port purchase
| Plan | Item | Per | Amount | Source |
|---|---|---|---|---|
| Pay As You Go | STT Offline Basic (mono-channel, no diarization) | second | $0 | source |
| Pay As You Go | STT Offline Basic (mono-channel, no diarization) | minute | $0.0015 | source |
| Pay As You Go | STT Offline Basic (mono-channel, no diarization) | hour | $0.09 | source |
| Pay As You Go | STT Offline Enhanced (two-channel call center, diarization, PII redaction) | second | $0.0001 | source |
| Pay As You Go | STT Offline Enhanced (two-channel call center, diarization, PII redaction) | minute | $0.003 | source |
| Pay As You Go | STT Offline Enhanced (two-channel call center, diarization, PII redaction) | hour | $0.18 | source |
| Pay As You Go | STT Realtime Basic (streaming transcription) | second | $0.0001 | source |
| Pay As You Go | STT Realtime Basic (streaming transcription) | minute | $0.003 | source |
| Pay As You Go | STT Realtime Basic (streaming transcription) | hour | $0.18 | source |
| Pay As You Go | STT Realtime Enhanced / MRCP ASR | second | $0.0001 | source |
| Pay As You Go | STT Realtime Enhanced / MRCP ASR | minute | $0.0054 | source |
| Pay As You Go | STT Realtime Enhanced / MRCP ASR | hour | $0.324 | source |
| Edge Deployment | STT Offline Enhanced & Multi-channel - port-based license | port/month | $60 | source |
| Edge Deployment | STT Offline Enhanced & Multi-channel - usage-based license | audio hour | $0.16 | source |
| Edge Deployment | STT Realtime Transcription - port-based license | port/month | $72 | source |
| Edge Deployment | STT Realtime Transcription - usage-based license | audio hour | $0.2 | source |
| Edge Deployment | MRCP ASR Tier 1 - port-based license | port/month | $40 | source |
| Edge Deployment | MRCP ASR Tier 2 - port-based license | port/month | $70 | source |
| Voice Agent Platform | AI Voice Agent Standard (Voicegain STT + Standard TTS + SIP Stack + LLM integration) | minute | $0.04 | source |
| Voice Agent Platform | AI Voice Agent Premium (Premium Neural TTS + Voicegain STT + SIP Stack + LLM integration) | minute | $0.06 | source |
Capabilities
- Supported actions
- transcribe_batch, transcribe_streaming, speaker_diarization, word_timestamps, language_detection, sentiment_analysis, named_entity_recognition, keyword_extraction, intent_classification, pii_redaction, pci_redaction, custom_model_training, telephony_bot_api, mrcp_asr, speech_analytics, call_summarization, real_time_agent_assist, automated_qa [6]
- Regions
- US (Google Cloud), AWS VPC, Azure VPC, IBM Cloud VPC, Oracle VPC, on-premise datacenter [7]
- Languages
- English, Spanish, Hindi, German, Portuguese (Alpha early access), Polish (Alpha early access), Korean (Alpha early access), Dutch (Alpha early access), Ukrainian (Alpha early access), French (coming soon), Arabic (coming soon), Italian (coming soon), 50+ languages for batch transcription via Whisper API [8]
- Input types
- audio file upload (40+ formats via ffmpeg for batch), WebSocket streaming (L16 linear PCM 16-bit mono, F32 linear PCM 32-bit floating point mono), RTP audio stream, SIPREC, MRCP, stereo/two-channel audio
- Output types
- JSON transcript, word-level timestamps, speaker diarization labels, sentiment scores, named entities, keywords, intent labels, call summaries, redacted transcript, redacted audio
- Webhooks
- ✓ Yes [9]
- Sandbox / test mode
- ✗ No [10]
- SDK languages
- Python [11]
- MCP server
- ✗ No
Trust & compliance
- SOC 2
- SOC 2 Type II [12]
- HIPAA
- ✓ Yes [13]
- GDPR
- ✓ Yes [14]
- ISO 27001
- ✗ No [15]
- PCI DSS
- ✓ Yes [16]
- Published SLA
- ✓ Yes [17]
- Rate limits
- 4 concurrent/simultaneous requests or 4 hours of audio-processing per hour (standard pay-as-you-go). API Request Limit: 75 requests per minute (fixed 1-minute window). Higher limits available with volume/term commitments. [18]
- Known restrictions
- Streaming supports English and Spanish only (batch supports 50+ languages via Whisper), Real-time transcription input limited to L16 and F32 PCM audio formats, Whisper API is batch-only (no real-time/streaming), Minimum billing of 6 seconds per API request, then 1-second increments, Alpha/early-access language models initially available in offline/batch mode only, Edge/on-premise deployment requires annual commitment and minimum port purchase
Developer surface
Integration
- API style
- rest
- Base URL
- https://api.voicegain.ai/v1
- Version
- v1
- Versioning
- url
- Stability
- ga
- Auth methods
- jwt
- Rate limit
- 4 / concurrent
- Python
voicegain-speech· repo
Adoption & maturity
- Launched
- 2019-01-01
- Notable customers
- Sutherland, Samsung, Aetna, LevelAI, Onvisource, Hammer
Other Speech-to-Text & Transcription APIs
ElevenLabs Scribe (Speech to Text)
"Scribe v2 is the most accurate Speech to Text model" offering "real-time Speech to Text in under 150 ms" across "90+ languages."
Azure AI Speech to Text
"Azure Speech in Foundry Tools provides speech to text, text to speech, and other capabilities through a Microsoft Foundry resource. You can transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and conduct live AI voice conversations."
Amazon Transcribe
"Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application."
Google Cloud Speech-to-Text
"Accurate voice typing and transcription powered by Gemini."
IBM watsonx Speech to Text
"IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics."
AssemblyAI
"Voice AI infrastructure for developers building products that transcribe, understand, and act on speech."
References
- ↑Description: voicegain.ai
- ↑Pricing model: voicegain.ai
- ↑Published pricing: voicegain.ai
- ↑Free tier: voicegain.ai
- ↑Enterprise plan: voicegain.ai
- ↑Supported actions: voicegain.ai · voicegain.ai
- ↑Regions: voicegain.ai
- ↑Languages: voicegain.ai · support.voicegain.ai
- ↑Webhooks: voicegain.ai
- ↑Sandbox: voicegain.ai
- ↑SDK languages: voicegain.ai
- ↑SOC 2: voicegain.ai · voicegain.ai
- ↑HIPAA: voicegain.ai · voicegain.ai
- ↑GDPR: voicegain.ai
- ↑ISO 27001: voicegain.ai
- ↑PCI DSS: voicegain.ai · voicegain.ai
- ↑Published SLA: voicegain.ai
- ↑Rate limits: voicegain.ai · support.voicegain.ai
Change history
- 2026-06-21 Capabilities: {} → {"self_hosted":true,"pii_redaction":true,"real_time_streaming":true,"speaker_di…
- 2026-06-21 Summary Md: (none) → Voicegain is a speech-to-text and voice AI platform aimed at contact centers, h…
- 2026-06-21 Score Setup Speed: (none) → 50
- 2026-06-21 Score Docs Quality: (none) → 35
- 2026-06-21 Score Procurement Friction: (none) → 85
- 2026-06-21 Score Trust Readiness: (none) → 85
- 2026-06-21 Best For: (none) → Regulated or enterprise workloads - compliance attestations and an enterprise p…
- 2026-06-21 Avoid If: (none) → You want to try it free before paying
- 2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
- 2026-06-21 Score Agent Friendliness: (none) → 20
- 2026-06-21 Score Pricing Transparency: (none) → 85
- 2026-06-21 Docs URL: (none) → https://www.voicegain.ai/api
- 2026-06-21 Rendering: (none) → static
- 2026-06-21 Has Structured Data: (none) → No
- 2026-06-21 Llms Txt Present: (none) → No
- 2026-06-21 API Reference URL: (none) → https://www.voicegain.ai/api
- 2026-06-21 Robots Allows Agents: (none) → Yes
- 2026-06-21 Pricing Model: set to usage_based
- 2026-06-21 Has Published Pricing: set to Yes
- 2026-06-21 Free Tier Available: set to No
- 2026-06-21 Free Tier Details: set to $50 one-time credit on signup, no credit card required (not a recurring free al…
- 2026-06-21 Minimum Commitment: set to Edge/on-premise deployment requires annual commitment and minimum port purchase
- 2026-06-21 Self Serve Signup: set to Yes
- 2026-06-21 Requires Sales Call: set to No
- 2026-06-21 Enterprise Plan Available: set to Yes
- 2026-06-21 SOC 2: set to type_2
- 2026-06-21 HIPAA: set to Yes
- 2026-06-21 GDPR: set to Yes
- 2026-06-21 ISO 27001: set to No
- 2026-06-21 PCI DSS: set to Yes
- 2026-06-21 SLA Published: set to Yes
- 2026-06-21 SLA URL: set to https://www.voicegain.ai/post/voicegain-introduces-industry-first-relative-spee…
- 2026-06-21 Data Retention Policy URL: set to https://www.voicegain.ai/privacy-policy
- 2026-06-21 Documented Rate Limits: set to 4 concurrent/simultaneous requests or 4 hours of audio-processing per hour (sta…
- 2026-06-21 Rate Limit Requests: set to 4
- 2026-06-21 Rate Limit Window: set to concurrent
- 2026-06-21 Known Restrictions: set to Streaming supports English and Spanish only (batch supports 50+ languages via W…
- 2026-06-21 Auth Methods: set to jwt
- 2026-06-21 Auth Docs URL: set to https://console.voicegain.ai/api/v1/index.html
- 2026-06-21 API Style: set to rest
- 2026-06-21 API Version: set to v1
- 2026-06-21 Versioning Scheme: set to url
- 2026-06-21 Stability: set to ga
- 2026-06-21 Quickstart URL: set to https://www.voicegain.ai/trial
- 2026-06-21 Requires Verification: set to No
- 2026-06-21 Starting Price Usd: set to 0.0015
- 2026-06-21 Price Basis: set to minute
- 2026-06-21 Free Tier Limit: set to $50 one-time credit on signup, no credit card required
- 2026-06-21 Launched At: set to 2019-01-01
- 2026-06-21 Notable Customers: set to Sutherland, Samsung, Aetna, LevelAI, Onvisource, Hammer
Suggest an edit / leave a review
Leave a review or comment
curl -X POST https://apio.sh/api/feedback/voicegain \
-H 'Content-Type: application/json' \
-d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'Suggest a correction to a field (cite a source)
curl -X POST https://apio.sh/api/suggest/voicegain/FIELD \
-H 'Content-Type: application/json' \
-d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'