Cartesia (Sonic)
"The fastest and most natural text to speech model" [1]
Cartesia's Sonic API is a text-to-speech service built for low-latency voice applications such as conversational AI agents, customer support, dubbing, and audiobook narration, with a reported first-audio-byte latency of 90ms on Sonic 3.5. Pricing starts at $50 per million characters with a free tier of 20,000 characters per month, and self-serve signup is available without a sales call. The API supports REST and WebSocket streaming, instant voice cloning on all plans, and deploys across cloud regions, on-premises, and on-device. Cartesia holds SOC 2 Type II, HIPAA, GDPR, and PCI DSS certifications, and counts Quora, Cresta, and Rasa among its customers.
Best for / Avoid if
Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt)
Pricing & procurement
- Pricing model
- Hybrid (base + usage) [2]
- Published pricing
- ✓ Yes
- Free tier
- ✓ Yes [3]
- Free tier details
- Free plan at $0/month includes 20,000 credits/month (approximately 20,000 characters of TTS at 1 credit per character). Includes instant voice cloning. Does not include professional voice cloning or commercial use license. [4]
- Self-serve signup
- ✓ Yes
- Requires sales call
- ✗ No
- Enterprise plan
- ✓ Yes [5]
| Plan | Item | Per | Amount | Source |
|---|---|---|---|---|
| Free | Monthly subscription | month | $0 | source |
| Free | TTS credits included | 20,000 credits/month | $0 | source |
| Pro | Monthly subscription | month | $5 | source |
| Pro | TTS credits included | 100,000 credits/month | $5 | source |
| Startup | Monthly subscription | month | $49 | source |
| Startup | TTS credits included | 1,250,000 credits/month | $49 | source |
| Scale | Monthly subscription | month | $299 | source |
| Scale | TTS credits included | 8,000,000 credits/month | $299 | source |
| Standard Sonic TTS — credit consumption rate | ~1 credit per character | - | source | |
| TTS with Professional voice clone — credit consumption rate | ~1.5 credits per character | - | source | |
| Startup | Professional voice clone fine-tuning (one-time per clone) | 1,000,000 credits per fine-tune | - | source |
| Voice localization (one-time per voice) | 225 credits per voice | - | source | |
| Voice changer — credit consumption rate | 15 credits per second of audio | - | source |
Capabilities
- Supported actions
- synthesize_speech, streaming_tts, websocket_tts, instant_voice_cloning, professional_voice_cloning, word_timestamps, phoneme_timestamps, speech_to_speech, multilingual_synthesis, emotion_control, speed_control, volume_control, pronunciation_dictionaries, voice_design, voice_localization, voice_infill, voice_changer [6]
- Regions
- US, EU, global (cloud), on-premises, on-device
- Languages
- American English, British English, Australian English, French (France), French (Canada), German, Spanish (Castilian), Spanish (Mexican), Italian, Portuguese (Brazilian), Russian, Japanese, Korean, Mandarin Chinese, Dutch, Hindi, Bengali, Bulgarian, Croatian, Czech, Danish, Finnish, Georgian, Greek, Gujarati, Hebrew, Hungarian, Indonesian, Kannada, Malayalam, Marathi, Norwegian, Polish, Punjabi, Romanian, Slovak, Swedish, Tagalog, Tamil, Telugu, Thai, Turkish, Ukrainian, Vietnamese, Emirati Arabic, Malay, Arabic [7]
- Input types
- plain text
- Output types
- pcm_f32le, pcm_s16le, pcm_mulaw, pcm_alaw, wav, mp3, streaming chunks (SSE), streaming chunks (WebSocket) [8]
- Webhooks
- ✗ No [9]
- Sandbox / test mode
- ✗ No
- SDK languages
- Python, JavaScript/TypeScript [10]
- MCP server
- ✓ Yes [11]
Trust & compliance
- SOC 2
- SOC 2 Type II [12]
- HIPAA
- ✓ Yes [13]
- GDPR
- ✓ Yes [14]
- ISO 27001
- – Unknown [15]
- PCI DSS
- ✓ Yes [16]
- Published SLA
- ✗ No [17]
- Rate limits
- TTS concurrent requests: Free=2, Pro=3, Startup=5, Scale=15, Enterprise=custom. WebSocket connections limited to 10x concurrency limit. Idle WebSocket connections closed after 5 minutes. Exceeding limits returns HTTP 429. First audio byte latency: 90ms (Sonic 3.5). [18]
- Known restrictions
- Voice cloning requires explicit consent from the voice owner per Acceptable Use Policy, Cannot post audio of deceased persons, political candidates, or others without express permission, No SSML support documented, Professional voice cloning available on Startup plan and above only; instant voice cloning available on all plans including Free, Free plan does not include commercial use license, Infinite request length supported (no per-request character cap documented), Voice sample recordings may be retained by Cartesia solely for voice cloning functionality [19]
Developer surface
Integration
Adoption & maturity
- Launched
- 2023-01-01
- Notable customers
- Quora, Cresta, Rasa
Other Text-to-Speech APIs
ElevenLabs Text to Speech
"Text to Speech with high quality, human-like AI voices"
Azure AI Text to Speech
"Text to speech enables your applications, tools, or devices to convert text into human like synthesized speech. The text to speech capability is also known as speech synthesis. Use human like standard voices out of the box, or create a custom voice that's unique to your product or brand."
Amazon Polly
"Amazon Polly is a cloud service that converts text into lifelike speech. You can use Amazon Polly to develop applications that increase engagement and accessibility."
Google Cloud Text-to-Speech
"Cloud Text-to-Speech converts text or Speech Synthesis Markup Language (SSML) input into audio data of natural human speech."
Murf AI
"Enterprise-grade AI voice generation with 150+ natural-sounding voices across 35 languages and 20+ speaking styles."
OpenAI Text to Speech (gpt-4o-mini-tts / tts-1)
"Transform text into lifelike spoken audio" - OpenAI's TTS service enabling blog narration, multilingual audio production, and realtime voice output via gpt-4o-mini-tts, tts-1, and tts-1-hd models.
References
- ↑Description: cartesia.ai · docs.cartesia.ai
- ↑Pricing model: cartesia.ai
- ↑Free tier: cartesia.ai
- ↑Free tier details: cartesia.ai
- ↑Enterprise plan: cartesia.ai
- ↑Supported actions: docs.cartesia.ai · docs.cartesia.ai
- ↑Languages: docs.cartesia.ai · cartesia.ai
- ↑Output types: docs.cartesia.ai · docs.cartesia.ai
- ↑Webhooks: docs.cartesia.ai
- ↑SDK languages: docs.cartesia.ai
- ↑MCP server: docs.cartesia.ai · github.com
- ↑SOC 2: cartesia.ai · cartesia.ai
- ↑HIPAA: cartesia.ai · cartesia.ai
- ↑GDPR: cartesia.ai · cartesia.ai
- ↑ISO 27001: cartesia.ai
- ↑PCI DSS: cartesia.ai · cartesia.ai
- ↑Published SLA: cartesia.ai
- ↑Rate limits: docs.cartesia.ai · docs.cartesia.ai
- ↑Known restrictions: cartesia.ai · cartesia.ai
Change history
- 2026-06-21 Capabilities: {} → {"streaming":true,"multilingual":true,"voice_design":true,"voice_cloning":true,…
- 2026-06-21 Summary Md: (none) → Cartesia's Sonic API is a text-to-speech service built for low-latency voice ap…
- 2026-06-21 Score Docs Quality: (none) → 15
- 2026-06-21 Score Procurement Friction: (none) → 100
- 2026-06-21 Score Trust Readiness: (none) → 65
- 2026-06-21 Best For: (none) → Prototypes and side projects - free to start, no sales call, Regulated or enter…
- 2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
- 2026-06-21 Score Agent Friendliness: (none) → 50
- 2026-06-21 Score Pricing Transparency: (none) → 100
- 2026-06-21 Score Setup Speed: (none) → 80
- 2026-06-21 Llms Txt Present: (none) → No
- 2026-06-21 Has Structured Data: (none) → Yes
- 2026-06-21 Robots Allows Agents: (none) → Yes
- 2026-06-21 Status Page URL: (none) → https://status.cartesia.ai
- 2026-06-21 Docs URL: (none) → https://docs.cartesia.ai/get-started/overview
- 2026-06-21 Rendering: (none) → static
- 2026-06-21 MCP Server Available: set to Yes
- 2026-06-21 Pricing Model: set to hybrid
- 2026-06-21 Has Published Pricing: set to Yes
- 2026-06-21 Free Tier Available: set to Yes
- 2026-06-21 Free Tier Details: set to Free plan at $0/month includes 20,000 credits/month (approximately 20,000 chara…
- 2026-06-21 Self Serve Signup: set to Yes
- 2026-06-21 Requires Sales Call: set to No
- 2026-06-21 Enterprise Plan Available: set to Yes
- 2026-06-21 SOC 2: set to type_2
- 2026-06-21 HIPAA: set to Yes
- 2026-06-21 PCI DSS: set to Yes
- 2026-06-21 SLA Published: set to No
- 2026-06-21 Data Retention Policy URL: set to https://cartesia.ai/legal/privacy.html
- 2026-06-21 Documented Rate Limits: set to TTS concurrent requests: Free=2, Pro=3, Startup=5, Scale=15, Enterprise=custom.…
- 2026-06-21 Rate Limit Requests: set to 2
- 2026-06-21 Rate Limit Window: set to concurrent
- 2026-06-21 Known Restrictions: set to Voice cloning requires explicit consent from the voice owner per Acceptable Use…
- 2026-06-21 Auth Methods: set to api_key, jwt
- 2026-06-21 Auth Docs URL: set to https://docs.cartesia.ai/get-started/authenticate-your-client-applications.md
- 2026-06-21 API Style: set to rest
- 2026-06-21 Base URL: set to https://api.cartesia.ai
- 2026-06-21 API Version: set to 2026-03-01
- 2026-06-21 Versioning Scheme: set to date
- 2026-06-21 Stability: set to ga
- 2026-06-21 Deprecation Policy URL: set to https://docs.cartesia.ai/use-the-api/api-conventions.md
- 2026-06-21 MCP URL: set to https://github.com/cartesia-ai/cartesia-mcp
- 2026-06-21 Quickstart URL: set to https://docs.cartesia.ai/get-started/realtime-text-to-speech-quickstart.md
- 2026-06-21 GDPR: set to Yes
- 2026-06-21 Requires Verification: set to No
- 2026-06-21 Starting Price Usd: set to 50
- 2026-06-21 Price Basis: set to 1M characters
- 2026-06-21 Free Tier Limit: set to 20,000 characters/month
- 2026-06-21 Launched At: set to 2023-01-01
- 2026-06-21 Notable Customers: set to Quora, Cresta, Rasa
Suggest an edit / leave a review
Leave a review or comment
curl -X POST https://apio.sh/api/feedback/cartesia \
-H 'Content-Type: application/json' \
-d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'Suggest a correction to a field (cite a source)
curl -X POST https://apio.sh/api/suggest/cartesia/FIELD \
-H 'Content-Type: application/json' \
-d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'