Cartesia (Sonic)

"The fastest and most natural text to speech model" [1]

cartesia.ai · By Cartesia · Agent JSON · Suggest an edit · Last verified 2026-06-21 · Source confidence: high

Cartesia's Sonic API is a text-to-speech service built for low-latency voice applications such as conversational AI agents, customer support, dubbing, and audiobook narration, with a reported first-audio-byte latency of 90ms on Sonic 3.5. Pricing starts at $50 per million characters with a free tier of 20,000 characters per month, and self-serve signup is available without a sales call. The API supports REST and WebSocket streaming, instant voice cloning on all plans, and deploys across cloud regions, on-premises, and on-device. Cartesia holds SOC 2 Type II, HIPAA, GDPR, and PCI DSS certifications, and counts Quora, Cresta, and Rasa among its customers.

Best for / Avoid if

Best for: Prototypes and side projects - free to start, no sales call; Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt)

Pricing & procurement

Pricing model
Hybrid (base + usage) [2]
Published pricing
Yes
Free tier
Yes [3]
Free tier details
Free plan at $0/month includes 20,000 credits/month (approximately 20,000 characters of TTS at 1 credit per character). Includes instant voice cloning. Does not include professional voice cloning or commercial use license. [4]
Self-serve signup
Yes
Requires sales call
No
Enterprise plan
Yes [5]
Published prices
PlanItemPerAmountSource
FreeMonthly subscriptionmonth$0source
FreeTTS credits included20,000 credits/month$0source
ProMonthly subscriptionmonth$5source
ProTTS credits included100,000 credits/month$5source
StartupMonthly subscriptionmonth$49source
StartupTTS credits included1,250,000 credits/month$49source
ScaleMonthly subscriptionmonth$299source
ScaleTTS credits included8,000,000 credits/month$299source
Standard Sonic TTS — credit consumption rate~1 credit per character - source
TTS with Professional voice clone — credit consumption rate~1.5 credits per character - source
StartupProfessional voice clone fine-tuning (one-time per clone)1,000,000 credits per fine-tune - source
Voice localization (one-time per voice)225 credits per voice - source
Voice changer — credit consumption rate15 credits per second of audio - source

Capabilities

  • Real-time streaming
  • Voice cloning
  • Voice design
  • Multilingual voices
  • Word timestamps
Supported actions
synthesize_speech, streaming_tts, websocket_tts, instant_voice_cloning, professional_voice_cloning, word_timestamps, phoneme_timestamps, speech_to_speech, multilingual_synthesis, emotion_control, speed_control, volume_control, pronunciation_dictionaries, voice_design, voice_localization, voice_infill, voice_changer [6]
Regions
US, EU, global (cloud), on-premises, on-device
Languages
American English, British English, Australian English, French (France), French (Canada), German, Spanish (Castilian), Spanish (Mexican), Italian, Portuguese (Brazilian), Russian, Japanese, Korean, Mandarin Chinese, Dutch, Hindi, Bengali, Bulgarian, Croatian, Czech, Danish, Finnish, Georgian, Greek, Gujarati, Hebrew, Hungarian, Indonesian, Kannada, Malayalam, Marathi, Norwegian, Polish, Punjabi, Romanian, Slovak, Swedish, Tagalog, Tamil, Telugu, Thai, Turkish, Ukrainian, Vietnamese, Emirati Arabic, Malay, Arabic [7]
Input types
plain text
Output types
pcm_f32le, pcm_s16le, pcm_mulaw, pcm_alaw, wav, mp3, streaming chunks (SSE), streaming chunks (WebSocket) [8]
Webhooks
No [9]
Sandbox / test mode
No
SDK languages
Python, JavaScript/TypeScript [10]
MCP server
Yes [11]

Trust & compliance

SOC 2
SOC 2 Type II [12]
HIPAA
Yes [13]
GDPR
Yes [14]
ISO 27001
Unknown [15]
PCI DSS
Yes [16]
Published SLA
No [17]
Rate limits
TTS concurrent requests: Free=2, Pro=3, Startup=5, Scale=15, Enterprise=custom. WebSocket connections limited to 10x concurrency limit. Idle WebSocket connections closed after 5 minutes. Exceeding limits returns HTTP 429. First audio byte latency: 90ms (Sonic 3.5). [18]
Known restrictions
Voice cloning requires explicit consent from the voice owner per Acceptable Use Policy, Cannot post audio of deceased persons, political candidates, or others without express permission, No SSML support documented, Professional voice cloning available on Startup plan and above only; instant voice cloning available on all plans including Free, Free plan does not include commercial use license, Infinite request length supported (no per-request character cap documented), Voice sample recordings may be retained by Cartesia solely for voice cloning functionality [19]

Developer surface

Docs rendering: static

Integration

API style
rest
Base URL
https://api.cartesia.ai
Version
2026-03-01
Versioning
date
Stability
ga
Auth methods
api_key, jwt
Error format
vendor-specific
Rate limit
2 / concurrent

SDKs

  • Python cartesia · repo
  • JavaScript/TypeScript @cartesia/cartesia-js · repo

Adoption & maturity

Launched
2023-01-01
Notable customers
Quora, Cresta, Rasa

Other Text-to-Speech APIs

  • ElevenLabs Text to Speech

    "Text to Speech with high quality, human-like AI voices"

    Hybrid · free tier · public pricing · self-serve

  • Azure AI Text to Speech

    "Text to speech enables your applications, tools, or devices to convert text into human like synthesized speech. The text to speech capability is also known as speech synthesis. Use human like standard voices out of the box, or create a custom voice that's unique to your product or brand."

    Usage · free tier · public pricing · self-serve

  • Amazon Polly

    "Amazon Polly is a cloud service that converts text into lifelike speech. You can use Amazon Polly to develop applications that increase engagement and accessibility."

    Usage · free tier · public pricing · self-serve

  • Google Cloud Text-to-Speech

    "Cloud Text-to-Speech converts text or Speech Synthesis Markup Language (SSML) input into audio data of natural human speech."

    Usage · free tier · public pricing · self-serve

  • Murf AI

    "Enterprise-grade AI voice generation with 150+ natural-sounding voices across 35 languages and 20+ speaking styles."

    Usage · public pricing · self-serve

  • OpenAI Text to Speech (gpt-4o-mini-tts / tts-1)

    "Transform text into lifelike spoken audio" - OpenAI's TTS service enabling blog narration, multilingual audio production, and realtime voice output via gpt-4o-mini-tts, tts-1, and tts-1-hd models.

    Usage · public pricing · self-serve

Cartesia (Sonic) alternatives · Cartesia (Sonic) vs ElevenLabs Text to Speech · All Text-to-Speech APIs APIs

References

Each field above carries a numbered source - hover for a preview, click to jump here.

  1. Description: cartesia.ai · docs.cartesia.ai
  2. Pricing model: cartesia.ai
  3. Free tier: cartesia.ai
  4. Free tier details: cartesia.ai
  5. Enterprise plan: cartesia.ai
  6. Supported actions: docs.cartesia.ai · docs.cartesia.ai
  7. Languages: docs.cartesia.ai · cartesia.ai
  8. Output types: docs.cartesia.ai · docs.cartesia.ai
  9. Webhooks: docs.cartesia.ai
  10. SDK languages: docs.cartesia.ai
  11. MCP server: docs.cartesia.ai · github.com
  12. SOC 2: cartesia.ai · cartesia.ai
  13. HIPAA: cartesia.ai · cartesia.ai
  14. GDPR: cartesia.ai · cartesia.ai
  15. ISO 27001: cartesia.ai
  16. PCI DSS: cartesia.ai · cartesia.ai
  17. Published SLA: cartesia.ai
  18. Rate limits: docs.cartesia.ai · docs.cartesia.ai
  19. Known restrictions: cartesia.ai · cartesia.ai

Change history

Every field change, who made it, and when - from our audited data pipeline and editors.

  1. 2026-06-21 Capabilities: {}{"streaming":true,"multilingual":true,"voice_design":true,"voice_cloning":true,…
  2. 2026-06-21 Summary Md: (none)Cartesia's Sonic API is a text-to-speech service built for low-latency voice ap…
  3. 2026-06-21 Score Docs Quality: (none)15
  4. 2026-06-21 Score Procurement Friction: (none)100
  5. 2026-06-21 Score Trust Readiness: (none)65
  6. 2026-06-21 Best For: (none)Prototypes and side projects - free to start, no sales call, Regulated or enter…
  7. 2026-06-21 Scoring Methodology: (none)Scores are computed deterministically from this profile's published, sourced fi…
  8. 2026-06-21 Score Agent Friendliness: (none)50
  9. 2026-06-21 Score Pricing Transparency: (none)100
  10. 2026-06-21 Score Setup Speed: (none)80
  11. 2026-06-21 Llms Txt Present: (none)No
  12. 2026-06-21 Has Structured Data: (none)Yes
  13. 2026-06-21 Robots Allows Agents: (none)Yes
  14. 2026-06-21 Status Page URL: (none)https://status.cartesia.ai
  15. 2026-06-21 Docs URL: (none)https://docs.cartesia.ai/get-started/overview
  16. 2026-06-21 Rendering: (none)static
  17. 2026-06-21 MCP Server Available: set to Yes
  18. 2026-06-21 Pricing Model: set to hybrid
  19. 2026-06-21 Has Published Pricing: set to Yes
  20. 2026-06-21 Free Tier Available: set to Yes
  21. 2026-06-21 Free Tier Details: set to Free plan at $0/month includes 20,000 credits/month (approximately 20,000 chara…
  22. 2026-06-21 Self Serve Signup: set to Yes
  23. 2026-06-21 Requires Sales Call: set to No
  24. 2026-06-21 Enterprise Plan Available: set to Yes
  25. 2026-06-21 SOC 2: set to type_2
  26. 2026-06-21 HIPAA: set to Yes
  27. 2026-06-21 PCI DSS: set to Yes
  28. 2026-06-21 SLA Published: set to No
  29. 2026-06-21 Data Retention Policy URL: set to https://cartesia.ai/legal/privacy.html
  30. 2026-06-21 Documented Rate Limits: set to TTS concurrent requests: Free=2, Pro=3, Startup=5, Scale=15, Enterprise=custom.…
  31. 2026-06-21 Rate Limit Requests: set to 2
  32. 2026-06-21 Rate Limit Window: set to concurrent
  33. 2026-06-21 Known Restrictions: set to Voice cloning requires explicit consent from the voice owner per Acceptable Use…
  34. 2026-06-21 Auth Methods: set to api_key, jwt
  35. 2026-06-21 Auth Docs URL: set to https://docs.cartesia.ai/get-started/authenticate-your-client-applications.md
  36. 2026-06-21 API Style: set to rest
  37. 2026-06-21 Base URL: set to https://api.cartesia.ai
  38. 2026-06-21 API Version: set to 2026-03-01
  39. 2026-06-21 Versioning Scheme: set to date
  40. 2026-06-21 Stability: set to ga
  41. 2026-06-21 Deprecation Policy URL: set to https://docs.cartesia.ai/use-the-api/api-conventions.md
  42. 2026-06-21 MCP URL: set to https://github.com/cartesia-ai/cartesia-mcp
  43. 2026-06-21 Quickstart URL: set to https://docs.cartesia.ai/get-started/realtime-text-to-speech-quickstart.md
  44. 2026-06-21 GDPR: set to Yes
  45. 2026-06-21 Requires Verification: set to No
  46. 2026-06-21 Starting Price Usd: set to 50
  47. 2026-06-21 Price Basis: set to 1M characters
  48. 2026-06-21 Free Tier Limit: set to 20,000 characters/month
  49. 2026-06-21 Launched At: set to 2023-01-01
  50. 2026-06-21 Notable Customers: set to Quora, Cresta, Rasa

Suggest an edit / leave a review

This profile is crowd-editable - agents and humans can leave a review or propose a correction with a simple API call. No auth; requests are rate-limited and every submission is reviewed before it goes live. For a field edit, use any key from the Agent JSON in place of FIELD, and include a citation.

Leave a review or comment

curl -X POST https://apio.sh/api/feedback/cartesia \
  -H 'Content-Type: application/json' \
  -d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'

Suggest a correction to a field (cite a source)

curl -X POST https://apio.sh/api/suggest/cartesia/FIELD \
  -H 'Content-Type: application/json' \
  -d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'

All the ways to contribute →