Amazon Polly
"Amazon Polly is a cloud service that converts text into lifelike speech. You can use Amazon Polly to develop applications that increase engagement and accessibility." [1]
Amazon Polly is an AWS cloud text-to-speech service, launched in 2016, suited for mobile apps, eLearning platforms, accessibility tools, IVR systems, and IoT applications. Pricing is usage-based at $4.00 per million characters, with a permanent free tier of 5 million standard-voice characters per month and additional neural, long-form, and generative character allowances for the first year. SDKs are available in ten languages including Python, Node.js, Java, Go, and Rust, and the service is available across more than 20 AWS regions including GovCloud. It holds SOC 2 Type 2, HIPAA, GDPR, ISO 27001, and PCI DSS certifications.
Best for / Avoid if
Best for: Prototypes and side projects - free to start, no sales call; Teams needing broad API coverage out of the box; Cost-sensitive teams - low, transparent entry price
Pricing & procurement
- Pricing model
- Usage-based [2]
- Published pricing
- ✓ Yes [3]
- Free tier
- ✓ Yes [4]
- Free tier details
- Standard voices: 5 million characters/month (permanent, always-free). Neural voices: 1 million characters/month for the first 12 months. Long-Form voices: 500 thousand characters/month for the first 12 months. Generative voices: 100 thousand characters/month for the first 12 months. [5]
- Self-serve signup
- ✓ Yes
- Requires sales call
- ✗ No
- Enterprise plan
- ✗ No
| Plan | Item | Per | Amount | Source |
|---|---|---|---|---|
| Free Tier (Always Free) | Speech synthesis (Standard voices) | 5M characters per month (permanent, no expiry) | $0 | source |
| Free Tier (12 Months) | Speech synthesis (Neural voices) | 1M characters per month (first 12 months) | $0 | source |
| Free Tier (12 Months) | Speech synthesis (Long-Form voices) | 500K characters per month (first 12 months) | $0 | source |
| Free Tier (12 Months) | Speech synthesis (Generative voices) | 100K characters per month (first 12 months) | $0 | source |
| Pay As You Go | Speech synthesis (Standard voices) | 1M characters | $4 | source |
| Pay As You Go | Speech synthesis (Neural voices) | 1M characters | $16 | source |
| Pay As You Go | Speech synthesis (Long-Form voices) | 1M characters | $100 | source |
| Pay As You Go | Speech synthesis (Generative voices) | 1M characters | $30 | source |
| Pay As You Go (GovCloud) | Speech synthesis (Standard voices) | 1M characters | $4.8 | source |
| Pay As You Go (GovCloud) | Speech synthesis (Neural voices) | 1M characters | $19.2 | source |
Capabilities
- Supported actions
- synthesize_speech, streaming_tts, async_synthesis_task, ssml_support, custom_lexicons, speech_marks, word_timestamps, sentence_timestamps, viseme_timestamps, newscaster_speaking_style, neural_tts, standard_tts, long_form_tts, generative_tts, describe_voices, put_lexicon, get_lexicon, list_lexicons, delete_lexicon, get_speech_synthesis_task, list_speech_synthesis_tasks, start_speech_synthesis_stream [6]
- Regions
- US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Africa (Cape Town), Asia Pacific (Hong Kong), Asia Pacific (Malaysia), Asia Pacific (Mumbai), Asia Pacific (Osaka), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Thailand), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Europe (Spain), Europe (Stockholm), Europe (Zurich), Middle East (Bahrain), South America (São Paulo), AWS GovCloud (US-West) [7]
- Languages
- Arabic (arb), Arabic Gulf (ar-AE), Catalan (ca-ES), Chinese Cantonese (yue-CN), Chinese Mandarin (cmn-CN), Czech (cs-CZ), Danish (da-DK), Dutch Belgian (nl-BE), Dutch (nl-NL), English Australian (en-AU), English British (en-GB), English Indian (en-IN), English New Zealand (en-NZ), English Singaporean (en-SG), English South African (en-ZA), English US (en-US), English Welsh (en-GB-WLS), Finnish (fi-FI), French (fr-FR), French Belgian (fr-BE), French Canadian (fr-CA), Hindi (hi-IN), German (de-DE), German Austrian (de-AT), German Swiss (de-CH), Icelandic (is-IS), Italian (it-IT), Japanese (ja-JP), Korean (ko-KR), Norwegian (nb-NO), Polish (pl-PL), Portuguese Brazilian (pt-BR), Portuguese European (pt-PT), Romanian (ro-RO), Russian (ru-RU), Spanish Spain (es-ES), Spanish Mexican (es-MX), Spanish US (es-US), Swedish (sv-SE), Turkish (tr-TR), Welsh (cy-GB) [8]
- Input types
- plain text, SSML
- Output types
- mp3, ogg_vorbis, ogg_opus, pcm, mulaw, alaw, json (speech marks) [9]
- Webhooks
- ✗ No [10]
- Sandbox / test mode
- ✗ No [11]
- SDK languages
- Python, Node.js, Java, Go, C++, .NET, Ruby, PHP, Rust, Kotlin [12]
- MCP server
- ✗ No [13]
Trust & compliance
- SOC 2
- SOC 2 Type II [14]
- HIPAA
- ✓ Yes [15]
- GDPR
- ✓ Yes [16]
- ISO 27001
- ✓ Yes [17]
- PCI DSS
- ✓ Yes [18]
- Published SLA
- ✓ Yes [19]
- Rate limits
- SynthesizeSpeech (standard): 80 requests/sec; SynthesizeSpeech (neural/long-form/generative): 8 requests/sec; StartSpeechSynthesisTask (standard/neural): 10 requests/sec; StartSpeechSynthesisTask (generative/long-form): 1 request/sec. SynthesizeSpeech max input: 6,000 total characters / 3,000 billed characters per request. StartSpeechSynthesisTask max input: 200,000 total characters / 100,000 billed characters per request. All limits are per region and adjustable via Service Quotas. [20]
- Known restrictions
- SynthesizeSpeech API: maximum 6,000 total characters (including SSML tags) or 3,000 billed characters per request, StartSpeechSynthesisTask API: maximum 200,000 total characters or 100,000 billed characters per request, Maximum 5 lexicons per synthesis request, Maximum 100 lexicons per account per region, Maximum lexicon size: 40,000 characters, Speech marks (JSON) cannot be combined with audio output formats in a single request, Not all voices support all engines (standard/neural/long-form/generative), Not all alphabets/scripts are available for all voices [21]
Developer surface
Integration
- API style
- rest
- Base URL
- https://polly.{region}.amazonaws.com
- Version
- 2016-06-10
- Versioning
- url
- Stability
- ga
- Auth methods
- hmac_signature
- Error format
- vendor-specific
- Rate limit
- 80 / second
- Python
boto3· repo - Node.js
@aws-sdk/client-polly· repo - Java
software.amazon.awssdk:polly· repo - Go
github.com/aws/aws-sdk-go-v2/service/polly· repo - C++
aws-sdk-cpp (polly)· repo - .NET
AWSSDK.Polly· repo - Ruby
aws-sdk-polly· repo - PHP
aws/aws-sdk-php· repo - Rust
aws-sdk-polly· repo - Kotlin
aws.sdk.kotlin:polly· repo
Adoption & maturity
- Launched
- 2016-11-30
- GA
- 2016-11-30
Other Text-to-Speech APIs
ElevenLabs Text to Speech
"Text to Speech with high quality, human-like AI voices"
Azure AI Text to Speech
"Text to speech enables your applications, tools, or devices to convert text into human like synthesized speech. The text to speech capability is also known as speech synthesis. Use human like standard voices out of the box, or create a custom voice that's unique to your product or brand."
Google Cloud Text-to-Speech
"Cloud Text-to-Speech converts text or Speech Synthesis Markup Language (SSML) input into audio data of natural human speech."
Cartesia (Sonic)
"The fastest and most natural text to speech model"
Murf AI
"Enterprise-grade AI voice generation with 150+ natural-sounding voices across 35 languages and 20+ speaking styles."
OpenAI Text to Speech (gpt-4o-mini-tts / tts-1)
"Transform text into lifelike spoken audio" - OpenAI's TTS service enabling blog narration, multilingual audio production, and realtime voice output via gpt-4o-mini-tts, tts-1, and tts-1-hd models.
References
- ↑Description: docs.aws.amazon.com
- ↑Pricing model: aws.amazon.com
- ↑Published pricing: aws.amazon.com
- ↑Free tier: aws.amazon.com · aws.amazon.com
- ↑Free tier details: aws.amazon.com
- ↑Supported actions: docs.aws.amazon.com · docs.aws.amazon.com
- ↑Regions: docs.aws.amazon.com
- ↑Languages: docs.aws.amazon.com
- ↑Output types: docs.aws.amazon.com
- ↑Webhooks: docs.aws.amazon.com
- ↑Sandbox: aws.amazon.com
- ↑SDK languages: docs.aws.amazon.com
- ↑MCP server: aws.amazon.com
- ↑SOC 2: docs.aws.amazon.com · aws.amazon.com
- ↑HIPAA: aws.amazon.com · docs.aws.amazon.com
- ↑GDPR: aws.amazon.com · aws.amazon.com
- ↑ISO 27001: aws.amazon.com
- ↑PCI DSS: aws.amazon.com · docs.aws.amazon.com
- ↑Published SLA: aws.amazon.com
- ↑Rate limits: docs.aws.amazon.com · docs.aws.amazon.com
- ↑Known restrictions: docs.aws.amazon.com · docs.aws.amazon.com
Change history
- 2026-06-21 Capabilities: {} → {"ssml":true,"streaming":true,"multilingual":true,"word_timestamps":true}
- 2026-06-21 Summary Md: (none) → Amazon Polly is an AWS cloud text-to-speech service, launched in 2016, suited f…
- 2026-06-21 Score Pricing Transparency: (none) → 100
- 2026-06-21 Score Setup Speed: (none) → 85
- 2026-06-21 Score Docs Quality: (none) → 15
- 2026-06-21 Score Procurement Friction: (none) → 100
- 2026-06-21 Score Trust Readiness: (none) → 100
- 2026-06-21 Best For: (none) → Prototypes and side projects - free to start, no sales call, Teams needing broa…
- 2026-06-21 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
- 2026-06-21 Score Agent Friendliness: (none) → 30
- 2026-06-21 Robots Allows Agents: (none) → Yes
- 2026-06-21 Status Page URL: (none) → https://status.aws.amazon.com
- 2026-06-21 Docs URL: (none) → https://docs.aws.amazon.com/
- 2026-06-21 Has Structured Data: (none) → Yes
- 2026-06-21 Rendering: (none) → static
- 2026-06-21 Llms Txt Present: (none) → No
- 2026-06-21 Pricing Model: set to usage_based
- 2026-06-21 Has Published Pricing: set to Yes
- 2026-06-21 Free Tier Available: set to Yes
- 2026-06-21 Free Tier Details: set to Standard voices: 5 million characters/month (permanent, always-free). Neural vo…
- 2026-06-21 Self Serve Signup: set to Yes
- 2026-06-21 Requires Sales Call: set to No
- 2026-06-21 Enterprise Plan Available: set to No
- 2026-06-21 SOC 2: set to type_2
- 2026-06-21 HIPAA: set to Yes
- 2026-06-21 GDPR: set to Yes
- 2026-06-21 ISO 27001: set to Yes
- 2026-06-21 PCI DSS: set to Yes
- 2026-06-21 SLA Published: set to Yes
- 2026-06-21 SLA URL: set to https://aws.amazon.com/ai/services/language-sla/
- 2026-06-21 Data Retention Policy URL: set to https://docs.aws.amazon.com/polly/latest/dg/data-protection.html
- 2026-06-21 Documented Rate Limits: set to SynthesizeSpeech (standard): 80 requests/sec; SynthesizeSpeech (neural/long-for…
- 2026-06-21 Rate Limit Requests: set to 80
- 2026-06-21 Rate Limit Window: set to second
- 2026-06-21 Known Restrictions: set to SynthesizeSpeech API: maximum 6,000 total characters (including SSML tags) or 3…
- 2026-06-21 Auth Methods: set to hmac_signature
- 2026-06-21 Auth Docs URL: set to https://docs.aws.amazon.com/polly/latest/dg/security-iam.html
- 2026-06-21 API Style: set to rest
- 2026-06-21 Base URL: set to https://polly.{region}.amazonaws.com
- 2026-06-21 API Version: set to 2016-06-10
- 2026-06-21 Versioning Scheme: set to url
- 2026-06-21 Stability: set to ga
- 2026-06-21 Deprecation Policy URL: set to https://docs.aws.amazon.com/general/latest/gr/service-lifecycle.html
- 2026-06-21 Quickstart URL: set to https://docs.aws.amazon.com/polly/latest/dg/getting-started.html
- 2026-06-21 Error Format: set to vendor-specific
- 2026-06-21 Slug: set to amazon-polly
- 2026-06-21 Starting Price Usd: set to 4
- 2026-06-21 Price Basis: set to 1M characters
- 2026-06-21 Free Tier Limit: set to 5 million characters/month (standard voices, permanent); 1 million characters/m…
- 2026-06-21 Launched At: set to 2016-11-30
Suggest an edit / leave a review
Leave a review or comment
curl -X POST https://apio.sh/api/feedback/amazon-polly \
-H 'Content-Type: application/json' \
-d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'Suggest a correction to a field (cite a source)
curl -X POST https://apio.sh/api/suggest/amazon-polly/FIELD \
-H 'Content-Type: application/json' \
-d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'