Mistral Document AI (Mistral OCR)
"The world's best document extraction and understanding model." [1]
Mistral Document AI (Mistral OCR) is a REST API for extracting text, tables, images, and structured data from PDFs and scanned documents, with support for multilingual content, mathematical notation, and custom-prompt document annotation. Pricing is usage-based at $2.00 per 1,000 pages with self-serve signup and no sales call required, plus enterprise plans for larger volumes. The API carries SOC 2 Type 2, ISO 27001, and GDPR certifications, and counts BNP Paribas, HSBC, BMW, SAP, and Snowflake among its customers. Python and TypeScript SDKs are available, and processing can be directed to either EU or US infrastructure.
Best for / Avoid if
Best for: Regulated or enterprise workloads - compliance attestations and an enterprise plan; AI agents and automation - an agent-ready surface (MCP / llms.txt); Teams needing broad API coverage out of the box
Avoid if: You want to try it free before paying
Scores
- 50 / 100Agent friendliness
- 85 / 100Pricing transparency
- 55 / 100Setup speed
- 65 / 100Docs quality
- 85 / 100Procurement ease
- 55 / 100Trust readiness
Pricing & procurement
- Pricing model
- Usage-based [2]
- Published pricing
- ✓ Yes [3]
- Free tier
- ✗ No [4]
- Self-serve signup
- ✓ Yes [5]
- Requires sales call
- ✗ No
- Enterprise plan
- ✓ Yes [6]
| Plan | Item | Per | Amount | Source |
|---|---|---|---|---|
| Document extraction (OCR) — standard | 1,000 pages | $2 | source | |
| Annotated pages (structured JSON extraction) | 1,000 annotated pages | $3 | source | |
| Batch API | Document extraction (OCR) — batch (50% discount) | 1,000 pages | $1 | source |
Capabilities
- Supported actions
- ocr_process (POST /v1/ocr), extract_text_markdown, extract_images_base64, extract_tables_markdown_or_html, extract_headers_footers, extract_bounding_boxes, confidence_scoring_per_word_or_page, structured_json_annotation, document_annotation_with_custom_prompt, batch_ocr (via /v1/batch), page_selection, document_qna (via /v1/chat/completions with OCR model) [7]
- Regions
- European Union (default), United States [8]
- Languages
- Russian, French, Hindi, Chinese, Portuguese, German, Spanish, Turkish, Ukrainian, Italian, Romanian, 40+ languages total (multilingual, thousands of scripts and fonts) [9]
- Input types
- PDF, PNG, JPG, JPEG, TIFF, BMP, GIF, WEBP, AVIF, PPTX, DOCX, document_url, base64_encoded_document, image_url [10]
- Output types
- JSON, Markdown, HTML (for tables) [11]
- Webhooks
- ✗ No
- Sandbox / test mode
- ✗ No
- SDK languages
- Python, TypeScript [12]
- MCP server
- ✗ No
Trust & compliance
- SOC 2
- SOC 2 Type II [13]
- HIPAA
- ✗ No
- GDPR
- ✓ Yes [14]
- ISO 27001
- ✓ Yes [15]
- PCI DSS
- – Unknown
- Published SLA
- ✗ No
- Rate limits
- Rate limits are applied at the Workspace level and vary by usage tier. They include: Requests per second (RPS), Tokens per minute, Tokens per month. Tiers advance based on cumulative billing: Free mode (default, limited, evaluation/prototyping), Tier 2 (>$20), Tier 3 (>$100), Tier 4 (>$500), Custom (>$2,000, contact support). [16]
- Known restrictions
- Maximum file size for uploads: 512 MB, Maximum image size: 20 MB per image, Uploaded files are retained for 30 days unless deleted earlier, Batch jobs support up to 1 million requests per batch, Rate limits vary by subscription tier, On-premises and cloud partner deployment available for self-hosting use cases [17]
Developer surface
Integration
Adoption & maturity
- Launched
- 2025-03-06
- GA
- 2025-03-06
- Notable customers
- BNP Paribas, HSBC, ASML, CMA CGM, BMW, IBM, SAP, Stellantis, Snowflake, AXA, Cisco, Ericsson, Orange, TotalEnergies
Other OCR & Document Parsing APIs
Amazon Textract
"Automatically extract printed text, handwriting, layout elements, and data from any document"
Veryfi
"Documents into Data - securely, in seconds"
Google Document AI
"A document processing and understanding platform that takes unstructured data from documents and transforms it into structured data, making it easier to understand, analyze, and consume."
Azure AI Document Intelligence
"Azure Document Intelligence in Foundry Tools is a machine-learning based OCR and intelligent document processing service to automate extraction of key data from forms and documents."
Nanonets
"AI Agents for Enterprise Data Processing."
Extend
"Turn documents into high quality data"
References
- ↑Description: mistral.ai · mistral.ai
- ↑Pricing model: mistral.ai · mistral.ai
- ↑Published pricing: docs.mistral.ai · mistral.ai
- ↑Free tier: mistral.ai
- ↑Self-serve signup: mistral.ai · docs.mistral.ai
- ↑Enterprise plan: mistral.ai · mistral.ai
- ↑Supported actions: docs.mistral.ai · docs.mistral.ai
- ↑Regions: help.mistral.ai
- ↑Languages: mistral.ai · docs.mistral.ai
- ↑Input types: docs.mistral.ai · docs.mistral.ai
- ↑Output types: docs.mistral.ai · docs.mistral.ai
- ↑SDK languages: docs.mistral.ai · docs.mistral.ai
- ↑SOC 2: help.mistral.ai · trust.mistral.ai
- ↑GDPR: legal.mistral.ai · help.mistral.ai
- ↑ISO 27001: help.mistral.ai
- ↑Rate limits: help.mistral.ai · docs.mistral.ai
- ↑Known restrictions: docs.mistral.ai · docs.mistral.ai
Change history
- 2026-06-15 Score Docs Quality: 25 → 65
- 2026-06-15 Score Agent Friendliness: 25 → 50
- 2026-06-14 Openapi Spec URL: (none) → https://docs.mistral.ai/openapi.yaml
- 2026-06-14 Robots Allows Agents: (none) → Yes
- 2026-06-14 Has Structured Data: (none) → No
- 2026-06-14 API Reference URL: (none) → https://docs.mistral.ai/api
- 2026-06-14 Llms Txt URL: https://mistral.ai/llms.txt → https://docs.mistral.ai/llms.txt
- 2026-06-14 Capabilities: {} → {"tables":true,"agentic_output":true,"receipts_invoices":true}
- 2026-06-14 Summary Md: (none) → Mistral Document AI (Mistral OCR) is a REST API for extracting text, tables, im…
- 2026-06-14 Score Docs Quality: 0 → 25
- 2026-06-14 Best For: Regulated or enterprise workloads - compliance attestations and an enterprise p… → Regulated or enterprise workloads - compliance attestations and an enterprise p…
- 2026-06-14 Score Agent Friendliness: 10 → 25
- 2026-06-14 Rendering: (none) → static
- 2026-06-14 Llms Txt Present: (none) → Yes
- 2026-06-14 Llms Txt URL: (none) → https://mistral.ai/llms.txt
- 2026-06-14 Status Page URL: (none) → https://status.mistral.ai
- 2026-06-14 Docs URL: (none) → https://docs.mistral.ai
- 2026-06-14 Score Agent Friendliness: (none) → 10
- 2026-06-14 Scoring Methodology: (none) → Scores are computed deterministically from this profile's published, sourced fi…
- 2026-06-14 Avoid If: (none) → You want to try it free before paying
- 2026-06-14 Best For: (none) → Regulated or enterprise workloads - compliance attestations and an enterprise p…
- 2026-06-14 Score Trust Readiness: (none) → 55
- 2026-06-14 Score Procurement Friction: (none) → 85
- 2026-06-14 Score Docs Quality: (none) → 0
- 2026-06-14 Score Setup Speed: (none) → 55
- 2026-06-14 Score Pricing Transparency: (none) → 85
- 2026-06-14 Starting Price Usd: 2 → 2
- 2026-06-14 Last Verified At: 2026-06-13T00:00:00.000Z → 2026-06-14T00:00:00.000Z
- 2026-06-14 SDK Packages: Python, TypeScript → Python, TypeScript
- 2026-06-13 Known Restrictions: set to Maximum file size for uploads: 512 MB, Maximum image size: 20 MB per image, Upl…
- 2026-06-13 Auth Methods: set to api_key
- 2026-06-13 Auth Docs URL: set to https://docs.mistral.ai/getting-started/quickstarts/studio/activate-and-generat…
- 2026-06-13 API Style: set to rest
- 2026-06-13 Base URL: set to https://api.mistral.ai/v1
- 2026-06-13 API Version: set to v1
- 2026-06-13 Versioning Scheme: set to url
- 2026-06-13 Stability: set to ga
- 2026-06-13 Deprecation Policy URL: set to https://docs.mistral.ai/getting-started/models
- 2026-06-13 Quickstart URL: set to https://docs.mistral.ai/getting-started/quickstarts/developer/first-api-request
- 2026-06-13 Error Format: set to vendor-specific
- 2026-06-13 Requires Verification: set to No
- 2026-06-13 Starting Price Usd: set to 2
- 2026-06-13 Price Basis: set to 1,000 pages
- 2026-06-13 Launched At: set to 2025-03-06
- 2026-06-13 GA Date: set to 2025-03-06
- 2026-06-13 Notable Customers: set to BNP Paribas, HSBC, ASML, CMA CGM, BMW, IBM, SAP, Stellantis, Snowflake, AXA, Ci…
- 2026-06-13 Fields Not Found: set to pci_dss, exact per-model rate limits (RPS/TPM numbers not published), maximum p…
- 2026-06-13 Source Confidence: set to high
- 2026-06-13 Extractor: set to claude-subagent:sonnet
- 2026-06-13 Last Verified At: set to 2026-06-13T00:00:00.000Z
Suggest an edit / leave a review
Leave a review or comment
curl -X POST https://apio.sh/api/feedback/mistral-document-ai \
-H 'Content-Type: application/json' \
-d '{"kind":"review","rating":5,"body":"Your experience with this API…"}'Suggest a correction to a field (cite a source)
curl -X POST https://apio.sh/api/suggest/mistral-document-ai/FIELD \
-H 'Content-Type: application/json' \
-d '{"value":"corrected value","citations":[{"url":"https://source.example/page","excerpt":"supporting quote"}],"note":"what changed and why"}'