Voice · best for
Best AI model for TTS Replacement (2026)
Models that produce natural-sounding speech. Ranked from 346 live models on the OpenRouter catalog, weighted for audio input.
| # | Model | Score | In / 1M | Out / 1M | Context | |
|---|---|---|---|---|---|---|
| 1 | Xiaomi: MiMo-V2.5xiaomi/mimo-v2.5 | 115 | $0.40 | $2.00 | 1,048,576 | Try → |
| 2 | Xiaomi: MiMo-V2-Omnixiaomi/mimo-v2-omni | 115 | $0.40 | $2.00 | 262,144 | Try → |
| 3 | Google: Gemini 3.1 Flash Lite Previewgoogle/gemini-3.1-flash-lite-preview | 115 | $0.25 | $1.50 | 1,048,576 | Try → |
| 4 | Google: Gemini 3.1 Pro Preview Custom Toolsgoogle/gemini-3.1-pro-preview-customtools | 115 | $2.00 | $12.00 | 1,048,576 | Try → |
| 5 | Google: Gemini 3.1 Pro Previewgoogle/gemini-3.1-pro-preview | 115 | $2.00 | $12.00 | 1,048,576 | Try → |
| 6 | Google: Gemini 3 Flash Previewgoogle/gemini-3-flash-preview | 115 | $0.50 | $3.00 | 1,048,576 | Try → |
| 7 | Google: Gemini 2.5 Flash Lite Preview 09-2025google/gemini-2.5-flash-lite-preview-09-2025 | 115 | $0.10 | $0.40 | 1,048,576 | Try → |
| 8 | Google: Gemini 2.5 Flash Litegoogle/gemini-2.5-flash-lite | 115 | $0.10 | $0.40 | 1,048,576 | Try → |
| 9 | Google: Gemini 2.5 Flashgoogle/gemini-2.5-flash | 115 | $0.30 | $2.50 | 1,048,576 | Try → |
| 10 | Google: Gemini 2.5 Progoogle/gemini-2.5-pro | 115 | $1.25 | $10.00 | 1,048,576 | Try → |
| 11 | Google: Gemini 2.5 Pro Preview 06-05google/gemini-2.5-pro-preview | 115 | $1.25 | $10.00 | 1,048,576 | Try → |
| 12 | Google: Gemini 2.5 Pro Preview 05-06google/gemini-2.5-pro-preview-05-06 | 115 | $1.25 | $10.00 | 1,048,576 | Try → |
| 13 | Google: Gemini 2.0 Flash Litegoogle/gemini-2.0-flash-lite-001 | 115 | $0.07 | $0.30 | 1,048,576 | Try → |
| 14 | Google: Gemini 2.0 Flashgoogle/gemini-2.0-flash-001 | 115 | $0.10 | $0.40 | 1,048,576 | Try → |
| 15 | OpenAI: GPT Audio Miniopenai/gpt-audio-mini | 104 | $0.60 | $2.40 | 128,000 | Try → |
How we ranked these
For TTS Replacement, we weight models on audio input. Higher means better. Scores combine OpenRouter's model metadata (context length, modality support, tool calling, structured output, reasoning capability) with public pricing. See full methodology →