Voice and emotion control
Choose an AI voice, then apply emotion presets — happy, sad, calm, neutral, and more — and adjust speed, pitch, and volume to match the tone.
Paste your text or upload a document — TXT, PDF, DOCX, PPTX, or VTT — or even an image with OCR. Choose an AI voice, set the emotion, speed, pitch, and volume, then generate and download a natural voiceover. Built for long-form scripts and documents.
Text to Speech
Convert text and documents into natural AI voiceovers.
Open full workspace2 credits per minute.
What this tool is built for
Paste your text or upload a document — TXT, PDF, DOCX, PPTX, or VTT — or even an image with OCR. Choose an AI voice, set the emotion, speed, pitch, and volume, then generate and download a natural voiceover. Built for long-form scripts and documents.

Highlights
Choose an AI voice, then apply emotion presets — happy, sad, calm, neutral, and more — and adjust speed, pitch, and volume to match the tone.
Paste text, upload TXT, PDF, DOCX, PPTX, or VTT documents, or drop in an image and pull the text out with OCR.
Process articles, scripts, PDFs, and slide decks in one run — ideal for narration, study material, and business voiceovers.
How it works
Paste text or upload a document, VTT file, or image.
Pick a voice and fine-tune emotion, speed, pitch, and volume.
Generate the voiceover, preview the result, then download the audio.
Capabilities
Common jobs
FAQ
Direct text input, document uploads (TXT, MD, PDF, DOC, DOCX, PPTX, VTT), and image uploads with OCR, so you can convert text from many sources into speech.
Yes. You can fine-tune speed, pitch, and volume, plus apply emotion presets such as happy, sad, calm, and neutral to match your use case.
Yes. It handles long-form content such as articles, scripts, PDFs, DOCX files, and slide decks, up to 200,000 characters per run.
Use text to speech for plain text, long documents, and general voiceovers. Use subtitle to speech when you need VTT subtitle timing and cue-by-cue control.
2 credits per minute of generated audio.