VTT-first editor
Upload subtitle files and review each timestamped cue instead of working in a generic text box.
Upload a VTT subtitle file, review each timestamped cue, choose an AI voice, and generate clean audio. Turn on strict mode to align the speech closer to your subtitle timing. Best for timed captions, dubbing drafts, and accessibility narration.
Subtitle to Speech
Convert VTT subtitle files into AI voiceover audio.
Open full workspace4 credits per minute.
What this tool is built for
Upload a VTT subtitle file, review each timestamped cue, choose an AI voice, and generate clean audio. Turn on strict mode to align the speech closer to your subtitle timing. Best for timed captions, dubbing drafts, and accessibility narration.

Highlights
Upload subtitle files and review each timestamped cue instead of working in a generic text box.
Enable strict mode when alignment to the subtitle windows matters more than perfectly natural pacing.
Upload script files like TXT, PDF, DOCX, or PPTX when you want speech output without subtitle timestamps.
How it works
Drop a VTT subtitle file, or a TXT, PDF, DOCX, or PPTX script for a non-timed run.
Select the AI voice and decide whether to enable strict subtitle timing.
Generate the voice track and download it as MP3 to sync with your video.
Capabilities
Common jobs
FAQ
Upload a VTT subtitle file, review the subtitle lines and timestamps, choose an AI voice, then generate audio. Each subtitle cue is turned into speech and assembled into the final track for download.
Strict mode forces the generated speech to fit the subtitle timestamps more closely. It is useful when timing accuracy matters, but the voice can sound slightly less natural in tight subtitle windows.
Yes. You can upload text-based files such as TXT, MD, PDF, DOC, DOCX, and PPTX. VTT is the best option when you need timestamp-aware subtitle timing.
Use subtitle to speech when you have VTT subtitle timing and want cue-by-cue control. Use text to speech for plain text, articles, and long documents that do not need subtitle timing.
4 credits per minute of generated audio.