Speaker identification
AI-powered speech recognition detects and labels different speakers, so interviews and meetings read cleanly.
Upload any audio or video file, paste a YouTube, Vimeo, or Instagram link, or record directly from your microphone. AI transcribes speech to text with high accuracy, automatic speaker identification, and one-click translation into 100+ languages.
Transcribe
Transcribe audio and video to text with AI in 100+ languages.
Open full workspace6 credits per minute.
What this tool is built for
Upload any audio or video file, paste a YouTube, Vimeo, or Instagram link, or record directly from your microphone. AI transcribes speech to text with high accuracy, automatic speaker identification, and one-click translation into 100+ languages.

Highlights
AI-powered speech recognition detects and labels different speakers, so interviews and meetings read cleanly.
Paste a YouTube, Vimeo, or Instagram link, or pull a file from Google Drive, Dropbox, or OneDrive — no manual download.
Turn on auto-translate to get the transcript in your target language alongside the source-language text.
How it works
Upload a file up to 2GB, paste a streaming URL, or record directly in the browser.
Speech is segmented, timestamped, and labelled by speaker, with optional translation applied.
Download as SRT, VTT, TXT, PDF, DOCX, or CSV — with or without timestamps.
Capabilities
Common jobs
FAQ
MP3, MP4, WAV, M4A, FLAC, OGG, WebM, MOV, and AVI are all supported, up to 2GB per file.
Yes. Paste the YouTube, Vimeo, or Instagram URL directly and the tool fetches and transcribes the audio without a manual download step.
Transcripts export as SRT, VTT, TXT, PDF, DOCX, or CSV, with or without timestamps. You can also auto-translate the transcript into 100+ languages before exporting.
Yes. Automatic speaker identification labels who is speaking, which is useful for interviews, panels, and meetings.
6 credits per minute of audio. The credit cost is shown before you confirm the job.