Transcribe Audio Files with OpenAI Whisper

I record a lot of voice notes as I’m out and about, it’s often quicker then typing thoughts out, but when I get back to my laptop there will often be voice notes I want to transcribe into text. it’s not just my own voice notes though, sometimes there will be something that really stands out in something I’ve listened to on YouTube or a Podcast that I want to clip and transcribe.

It’s hard to overstate just how fantastic OpenAI’s Whisper is. It may be that I happen to have a nice clear voice, but I can transcribe everything in seconds on my HP EliteBook (with Intel onboard graphics only) and it’s extremely accurate, requiring only a few edits to the output text.

Installation is easy:

pip install -U openai-whisper

Then once you’ve done that just feed whisper one or more audio files and specify a model. I never really have to use anything more than the tiny model, it generally transcribes everything I say spot-on, any issues I can quickly correct. It really is amazing.

# Single audio file:
whisper audio_2025-01-16_21-54-03.ogg --model tiny --output_format txt

# All audio files in a directory:
whisper * --model tiny --output_format txt

The above command will output the transcription on the console, but by specifying --output_format it’ll also save it to a text file. If you don’t specify that, it’ll output the transcription in various formats that you may or may not need like json, srt etc.

It’s absolutely excellent. There’s no need to use any dodgy, paid online services, just use whisper.