Which model should I choose?

Pick a model by what matters most to you, privacy, speed, accuracy or a light footprint. Here is a cheat sheet.

3 min read

There is no single best model, only the best one for what you are doing. Find your priority in the first column and start there. You can always change it later, per mode.

Your priority	Pick	Why
Least setup, good balance	Whisper Base, local	Small download, transcribes on device, the recommended default
Best accuracy offline	Whisper Large or Turbo, local	Highest quality without sending audio anywhere
Fastest on a Mac	Parakeet, local	Uses Apple MLX, very fast on Apple Silicon
Top accuracy, no download	OpenAI, cloud	Strong models, needs your API key and a connection
Fastest cloud	Groq, cloud	whisper-large-v3-turbo with very low latency
A tight, older laptop	Whisper Tiny, local	Smallest and lightest, trades some accuracy

Rules of thumb

If privacy matters most, choose a local model. Your audio never leaves your device.
If you want the least fuss, Whisper Base is a solid balance of speed and accuracy.
If your machine struggles, drop to a smaller local model or switch to a cloud provider.
Keep a fast model for quick notes and an accurate one for long dictation, each in its own mode.

FAQ

Questions and answers

I just installed wispa. What should I start with?

Whisper Base, local. It downloads quickly, runs on your device and is accurate enough for everyday dictation. Move up or to a cloud model if you want more.

My transcriptions are slow. What helps?

Use a smaller local model like Base or Tiny, or switch to a cloud provider such as Groq, which is very fast because the work happens on their servers.

Which is most accurate?

A large Whisper model or a strong cloud model like OpenAI. The gap to a mid-size model is small for clear speech, so try Base first.

Which model should I choose?

Rules of thumb

Questions and answers

I just installed wispa. What should I start with?

My transcriptions are slow. What helps?

Which is most accurate?

Related articles

Local vs cloud models

Transcription models

Languages

Start dictating in minutes