Transcription models
wispa can transcribe with local models that run on your device or cloud models that use your own API key. Here is the full lineup.
3 min read
The model is what turns your voice into text. wispa gives you two kinds: models that run on your device, and cloud models you reach with your own API key. Each mode picks one.
On-device models
Whisper runs locally in sizes from Tiny to Large, plus a Turbo build, so you can trade speed for accuracy. Parakeet runs through Apple MLX and is very fast on Apple Silicon Macs. With any local model your audio is transcribed and discarded on your computer, offline if you want.
Cloud models
OpenAI offers whisper-1 and the newer gpt-4o-transcribe and gpt-4o-mini-transcribe. Groq runs whisper-large-v3-turbo very fast. Cloud models need an internet connection and your own API key, and your audio is sent to that provider.
| Model | Where it runs | Best for |
|---|---|---|
| Whisper Tiny to Base | On your device | Fast and light, low-resource machines |
| Whisper Small to Large | On your device | Higher accuracy, fully offline |
| Parakeet | On your device (Apple Silicon) | Very fast transcription on a Mac |
| OpenAI | Cloud, your key | Top accuracy without a download |
| Groq | Cloud, your key | Very low-latency cloud transcription |
FAQ
Questions and answers
Which model is the default?
A local Whisper Base model is the recommended starting point. It downloads quickly and transcribes well on most machines without sending audio anywhere.
Can I use different models for different tasks?
Yes. Each mode sets its own model, so you might use a fast local model for quick notes and a cloud model for long, accuracy-critical dictation.
Is Parakeet available on Windows?
No. Parakeet runs through Apple MLX and is Apple Silicon only. On Windows you can use local Whisper models or a cloud provider.