Models
The Whisper models behind Ditto's transcription, how to pick one, switch between them, and clean up disk space.
What a model is
Ditto runs whisper.cpp, an optimized port of OpenAI’s Whisper. Whisper ships in five sizes. Each one is a single binary file that gets loaded into RAM (and VRAM on GPU) when you transcribe.
Bigger models are more accurate, especially on accents, technical jargon, and noisy audio. They’re also slower and use more resources. For most day-to-day dictation, Small strikes the best balance.
The five sizes
| Model | Size | Speed (GPU) | Speed (CPU) | Best for |
|---|---|---|---|---|
| Tiny | 75 MB | ~150 ms | ~1.5 s | Quick notes, low-spec machines |
| Base | 142 MB | ~250 ms | ~3 s | Casual everyday use |
| Small | 466 MB | ~500 ms | ~8 s | Daily use, recommended |
| Medium | 1.5 GB | ~1.2 s | ~25 s | Accents, jargon, noisy environments |
| Large-v3 | 2.9 GB | ~2.5 s | ~80 s | Maximum quality, slow without GPU |
Switching the active model
- Open Settings from the tray icon.
- Go to the Models panel.
- Click the row of the model you want to use. Only downloaded models can be selected. Non-downloaded models show a Download button instead.
The change applies to the next transcription. Currently active models stay loaded between transcriptions for low latency, so switching may add a small delay on the first use of a new model.
Downloading
If a model isn’t downloaded yet, click Download in its row. Ditto fetches it from HuggingFace and saves it to your %APPDATA%\ditto\models\ folder.
A progress bar appears while it downloads. You can:
- Cancel by clicking the X next to the progress bar. Partial files are cleaned up automatically.
- Switch tabs in Settings while it runs. The download keeps going in the background.
- Trigger another transcription with whatever model is currently active. Downloads don’t block recording.
Refreshing the list
If you delete a model file manually (from %APPDATA%\ditto\models\ in Explorer), Ditto won’t notice until you tell it to recheck. There’s a refresh button next to the Active model title — click it and Ditto re-scans the folder.
If the model that was active is gone, Ditto falls back to whichever model is still available, or reopens the welcome window if none remain.
Where they live
All models live in a single folder:
%APPDATA%\ditto\models\
Filenames follow the pattern ggml-<size>.bin:
ggml-tiny.binggml-base.binggml-small.binggml-medium.binggml-large-v3.bin
You can open this folder directly from Settings → Models → Storage → Open folder.
Removing a model
To free disk space, just delete the .bin file from the models folder in Explorer. Then click the refresh button in Settings → Models so Ditto updates its UI.
To remove everything Ditto stored (settings + all models):
- Settings → Advanced → Delete all data
This wipes %APPDATA%\ditto\ entirely and restarts the app from scratch with the welcome window.
Which one should I pick?
A rough guide:
- Default for trying it out: Base. Small enough to download fast, big enough to get a feel for what Whisper can do.
- For everyday dictation: Small. The sweet spot of speed and accuracy.
- For technical writing, accents, noisy rooms: Medium or Large-v3. The jump in accuracy is real, especially with proper nouns and rare words.
- For older or low-RAM machines: Tiny. Less accurate but always responsive.
You can always switch later — there’s no commitment to your first pick.