Privacy & data
What Ditto does and doesn't send to the internet, what it stores on your machine, and what other apps can see.
The short version
- Audio is captured, processed, and discarded locally. It never leaves your machine.
- Transcriptions are generated by whisper.cpp running on your CPU or GPU. No transcription server is involved.
- Ditto talks to the internet only to download Whisper models from HuggingFace, on demand, when you click Download.
- No telemetry, no analytics, no crash reports phone home.
Audio
When you record:
- The renderer captures audio with
getUserMediaandMediaRecorder— the same APIs a website uses for a “record voice” feature. - The audio is decoded, resampled to 16 kHz mono, and converted to 16-bit PCM in the renderer process.
- The PCM is sent to the main process over IPC as an
ArrayBuffer. - Main writes it to a temporary
.wavfile in%TEMP%. whisper-cli.exereads the WAV, transcribes, prints the text to stdout.- The temp
.wavis deleted in thefinallyblock of the IPC handler, immediately after transcription.
The audio never crosses a network boundary. The only time it touches disk is the temp WAV, which is removed right after use. If the app crashes mid-transcription, leftover WAVs are cleaned up by Windows’ temp folder maintenance.
Transcriptions
Same story. Once Whisper produces text, Ditto:
- Sets your clipboard to the text.
- Optionally simulates Ctrl+V into the active app.
- Optionally restores your previous clipboard content.
That’s it. The transcription is not logged to a file, not stored in a history, not sent anywhere. If you need it after the paste, copy it out of the target app.
Models
Ditto downloads Whisper models from HuggingFace’s CDN on demand:
- The download happens only when you click Download in Settings → Models or in the welcome window.
- The destination is
%APPDATA%\ditto\models\. - Once downloaded, models are never re-fetched. Ditto doesn’t check for updates or phone home about them.
The full URL pattern is https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-<size>.bin, served by HuggingFace’s standard CDN. They may keep their own access logs; that’s between you and HuggingFace.
Settings and other local data
Ditto stores everything related to your configuration in:
%APPDATA%\ditto\config.json
This is plain text JSON, managed by electron-store. You can open it in any editor to inspect or back up. The schema matches the DittoSettings type in the code.
What’s in there:
- Your shortcut, theme, mic, language preferences
- Cumulative usage counters (transcribe count, total audio seconds processed, number of auto-pastes)
- Last pill drag position, if you have “remember last position” on
What’s not:
- Any audio
- Any transcription text
- Any account or identifier — Ditto has no concept of accounts
Telemetry, analytics, crash reports
There is none. Ditto does not include:
- An analytics SDK
- A crash reporting service (Sentry, Bugsnag, etc.)
- An auto-update channel that pings home (auto-update is planned, but you’ll be able to opt out, and it’ll only contact GitHub Releases when checking for updates)
- Any network call other than the model download from HuggingFace
If a future release adds any of those, it will be opt-in and disclosed in the release notes.
Microphone permission
Windows treats microphone access as a sensitive permission. The first time Ditto records, Windows pops up a permission prompt. Ditto needs that to work — without it, no audio reaches the app.
You can review or revoke the permission at any time:
- Windows Settings → Privacy & security → Microphone
- Find Ditto in the list, toggle access on/off
Ditto only uses the mic during an active recording. There’s no background listening, no wake-word detection, no continuous capture. The getUserMedia stream is requested when you press the shortcut and stopped the moment you press it again.
Clipboard
Ditto reads the clipboard only to snapshot it for the “Keep previous clipboard” feature, and only when both auto-paste and that option are on.
Ditto writes to the clipboard when transcription finishes — your text goes there before the simulated Ctrl+V (or as the only effect, if auto-paste is off).
That’s the entire interaction Ditto has with the clipboard. It doesn’t read the clipboard outside of recordings. It doesn’t store clipboard history.