Docs / Getting Started / First transcription

First transcription

A walkthrough of dictating your first sentence, what each pill state means, and how to cancel or recover if something goes wrong.

Before you start

Two things should be in place from the previous steps:

Dictating

  1. Click into a text field. Anywhere a cursor blinks works: a Slack message, a code editor, an email draft, a search bar. Ditto pastes wherever your cursor is when transcription finishes.
  2. Press Ctrl+Shift+Space. The pill switches to its recording state. On Matte and Onyx themes it shows voice-driven bars; on Aurora, Sunset, and Mint it shows a wave animation at the bottom that reacts to your voice.
  3. Talk normally. Speak at your usual pace. Pauses are fine, Whisper will handle them. Background noise is OK in moderation, especially with the noise filter on.
  4. Press Ctrl+Shift+Space again. The pill collapses to a small dots loader while Ditto runs the audio through the model.
  5. Your text is pasted. Within a second or two (longer for big models or long recordings), the transcription appears in the text field where your cursor was.

What each pill state looks like

StateWhat you seeMeaning
IdleWide pill that reads “Ditto”Ready, waiting for the shortcut
RecordingVoice-reactive bars or waveCapturing audio
TranscribingSmall pill with a row of animated dotsRunning Whisper, generating text
CopiedShort pill with a check icon and “Copied!”Text is in your clipboard but auto-paste is off

The “Copied” state only appears if you turned off auto-paste in Settings → General. By default Ditto pastes the text for you.

Canceling a recording

If you change your mind mid-sentence:

The cancel shortcut is configurable in Settings → Shortcut, in case Esc clashes with something else for you.

If something goes wrong

A few common situations:

For broader issues, jump to Troubleshooting.

A few tips for better results