The macOS runtime has two parts:
  • Python owns capture, transcription, grammar cleanup, replacements, clipboard and paste output, history, sockets, LaunchAgent lifecycle, and CLI commands.
  • Swift owns the visible menu bar app, overlay, settings, onboarding, and IPC rendering.

Hotkeys

ShortcutAction
Double-tap Right OptionStart recording and copy the result by default
Hold Right OptionRecord until release and paste the result at the cursor
Tap Right Option or SpaceStop a double-tap recording and transcribe
EscCancel recording or stop speech
Option+TRead selected text aloud
Ctrl+Shift+GProofread selected text
Ctrl+Shift+RRewrite selected text
Ctrl+Shift+POptimize selected text as an LLM prompt
Double-tap dictation follows ui.auto_paste: clipboard by default, cursor paste when enabled. Hold-to-record always pastes the result at the cursor. Text-transform shortcuts require grammar correction to be enabled with a working backend. The menu bar shows service state and exposes:
ItemPurpose
StatusCurrent state, active engine, and backend subtitle
EngineSwitch transcription engine in place
GrammarSwitch grammar backend in place
ReplacementsToggle replacements and show rule count
Retry Last / Copy LastRe-transcribe or re-copy
TranscriptionsRecent entries, click to copy
RecordingsAudio files, click to reveal in Finder
SettingsFull sidebar settings window
ServiceRestart, check for updates, and open logs

Settings panels

PanelCovers
RecordingTrigger key, double-tap window, audio cleanup, and duration limits
TranscriptionEngine picker, verified model cache state, inline download progress, cancel, and per-engine sampling or decoding parameters
GrammarBackend picker and per-backend connection and limits
VoiceText-to-speech voice, shortcut, and dictation command help
VocabularySearchable replacement editor with import and export
OutputOverlay, sounds, notifications, paste-at-cursor, and history limit
ShortcutsProofread, rewrite, prompt-engineer bindings, and cheatsheet
ActivitySessions, words, 30-day chart, top words, and top replacement triggers
AdvancedPermission request buttons, storage paths, model idle unload, service log, doctor, restart, and update
AboutVersion, credits, and replay tutorial
Settings save to ~/.whisper/config.toml. Fields that require restart warn and offer immediate restart.

Overlay states

The floating overlay shows recording duration, waveform, processing progress, copied or pasted result, errors, and speaking state. Python publishes state snapshots over IPC. Swift drops stalled IPC consumers instead of blocking the transcription pipeline.