Voice control for your coding agent.
Hold Fn, talk to Claude Code (or anything in your terminal), release — the transcript pastes at the cursor. Five STT engines including two on-device, voice commands that survive terminal apps, optional Claude cleanup, AI Actions to reshape any transcript on demand, and an encrypted voice journal on a second hotkey.
Apple Silicon · v0.4.3 · signed & notarized · built by a Claude Code daily driver
Talking to your agent shouldn't mean typing at it.
Claude Code, Cursor, aider, OpenCode, your terminal — whatever you live in. If it accepts a paste, Dictabit feeds it.
Stop transcribing your own thoughts into a textarea. Hold Fn, talk through what you want changed, release. A 200-word prompt takes 30 seconds instead of three minutes of typing.
When Claude is grinding through a multi-step task, your hands are free. Use that. Queue the next prompt, leave a steering note, draft the follow-up — without leaving the terminal.
Voice is faster than typing for fuzzy thinking. Dictate a stream-of-consciousness design note straight into your scratch file or directly into the model. Let the cleanup pass tidy the punctuation.
The two writing tasks every dev hates the most. Both are short, both are conversational, both belong in your voice — not bashed out one-finger between context switches.
Three keys. No app to focus.
The pill is click-through and transparent. It doesn't steal focus, it doesn't move your cursor, it doesn't show up in your screenshots.
Press and hold your hotkey from any app — Fn by default, or rebind to any combo. The pill at the bottom of your screen lights up. Modifier-only debouncing absorbs N-key-rollover flutter so a brief flag drop mid-sentence doesn't end your recording.
Audio is captured at the mic's native sample rate, downmixed to mono, high-passed at 80 Hz, peak-normalized, and silence-trimmed with an adaptive noise floor. The frontmost app is detected in parallel so the prompt biases toward terms common in that app type.
Release. Transcript is cleaned (optional Claude pass), Cmd+V is synthesized via core-graphics, your prior clipboard contents are restored, and a Return key fires after a configurable delay if you want it to.
One hotkey. Reshape any transcript.
Dictate freely. Then press the AI hotkey and pick what you want done — summarize, restructure, tighten, JSONify, translate. Your last transcript goes to Claude with the action's prompt, the result pastes at the cursor. Six built-in actions, add your own with custom prompts.
2-3 sentences. Cut everything else.
One topic per bullet. Keep all meaning.
Lowercase_snake_case keys. No preamble.
Professional tone. Same structure.
Remove filler. Same meaning, fewer words.
Natural English. Translation only.
Uses your Anthropic key — bring your own, no Dictabit billing in the middle. Each action is just a system prompt; add custom ones for tones, formats, or domain- specific rewrites (PR titles, commit messages, JIRA tickets, tweet drafts, code comments). Source can be your last transcript, the clipboard, or selection.
Four providers. Five engines. Bring your own keys.
No Dictabit server in the middle. Your audio goes from your Mac to the provider you chose — or stays on your Mac entirely with one of the two on-device engines.
NVIDIA's ONNX transducer, multilingual, fast on CPU. The recommended on-device engine — no API key, no quota, nothing leaves your Mac.
whisper.cpp via Metal on Apple Silicon. Five model sizes. Pre-warmed at startup so the first offline dictation isn't a 3–5 s cold start.
Default cloud option. whisper-large-v3 on Groq hardware — typical turnaround under a second. Free tier covers ~8h/day.
If you already pay for OpenAI. ~$0.006/min. Supports translate-to-English per dictation.
Nova-3, transcribe-only. $200 free credit ≈ 200 h. Sometimes wins on heavy accents, noisy rooms, or unusual technical vocab.
Hot-swappable from the Providers tab — grouped Online · Offline · LLM cleanup. The last key of the active provider is locked so you can't accidentally disconnect a live pipeline. Per-app prompt biasing ships with built-in vocab for Terminal, code editors, browsers, and chat apps — and a Dev-mode toggle adds extra programming vocabulary.
Commands that survive a terminal.
The voice features other dictation apps gloss over. These are the ones that matter when you actually live in your editor.
End any dictation with “scratch that”, “cancel that”, “undo that”, or six other triggers. The current dictation is discarded and the previously-pasted dictation is backspaced out — works in terminals, where ⌘Zdoesn't do anything.
End with “send” or “submit” and Dictabit fires Enter immediately, stripping the trigger word. Pair with an auto-send delay (Instant · 1s · 2s · 3s · 5s) and chain follow-up prompts by holding the hotkey during the countdown. Talk to your agent in turns, without ever touching Return.
Say “open paren”, “new paragraph”, “hyphen”, “new line”, “close brace” and Dictabit substitutes the actual character. Dictating prompts with code-flavored content stops being a fight with the transcriber.
Say “my email”and get your address. Dictate a teammate's name and get the right spelling every time. Snippets, user-defined corrections, and a built-in dev vocabulary all run before the text hits your cursor — so your custom variable names and weird CLI tools come out right the first time.
Five quiet passes between your voice and the cursor.
Frontmost app's bundle ID is detected the moment recording starts. Terminal gets CLI vocabulary. Editors get programming keywords. Browsers and chat apps get their own term sets. Recognition gets sharper without any config from you.
Optional Claude 4.5 Haiku pass that strips filler words, restores punctuation, and applies self-corrections heard mid-dictation ( “send it to John… actually Sarah” → “send it to Sarah”). System prompt is editable. Roughly $0.001 per dictation. Falls back to the raw transcript on any error, so a missing key never swallows a dictation.
Optional URL receives every transcript as JSON { text, timestamp_ms }. n8n, Zapier, your own agent backend. Webhook-only mode skips the cursor paste — Dictabit becomes a voice-to-API trigger for whatever pipeline you've wired up.
Want filler cleanup without the LLM cost? A regex pass handles “um”, “uh”, “like”, “you know”, “basically” and friends. Free, instant, runs whether or not the LLM pass is on.
Dictate in any language Groq supports, get English back. Per-dictation toggle from the General tab. Useful when you're more fluent in a non-English language than the audience you're writing to.
Every dictation lands in a searchable, click-to-copy history panel. Configurable cap (200 · 500 · 1000 · 2000 · Unlimited). One-click export. Lifetime stats — dictations, words, minutes saved — persist independently of the cap.
There's a second mode for thinking out loud.
A separate hotkey, a separate flow. Talk for as long as you want, get formatted Markdown back, encrypted at rest.
Press the journal hotkey, ramble. Audio splits at natural silence boundaries, transcribes in sequence, stitches back together. A single failed chunk becomes a [… inaudible …] marker so the entry never dies on one bad upload.
When you're done, Claude turns rambling speech into proper paragraphs and bullet lists. Per-notebook prompt overrides — write a tone for each notebook and Claude follows it.
Entries open in an inline editor that renders formatting as you type — bold appears bold, headings styled, lists indented. No raw asterisks visible. Toolbar for the common stuff, "Show markdown source" disclosure for power users.
Entries encrypted with a per-machine key in the macOS Keychain. Plaintext never touches disk. Titles and previews are derived on demand by decrypting just the entry you open. One-click decrypted-Markdown export per notebook is the backup escape hatch.
Different jobs.
Wispr Flow is the one for writing emails. Dictabit is the one for terminals and agents. Same gesture, very different posture.
A utility that disappears when it's working.
Hold-to-talk for the button feel, or tap once to start and tap again to stop. Hot-swappable in the UI, no restart.
A 150 ms window absorbs N-key-rollover flutter on combos like ⌃⇧, so a brief flag drop mid-sentence doesn't end your recording.
The pill positions itself on the display under your cursor. Move screens mid-dictation? It follows. Click-through and transparent — it never steals focus.
Dictabit synthesizes Cmd+V to paste, then restores whatever you had on the clipboard before. The transcript paste doesn't blow away the link you were about to share.
Parakeet and the Whisper Metal kernel pre-warm at startup; the mic device pre-opens for ~150 ms in the background. Your first hotkey press of the session isn't a cold start.
When something goes wrong the error pill surfaces the actual reason — a missing key, a rate limit, a mic that's muted — instead of a generic Error.
Lowering the history cap or clicking Clear auto-snapshots history.json to a backups dir first. The last 5 snapshots stick around with a one-click Restore — accidental deletions stop being permanent.
Slack messages. Emails. Notion pages. The PR description you've been putting off.
Dictabit is built for the AI-assisted dev workflow, but it pastes into any text field on your Mac. If you'd otherwise be typing it, you can dictate it.