Push-to-talk for macOS. Never lose the first word — recording starts in 7ms. 100% local, 25 languages
Talk to Claude Code, Cursor, ChatGPT and multi-agent workflows — 10x faster than typing.
| ├─ | ⌘ | → | 📝 Paste in active app |
| ├─ | ⌘+1 | → | 🤖 Claude Code CEO Agent |
| ├─ | ⌘+2 | → | 💬 Slack |
| └─ | ⌘+3 | → | ✅ Linear task |
Everything runs on your Mac. No accounts, no servers, no cloud.
Your voice never leaves your Mac. Transcription runs entirely on the Apple Neural Engine via CoreML. No internet required.
Auto-detect and transcribe English, Spanish, French, German, Japanese, Chinese, and 19 more languages. Switch mid-sentence.
Press the key — recording starts in 7 milliseconds. Other apps take 200ms+ and lose the first word. SpeechButton captures everything from the very first syllable.
All settings in a plain config.toml file. Hand it to an AI agent — it configures everything for you. Per-hotkey channel bindings, transform pipelines, device rules. Professional-grade control without a GUI.
SpeechButton transcribes in chunks as you speak. Every time you pause briefly, the chunk is transcribed and pasted instantly. When you finish talking, almost all text is already there.
With auto-Enter enabled, a longer silence (3s) sends the full message automatically. Hold the hotkey, talk to Claude Code or ChatGPT, pause briefly between thoughts — chunks appear in realtime. Stop talking for 3 seconds — message is sent. No keyboard needed.
# Hands-free voice activity detection
[vad]
enabled = true
chunk_silence_sec = 0.7 # pause → transcribe chunk
[global]
auto_send = true
send_delay_sec = 3.0 # 3s silence → auto-Enter
SpeechButton doesn't just paste text at your cursor. It's a programmable voice pipeline — route, transform, and deliver your speech anywhere.
Different key combos → different destinations. Hold Command and press 1, 2, or 3 to route your speech to different agents, APIs, or scripts.
Process your speech before it arrives. Run it through a script, an LLM, or any command. The transformed text is what gets delivered.
/start for agent commandsText doesn't just go to your cursor. Send it to a webhook, log it to a file, pipe it to a script, or fire it at an API — all at once.
paste — at cursor in any appexec — pipe to any script or CLIwebhook — POST to any URLfile — append to log for history# Command+1: voice → structured task → send to AI agent
[[hotkey]]
key = "RightCommand"
channel = "1"
name = "agent-task"
transform = "python3 ~/scripts/to_task.py" # speech → "/start-task ..." command
# Command+2: voice → clean text → paste at cursor
[[hotkey]]
key = "RightCommand"
channel = "2"
name = "clean-paste"
transform = "python3 ~/scripts/clean_filler.py" # remove filler words
# Command+3: voice → create issue in Linear/GitHub
[[hotkey]]
key = "RightCommand"
channel = "3"
name = "create-issue"
transform = "python3 ~/scripts/format_issue.py"
exec = "bash ~/scripts/create_linear_issue.sh"
Hands-free agent workflows, voice-driven automation, and programmable text pipelines.
Voice Activity Detection sends text as you speak. Combined with auto-Enter, you can talk to AI agents (Claude Code, ChatGPT, Slack bots) without touching the keyboard at all.
All settings in a single config.toml file. AI agents can configure SpeechButton programmatically — no GUI needed. Changes apply instantly without restart.
Process text before it's pasted: run it through a script, send it to an LLM API, or transform it locally. Get cleaned-up, formatted, or translated text — all from your voice.
Hold Command, then press 1, 2, or 3 to route your speech to different destinations. Send voice to one agent, then switch to another with a different channel — perfect for multi-agent workflows.
Send transcribed text to a webhook URL for integrations, or log everything to a file for history and audit. All outputs work simultaneously — paste, file, webhook, and exec at once.
Use your iPhone as an external microphone. With keep_hot = true the mic stays always-on — no 300ms wake-up delay when you start talking. Same blazing fast response, even over wireless.
Choose output_format = "text" for plain text or "json" for structured data with timestamps, language, and confidence — ideal for programmatic pipelines.
# Default: paste raw text at cursor
[[hotkey]]
key = "RightCommand"
name = "default"
# Command+1: clean filler words via Python before pasting
[[hotkey]]
key = "RightCommand"
channel = "1"
name = "ai-cleanup"
transform = "python3 ~/scripts/clean_filler.py"
# Command+2: translate to English via LLM API
[[hotkey]]
key = "RightCommand"
channel = "2"
name = "translate"
transform = "python3 ~/scripts/translate_en.py"
# Command+3: send to Slack via bash script
[[hotkey]]
key = "RightCommand"
channel = "3"
name = "slack-post"
exec = "bash ~/scripts/slack-post.sh"
output_format = "json" # structured output with timestamps
# Output destinations (all work simultaneously)
[output]
paste = "accessibility" # paste at cursor
file = "/tmp/speechbutton_log" # transcription history
webhook = "http://localhost:8080/transcription"
# Per-device settings (match by name)
[[device_rule]]
match = "iPhone"
keep_hot = true # always-on mic → no 300ms wake delay
Three steps. Zero configuration.
Press and hold Right Command (or your custom hotkey). The menu bar mic starts pulsing.
Talk naturally in any supported language. Voice Activity Detection sends text as you talk.
Let go of the key. Text is instantly pasted wherever your cursor is. Done.
Start free. Upgrade when you need more.
Both plans: 100% local. Your voice never leaves your Mac. No cloud. No data collection.
Download SpeechButton and get instant speech-to-text on your Mac. Free 15 minutes/day. Pro: unlimited.
Download for macOSRequires macOS 14 Sonoma or later · Apple Silicon (M1+)