The Simplest Entry Point
You don't need a remote server. You don't need a P2P mesh. You don't need multiple agents running in terminals.
You need one thing: a hotkey that turns your voice into a coding task.
"Write a function that parses a CSV file and returns an array of JSON objects where each column header becomes a key."
Release. SpeechButton transcribes your words, the on-device Local AI Transform structures them as a clear prompt, and send_claude_code.py runs claude --print --bare -p — a one-shot Claude Code agent that executes the task and exits.
30 seconds later, the function exists. You spoke it into existence.
Setup: 3 Files, 5 Minutes
Three files in ~/.config/speechbutton/. That's the entire integration.
File 1: config.toml
One extra hotkey. That's it.
# ~/.config/speechbutton/config.toml [global] model = "parakeet-tdt-0.6b-v3-int8" language = "auto" auto_punctuation = true # Default — paste at cursor [[hotkey]] key = "RightCommand" name = "default" paste = "accessibility" # One-shot agent — speak a task, agent executes once [[hotkey]] key = "RightCommand" channel = "1" name = "claude-code" transform = "prompts/claude_code_task.md" exec = "integrations/send_claude_code.py"
File 2: Transform prompt — spoken task → clear prompt
Your rambling speech becomes a clean, structured prompt the agent can act on immediately. The Local AI Transform runs a built-in on-device model (Gemma 4) on your Mac — no API key, no cloud, no cost.
You say:
"um write a function that like parses CSV and returns JSON objects where each column header becomes a key and handle the case where there are empty values"
Clean up this spoken coding task. Fix grammar, remove filler words, make the instruction precise and actionable for a coding agent. Keep technical terms exact. Output ONLY the cleaned task.
Agent receives (on-device, instant):
Write a function that parses a CSV file and returns an array of JSON objects. Each column header becomes a key. Handle empty values gracefully (use null for empty cells).
Your stream-of-consciousness became a precise instruction instantly, fully on-device. No API key needed for the transform step — only Claude Code itself requires your Anthropic account.
File 3: One-shot agent runner
Runs Claude Code once with the given task, then exits.
#!/usr/bin/env python3 """Send text to Claude Code CLI as a one-shot prompt.""" import os, shutil, subprocess, sys def find_claude_binary(): found = shutil.which("claude") if found: return found for path in [os.path.expanduser("~/.local/bin/claude"), "/opt/homebrew/bin/claude"]: if os.path.exists(path): return path print("claude CLI not found", file=sys.stderr) sys.exit(1) def main(): task = sys.stdin.read().strip() if not task: sys.exit(0) claude = find_claude_binary() result = subprocess.run( [claude, "--print", "--bare", "-p", task], capture_output=True, text=True, timeout=120, ) if result.returncode == 0: response = result.stdout.strip() print(f"Claude: {response.split(chr(10))[0][:100]}" if response else "Done") else: print(f"Error: {result.stderr[:100]}", file=sys.stderr) sys.exit(1) if __name__ == "__main__": main()
Done. Hold RightCommand, speak, release. Agent runs, writes code, exits.
How It Works
RightCommand (hold) → speak → release
│
▼
SpeechButton STT (7ms, offline)
│
▼
Local AI Transform: Gemma 4 on-device
prompts/claude_code_task.md → clean prompt
│
▼
integrations/send_claude_code.py
└─ claude --print --bare -p "Write a function that..."
│
▼
Agent reads your codebase, writes the code, done.
The Local AI Transform runs on your Mac using Gemma 4 — completely offline, no API key, no cost. The --bare flag skips hooks and MCP servers for fast startup. The -p flag runs non-interactively — prompt in, result out. The agent gets the tools it needs, does the work, and exits.
No persistent process. No terminal to manage. No session to resume. One shot.
What You Can Do With It
Generate code
"Add a rate limiter middleware to the Express app. Use a sliding window algorithm with Redis. 100 requests per minute per IP."
The agent reads your Express app, finds the middleware chain, writes the rate limiter, adds the Redis connection, and exits.
Fix bugs
"The user profile page crashes when the bio field is null. Add a null check in the profile component and show a placeholder instead."
The agent greps for the profile component, finds the null dereference, adds the check, done.
Write tests
"Write unit tests for the CSV parser function in utils.ts. Cover: normal CSV, empty file, missing headers, quoted values with commas, and unicode characters."
The agent reads your parser, writes five tests, runs them to verify they pass.
Refactor
"Extract the database connection logic from server.ts into a separate db module. Keep the same interface, just move it."
The agent reads server.ts, creates the new module, updates imports, verifies nothing breaks.
Quick scripts
"Write a bash script that finds all TODO comments in the codebase, counts them per file, and outputs a sorted table."
The agent writes the script, makes it executable, even runs it to show you the output.
Why One-Shot > Typing the Prompt
You could open a terminal and type claude -p "Write a function...". But:
- Context switching — you leave your editor to open a terminal
- Typing speed — a complex prompt takes 30–60 seconds to type
- Raw input — what you type goes straight to the agent, unstructured
- Stay in flow — hold a key, speak, release — never leave your editor
- 10 seconds — speaking a complex prompt is 3× faster than typing
- Transform — rambling speech becomes a clean, precise prompt automatically
The one-shot pattern makes Claude Code feel like a voice command. You don't think about terminals, prompts, or flags. You hold a key, describe what you want, and it happens.
And the 7ms capture matters here: you start talking the instant you hold the key. No pause, no "listening..." indicator, no lost first word. The first syllable lands.
Advanced: Per-Project Agents
Different projects need different tool permissions. Add multiple channels with different hotkeys and permission sets:
# Frontend project — channel 1, no Bash access [[hotkey]] key = "RightCommand" channel = "1" name = "claude-frontend" transform = "prompts/claude_code_task.md" exec = "integrations/send_claude_safe.py" # Backend project — channel 2, full access [[hotkey]] key = "RightCommand" channel = "2" name = "claude-backend" transform = "prompts/claude_code_task.md" exec = "integrations/send_claude_full.py"
# send_claude_safe.py — restricted tools (frontend) import shutil, subprocess, sys task = sys.stdin.read().strip() subprocess.run([shutil.which("claude"), "--print", "--bare", "-p", task, "--allowedTools", "Read,Edit,Write,Grep,Glob"]) # send_claude_full.py — full access (backend) import shutil, subprocess, sys task = sys.stdin.read().strip() subprocess.run([shutil.which("claude"), "--print", "--bare", "-p", task, "--allowedTools", "Read,Edit,Write,Bash,Grep,Glob"])
From One-Shot to Multi-Agent
The one-shot channel is the gateway. Once you're comfortable speaking tasks:
One-shot agent (today)
Hold RightCommand. Speak a task. Agent executes and exits.
Persistent agent (next step)
Add a second hotkey for a persistent Claude Code agent that handles bigger tasks without exiting.
Slack notifications
A third hotkey that routes to a Slack channel when agents finish tasks.
Linear issue creation
A fourth hotkey that creates a Linear issue when you describe a bug by voice.
SpeechButton grows with you. Start with one hotkey, one agent. Scale to five hotkeys, five destinations, five transform pipelines. The config.toml grows one channel at a time.
Get Started
Prerequisites
- macOS 15+ (Sequoia), Apple Silicon
- Claude Code CLI installed (
npm install -g @anthropic-ai/claude-code) - Signed in to Claude Code with your Anthropic account
Quick Start
- 1 Download SpeechButton — free 15 minutes/day, no account needed
-
2
Install Claude Code —
npm install -g @anthropic-ai/claude-code -
3
Create the folders
integrations/andprompts/inside~/.config/speechbutton/ -
4
Copy the 3 files from this article:
config.toml,prompts/claude_code_task.md, andintegrations/send_claude_code.py - 5 Hold RightCommand, describe a task, release. Your first voice-to-code in under 5 minutes. No servers, no setup, no complexity.
Start speaking code into existence today
Free 15 min/day · No account needed · macOS 15+ · Apple Silicon
Download for macOS — FreePro ($7.99/mo) removes the daily limit. Requires macOS 15+ and Apple Silicon.