The Friction of Switching to Send a Message
You're in the middle of a debugging session. Your partner messages asking if you'll be late for dinner. You need to reply — quick, human, no big deal.
So you switch to Telegram. The app takes a second to focus. You find the chat. You start typing. You mistype because you're still half-thinking about the bug. You fix it. You send. You switch back to the terminal.
By the time you're done, you've lost the mental model you were holding. The context switch cost you more than the 15 seconds it took.
What if you could just speak?
"Hey, I'll be 10 minutes late, stuck in traffic."
Release. SpeechButton transcribes your words, passes them through a local AI to clean up the phrasing, and sends the message via Telegram Bot API. Your partner sees it in seconds.
Three seconds. No app switch. No typing. You're still looking at the same screen.
How It Works
Your voice ──▶ SpeechButton STT ──▶ Local AI Transform ──▶ Telegram Bot API
(7ms) (Apple Neural (Gemma 4, local — (delivers message
Engine, offline) cleans up speech, to chat)
formats for chat)
Three components:
- SpeechButton captures and transcribes your voice locally on Apple Neural Engine
- Local AI (Gemma 4) reads your prompt file and cleans up raw transcription into a natural, well-punctuated message — entirely on your Mac
- A Python script in your
integrations/folder sends the text to the Telegram Bot API
The result: a clean Telegram message from spoken words, in under 4 seconds total. Everything is local except the final Telegram API call — no Anthropic API key needed.
Setup
Five steps. Under five minutes.
Step 1: Create a Telegram bot
- Open Telegram and message
@BotFather - Send
/newbotand follow the prompts to name your bot - Copy the bot token (looks like
123456789:ABCdefGHIjkl...)
Step 2: Get your chat_id
Send any message to your bot first, then run:
curl https://api.telegram.org/bot<TOKEN>/getUpdates | jq '.result[0].message.chat.id'
The returned number is your chat_id. For a group chat, send a message in the group first — the same command returns the group's chat_id.
Step 3: Create the prompt file
This prompt tells the local AI how to clean up raw speech for a chat message.
You are a message formatter for Telegram chat. Clean up raw speech transcription into a natural, well-punctuated chat message. Fix filler words, false starts, and run-on sentences. Keep the tone conversational. Preserve the original meaning exactly. Output ONLY the cleaned message text — no quotes, no explanation.
Edit this file any time to change how your messages are cleaned up — no recompilation needed. You can make it more formal, add emoji, or keep it minimal.
Step 4: SpeechButton config.toml
Add a hotkey for Telegram. RightCommand on channel 6 sends a message; your other channels remain unchanged.
# ~/.config/speechbutton/config.toml [global] model = "parakeet-tdt-0.6b-v3-int8" language = "en" auto_punctuation = true [audio] vad_enabled = true vad_silence_threshold = 1.0 # Telegram — send message from voice [[hotkey]] key = "RightCommand" channel = "6" name = "telegram" transform = "prompts/telegram_message.md" exec = "TELEGRAM_BOT_TOKEN=123:ABC TELEGRAM_CHAT_ID=456 integrations/send_telegram.py"
Step 5: Integration script — send via Telegram Bot API
Reads the cleaned message from stdin and posts it to the Telegram Bot API. No external dependencies — uses only Python's standard library.
#!/usr/bin/env python3 """Send a message to Telegram via Bot API.""" import json, os, sys, urllib.request def main(): text = sys.stdin.read().strip() if not text: sys.exit(0) token = os.environ.get("TELEGRAM_BOT_TOKEN") chat_id = os.environ.get("TELEGRAM_CHAT_ID") if not token or not chat_id: print("TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID required", file=sys.stderr) sys.exit(1) req = urllib.request.Request( f"https://api.telegram.org/bot{token}/sendMessage", json.dumps({"chat_id": chat_id, "text": text, "parse_mode": "Markdown"}).encode(), {"Content-Type": "application/json"}) result = json.loads(urllib.request.urlopen(req, timeout=10).read()) if result.get("ok"): print(f"Telegram: {text[:60]}...") else: print(f"Error: {result}", file=sys.stderr) sys.exit(1) if __name__ == "__main__": main()
chmod +x ~/.config/speechbutton/integrations/send_telegram.py
Done. Hold RightCommand, speak your message, release. It lands in Telegram in seconds.
Local AI or Claude API?
SpeechButton supports two ways to clean up your speech before sending to Telegram. Option A is recommended — it's free, offline, and private.
Uses Gemma 4 running locally on your Mac. No API key required. No data leaves your machine except the final Telegram message.
transform = "prompts/telegram_message.md"
Uses Claude API for potentially more nuanced message polish. Requires an ANTHROPIC_API_KEY and incurs per-request costs.
transform = "transforms/transform_claude.py prompts/telegram_message.md"
The integration script is identical for both options. Only the transform line in config.toml changes. If you choose Option B, set ANTHROPIC_API_KEY in your environment and swap the transform line.
Real Workflows
Quick message to partner or team
You're in the zone and don't want to stop. But someone needs to know something.
"Hey, I'll be 10 minutes late, stuck in traffic."
Sent. You never left your current window.
Project update to a group
Async teams on Telegram need regular status updates. Speak it instead of typing it.
"Sprint review done. Shipped 3 features, one bug left on the auth module. Should be resolved by EOD."
The local AI cleans up the phrasing and adds proper punctuation. Your team gets a clear, well-formatted update.
Note to self
Saved Messages in Telegram is a great inbox. Send voice reminders to yourself without breaking concentration.
"Remember to buy groceries and call the dentist tomorrow."
Set TELEGRAM_CHAT_ID to your own Telegram user ID to send to Saved Messages. Instant voice-to-reminder.
Privacy
Here's exactly what stays on your Mac and what goes to the cloud:
| Component | Where it runs | Data sent externally |
|---|---|---|
| Voice capture | Your Mac | ✓ Nothing |
| Speech-to-text | Apple Neural Engine | ✓ Nothing |
| AI transform | Your Mac (Gemma 4) | ✓ Nothing |
| Send to Telegram | Telegram Bot API | Message text only |
With the default local AI transform, everything stays on your Mac except the final Telegram Bot API call: voice → local STT (Apple Neural Engine) → local AI transform (Gemma 4, on your Mac) → Telegram API. No audio leaves your machine. No transcription leaves your machine. Only the cleaned message text reaches Telegram's servers — which is where it needs to go anyway. No Anthropic API key required. If you prefer higher-quality message polish at the cost of privacy, you can optionally switch to the Claude API transform (Option B) — in that case, transcribed text is sent to Anthropic's servers for processing.
Prerequisites
- ✓ macOS 15+ (Sequoia) — required for local STT
- ✓ Apple Silicon (M1 or later) — required for Apple Neural Engine
- ✓ Telegram account — to receive messages
- ✓ Bot token from @BotFather — free, takes 60 seconds to create
- ✓ Python 3 — pre-installed on macOS, no packages needed
Get Started
- 1 Download SpeechButton — free 15 minutes/day, no account needed
- 2 Create a Telegram bot — message @BotFather, send /newbot, copy the token
- 3 Get your chat_id — send a message to your bot, then run the curl command above
-
4
Copy the config, prompt file, and integration script from this article into
~/.config/speechbutton/ - 5 Hold RightCommand, speak your message, release. It arrives in Telegram in under 4 seconds.
Start sending Telegram messages by voice today
Free 15 min/day · No account needed · macOS 15+ · Apple Silicon
Download for macOS — FreePro ($7.99/mo) removes the daily limit. Requires macOS 15+ and Apple Silicon.