Voice to text, the way it should work
Hold Fn to speak. Release to get clean, polished text — right where your cursor is. Mumbli is a macOS menu bar app that turns speech into text in any application. No app-specific setup. No integrations. If there’s a text cursor, Mumbli works there.Works Everywhere
Dictate into any text field — browser, email, Slack, IDE, terminal. Mumbli uses the Accessibility API to inject text at your cursor.
Hold or Hands-Free
Hold Fn to dictate while you press, or double-tap to go hands-free. Both modes are always available.
AI Polishing
Filler words removed, grammar fixed, self-corrections handled. Your voice, cleaned up — not rewritten.
Custom Vocabulary
Add proper nouns and technical terms that get mistranscribed. Benchmarked at 36% to 100% accuracy improvement.
How it works
Activate
Press and hold Fn (or double-tap for hands-free mode). A small overlay appears near your cursor.
Speak
Audio is captured at 16 kHz and sent to an AI transcription service (ElevenLabs or Groq Whisper).
Two engines, your choice
| Engine | Transcription | Polishing | Typical Latency |
|---|---|---|---|
| Standard | ElevenLabs Scribe v1 | OpenAI GPT-5.4 Nano | ~3-5s |
| Fast | Groq Whisper large-v3-turbo | Groq Llama 3.1 8B | ~0.5-1s |
Built for safety
LLMs can hallucinate, especially smaller ones. Mumbli has three layers of defense:- XML boundaries — Dictation is wrapped in tags, separating it from system instructions
- Injection-hardened prompts — The LLM is explicitly forbidden from inventing content
- RepetitionGuard — Catches sentence explosion, length explosion, and tag leakage. Falls back to raw transcription if polishing goes wrong.
Free and open source
Mumbli is MIT-licensed. Bring your own API keys for the transcription and polishing services.Download Mumbli
Get the latest DMG from GitHub Releases. macOS 13.0+ (Ventura or later).