An honest comparison to help you find the right tool for your setup. We want you to succeed with voice input, even if Voxtype isn't the right fit.
| Tool | Engine | Offline | GPU | Wayland | CJK Output | Any Desktop | Setup |
|---|---|---|---|---|---|---|---|
| Voxtype | Whisper | Yes | Vulkan/CUDA | Native | Yes (wtype) | Yes | Easy |
| Vocalinux | Whisper | Yes | No | Yes | No (ydotool) | Yes | Medium |
| Nerd-dictation | VOSK | Yes | No | Via ydotool | No (ydotool) | Yes | Medium |
| Speech Note | Multi | Yes | CUDA only | Yes | Yes (clipboard) | GUI app | Easy |
| Blurt | Whisper | Yes | No | Yes | Yes (clipboard) | GNOME only | Medium |
| WhisperWriter | Whisper | Yes | CUDA only | Partial | Varies | Yes | Medium |
| Numen Voice | VOSK | Yes | No | Yes | Varies | Yes | Hard |
| ibus-speech-to-text | VOSK | Yes | No | Varies | Yes (IBus) | IBus req. | Hard |
| hyprwhspr | Whisper | Yes | CUDA | Native | Yes (wl-copy) | Arch/Hyprland | Easy |
| waystt | Whisper | Optional | No | Native | Varies | Yes | Medium |
| VoxInput | LocalAI | Via LocalAI | Via LocalAI | Yes | Varies | Yes | Hard |
| VOXD | Whisper | Yes | No | Yes | No (ydotool) | Yes | Medium |
Need to dictate in CJK languages?
Voxtype on Wayland - one of the only tools that correctly outputs CJK characters. Uses wtype for native text injection. Most alternatives rely on ydotool which cannot type non-ASCII characters. (Note: CJK output requires Wayland; X11 falls back to ydotool.)
Want native compositor keybindings?
Voxtype - use bind/bindr or bindsym --release for push-to-talk. No input group required. Bind any key combo like Super+V.
Using GNOME, KDE, or other desktops?
Voxtype - works on Wayland and X11 with kernel-level hotkey detection. Optimized for Wayland.
Running GNOME Shell?
Blurt - native GNOME extension. Note: clipboard-only (requires paste).
KDE has no built-in STT.
Voxtype - works perfectly on KDE Wayland with full features.
Need to transcribe recordings?
Speech Note - GUI app with multiple engines and file import.
Need hands-free computer control?
Numen Voice - designed for full voice control, not just dictation.
Want to customize everything?
Nerd-dictation - single Python file with powerful hooks.
Use Windows and Linux?
WhisperWriter - Python app that works on both platforms.
Want status bar integration?
Voxtype - built-in Waybar module shows recording state. Runs as systemd service.
The two main offline speech recognition engines
Maximum accuracy
Used by: Voxtype, Blurt, WhisperWriter, Speech Note
Lightweight & fast
Used by: Nerd-dictation, Numen, ibus-speech-to-text, Speech Note
Deep dives into how Voxtype compares with each alternative
Whisper accuracy vs VOSK hackability. Daemon vs manual activation.
→Both use Whisper. Desktop-agnostic vs GNOME-native.
→Push-to-talk daemon vs GUI transcription suite.
→Dictation vs full hands-free computer control.
→Linux-native vs cross-platform Python app.
→Daemon vs foreground. Both have audio feedback and cursor injection.
→Both Whisper-based. Universal vs Arch/Hyprland-focused.
→Both Rust. Daemon vs signal-driven. Offline vs cloud default.
→Embedded Whisper vs LocalAI API. Simple vs flexible.
→CLI daemon vs multi-UI app. Both use whisper.cpp offline.
→Every tool compared here supports fully offline operation. Your voice never leaves your computer. No cloud accounts, no subscriptions, no data collection. This is how speech recognition should work.
Hold a key. Speak. Release. Your words appear at the cursor.