Voxtype vs Vocalinux
Two Whisper-based tools with similar goals. Both work on Wayland, both inject text at cursor, both have audio feedback. The key difference? How they run.
At a Glance
| Aspect | Voxtype | Vocalinux |
|---|---|---|
| Engine | Whisper (whisper.cpp) | Whisper |
| Language | Rust | Python |
| Architecture | Systemd daemon | Foreground process |
| Text Output | wtype (native Wayland) | ydotool |
| CJK/Unicode Output | Yes | No (ydotool limitation) |
| Recording Feedback | Audio + Notifications | Audio + Visual |
| GPU Acceleration | Vulkan, CUDA, Metal, ROCm | No |
| Text Processing | Word replacements, spoken punctuation | No |
| Voice Commands | No | Yes |
The Big Difference: Daemon vs Foreground
Voxtype: Background Daemon
Voxtype runs as a systemd user service. It starts automatically at login, runs invisibly in the background, and is always ready. You never need to think about starting it.
systemctl --user enable --now voxtype
# That's it. It's running. Forever.
Vocalinux: Foreground Process
Vocalinux runs in the foreground. You must start it manually each session, and keep it running. Close the terminal or process, lose dictation.
source venv/bin/activate
vocalinux
# Must stay running in a terminal
You can create your own systemd service or autostart script, but it's not provided out of the box.
Voice Commands
Vocalinux: Built-in Commands
Vocalinux includes voice commands for:
- Punctuation: "period", "comma", "question mark"
- Formatting: "new line", "new paragraph"
- Editing: "backspace", "delete", "undo", "redo"
Say "period" and you get a period. No keyboard needed.
Voxtype: Pure Transcription
Voxtype transcribes exactly what you say. Punctuation comes from Whisper's model intelligence. If you say "Hello comma how are you question mark", Voxtype will transcribe that literally. Whisper often adds appropriate punctuation automatically based on speech patterns.
Installation
Voxtype
# Package install (Debian/Ubuntu)
curl -LO https://github.com/peteonrails/voxtype/releases/download/v0.2.1/voxtype_0.2.1-1_amd64.deb
sudo dpkg -i voxtype_0.2.1-1_amd64.deb
voxtype setup model
voxtype setup systemd
systemctl --user enable --now voxtype
Time to first dictation: ~5 minutes
Vocalinux
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux
./install.sh
source venv/bin/activate
vocalinux
Time to first dictation: ~10-15 minutes
Similar Strengths
Both tools share these capabilities:
- Whisper engine - Excellent accuracy
- 100% offline - No cloud, no subscriptions
- Wayland support - Works on modern desktops
- Cursor injection - Text types where your cursor is
- Audio feedback - Sounds when recording starts/stops
Which to Choose?
Choose Voxtype if: You want a daemon that starts automatically, prefer packaged installation, and don't need voice commands for punctuation.
Choose Vocalinux if: You want built-in voice commands for editing, prefer Python/YAML configuration, and don't mind manual startup.