Voxtype vs Numen Voice
Dictation vs hands-free computing. Two different philosophies for voice input on Linux.
Fundamentally Different Goals
Voxtype: "I want to type by speaking"
Hold a key, speak naturally, release. Your words appear as text. That's it.
Numen Voice: "I want to control my computer with my voice"
Numen is for users who cannot or choose not to use a keyboard. It provides:
- Text input via syllables ("hoof each yank" → "hey")
- Mouse control
- Window management
- Application shortcuts
- Full hands-free operation
Who Needs What
| If you... | Choose... |
|---|---|
| Have normal keyboard use | Voxtype |
| Want occasional dictation | Voxtype |
| Have RSI or mobility issues | Numen Voice |
| Cannot use keyboard/mouse | Numen Voice |
| Want voice commands beyond text | Numen Voice |
Recognition Approach
Voxtype (Whisper)
Transcribes natural speech accurately.
"The quick brown fox jumps over the lazy dog" → exactly that
Handles accents, technical terms, proper nouns. 99+ language support.
Numen (VOSK)
Optimized for command recognition. Uses syllable-based input for precision.
"hoof each yank" → recognized reliably every time
Sacrifices natural language for command reliability.
Example Workflows
Writing "Hey Sarah" with Voxtype
[Hold key]
"Hey Sarah"
[Release]
→ "Hey Sarah" appears
Writing "Hey Sarah" with Numen
"scribe" (dictation mode)
"hoof each yank" → "hey"
"scribe cap sarah" → "Sarah"
Numen is powerful but has a learning curve. It's designed for users who will invest time to master it for accessibility needs.
Resource Usage
| Aspect | Voxtype | Numen |
|---|---|---|
| Architecture | On-demand | Always listening |
| Memory | ~50MB idle | ~200MB active |
| Model size | 300MB - 3GB | ~50MB per language |
| GPU Acceleration | Vulkan, CUDA, Metal, ROCm | No (VOSK is CPU-only) |
The Right Choice
Ask yourself one question: "Can I use a keyboard comfortably?"
Yes → Voxtype (or another dictation tool)
No → Numen Voice (designed for your needs)
If you have RSI or accessibility needs, Numen is purpose-built for you. The learning curve pays off with full computer control via voice.