stt test

2026-01-13 14:10:10 -06:00 · 2026-01-13 14:10:10 -06:00 · 02f24bb524
commit 02f24bb524
parent c408693861
15 changed files with 4184 additions and 15 deletions
--- a/flakes/stt_ime/README.md
+++ b/flakes/stt_ime/README.md
@ -0,0 +1,108 @@
+# stt_ime - Speech-to-Text Input Method for Fcitx5
+
+Local, privacy-preserving speech-to-text that integrates as a native Fcitx5 input method.
+
+## Components
+
+- **stt-stream**: Rust CLI that captures audio, runs VAD, and transcribes with Whisper
+- **fcitx5-stt**: C++ Fcitx5 addon that spawns stt-stream and commits text to apps
+
+## Modes
+
+- **Manual**: Press `Ctrl+Space` or `Ctrl+R` to start/stop recording
+- **Oneshot**: Automatically starts on speech, commits on silence, then resets
+- **Continuous**: Always listening, commits each utterance automatically
+
+Press `Ctrl+M` while STT is active to cycle between modes.
+
+## Keys (when STT input method is active)
+
+| Key | Action |
+|-----|--------|
+| `Ctrl+Space` / `Ctrl+R` | Toggle recording (manual mode) |
+| `Ctrl+M` | Cycle mode (manual → oneshot → continuous) |
+| `Enter` | Accept current preedit text |
+| `Escape` | Cancel recording / clear preedit |
+
+## Usage
+
+### NixOS Module
+
+```nix
+# In your host's flake.nix inputs:
+stt_ime.url = "git+https://git.ros.one/josh/nixos-config?dir=flakes/stt_ime";
+
+# In your NixOS config:
+{
+  imports = [ inputs.stt_ime.nixosModules.default ];
+
+  ringofstorms.sttIme = {
+    enable = true;
+    model = "base.en";  # tiny, base, small, medium, large-v3 (add .en for English-only)
+    useGpu = false;     # set true for CUDA acceleration
+  };
+}
+```
+
+### Standalone CLI
+
+```bash
+# Run with default settings (manual mode)
+stt-stream
+
+# Run in continuous mode
+stt-stream --mode continuous
+
+# Use a specific model
+stt-stream --model small-en
+
+# Commands via stdin (manual mode):
+echo "start" | stt-stream  # begin recording
+echo "stop" | stt-stream   # stop and transcribe
+echo "cancel" | stt-stream # cancel without transcribing
+echo "shutdown" | stt-stream # exit
+```
+
+### Output Format (NDJSON)
+
+```json
+{"type":"ready"}
+{"type":"recording_started"}
+{"type":"partial","text":"hello worl"}
+{"type":"partial","text":"hello world"}
+{"type":"final","text":"Hello world."}
+{"type":"recording_stopped"}
+{"type":"shutdown"}
+```
+
+## Models
+
+Models are automatically downloaded from Hugging Face on first run and cached in `~/.cache/stt-stream/models/`.
+
+| Model | Size | Speed | Quality |
+|-------|------|-------|---------|
+| tiny.en | ~75MB | Fastest | Basic |
+| base.en | ~150MB | Fast | Good (default) |
+| small.en | ~500MB | Medium | Better |
+| medium.en | ~1.5GB | Slow | Great |
+| large-v3 | ~3GB | Slowest | Best (multilingual) |
+
+## Environment Variables
+
+- `STT_STREAM_MODEL_PATH`: Path to a specific model file
+- `STT_STREAM_MODEL`: Model name (overridden by CLI)
+- `STT_STREAM_USE_GPU`: Set to "1" for GPU acceleration
+
+## Building
+
+```bash
+cd flakes/stt_ime
+nix build .#stt-stream    # Rust CLI only
+nix build .#fcitx5-stt    # Fcitx5 addon (includes stt-stream)
+nix build                  # Default: fcitx5-stt
+```
+
+## Integration with de_plasma
+
+The addon is automatically added to Fcitx5 when `ringofstorms.sttIme.enable = true`.
+It appears as "Speech to Text" (STT) in the input method switcher alongside US and Mozc.