VoxCli 0.1.2

dotnet tool install --global VoxCli --version 0.1.2
                    
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
                    
if you are setting up this repo
dotnet tool install --local VoxCli --version 0.1.2
                    
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=VoxCli&version=0.1.2
                    
nuke :add-package VoxCli --version 0.1.2
                    

vox

A small CLI for transcribing audio files locally with Whisper, built around recording workflows (notably Apple Voice Memos). Distributed as a .NET global tool.

vox orchestrates two existing CLIs — whisper-cli (whisper.cpp) and ffmpeg — rather than bundling its own engine. It also ships an MCP mode so it can be used directly from Claude Code and other MCP clients.

See PLAN.md for design decisions, the full CLI shape, configuration, model handling, and the index.

Prerequisites

brew install whisper-cpp ffmpeg

Both must be in PATH. vox refuses to run with a clear error if either is missing.

Install

dotnet tool install --global VoxCli

Quick start

First-time setup picks source directories (defaults to the Apple Voice Memos shared container on macOS), an output directory, and a default model:

vox setup

Then, the recording-workflow commands:

vox listen                             # pick a recording from configured sources
vox listen --latest                    # transcribe the most recent recording (same as 'vox latest')
vox latest                             # transcribe the most recent recording
vox list                               # list recordings, marking transcribed/untranscribed
vox find <query>                       # find transcripts whose preview matches <query>
vox find --latest                      # print the path of the most recent transcript

(Running plain vox with no arguments prints help.)

Or transcribe a specific file:

vox path/to/recording.m4a              # transcribe → prints output path
vox path/to/recording.m4a --model small --format txt --language en
vox help                               # full usage

On success, vox prints the absolute output path on stdout and nothing else. Errors and progress go to stderr. This makes it pipe-friendly:

TRANSCRIPT=$(vox recording.m4a) && cat "$TRANSCRIPT"

Models

The default model is large-v3 (OpenAI's released large checkpoint, ~2.9 GB). It is downloaded lazily on the first transcription that needs it; subsequent runs reuse the cached file. Change the default with vox config set-default-model <name> (e.g. small, medium, large-v3-turbo).

vox models list                        # list models present locally
vox models download <name>             # pre-fetch a model
vox models remove <name>               # delete a model from the vox cache
vox models path                        # print the vox model cache directory

Vox first looks in its own cache (~/.cache/voxcli/models), then in known reuse paths (~/.cache/openwhispr/whisper-models, ~/.cache/whisper.cpp, $WHISPER_CPP_MODEL_DIR), and downloads from HuggingFace only as a last resort.

Voice Activity Detection

Whisper occasionally falls into hallucination loops on audio with long silent stretches — repeating the same sentence hundreds of times, or emitting boilerplate phrases it learned from YouTube training data. To prevent this, vox passes the Silero VAD model to whisper-cli to trim silence before transcription. VAD is on by default.

The VAD model (~2 MB) is downloaded once on first use and cached at ~/.cache/voxcli/vad/.

vox <file> --no-vad                    # disable VAD for a single run
vox <file> --vad                       # force VAD on even if disabled in config
vox config set-vad off                 # disable persistently
vox config set-vad on                  # re-enable persistently

MCP mode

claude mcp add vox -- vox --mcp

vox exposes tools so Claude can list recordings, transcribe, and look up existing transcripts without being told about the CLI.

Build from source

dotnet build                                                            # build
dotnet test                                                              # run tests
dotnet run --project VoxCli.Cli -- <args>                                # run from source
dotnet pack VoxCli.Cli/VoxCli.Cli.csproj --configuration Release         # build the global-tool nupkg

License

MIT.

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

Version Downloads Last Updated
0.1.2 97 5/15/2026
0.1.1 95 5/13/2026