VoiceToText 0.1.0
dotnet add package VoiceToText --version 0.1.0
NuGet\Install-Package VoiceToText -Version 0.1.0
<PackageReference Include="VoiceToText" Version="0.1.0" />
<PackageVersion Include="VoiceToText" Version="0.1.0" />
<PackageReference Include="VoiceToText" />
paket add VoiceToText --version 0.1.0
#r "nuget: VoiceToText, 0.1.0"
#:package VoiceToText@0.1.0
#addin nuget:?package=VoiceToText&version=0.1.0
#tool nuget:?package=VoiceToText&version=0.1.0
VoiceToText
A C# speech-to-text library with a provider abstraction layer. Supports multiple STT backends through a common interface, designed for reuse across transcribers, Unity games, desktop apps, and more.
Packages
| Package | Description |
|---|---|
VoiceToText |
Core abstractions (zero STT dependencies) |
VoiceToText.Vosk |
Vosk provider — true streaming, lightweight, sub-second latency |
VoiceToText.Whisper |
Whisper.net provider — best accuracy, batch model (streaming simulated with 2-3s buffer) |
VoiceToText.Audio.NAudio |
Windows microphone capture via NAudio |
Prerequisites
- .NET 10 SDK (10.0.100 or later)
- A speech model — Whisper models auto-download; Vosk models must be downloaded manually
Usage
Register services
services.AddVoiceToText()
.AddVoskRecognizer(opts => opts.ModelPath = "models/vosk-model-small-en-us")
.AddNAudioMicrophone();
Batch transcription
await using var recognizer = serviceProvider.GetRequiredService<ISpeechRecognizer>();
await using var stream = File.OpenRead("audio.wav");
var result = await recognizer.TranscribeAsync(stream);
Console.WriteLine(result.Text);
Real-time streaming
var streaming = serviceProvider.GetRequiredService<IStreamingRecognizer>();
streaming.FinalResultReceived += (_, e) => Console.WriteLine(e.Text);
await streaming.StartAsync();
// Push audio chunks...
await streaming.StopAsync();
Console Sample
The included console sample supports file transcription and live microphone input.
# Transcribe a WAV file (defaults to Whisper)
dotnet run --project samples/VoiceToText.Samples.Console -- hello-world.wav
# Live microphone with Vosk
dotnet run --project samples/VoiceToText.Samples.Console -- --mic --vosk --model vosk-model-small-en-us-0.15
# Live microphone with Whisper (default)
dotnet run --project samples/VoiceToText.Samples.Console -- --mic
Options:
| Flag | Description |
|---|---|
--mic |
Live microphone streaming (Windows only) |
--vosk |
Use Vosk provider |
--whisper |
Use Whisper provider (default) |
--model <path> |
Path to model file (.bin) or directory |
Project Structure
src/
VoiceToText/ Core abstractions (zero STT dependencies)
VoiceToText.Vosk/ Vosk provider (true streaming)
VoiceToText.Whisper/ Whisper.net provider (batch, best accuracy)
VoiceToText.Audio.NAudio/ Windows microphone capture
samples/
VoiceToText.Samples.Console/ Push-to-talk console demo
tests/
VoiceToText.Tests/ Unit tests (26 tests)
Architecture
Core Interfaces
ISpeechRecognizer— batch/file transcription (TranscribeAsync,TranscribeSegmentsAsync)IStreamingRecognizer— real-time streaming withPushAudio()+PartialResultReceived/FinalResultReceivedeventsIAudioSource— audio input abstraction (microphone, file) withDataAvailableevent
Audio Pipeline
All audio is normalized to 16kHz mono 16-bit PCM. The AudioFormatConverter utility handles stereo-to-mono, resampling, and PCM/float conversion.
Target Frameworks
net10.0— primary targetnetstandard2.1— Unity 2021+ and broad compatibility
Building
dotnet build
dotnet test
dotnet pack # produces 4 .nupkg files
Contributing
Feature branches off main. PRs welcome.
License
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.1 is compatible. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.1
- Microsoft.Bcl.AsyncInterfaces (>= 9.0.0)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.0)
- Microsoft.Extensions.Logging.Abstractions (>= 9.0.0)
- Microsoft.Extensions.Options (>= 9.0.0)
-
net10.0
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.0)
- Microsoft.Extensions.Logging.Abstractions (>= 9.0.0)
- Microsoft.Extensions.Options (>= 9.0.0)
NuGet packages (3)
Showing the top 3 NuGet packages that depend on VoiceToText:
| Package | Downloads |
|---|---|
|
VoiceToText.Audio.NAudio
NAudio-based microphone capture for VoiceToText (Windows) |
|
|
VoiceToText.Whisper
Whisper.net speech-to-text provider for VoiceToText |
|
|
VoiceToText.Vosk
Vosk speech-to-text provider for VoiceToText |
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.1.0 | 173 | 3/12/2026 |