ElBruno.Realtime
0.6.0
dotnet add package ElBruno.Realtime --version 0.6.0
NuGet\Install-Package ElBruno.Realtime -Version 0.6.0
<PackageReference Include="ElBruno.Realtime" Version="0.6.0" />
<PackageVersion Include="ElBruno.Realtime" Version="0.6.0" />
<PackageReference Include="ElBruno.Realtime" />
paket add ElBruno.Realtime --version 0.6.0
#r "nuget: ElBruno.Realtime, 0.6.0"
#:package ElBruno.Realtime@0.6.0
#addin nuget:?package=ElBruno.Realtime&version=0.6.0
#tool nuget:?package=ElBruno.Realtime&version=0.6.0
ElBruno.Realtime
A pluggable real-time audio conversation framework for .NET, following Microsoft.Extensions.AI patterns. Build voice-powered apps with local STT, TTS, VAD, and any LLM — all running on your machine, no cloud required.
Powered By
This project is built on two core Microsoft frameworks for AI and conversation management:
- Microsoft.Extensions.AI (MEAI) — Provides unified abstractions for chat clients (
IChatClient) and speech-to-text (ISpeechToTextClient), enabling pluggable LLM and STT providers throughout the pipeline. - Microsoft Agent Framework — Manages conversation sessions, per-user state, and dialogue continuity, ensuring each user gets a consistent, stateful conversation experience.
Together with industry-standard models (Whisper STT, Silero VAD, ONNX Runtime), these frameworks provide a production-ready foundation for real-time voice applications.
Architecture
Microphone (Audio Input)
│ raw PCM audio
▼
🔇 Silero VAD ─── Voice Activity Detection (~2 MB ONNX)
│ speech segments
▼
🎙️ Whisper STT ─── Speech-to-Text (~75 MB GGML)
│ transcribed text
▼
🤖 Any IChatClient ─── LLM Chat (Ollama / OpenAI / Azure)
│ response text
▼
🗣️ Any TTS ─── Text-to-Speech (pluggable)
│ WAV audio
▼
Speaker (Audio Output)
All models download automatically on first use. The LLM is pluggable via IChatClient — use Ollama, OpenAI, Azure, or any provider.
Features
- Local-First — All audio processing runs locally. No data leaves your machine.
- Microsoft.Extensions.AI — Implements
ISpeechToTextClientand follows M.E.AI patterns throughout - Pluggable Providers — Swap STT, TTS, VAD, or LLM independently
- Auto Model Download — Models download from HuggingFace/Whisper.net on first use
- DI-Ready — One-line setup with
AddPersonaPlexRealtime()+ fluent builder - Streaming — Full async streaming via
IAsyncEnumerablefor real-time processing - Multi-Target — Supports .NET 8.0 and .NET 10.0
Quick Start
Install
dotnet add package ElBruno.Realtime # Core abstractions + pipeline
dotnet add package ElBruno.Realtime.Whisper # Local STT (Whisper.net)
dotnet add package ElBruno.Realtime.SileroVad # Local VAD (Silero)
Basic Usage
using ElBruno.Realtime;
using ElBruno.Realtime.Whisper;
using ElBruno.Realtime.SileroVad;
using Microsoft.Extensions.AI;
// 1. Configure the pipeline (models auto-download on first use)
builder.Services.AddPersonaPlexRealtime(opts =>
{
opts.DefaultSystemPrompt = "You are a helpful voice assistant.";
})
.UseWhisperStt("whisper-tiny.en") // 75MB, or "whisper-base.en" for accuracy
.UseSileroVad(); // Voice activity detection
// .UseYourTts() — plug in any ITextToSpeechClient
// 2. Register your LLM (any IChatClient provider)
builder.Services.AddChatClient(new OllamaChatClient(
new Uri("http://localhost:11434"), "phi4-mini"));
// 3. Use the pipeline
var conversation = app.Services.GetRequiredService<IRealtimeConversationClient>();
// One-shot turn
using var audio = File.OpenRead("question.wav");
var turn = await conversation.ProcessTurnAsync(audio);
Console.WriteLine($"User: {turn.UserText}");
Console.WriteLine($"AI: {turn.ResponseText}");
// Streaming full-duplex
await foreach (var evt in conversation.ConverseAsync(microphoneStream))
{
if (evt.Kind == ConversationEventKind.ResponseTextChunk)
Console.Write(evt.ResponseText);
}
Packages
| Package | Description |
|---|---|
ElBruno.Realtime |
Core: ITextToSpeechClient, IVoiceActivityDetector, IRealtimeConversationClient, pipeline orchestration, DI |
ElBruno.Realtime.Whisper |
ISpeechToTextClient (M.E.AI) via Whisper.net — auto-downloads GGML models |
ElBruno.Realtime.SileroVad |
IVoiceActivityDetector via Silero VAD v5 ONNX — configurable thresholds |
Samples
| Sample | Description |
|---|---|
| scenario-01-console | Realtime console app |
| scenario-02-api | ASP.NET Core API with SignalR |
| scenario-03-blazor-aspire | Blazor + .NET Aspire with voice chat + voice-controlled side-scroller game |
| scenario-04-realtime-console | Real-time microphone conversation with Whisper STT + Ollama LLM |
Run a Sample
# Prerequisites: Ollama running with phi4-mini
ollama pull phi4-mini
ollama serve
# Run the console sample
cd src/samples/scenario-01-console
dotnet run
Auto-Downloaded Models
All models are cached in %LOCALAPPDATA%/ElBruno/Realtime/ and shared across apps:
| Model | Size | Purpose | Auto-Download |
|---|---|---|---|
| Silero VAD v5 | ~2 MB | Voice activity detection | ✅ Yes |
| Whisper tiny.en | ~75 MB | Speech-to-text (fast) | ✅ Yes |
| Whisper base.en | ~142 MB | Speech-to-text (accurate) | ✅ Yes |
| Phi4-Mini (Ollama) | ~2.7 GB | LLM chat | ❌ Manual: ollama pull phi4-mini |
Documentation
| Document | Description |
|---|---|
| Models Overview | How each model is used in the pipeline |
| Architecture | Three-layer architecture, data flow, M.E.AI integration |
| Publishing | NuGet publishing guide |
Building from Source
git clone https://github.com/elbruno/ElBruno.Realtime.git
cd ElBruno.Realtime
dotnet build
dotnet test
Requirements
- .NET 8.0 or .NET 10.0 SDK
- ONNX Runtime compatible platform (Windows, Linux, macOS)
- Ollama (or any
IChatClientprovider) for the LLM - Sufficient disk space for model files
Contributing
Contributions are welcome! Here's how to get started:
- Fork the repository
- Create a branch for your feature or fix:
git checkout -b feature/my-feature - Make your changes and ensure the solution builds:
dotnet build - Run tests:
dotnet test - Submit a pull request with a clear description of the changes
Please open an issue first for major changes or new features to discuss the approach.
👋 About the Author
Hi! I'm ElBruno 🧡, a passionate developer and content creator exploring AI, .NET, and modern development practices.
Made with ❤️ by ElBruno
If you like this project, consider following my work across platforms:
- 📻 Podcast: No Tienen Nombre — Spanish-language episodes on AI, development, and tech culture
- 💻 Blog: ElBruno.com — Deep dives on embeddings, RAG, .NET, and local AI
- 📺 YouTube: youtube.com/elbruno — Demos, tutorials, and live coding
- 🔗 LinkedIn: @elbruno — Professional updates and insights
- 𝕏 Twitter: @elbruno — Quick tips, releases, and tech news
License
This project is licensed under the MIT License — see the LICENSE file for details.
Related Projects
- ElBruno.PersonaPlex — NVIDIA PersonaPlex-7B-v1 ONNX inference
- ElBruno.QwenTTS — QwenTTS text-to-speech
- ElBruno.VibeVoiceTTS — VibeVoiceTTS
- ElBruno.Text2Image — Text-to-image generation
- ElBruno.HuggingFace.Downloader — HuggingFace model downloader
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- Microsoft.Extensions.AI.Abstractions (>= 10.0.0)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.13)
-
net8.0
- Microsoft.Extensions.AI.Abstractions (>= 10.0.0)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.13)
NuGet packages (2)
Showing the top 2 NuGet packages that depend on ElBruno.Realtime:
| Package | Downloads |
|---|---|
|
ElBruno.Realtime.Whisper
Whisper-based ISpeechToTextClient implementation for ElBruno.Realtime. Provides local speech-to-text using Whisper.net with automatic GGML model download. Supports tiny.en (fast) and base.en (accurate) models. |
|
|
ElBruno.Realtime.SileroVad
Silero VAD-based IVoiceActivityDetector implementation for ElBruno.Realtime. Provides local voice activity detection using Silero VAD v5 ONNX model with automatic download. Detects speech segments in continuous audio streams. |
GitHub repositories
This package is not used by any popular GitHub repositories.