SharpInference.ModelHandler 1.0.0-alpha.2

This is a prerelease version of SharpInference.ModelHandler.
dotnet add package SharpInference.ModelHandler --version 1.0.0-alpha.2
                    
NuGet\Install-Package SharpInference.ModelHandler -Version 1.0.0-alpha.2
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="SharpInference.ModelHandler" Version="1.0.0-alpha.2" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="SharpInference.ModelHandler" Version="1.0.0-alpha.2" />
                    
Directory.Packages.props
<PackageReference Include="SharpInference.ModelHandler" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add SharpInference.ModelHandler --version 1.0.0-alpha.2
                    
#r "nuget: SharpInference.ModelHandler, 1.0.0-alpha.2"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package SharpInference.ModelHandler@1.0.0-alpha.2
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=SharpInference.ModelHandler&version=1.0.0-alpha.2&prerelease
                    
Install as a Cake Addin
#tool nuget:?package=SharpInference.ModelHandler&version=1.0.0-alpha.2&prerelease
                    
Install as a Cake Tool

SharpInference

A pure C#/.NET AI inference engine for image generation, speech, vision, video, and interactive world models — no Python, no native runtime DLLs.

SharpInference loads .safetensors and .gguf checkpoints directly and runs them on NVIDIA CUDA, cross-vendor Vulkan, or CPU SIMD — entirely in managed .NET. GPU kernels are PTX/SPIR-V shipped with the package and JIT-compiled at runtime; there are no C++ wrappers, no bundled native inference library, and no external Python process to manage. Just NuGet packages.

It is the non-LLM companion to dotLLM: together they form a complete AI stack in pure .NET.


⚠️ Alpha software

This is 1.0.0-alpha — an early, fast-moving preview. Use it to experiment, not in production.

  • APIs will change without notice between alpha releases. Pin an exact version.
  • Model coverage is broad but maturity varies. Many architectures are implemented and load/run end-to-end but are still being validated numerically against their reference implementations. Treat output quality per-model as "verify before you rely on it."
  • No support guarantees, no semver stability until 1.0.0.
  • The OpenAI-compatible server and CLI are not published as packages in this alpha — they live in the source repository.

Found a bug or a mismatch against a reference? Please open an issue.


Install

One package pulls in the whole stack (all backends + every modality):

dotnet add package SharpInference --prerelease

Or reference only the pieces you need (see Packages):

dotnet add package SharpInference.Audio --prerelease
dotnet add package SharpInference.Cpu   --prerelease

Requires .NET 8 or .NET 10.


Quick start — speech-to-text

The Whisper pipeline downloads a checkpoint from HuggingFace on first use and runs on whichever backend you pass:

using SharpInference.Audio.Pipelines;
using SharpInference.Core.Backends;
using SharpInference.Cpu;          // or SharpInference.Cuda / SharpInference.Vulkan

using WhisperPipeline pipeline = await WhisperPipeline.LoadAsync("openai/whisper-base");
using IBackend backend = new CpuBackend();

var options = new WhisperOptions { Language = "en", WithTimestamps = false };
string text = pipeline.TranscribeWav(backend, "audio.wav", options);

Console.WriteLine(text);

Swap new CpuBackend() for new CudaBackend() or new VulkanBackend() — the pipeline is backend-agnostic. The same LoadAsync / pipeline pattern applies across modalities (StableDiffusion15Pipeline, WanVideoPipeline, KokoroPipeline, …); see the samples in the repo for image, video, and TTS walkthroughs.


What it can do

Modality Highlights
Image generation SD 1.5, SDXL, SD3, Flux.1 / Flux.2, AuraFlow, Chroma, HiDream, Qwen-Image, Lumina 2, OmniGen2, HunyuanImage, Ideogram, Kandinsky 5, and more — with LoRA, img2img, and tiling
Video generation LTX-Video, Wan 2.x, Lance, Kandinsky 5 video
Interactive / world models Matrix-Game 2 & 3, Oasis — action-conditioned, frame-by-frame generation
Speech-to-text Whisper (tiny → large-v3), Moonshine — with streaming and timestamps
Text-to-speech & voice Kokoro, F5-TTS, StyleTTS2, Bark, CosyVoice, Spark-TTS, VibeVoice, CSM
Music ACE-Step, MusicGen, YuE
Vision CLIP & SigLIP embeddings, YOLO detection, SAM segmentation, face detection

Checkpoints load directly from .safetensors / .gguf, including quantized weights (GGUF, MXFP4/8, NVFP4, block-scaled).

Coverage is wide because the engine shares a common core (tensors, schedulers, VAEs, text encoders, DSP) across architectures. Per-model numerical validation is ongoing — see the alpha note above.


Design pillars

Pillar What it means
Pure C# GPU access via PTX (CUDA Driver API) and SPIR-V (Vulkan) — no native shared inference libraries
Eager execution Ops run immediately; no computation graph to compile
Zero-allocation hot paths Tensor storage in NativeMemory.AlignedAlloc; weights memory-mapped; Span<T> throughout
Modular packages Pull in only the modality and backend you need

Packages

Package Description
SharpInference Meta-package — references everything below
SharpInference.Core Tensor types, IBackend, schedulers, pipeline base types
SharpInference.ModelHandler Safetensors/GGUF loaders, quant dequant, HuggingFace download, model registry
SharpInference.Tokenizers CLIP, T5, Whisper, and LLM-style tokenizers
SharpInference.Cpu CPU backend with AVX2 / AVX-512 / NEON SIMD kernels
SharpInference.Cuda CUDA backend — PTX kernels + cuBLAS
SharpInference.Vulkan Cross-vendor Vulkan backend (NVIDIA / AMD / Intel) via SPIR-V
SharpInference.Diffusion Image + music diffusion pipelines, VAEs, text encoders, LoRA
SharpInference.Audio Whisper/Moonshine STT, TTS, voice conversion, music
SharpInference.Vision CLIP/SigLIP embeddings, YOLO, SAM, face detection
SharpInference.Video LTX-Video, Wan, Lance, Kandinsky 5 video
SharpInference.Interactive Action-conditioned world models (Matrix-Game, Oasis)

Requirements

  • .NET 8 or .NET 10 SDK

CUDA backend (NVIDIA, fastest)

  • CUDA 12.x runtime
  • NVIDIA GPU, compute capability 8.0+ (RTX 30xx/40xx, A100, H100)

Vulkan backend (NVIDIA / AMD / Intel, cross-vendor)

  • Vulkan 1.3+ runtime (ships with the GPU vendor driver)
  • GPU with FP16 compute (shaderFloat16) — most discrete GPUs from 2019+

CPU backend

  • Any x86-64 (AVX2+) or ARM64 (NEON) machine — no GPU required


License

MIT © 2026 kalebbroo

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on SharpInference.ModelHandler:

Package Downloads
SharpInference.Audio

Audio inference for SharpInference — Whisper STT, Kokoro TTS, voice conversion, mel-spectrogram preprocessing, audio codecs, and HuggingFace model caching.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.0-alpha.2 51 6/14/2026