SharpInference 0.7.1
See the version list below for details.
dotnet add package SharpInference --version 0.7.1
NuGet\Install-Package SharpInference -Version 0.7.1
<PackageReference Include="SharpInference" Version="0.7.1" />
<PackageVersion Include="SharpInference" Version="0.7.1" />
<PackageReference Include="SharpInference" />
paket add SharpInference --version 0.7.1
#r "nuget: SharpInference, 0.7.1"
#:package SharpInference@0.7.1
#addin nuget:?package=SharpInference&version=0.7.1
#tool nuget:?package=SharpInference&version=0.7.1
SharpInference
A high-performance LLM inference engine and image generation pipeline for .NET 10. Reads GGUF model files and runs transformer inference on CPU (AVX2/AVX-512 SIMD) or GPU (Vulkan compute shaders / CUDA cuBLAS). Includes Z-Image-Turbo text-to-image and Real-ESRGAN upscaling.
This is the library package. For a command-line tool, install SharpInference.Cli instead.
Install
dotnet add package SharpInference
Quick start
using SharpInference.Core;
using SharpInference.Cpu;
using SharpInference.Engine;
var model = GgufModelLoader.Load("models/SmolLM2-1.7B-Instruct-Q4_K_M.gguf");
var backend = new CpuBackend();
var forward = new ForwardPass(model, backend);
var engine = new InferenceEngine(forward, model.Tokenizer);
await foreach (var token in engine.GenerateAsync("Hello, ", new SamplingParams { Temperature = 0.7f }))
{
Console.Write(token);
}
For GPU inference, swap CpuBackend for VulkanBackend or CudaBackend, or use HybridForwardPass to offload selected layers.
What's in the package
All 8 SharpInference library assemblies are bundled in one package:
| Assembly | Purpose |
|---|---|
SharpInference.Core |
GGUF parsing, BPE tokenizer, tensor types, model graph |
SharpInference.Cpu |
CPU backend (AVX2/AVX-512 SIMD, Q4_K_M dequant, optional OpenBLAS) |
SharpInference.Vulkan |
Vulkan compute backend |
SharpInference.Cuda |
CUDA / cuBLAS backend + NVRTC kernels |
SharpInference.Engine |
Forward pass, paged KV cache, samplers, speculative decoding |
SharpInference.Diffusion |
Z-Image-Turbo + FLUX.1 image generation |
SharpInference.Pipeline |
3-tier VRAM → RAM → NVMe memory hierarchy |
SharpInference.TurboQuant |
3-bit KV-cache compression |
Optional native dependencies
- OpenBLAS (CPU GEMM acceleration) — auto-detected on PATH, silently skipped if absent.
- Vulkan drivers — up-to-date GPU drivers (AMD / Intel / NVIDIA). No extra install on Windows.
- CUDA Toolkit 11.x or 12.x —
cublas64_*.dllandcudart64_*.dllon PATH. NVIDIA only.
NativeAOT
All assemblies are trim-safe and NativeAOT-compatible. To publish a single-binary application:
dotnet publish -c Release -r win-x64
Links
License
MIT. Copyright (c) 2026 Pekka Heikura.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- Microsoft.ML.Tokenizers (>= 2.0.0)
- System.Numerics.Tensors (>= 10.0.5)
- Vortice.Vulkan (>= 3.2.3)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on SharpInference:
| Package | Downloads |
|---|---|
|
SharpInference.Server
ASP.NET Core endpoints, options, and DI extensions that expose SharpInference as OpenAI- and Anthropic-compatible HTTP APIs. Bring your own host (Kestrel, IIS, YARP, etc.). |
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.7.2-alpha.0.6 | 0 | 6/6/2026 |
| 0.7.2-alpha.0.5 | 0 | 6/6/2026 |
| 0.7.2-alpha.0.4 | 0 | 6/6/2026 |
| 0.7.2-alpha.0.3 | 27 | 6/6/2026 |
| 0.7.2-alpha.0.2 | 44 | 6/5/2026 |
| 0.7.2-alpha.0.1 | 38 | 6/5/2026 |
| 0.7.1 | 67 | 6/4/2026 |
| 0.7.1-alpha.0.1 | 38 | 6/4/2026 |
| 0.7.0 | 69 | 6/4/2026 |
| 0.6.1-alpha.0.9 | 38 | 6/4/2026 |
| 0.6.1-alpha.0.8 | 43 | 6/4/2026 |
| 0.6.1-alpha.0.7 | 39 | 6/4/2026 |
| 0.6.1-alpha.0.6 | 37 | 6/4/2026 |
| 0.6.1-alpha.0.5 | 46 | 6/4/2026 |
| 0.6.1-alpha.0.4 | 41 | 6/3/2026 |
| 0.6.1-alpha.0.3 | 43 | 6/3/2026 |
| 0.6.1-alpha.0.2 | 37 | 6/3/2026 |
| 0.6.1-alpha.0.1 | 34 | 6/3/2026 |
| 0.6.0 | 96 | 6/3/2026 |
| 0.5.1-alpha.0.44 | 40 | 6/3/2026 |