SharpInference.Cli
0.7.1
There is a newer prerelease version of this package available.
See the version list below for details.
See the version list below for details.
dotnet tool install --global SharpInference.Cli --version 0.7.1
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
dotnet tool install --local SharpInference.Cli --version 0.7.1
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=SharpInference.Cli&version=0.7.1
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
nuke :add-package SharpInference.Cli --version 0.7.1
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
SharpInference.Cli
sharpi-cli — a command-line tool for LLM inference and image generation, powered by SharpInference. Reads GGUF models and runs transformer inference on CPU (AVX2/AVX-512 SIMD) or GPU (Vulkan / CUDA).
Install
dotnet tool install -g SharpInference.Cli
Or update:
dotnet tool update -g SharpInference.Cli
Usage
# Text generation (CPU)
sharpi-cli -m models/SmolLM2-1.7B-Instruct-Q4_K_M.gguf -p "Once upon a time" --temp 0.7
# All layers on GPU (Vulkan or CUDA, auto-selected)
sharpi-cli -m models/Qwen3-8B-Q4_K_M.gguf -p "Explain mmap" -g -1
# Interactive chat (omit -p to enter chat mode)
sharpi-cli -m models/Qwen3-8B-Q4_K_M.gguf
# Image generation (Z-Image-Turbo, requires CUDA)
sharpi-cli image \
-m models/z_image_turbo-Q5_K_M.gguf \
--vae models/z-image-turbo/vae \
--qwen-encoder models/Z-Image-AbliteratedV1.Q5_K_M.gguf \
--qwen-tokenizer models/z-image-turbo/tokenizer/tokenizer.json \
-p "a serene mountain lake at sunrise" -W 512 -H 512 --steps 4 -o out.png
Flag names are intentionally compatible with llama.cpp / llama-cli.
| Flag | Default | Description |
|---|---|---|
-m, --model |
auto-detect | Path to GGUF model file |
-p, --prompt |
(interactive) | Input prompt; omit to enter chat |
-n, --n-predict |
512 |
Maximum tokens to generate |
--temp |
0.7 |
Sampling temperature (0 = greedy) |
--top-k |
40 |
Top-k sampling |
--top-p |
0.95 |
Top-p nucleus sampling |
--min-p |
0.05 |
Min-p sampling |
-g, --n-gpu-layers |
0 |
Layers on GPU (0 = CPU only, -1 = all) |
-c, --ctx-size |
model default | Context / max sequence length |
--tq |
off | TurboQuant KV cache compression (3-bit, ~5× VRAM reduction) |
Run sharpi-cli --help for the full reference.
Requirements
- .NET 10 runtime (the tool installs framework-dependent)
- x86-64 CPU with AVX2 support
- For GPU inference: Vulkan-capable GPU (any vendor) or NVIDIA GPU with CUDA 11.x / 12.x
Links
License
MIT. Copyright (c) 2026 Pekka Heikura.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
This package has no dependencies.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.7.2-alpha.0.2 | 0 | 6/5/2026 |
| 0.7.2-alpha.0.1 | 0 | 6/5/2026 |
| 0.7.1 | 38 | 6/4/2026 |
| 0.7.1-alpha.0.1 | 26 | 6/4/2026 |
| 0.7.0 | 31 | 6/4/2026 |
| 0.6.1-alpha.0.9 | 23 | 6/4/2026 |
| 0.6.1-alpha.0.8 | 28 | 6/4/2026 |
| 0.6.1-alpha.0.7 | 39 | 6/4/2026 |
| 0.6.1-alpha.0.6 | 32 | 6/4/2026 |
| 0.6.1-alpha.0.5 | 43 | 6/4/2026 |
| 0.6.1-alpha.0.4 | 40 | 6/3/2026 |
| 0.6.1-alpha.0.3 | 44 | 6/3/2026 |
| 0.6.1-alpha.0.2 | 35 | 6/3/2026 |
| 0.6.1-alpha.0.1 | 33 | 6/3/2026 |
| 0.6.0 | 89 | 6/3/2026 |
| 0.5.1-alpha.0.44 | 36 | 6/3/2026 |
| 0.5.1-alpha.0.43 | 41 | 6/2/2026 |
| 0.5.1-alpha.0.42 | 37 | 6/2/2026 |
| 0.5.1-alpha.0.23 | 53 | 5/31/2026 |
| 0.5.1-alpha.0.22 | 50 | 5/31/2026 |
Loading failed