SharpInference.Cli
0.6.0
There is a newer version of this package available.
See the version list below for details.
See the version list below for details.
dotnet tool install --global SharpInference.Cli --version 0.6.0
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
dotnet tool install --local SharpInference.Cli --version 0.6.0
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=SharpInference.Cli&version=0.6.0
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
nuke :add-package SharpInference.Cli --version 0.6.0
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
SharpInference.Cli
sharpi-cli — a command-line tool for LLM inference and image generation, powered by SharpInference. Reads GGUF models and runs transformer inference on CPU (AVX2/AVX-512 SIMD) or GPU (Vulkan / CUDA).
Install
dotnet tool install -g SharpInference.Cli
Or update:
dotnet tool update -g SharpInference.Cli
Usage
# Text generation (CPU)
sharpi-cli -m models/SmolLM2-1.7B-Instruct-Q4_K_M.gguf -p "Once upon a time" --temp 0.7
# All layers on GPU (Vulkan or CUDA, auto-selected)
sharpi-cli -m models/Qwen3-8B-Q4_K_M.gguf -p "Explain mmap" -g -1
# Interactive chat (omit -p to enter chat mode)
sharpi-cli -m models/Qwen3-8B-Q4_K_M.gguf
# Image generation (Z-Image-Turbo, requires CUDA)
sharpi-cli image \
-m models/z_image_turbo-Q5_K_M.gguf \
--vae models/z-image-turbo/vae \
--qwen-encoder models/Z-Image-AbliteratedV1.Q5_K_M.gguf \
--qwen-tokenizer models/z-image-turbo/tokenizer/tokenizer.json \
-p "a serene mountain lake at sunrise" -W 512 -H 512 --steps 4 -o out.png
Flag names are intentionally compatible with llama.cpp / llama-cli.
| Flag | Default | Description |
|---|---|---|
-m, --model |
auto-detect | Path to GGUF model file |
-p, --prompt |
(interactive) | Input prompt; omit to enter chat |
-n, --n-predict |
512 |
Maximum tokens to generate |
--temp |
0.7 |
Sampling temperature (0 = greedy) |
--top-k |
40 |
Top-k sampling |
--top-p |
0.95 |
Top-p nucleus sampling |
--min-p |
0.05 |
Min-p sampling |
-g, --n-gpu-layers |
0 |
Layers on GPU (0 = CPU only, -1 = all) |
-c, --ctx-size |
model default | Context / max sequence length |
--tq |
off | TurboQuant KV cache compression (3-bit, ~5× VRAM reduction) |
Run sharpi-cli --help for the full reference.
Requirements
- .NET 10 runtime (the tool installs framework-dependent)
- x86-64 CPU with AVX2 support
- For GPU inference: Vulkan-capable GPU (any vendor) or NVIDIA GPU with CUDA 11.x / 12.x
Links
License
MIT. Copyright (c) 2026 Pekka Heikura.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
This package has no dependencies.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.7.2-alpha.0.25 | 0 | 6/12/2026 |
| 0.7.2-alpha.0.24 | 0 | 6/12/2026 |
| 0.7.2-alpha.0.23 | 26 | 6/11/2026 |
| 0.7.2-alpha.0.22 | 42 | 6/10/2026 |
| 0.7.2-alpha.0.21 | 36 | 6/10/2026 |
| 0.7.2-alpha.0.20 | 37 | 6/10/2026 |
| 0.7.2-alpha.0.19 | 45 | 6/10/2026 |
| 0.7.2-alpha.0.18 | 37 | 6/10/2026 |
| 0.7.2-alpha.0.17 | 39 | 6/9/2026 |
| 0.7.2-alpha.0.16 | 36 | 6/9/2026 |
| 0.7.2-alpha.0.15 | 42 | 6/9/2026 |
| 0.7.2-alpha.0.14 | 51 | 6/8/2026 |
| 0.7.2-alpha.0.13 | 53 | 6/8/2026 |
| 0.7.2-alpha.0.12 | 48 | 6/7/2026 |
| 0.7.2-alpha.0.11 | 46 | 6/7/2026 |
| 0.7.2-alpha.0.10 | 47 | 6/7/2026 |
| 0.7.2-alpha.0.9 | 48 | 6/7/2026 |
| 0.7.2-alpha.0.8 | 68 | 6/7/2026 |
| 0.7.1 | 106 | 6/4/2026 |
| 0.6.0 | 105 | 6/3/2026 |
Loading failed