Model2Vec.Net
0.1.1
dotnet add package Model2Vec.Net --version 0.1.1
NuGet\Install-Package Model2Vec.Net -Version 0.1.1
<PackageReference Include="Model2Vec.Net" Version="0.1.1" />
<PackageVersion Include="Model2Vec.Net" Version="0.1.1" />
<PackageReference Include="Model2Vec.Net" />
paket add Model2Vec.Net --version 0.1.1
#r "nuget: Model2Vec.Net, 0.1.1"
#:package Model2Vec.Net@0.1.1
#addin nuget:?package=Model2Vec.Net&version=0.1.1
#tool nuget:?package=Model2Vec.Net&version=0.1.1
Model2Vec.Net
A pure-managed C# (net10.0) port of MinishLab Model2Vec
static-embedding inference. Load a Model2Vec model folder
(model.safetensors + tokenizer.json + config.json) and compute sentence
embeddings with no Python, no native libraries, and no ONNX.
Unofficial, independent port. Not affiliated with or endorsed by MinishLab. See the project repository for full attribution and third-party notices.
Features
- Pure C# (
net10.0), no native dependency, no P/Invoke. - Reads the safetensors format directly (
F32,F16,F64,I8,U8). - Supports Model2Vec
embeddingsand Sentence Transformersembedding.weighttensors, plus Model2Vec vocabulary quantization (weights/mapping). - Hugging Face tokenizers via
Microsoft.ML.Tokenizers: WordPiece, byte-level BPE, and SentencePiece-backed Unigram. - SIMD-accelerated scaling and normalization via
System.Numerics.Tensors. - Output verified against the Python
model2vecpackage (tolerance1e-4).
Installation
dotnet add package Model2Vec.Net
Usage
using Model2VecNet;
var model = Model2VecModel.Load(@"C:\models\potion-base-2M");
float[] embedding = model.Encode("The quick brown fox jumps over the lazy dog.");
Console.WriteLine(model.Dimension); // e.g. 256
Console.WriteLine(embedding.Length); // == Dimension
Batch encoding:
float[][] embeddings = model.Encode([
"First sentence",
"Second sentence",
]);
Model2VecModel is immutable after loading and safe to share across threads.
Microsoft.Extensions.AI
Model2VecModel implements Microsoft.Extensions.AI
IEmbeddingGenerator<string, Embedding<float>>, so it plugs directly into the
.NET AI ecosystem (RAG pipelines, vector stores, semantic search):
using Microsoft.Extensions.AI;
using Model2VecNet;
IEmbeddingGenerator<string, Embedding<float>> generator = Model2VecModel.Load(@"C:\models\potion-base-2M");
GeneratedEmbeddings<Embedding<float>> embeddings = await generator.GenerateAsync(["First sentence", "Second sentence"]);
ReadOnlyMemory<float> vector = embeddings[0].Vector;
This composes with any Microsoft.Extensions.VectorData store (for example
Hnsw.Net) to embed and index text with no
external service.
Scope: inference only
This package loads a distilled Model2Vec static model and encodes text. It does
not distill or train models — that is, it does not forward-pass a teacher
sentence-transformer over a vocabulary, run PCA, apply Zipf/SIF weighting, or run
the tokenlearn/classifier training steps.
Why: those steps require a full transformer encoder plus an autodiff/optimizer
training loop, which in .NET means a native deep-learning dependency (ONNX
Runtime or libtorch). That would break this package's pure-managed, no-native-dependency
design. Distillation is a one-time offline step — create the model once with the
upstream Python tooling and load the resulting model.safetensors here.
Getting a model
Models are published on Hugging Face and are not bundled with this package.
Download a model folder (containing model.safetensors, tokenizer.json, and
config.json) and pass its path to Model2VecModel.Load. For example:
Any Model2Vec model in the standard folder layout will load.
Upstream / attribution
- Model2Vec (reference implementation): https://github.com/MinishLab/model2vec
- Microsoft.ML.Tokenizers: https://github.com/dotnet/machinelearning-tokenizers
License
MIT. See the project repository for the license and third-party notices: https://github.com/ericstj/Model2Vec.Net
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- Microsoft.Extensions.AI.Abstractions (>= 10.6.0)
- Microsoft.ML.Tokenizers (>= 2.0.0)
- System.Numerics.Tensors (>= 10.0.8)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.