Mostlylucid.OcrNer
1.1.0
dotnet add package Mostlylucid.OcrNer --version 1.1.0
NuGet\Install-Package Mostlylucid.OcrNer -Version 1.1.0
<PackageReference Include="Mostlylucid.OcrNer" Version="1.1.0" />
<PackageVersion Include="Mostlylucid.OcrNer" Version="1.1.0" />
<PackageReference Include="Mostlylucid.OcrNer" />
paket add Mostlylucid.OcrNer --version 1.1.0
#r "nuget: Mostlylucid.OcrNer, 1.1.0"
#:package Mostlylucid.OcrNer@1.1.0
#addin nuget:?package=Mostlylucid.OcrNer&version=1.1.0
#tool nuget:?package=Mostlylucid.OcrNer&version=1.1.0
Mostlylucid.OcrNer
Local-first OCR, Named Entity Recognition, and Vision captioning for .NET
One line of setup, zero model downloads. Everything auto-downloads on first use.
Features
- Tesseract OCR - extract text from images with ImageSharp preprocessing
- OpenCV preprocessing - deskew, denoise, and binarize degraded documents (opt-in)
- BERT NER - extract people, organizations, locations, and miscellaneous entities via ONNX
- Florence-2 Vision - local image captioning and scene-text OCR
- Microsoft.Recognizers.Text - rule-based extraction of dates, numbers, URLs, phones, emails, IPs (opt-in)
- Auto-download - models download from HuggingFace/GitHub on first use with atomic caching
- Full DI integration -
AddOcrNer()registers everything as singletons
Quick Start
dotnet add package Mostlylucid.OcrNer
// Register services
builder.Services.AddOcrNer(builder.Configuration);
// Use the pipeline
var pipeline = serviceProvider.GetRequiredService<IOcrNerPipeline>();
var result = await pipeline.ProcessImageAsync("invoice.png");
foreach (var entity in result.NerResult.Entities)
{
// entity.Label: "PER", "ORG", "LOC", "MISC"
// entity.Text: "John Smith"
// entity.Confidence: 0.9996
}
Configuration
{
"OcrNer": {
"EnableOcr": true,
"TesseractLanguage": "eng",
"MinConfidence": 0.5,
"MaxSequenceLength": 512,
"ModelDirectory": "models/ocrner",
"Preprocessing": "Default",
"EnableAdvancedPreprocessing": false,
"EnableRecognizers": false,
"RecognizerCulture": "en-us"
}
}
All settings have sensible defaults. The entire section can be omitted.
Services
| Service | What it does | Use when... |
|---|---|---|
INerService |
BERT NER from text | You already have text |
IOcrService |
Tesseract OCR from images | You need text from scans/screenshots |
IOcrNerPipeline |
OCR then NER in one call | You have images and want entities |
ITextRecognizerService |
Dates, phones, URLs, etc. | You want structured data alongside NER |
IVisionService |
Florence-2 captioning + OCR | You need image understanding |
CLI Tool
A companion CLI is available at Mostlylucid.OcrNer.CLI:
ocrner "John Smith works at Microsoft in Seattle"
ocrner ocr invoice.png -o results.json
ocrner caption photo.jpg --ocr --ner
Documentation
License
MIT
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- Florence2 (>= 25.12.63049)
- Microsoft.Extensions.Configuration.Abstractions (>= 10.0.1)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 10.0.1)
- Microsoft.Extensions.Http (>= 10.0.1)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.1)
- Microsoft.Extensions.Options (>= 10.0.1)
- Microsoft.Extensions.Options.ConfigurationExtensions (>= 10.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.23.2)
- Microsoft.Recognizers.Text.DateTime (>= 1.8.10)
- Microsoft.Recognizers.Text.Number (>= 1.8.10)
- Microsoft.Recognizers.Text.Sequence (>= 1.8.10)
- OpenCvSharp4 (>= 4.10.0.20241108)
- OpenCvSharp4.runtime.win (>= 4.10.0.20241108)
- SixLabors.ImageSharp (>= 3.1.12)
- Tesseract (>= 5.2.0)
-
net9.0
- Florence2 (>= 25.12.63049)
- Microsoft.Extensions.Configuration.Abstractions (>= 10.0.1)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 10.0.1)
- Microsoft.Extensions.Http (>= 10.0.1)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.1)
- Microsoft.Extensions.Options (>= 10.0.1)
- Microsoft.Extensions.Options.ConfigurationExtensions (>= 10.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.23.2)
- Microsoft.Recognizers.Text.DateTime (>= 1.8.10)
- Microsoft.Recognizers.Text.Number (>= 1.8.10)
- Microsoft.Recognizers.Text.Sequence (>= 1.8.10)
- OpenCvSharp4 (>= 4.10.0.20241108)
- OpenCvSharp4.runtime.win (>= 4.10.0.20241108)
- SixLabors.ImageSharp (>= 3.1.12)
- Tesseract (>= 5.2.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.1.0 | 127 | 2/9/2026 |
| 1.0.0 | 116 | 2/9/2026 |
| 0.0.1-alpha0 | 110 | 2/9/2026 |
1.0.0
Initial release of Mostlylucid.OcrNer.
Features:
- Tesseract OCR with ImageSharp preprocessing (grayscale, contrast, sharpen, upscale)
- BERT NER via ONNX Runtime (PER, ORG, LOC, MISC entities)
- Florence-2 vision model for image captioning and scene-text OCR
- OpenCV advanced preprocessing: deskew, denoise, binarization (opt-in via EnableAdvancedPreprocessing)
- Microsoft.Recognizers.Text: rule-based date, number, URL, phone, email, IP extraction (opt-in via EnableRecognizers)
- Automatic model download from HuggingFace/GitHub on first use
- Full DI integration via AddOcrNer() extension method
- All services registered as thread-safe singletons with lazy initialization
- Targets net9.0 and net10.0