Mostlylucid.OcrNer 1.1.0

.NET 9.0

dotnet add package Mostlylucid.OcrNer --version 1.1.0

NuGet\Install-Package Mostlylucid.OcrNer -Version 1.1.0

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="Mostlylucid.OcrNer" Version="1.1.0" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="Mostlylucid.OcrNer" Version="1.1.0" />
                    

                            Directory.Packages.props

<PackageReference Include="Mostlylucid.OcrNer" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add Mostlylucid.OcrNer --version 1.1.0

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: Mostlylucid.OcrNer, 1.1.0"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package Mostlylucid.OcrNer@1.1.0

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=Mostlylucid.OcrNer&version=1.1.0
                    

                            Install as a Cake Addin

#tool nuget:?package=Mostlylucid.OcrNer&version=1.1.0
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Mostlylucid.OcrNer

Local-first OCR, Named Entity Recognition, and Vision captioning for .NET

One line of setup, zero model downloads. Everything auto-downloads on first use.

Features

Tesseract OCR - extract text from images with ImageSharp preprocessing
OpenCV preprocessing - deskew, denoise, and binarize degraded documents (opt-in)
BERT NER - extract people, organizations, locations, and miscellaneous entities via ONNX
Florence-2 Vision - local image captioning and scene-text OCR
Microsoft.Recognizers.Text - rule-based extraction of dates, numbers, URLs, phones, emails, IPs (opt-in)
Auto-download - models download from HuggingFace/GitHub on first use with atomic caching
Full DI integration - AddOcrNer() registers everything as singletons

Quick Start

dotnet add package Mostlylucid.OcrNer

// Register services
builder.Services.AddOcrNer(builder.Configuration);

// Use the pipeline
var pipeline = serviceProvider.GetRequiredService<IOcrNerPipeline>();
var result = await pipeline.ProcessImageAsync("invoice.png");

foreach (var entity in result.NerResult.Entities)
{
    // entity.Label: "PER", "ORG", "LOC", "MISC"
    // entity.Text: "John Smith"
    // entity.Confidence: 0.9996
}

Configuration

{
  "OcrNer": {
    "EnableOcr": true,
    "TesseractLanguage": "eng",
    "MinConfidence": 0.5,
    "MaxSequenceLength": 512,
    "ModelDirectory": "models/ocrner",
    "Preprocessing": "Default",
    "EnableAdvancedPreprocessing": false,
    "EnableRecognizers": false,
    "RecognizerCulture": "en-us"
  }
}

All settings have sensible defaults. The entire section can be omitted.

Services

Service	What it does	Use when...
`INerService`	BERT NER from text	You already have text
`IOcrService`	Tesseract OCR from images	You need text from scans/screenshots
`IOcrNerPipeline`	OCR then NER in one call	You have images and want entities
`ITextRecognizerService`	Dates, phones, URLs, etc.	You want structured data alongside NER
`IVisionService`	Florence-2 captioning + OCR	You need image understanding

CLI Tool

A companion CLI is available at Mostlylucid.OcrNer.CLI:

ocrner "John Smith works at Microsoft in Seattle"
ocrner ocr invoice.png -o results.json
ocrner caption photo.jpg --ocr --ner

Documentation

License

MIT

Product	Compatible and additional computed target framework versions.
.NET	net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Product

.NET

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net10.0
- Florence2 (>= 25.12.63049)
- Microsoft.Extensions.Configuration.Abstractions (>= 10.0.1)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 10.0.1)
- Microsoft.Extensions.Http (>= 10.0.1)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.1)
- Microsoft.Extensions.Options (>= 10.0.1)
- Microsoft.Extensions.Options.ConfigurationExtensions (>= 10.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.23.2)
- Microsoft.Recognizers.Text.DateTime (>= 1.8.10)
- Microsoft.Recognizers.Text.Number (>= 1.8.10)
- Microsoft.Recognizers.Text.Sequence (>= 1.8.10)
- OpenCvSharp4 (>= 4.10.0.20241108)
- OpenCvSharp4.runtime.win (>= 4.10.0.20241108)
- SixLabors.ImageSharp (>= 3.1.12)
- Tesseract (>= 5.2.0)
net9.0
- Florence2 (>= 25.12.63049)
- Microsoft.Extensions.Configuration.Abstractions (>= 10.0.1)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 10.0.1)
- Microsoft.Extensions.Http (>= 10.0.1)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.1)
- Microsoft.Extensions.Options (>= 10.0.1)
- Microsoft.Extensions.Options.ConfigurationExtensions (>= 10.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.23.2)
- Microsoft.Recognizers.Text.DateTime (>= 1.8.10)
- Microsoft.Recognizers.Text.Number (>= 1.8.10)
- Microsoft.Recognizers.Text.Sequence (>= 1.8.10)
- OpenCvSharp4 (>= 4.10.0.20241108)
- OpenCvSharp4.runtime.win (>= 4.10.0.20241108)
- SixLabors.ImageSharp (>= 3.1.12)
- Tesseract (>= 5.2.0)

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
1.1.0	127	2/9/2026
1.0.0	116	2/9/2026
0.0.1-alpha0	110	2/9/2026

1.0.0
Initial release of Mostlylucid.OcrNer.

Features:
- Tesseract OCR with ImageSharp preprocessing (grayscale, contrast, sharpen, upscale)
- BERT NER via ONNX Runtime (PER, ORG, LOC, MISC entities)
- Florence-2 vision model for image captioning and scene-text OCR
- OpenCV advanced preprocessing: deskew, denoise, binarization (opt-in via EnableAdvancedPreprocessing)
- Microsoft.Recognizers.Text: rule-based date, number, URL, phone, email, IP extraction (opt-in via EnableRecognizers)
- Automatic model download from HuggingFace/GitHub on first use
- Full DI integration via AddOcrNer() extension method
- All services registered as thread-safe singletons with lazy initialization
- Targets net9.0 and net10.0