Mostlylucid.OcrNer 1.1.0

dotnet add package Mostlylucid.OcrNer --version 1.1.0
                    
NuGet\Install-Package Mostlylucid.OcrNer -Version 1.1.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Mostlylucid.OcrNer" Version="1.1.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Mostlylucid.OcrNer" Version="1.1.0" />
                    
Directory.Packages.props
<PackageReference Include="Mostlylucid.OcrNer" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Mostlylucid.OcrNer --version 1.1.0
                    
#r "nuget: Mostlylucid.OcrNer, 1.1.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Mostlylucid.OcrNer@1.1.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Mostlylucid.OcrNer&version=1.1.0
                    
Install as a Cake Addin
#tool nuget:?package=Mostlylucid.OcrNer&version=1.1.0
                    
Install as a Cake Tool

Mostlylucid.OcrNer

Local-first OCR, Named Entity Recognition, and Vision captioning for .NET

NuGet License: MIT .NET

One line of setup, zero model downloads. Everything auto-downloads on first use.


Features

  • Tesseract OCR - extract text from images with ImageSharp preprocessing
  • OpenCV preprocessing - deskew, denoise, and binarize degraded documents (opt-in)
  • BERT NER - extract people, organizations, locations, and miscellaneous entities via ONNX
  • Florence-2 Vision - local image captioning and scene-text OCR
  • Microsoft.Recognizers.Text - rule-based extraction of dates, numbers, URLs, phones, emails, IPs (opt-in)
  • Auto-download - models download from HuggingFace/GitHub on first use with atomic caching
  • Full DI integration - AddOcrNer() registers everything as singletons

Quick Start

dotnet add package Mostlylucid.OcrNer
// Register services
builder.Services.AddOcrNer(builder.Configuration);

// Use the pipeline
var pipeline = serviceProvider.GetRequiredService<IOcrNerPipeline>();
var result = await pipeline.ProcessImageAsync("invoice.png");

foreach (var entity in result.NerResult.Entities)
{
    // entity.Label: "PER", "ORG", "LOC", "MISC"
    // entity.Text: "John Smith"
    // entity.Confidence: 0.9996
}

Configuration

{
  "OcrNer": {
    "EnableOcr": true,
    "TesseractLanguage": "eng",
    "MinConfidence": 0.5,
    "MaxSequenceLength": 512,
    "ModelDirectory": "models/ocrner",
    "Preprocessing": "Default",
    "EnableAdvancedPreprocessing": false,
    "EnableRecognizers": false,
    "RecognizerCulture": "en-us"
  }
}

All settings have sensible defaults. The entire section can be omitted.

Services

Service What it does Use when...
INerService BERT NER from text You already have text
IOcrService Tesseract OCR from images You need text from scans/screenshots
IOcrNerPipeline OCR then NER in one call You have images and want entities
ITextRecognizerService Dates, phones, URLs, etc. You want structured data alongside NER
IVisionService Florence-2 captioning + OCR You need image understanding

CLI Tool

A companion CLI is available at Mostlylucid.OcrNer.CLI:

ocrner "John Smith works at Microsoft in Seattle"
ocrner ocr invoice.png -o results.json
ocrner caption photo.jpg --ocr --ner

Documentation

License

MIT

Product Compatible and additional computed target framework versions.
.NET net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.1.0 127 2/9/2026
1.0.0 116 2/9/2026
0.0.1-alpha0 110 2/9/2026

1.0.0
Initial release of Mostlylucid.OcrNer.

Features:
- Tesseract OCR with ImageSharp preprocessing (grayscale, contrast, sharpen, upscale)
- BERT NER via ONNX Runtime (PER, ORG, LOC, MISC entities)
- Florence-2 vision model for image captioning and scene-text OCR
- OpenCV advanced preprocessing: deskew, denoise, binarization (opt-in via EnableAdvancedPreprocessing)
- Microsoft.Recognizers.Text: rule-based date, number, URL, phone, email, IP extraction (opt-in via EnableRecognizers)
- Automatic model download from HuggingFace/GitHub on first use
- Full DI integration via AddOcrNer() extension method
- All services registered as thread-safe singletons with lazy initialization
- Targets net9.0 and net10.0