Html2Text.Net 0.2.0

dotnet add package Html2Text.Net --version 0.2.0
                    
NuGet\Install-Package Html2Text.Net -Version 0.2.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Html2Text.Net" Version="0.2.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Html2Text.Net" Version="0.2.0" />
                    
Directory.Packages.props
<PackageReference Include="Html2Text.Net" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Html2Text.Net --version 0.2.0
                    
#r "nuget: Html2Text.Net, 0.2.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Html2Text.Net@0.2.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Html2Text.Net&version=0.2.0
                    
Install as a Cake Addin
#tool nuget:?package=Html2Text.Net&version=0.2.0
                    
Install as a Cake Tool

Html2Text.Net

Live Demo CI Benchmarks NuGet License

Just fast HTML → plain text.

Lightweight, hand rolled, high-performance HTML to plain text conversion for .NET.

Check out the LIVE DEMO!

Use cases

  • Search / indexing pipelines: Strip HTML down to text for full-text search, indexing, classification, or deduping.
    • Example: convert HTML to text before indexing in Elasticsearch / OpenSearch
  • Batch processing: Convert large archives of HTML (docs, KB articles, CMS exports) into text efficiently.
  • Email & notification processing: Get a readable text version of HTML emails for previews, logs, or plain-text fallbacks.
  • LLM / NLP preprocessing: Normalize HTML into clean text before chunking, embedding, or extraction.
  • Logging / auditing: Store a text representation of HTML content for review or compliance.

Usage

Simple as possible:

using Html2Text;

string html = "<h1>Hello</h1><p>World</p>";

string text = Html2Text.Convert(html);

Output:

Hello

World

Install, build, test, contribute

NuGet

Install using NuGet (recommended):

dotnet add package Html2Text.Net

Supported frameworks

  • .Net 8+
  • .Net Framework 4.6.2+
  • .Net Standard 2.0 for compatibility with other frameworks, including .Net 5/6/7

For .Net Framework users, PackageReference style dependencies are recommended. Also ensure binding redirects are enabled.

Contributing

Contributions and pull requests are welcome! With .Net 10 SDK installed, to build locally:

dotnet build

To run unit and regression tests:

(windows): dotnet test
(linux/mac): dotnet test -f net10.0

To run the example console app:

dotnet build
dotnet run --project Html2Text.Example Samples/scottallen.html

How it works

Pipeline

HTML document -> Lexer (tokens) -> Parser (AST nodes) -> Renderer (string text)
  • Text nodes are emitted in document order.
  • Basic block separation is preserved (e.g., paragraphs/headings insert newlines).
  • Whitespace is normalized to produce readable plain text.

Minimal formatting is added to make the plain text output readable in only 4 cases:

  1. HTML tables are given cell separators | and horizontal lines --- under column headers:
| Chart                  | Record Holder     | Record       |
| ---------------------- | ----------------- | ------------ |
| Opening Days           | Avengers: Endgame | $157,461,641 |
| Top Single Day Grosses | Avengers: Endgame | $157,461,641 |
  1. Lists and nested lists are indented and given a leading - like so:
 - 1 Early life
 - 2 Enigma machine
 - 3 Solving the wiring
 - Toggle Solving the wiring subsection
   - 3.1 French help
 - 4 Solving daily settings
 - Toggle Solving daily settings subsection
   - 4.1 Early methods
   - 4.2 Bomba and sheets
   - 4.3 Allies informed
  1. In preformatted areas <pre> whitespace is preserved:
private int GetSmallestNonNegative(int x, int y) {
    return x < 0 && y < 0 ? 0
        : x < 0 ? y
        : y < 0 ? x
        : Math.Min(x, y);
}
  1. The <hr/> element adds a horizontal line of dashes ---.

Goals

This project is focused on:

  • High performance: designed for low allocations and fast throughput.
  • Text extraction only: get the words from the page/document.
  • No dependencies: Lightweight, not an embedded browser engine. No dependencies other than .NET itself.

Non-goals (by design)

The following are intentionally out of scope so the library can excel at the goals above:

  • Respecting CSS, computed styles, display:none, or visibility.
  • Pixel-accurate layout, whitespace mirroring, or browser-equivalent rendering.
  • Executing JavaScript or loading remote resources.

Performance notes

Benchmarks

High performance is a goal of this project. This library:

  • designed for converting many documents quickly (batch processing, indexing, search pipelines).
  • avoids DOM dependencies.
  • uses a lightweight, hand rolled lexer/parser/renderer pipeline.

Benchmarks are in Html2Text.PerfTests and can be run locally with:

dotnet run -c Release --project Html2Text.PerfTests

Or check out the latest automated perf test results here: https://pavlosmcg.github.io/Html2Text.Net/dev/bench/

perf-test-console perf-test-chart

Regression tests

Each file in the Samples/ directory acts as an acceptance/regression test. The results of converting these HTML files to plain text are saved in Html2Text.RegressionTests/*.verified.txt:

Samples/<file-name>.html -> Html2Text.Convert(<file-contents>) -> <file-name>.verified.txt

For example scottallen.htmlscottallen.verified.txt

Html2Text.RegressionTests uses Verify to make test assertions against verified output snapshots. If you need to update the outputs please see the Verify docs for snapshot management.

Projects in this repository

  • Html2Text/: core library
  • Html2Text.Example/: small example app
  • Html2Text.Tests/: unit tests
  • Html2Text.RegressionTests/: regression/acceptance tests
  • Html2Text.PerfTests/: performance benchmarking console app
  • Samples/: sample HTML files used during development and automated regression testing

Distributed under MPL-2.0 see LICENSE.txt

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net461 was computed.  net462 is compatible.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • .NETFramework 4.6.2

  • .NETStandard 2.0

  • net10.0

    • No dependencies.
  • net8.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.2.0 38 1/27/2026
0.1.1 46 1/25/2026
0.1.0 45 1/24/2026