Docxodus 6.4.0

dotnet add package Docxodus --version 6.4.0
                    
NuGet\Install-Package Docxodus -Version 6.4.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Docxodus" Version="6.4.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Docxodus" Version="6.4.0" />
                    
Directory.Packages.props
<PackageReference Include="Docxodus" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Docxodus --version 6.4.0
                    
#r "nuget: Docxodus, 6.4.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Docxodus@6.4.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Docxodus&version=6.4.0
                    
Install as a Cake Addin
#tool nuget:?package=Docxodus&version=6.4.0
                    
Install as a Cake Tool

<p align="center"> <img src="docxodus-mono-final.svg" alt="Docxodus" width="400"> </p>

<p align="center"> <strong>A powerful Typescript, Python and .NET library for manipulating Open XML documents (DOCX, XLSX, PPTX).</strong> </p>

<p align="center"> <a href="https://github.com/JSv4/Docxodus/actions/workflows/ci.yml"><img src="https://github.com/JSv4/Docxodus/actions/workflows/ci.yml/badge.svg" alt="CI"></a> <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"></a> </p>


Docxodus is a fork of Open-Xml-PowerTools upgraded to .NET 8.0. It provides tools for comparing Word documents, converting between DOCX and HTML, projecting DOCX to anchor-addressed markdown for LLM pipelines, programmatically editing DOCX content via a stateful session API, merging documents, and more.

Quick Start

Install the Library

# Install from NuGet
dotnet add package Docxodus

Using as a Library

using Docxodus;

// Compare documents
var original = new WmlDocument("original.docx");
var modified = new WmlDocument("modified.docx");

var settings = new WmlComparerSettings
{
    AuthorForRevisions = "Redline",
    DetailThreshold = 0
};

var result = WmlComparer.Compare(original, modified, settings);

// Get list of revisions (with move detection)
var revisions = WmlComparer.GetRevisions(result, settings);
foreach (var rev in revisions)
{
    if (rev.RevisionType == WmlComparer.WmlComparerRevisionType.Moved)
        Console.WriteLine($"Moved (group {rev.MoveGroupId}): {rev.Text}");
    else
        Console.WriteLine($"{rev.RevisionType}: {rev.Text}");
}

// Save the redlined document
result.SaveAs("redline.docx");

CLI Tools

Docxodus includes two command-line tools:

Redline (Document Comparison)

# Install globally
dotnet tool install -g Redline

# Usage
redline original.docx modified.docx output.docx

# With custom author tag
redline original.docx modified.docx output.docx --author="Legal Review"
Option Description
--author=<name> Author name for tracked changes (default: "Redline")
-h, --help Show help message
-v, --version Show version information

docx2html (HTML Conversion)

# Install globally
dotnet tool install -g Docx2Html

# Basic conversion
docx2html document.docx

# Specify output file
docx2html document.docx output.html

# Extract images to files instead of embedding as base64
docx2html document.docx --extract-images

# Use inline styles instead of CSS classes
docx2html document.docx --inline-styles
Option Description
--title=<text> Page title (default: document title or filename)
--css-prefix=<text> CSS class prefix (default: "pt-")
--inline-styles Use inline styles instead of CSS classes
--extract-images Save images to separate files instead of embedding
-h, --help Show help message
-v, --version Show version information

Download Standalone Binaries

Pre-built binaries are available on the Releases page:

redline (Document Comparison):

Platform Download
Windows (x64) redline-win-x64.exe
Linux (x64) redline-linux-x64
macOS (x64) redline-osx-x64
macOS (ARM) redline-osx-arm64

docx2html (HTML Conversion):

Platform Download
Windows (x64) docx2html-win-x64.exe
Linux (x64) docx2html-linux-x64
macOS (x64) docx2html-osx-x64
macOS (ARM) docx2html-osx-arm64

Build from Source

# Clone the repository
git clone https://github.com/JSv4/Docxodus.git
cd Docxodus

# Build
dotnet build Docxodus.sln

# Run the CLI
dotnet run --project tools/redline/redline.csproj -- --help

Testing

.NET Unit Tests

# Run all tests (~1,100 tests)
dotnet test Docxodus.Tests/Docxodus.Tests.csproj

# Run specific test by name
dotnet test --filter "FullyQualifiedName~WC001"

# Run tests for a specific class
dotnet test --filter "FullyQualifiedName~WmlComparerTests"

npm/WASM Browser Tests (Playwright)

# Need to be in npm subdirectory
cd npm

# Install dependencies (first time only)
npm install
npx playwright install chromium

# Build WASM and TypeScript (required before tests)
npm run build

# Run all Playwright tests (~62 tests)
npm test

# Run specific test by name pattern
npx playwright test --grep "Document Structure"

# Run tests with browser visible
npx playwright test --headed

# TypeScript type checking
npx tsc --noEmit

Features

  • WmlComparer - Compare two DOCX files and generate redlines with tracked changes
    • Move Detection - Automatically detects when content is relocated (not just deleted and re-inserted)
    • Format Change Detection - Detects formatting-only changes (bold, italic, font size, etc.)
    • Configurable similarity threshold and minimum word count
    • Links move pairs via MoveGroupId for easy tracking
  • WmlToHtmlConverter / HtmlToWmlConverter - Bidirectional DOCX ↔ HTML conversion
    • Comment rendering (endnote-style, inline, or margin)
    • Paginated output mode for PDF-like viewing
    • Headers, footers, footnotes, and endnotes support
    • Custom annotation rendering
  • WmlToMarkdownConverter - Anchor-addressed markdown projection of a DOCX with stable per-block IDs - a text view suitable for LLM editing pipelines, structured search indexers, and diff/review UIs
  • DocxSession - Stateful in-memory DOCX editor keyed by markdown-projection anchor ids - the write-side counterpart to WmlToMarkdownConverter for agentic editing pipelines
    • Text and structural edits: replace paragraph text, delete blocks, insert/split/merge paragraphs, change paragraph style, adjust list level, replace table-cell content
    • Character-range formatting (bold, italic, underline, strike, code, color, run style) addressed by substring, span, or a prior search match
    • Surgical text replacement that preserves per-run formatting on either side of the rewritten slice - including a span-addressed variant that lets identical placeholders in one paragraph each receive distinct values
    • Cross-run text search with per-fragment run breakdown (Grep), plus a cross-block variant (GrepCrossBlock) that lets a single match span adjacent paragraphs, headings, and list items
    • Template-slot enumeration that classifies bracketed regions as value blanks, alternative clauses, or drafter hints
    • Anchor discovery by text, regex, kind, bookmark, annotation id, or shared label - so an agent told to "edit the indemnification clause" can resolve intent to anchors without re-walking the document
    • NBSP / smart-quote handling so common Word whitespace and punctuation don't sabotage literal find/replace
    • Tracked-change mode that lands every mutation as w:ins / w:del instead of accepted edits
    • Bounded snapshot undo/redo
    • Raw OOXML escape hatch for content the markdown subset can't express (charts, equations, content controls)
    • Typed result envelope on every call - no exceptions across the API boundary
    • Available in .NET, WASM, and an npm TypeScript wrapper
  • DocumentBuilder - Merge and split DOCX files
  • DocumentAssembler - Template population from XML data
  • PresentationBuilder - Merge and split PPTX files
  • SpreadsheetWriter - Simplified XLSX creation API
  • OpenXmlRegex - Search/replace in DOCX/PPTX using regular expressions
  • OpenContractExporter - Export documents to OpenContracts format for NLP/document analysis
  • Supporting utilities for document manipulation

Browser/JavaScript Usage (npm)

Docxodus is also available as an npm package for client-side usage via WebAssembly:

npm install docxodus
import {
  initialize,
  convertDocxToHtml,
  compareDocuments,
  getRevisions,
  getDocumentMetadata,
  isMove,
  isMoveSource,
  isFormatChange,
  findMovePair,
  CommentRenderMode,
  PaginationMode
} from 'docxodus';

await initialize();

// Convert DOCX to HTML with comments and pagination
const html = await convertDocxToHtml(docxFile, {
  commentRenderMode: CommentRenderMode.EndnoteStyle,
  paginationMode: PaginationMode.Paginated,
  renderHeadersAndFooters: true
});

// Compare two documents
const redlinedDocx = await compareDocuments(originalFile, modifiedFile);

// Get revisions with move and format change detection
const revisions = await getRevisions(redlinedDocx);
for (const rev of revisions) {
  if (isMove(rev)) {
    const pair = findMovePair(rev, revisions);
    if (isMoveSource(rev)) {
      console.log(`Content moved from: "${rev.text}" to: "${pair?.text}"`);
    }
  } else if (isFormatChange(rev)) {
    console.log(`Format changed: ${rev.formatChange?.changedPropertyNames?.join(', ')}`);
  }
}

// Get document metadata for lazy loading
const metadata = await getDocumentMetadata(docxFile);
console.log(`${metadata.totalParagraphs} paragraphs, ${metadata.estimatedPageCount} pages`);

See the npm package documentation for full API reference, React hooks, and usage examples.

Requirements

  • .NET 8.0 or later

License

MIT License - see LICENSE for details.


Built on the shoulders of Open-Xml-PowerTools. Thanks to Eric White, Thomas Barnekow, and all original contributors.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
6.4.0 95 5/31/2026
6.3.0 95 5/30/2026
6.2.0 112 5/29/2026
6.1.0 109 5/28/2026
6.0.0 104 5/25/2026
5.5.4 102 5/24/2026
5.5.3 165 3/21/2026
5.5.2 93 3/21/2026
5.5.1 99 3/21/2026
5.5.0 108 3/16/2026
5.4.2 170 1/26/2026
5.4.1 145 1/21/2026
5.4.0 216 12/24/2025
5.3.0 175 12/21/2025
5.2.0 157 12/6/2025
5.1.2 220 12/4/2025