Docx2OC 5.4.2

dotnet tool install --global Docx2OC --version 5.4.2
                    
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
                    
if you are setting up this repo
dotnet tool install --local Docx2OC --version 5.4.2
                    
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=Docx2OC&version=5.4.2
                    
nuke :add-package Docx2OC --version 5.4.2
                    

docx2oc

A command-line tool to export Word documents (.docx) to the OpenContracts format.

Installation

dotnet tool install --global Docx2OC

Build from Source

cd tools/docx2oc
dotnet build

Usage

# Export with default output filename (input.oc)
docx2oc contract.docx

# Export with custom output filename
docx2oc contract.docx export.json

Output Format

The output is a JSON file in the OpenContracts format, containing:

  • title: Document title from core properties
  • content: Complete extracted text (body, headers, footers, footnotes, endnotes)
  • description: Document description/subject if available
  • pageCount: Estimated page count
  • pawlsFileContent: PAWLS-format page layout with token positions
  • labelledText: Structural annotations (SECTION, PARAGRAPH, TABLE)
  • relationships: Hierarchical relationships between annotations

Example Output

{
  "title": "Sample Contract",
  "content": "This is the document content...",
  "pageCount": 5,
  "pawlsFileContent": [
    {
      "page": { "width": 612, "height": 792, "index": 0 },
      "tokens": [
        { "x": 72, "y": 72, "width": 30, "height": 12, "text": "This" }
      ]
    }
  ],
  "labelledText": [
    {
      "id": "section-0",
      "annotationLabel": "SECTION",
      "structural": true
    }
  ]
}

OpenContracts Compatibility

This tool produces output compatible with the OpenContracts document analysis platform. The format includes:

  • Complete text extraction from all document parts
  • PAWLS-compatible token positions for NLP/ML pipelines
  • Structural annotations for document understanding

Environment Variables

  • DOCX2OC_DEBUG=1: Show detailed error information including stack traces
Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

Version Downloads Last Updated
5.4.2 110 1/26/2026
5.4.1 111 1/21/2026
5.4.0 172 12/24/2025
5.3.0 155 12/21/2025
5.2.0 133 12/6/2025
5.1.2 190 12/4/2025