Docx2OC 5.4.2
dotnet tool install --global Docx2OC --version 5.4.2
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
dotnet tool install --local Docx2OC --version 5.4.2
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=Docx2OC&version=5.4.2
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
nuke :add-package Docx2OC --version 5.4.2
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
docx2oc
A command-line tool to export Word documents (.docx) to the OpenContracts format.
Installation
As a .NET Tool (recommended)
dotnet tool install --global Docx2OC
Build from Source
cd tools/docx2oc
dotnet build
Usage
# Export with default output filename (input.oc)
docx2oc contract.docx
# Export with custom output filename
docx2oc contract.docx export.json
Output Format
The output is a JSON file in the OpenContracts format, containing:
- title: Document title from core properties
- content: Complete extracted text (body, headers, footers, footnotes, endnotes)
- description: Document description/subject if available
- pageCount: Estimated page count
- pawlsFileContent: PAWLS-format page layout with token positions
- labelledText: Structural annotations (SECTION, PARAGRAPH, TABLE)
- relationships: Hierarchical relationships between annotations
Example Output
{
"title": "Sample Contract",
"content": "This is the document content...",
"pageCount": 5,
"pawlsFileContent": [
{
"page": { "width": 612, "height": 792, "index": 0 },
"tokens": [
{ "x": 72, "y": 72, "width": 30, "height": 12, "text": "This" }
]
}
],
"labelledText": [
{
"id": "section-0",
"annotationLabel": "SECTION",
"structural": true
}
]
}
OpenContracts Compatibility
This tool produces output compatible with the OpenContracts document analysis platform. The format includes:
- Complete text extraction from all document parts
- PAWLS-compatible token positions for NLP/ML pipelines
- Structural annotations for document understanding
Environment Variables
DOCX2OC_DEBUG=1: Show detailed error information including stack traces
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
This package has no dependencies.