Isdoc.Parser
0.1.3
dotnet add package Isdoc.Parser --version 0.1.3
NuGet\Install-Package Isdoc.Parser -Version 0.1.3
<PackageReference Include="Isdoc.Parser" Version="0.1.3" />
<PackageVersion Include="Isdoc.Parser" Version="0.1.3" />
<PackageReference Include="Isdoc.Parser" />
paket add Isdoc.Parser --version 0.1.3
#r "nuget: Isdoc.Parser, 0.1.3"
#:package Isdoc.Parser@0.1.3
#addin nuget:?package=Isdoc.Parser&version=0.1.3
#tool nuget:?package=Isdoc.Parser&version=0.1.3
ISDOC Parser
Isdoc.Parser is a robust, high-performance .NET library designed to extract, parse, and validate ISDOC (Information System Document) payloads. It supports electronic invoices, common documents, and manifests in ISDOC/ISDOC.PDF formats.
The library is designed for seamless integration: it bundles all official ISDOC 6.0, 6.0.1, and 6.0.2 schemas as embedded resources (no external file dependencies required), features a secure XML reader to guard against XXE/XML bomb vulnerabilities, and provides both simple standalone APIs and full Dependency Injection support.
Key Features
- Multi-Format Extraction: Extract and parse embedded ISDOC payloads directly from PDF files/streams, standalone XML files, or raw byte arrays.
- Embedded XSD Validation: Automatically resolves and validates incoming XML against bundled schemas (
6.0,6.0.1, and6.0.2with or without XMLDSIG digital signatures). - Strongly-Typed Document Model: Maps parsed XML elements into a rich, structured .NET record hierarchy (
IsdocInvoice,IsdocCommonDocument,IsdocManifest). - Resilient Parsing: Preserves unmapped or custom XML children as
XElementnodes viaPreservedChildrencollections, preventing data loss. - Convenient Summary: Quick access to essential business fields (
DocumentId,SupplierName,PayableAmount, etc.) through a simplifiedIsdocSummaryrecord. - Enterprise-Grade Security: Uses a hardened XML reader that disables external entity resolution (XXE) and restricts DTD processing.
- Thread-Safe: The default parser implementation is stateless and completely thread-safe, making it perfect for registration as a singleton in web applications.
Installation
Add the NuGet package to your project:
dotnet add package Isdoc.Parser
Quick Start
1. Simple Instantiation (No DI)
For simple console applications, utilities, or quick prototypes, instantiate IsdocParser directly:
using Isdoc.Parser;
// Instantiate the parser (uses default production dependencies)
var parser = new IsdocParser();
// Parse an ISDOC-embedded PDF file
IsdocParseResult result = parser.ParsePdfFile("invoice.pdf");
if (result.IsIsdoc && result.Document != null)
{
var summary = result.Document.Summary;
Console.WriteLine($"Successfully parsed ISDOC Document!");
Console.WriteLine($"ID: {summary.DocumentId}");
Console.WriteLine($"Supplier: {summary.SupplierName}");
Console.WriteLine($"Payable Amount: {summary.PayableAmount} {summary.CurrencyCode}");
}
else
{
Console.WriteLine("Failed to parse or validate ISDOC document:");
foreach (var error in result.Errors)
{
Console.WriteLine($"- {error}");
}
}
2. Dependency Injection Setup
In modern ASP.NET Core or generic host applications, register the parser using the provided extension method:
using Isdoc.Parser;
using Microsoft.Extensions.DependencyInjection;
var services = new ServiceCollection();
// Registers IIsdocParser and all its underlying pipeline services as Singletons
services.AddIsdocParser();
var serviceProvider = services.BuildServiceProvider();
var parser = serviceProvider.GetRequiredService<IIsdocParser>();
Supported APIs
IIsdocParser provides synchronous and asynchronous methods for flexible integration:
PDF Input
ParsePdfFile(string pdfPath)/ParsePdfFileAsync(string pdfPath, CancellationToken ct)ParsePdfStream(Stream pdfStream)/ParsePdfStreamAsync(Stream pdfStream, CancellationToken ct)
XML / ISDOC Input
ParseXmlFile(string xmlPath)/ParseXmlFileAsync(string xmlPath, CancellationToken ct)ParseXmlBytes(byte[] xmlBytes, string? sourceName = null)
Understanding the Document Model
When parsing succeeds, IsdocParseResult contains an IsdocDocument with three main layers:
- Metadata (
IsdocMetadata): Technical payload characteristics.DocumentKind(Invoice,CommonDocument,Manifest,Unknown)Version(e.g.,"6.0.2")ContainsXmlSignature(boolean)
- Summary (
IsdocSummary): Flattened business representation for easy access.DocumentId,SupplierName,CustomerName,PayableAmount,CurrencyCode,IssueDate, etc.
- Content (
IsdocContent): Strongly-typed object models representing the parsed XML.- For
Invoice, this isIsdocInvoice.
- For
Deep Dive: Accessing Invoice Details
When result.Document.Content is an IsdocInvoice, you can access detailed structures:
if (result.Document.Content is IsdocInvoice invoice)
{
// 1. Header Information
Console.WriteLine($"Issue Date: {invoice.IssueDate}");
Console.WriteLine($"Local Currency: {invoice.LocalCurrencyCode}");
// 2. Party Details (Supplier / Customer)
if (invoice.AccountingSupplierParty?.Party is IsdocParty supplier)
{
Console.WriteLine($"Supplier Name: {supplier.PartyName.Name}");
Console.WriteLine($"Street: {supplier.PostalAddress.StreetName} {supplier.PostalAddress.BuildingNumber}");
Console.WriteLine($"City: {supplier.PostalAddress.CityName}");
Console.WriteLine($"Tax ID (DIC): {supplier.PartyTaxSchemes.FirstOrDefault()?.CompanyId}");
}
// 3. Line Items
Console.WriteLine("Line Items:");
foreach (var line in invoice.InvoiceLines)
{
Console.WriteLine($"- Item: {line.Item?.Description}");
Console.WriteLine($" Quantity: {line.InvoicedQuantity?.Value} {line.InvoicedQuantity?.UnitCode}");
Console.WriteLine($" Line Amount (Excl. VAT): {line.LineExtensionAmount}");
Console.WriteLine($" VAT Rate: {line.ClassifiedTaxCategory.Percent}%");
}
// 4. Totals
if (invoice.LegalMonetaryTotal is IsdocLegalMonetaryTotal totals)
{
Console.WriteLine($"Tax Exclusive Amount: {totals.TaxExclusiveAmount}");
Console.WriteLine($"Tax Inclusive Amount: {totals.TaxInclusiveAmount}");
Console.WriteLine($"Payable Amount: {totals.PayableAmount}");
}
// 5. Payment Details
if (invoice.PaymentMeans?.Payments.FirstOrDefault() is IsdocPayment payment)
{
Console.WriteLine($"Paid Amount: {payment.PaidAmount}");
Console.WriteLine($"Payment Method: {payment.PaymentMeansCode}");
if (payment.Details?.BankAccount is IsdocBankAccount bankAccount)
{
Console.WriteLine($"IBAN: {bankAccount.Iban}");
Console.WriteLine($"BIC: {bankAccount.Bic}");
Console.WriteLine($"Variable Symbol: {payment.Details.VariableSymbol}");
}
}
}
Preserving Unmapped Data
If the input XML contains elements not yet bound to properties in the core IsdocInvoice record model, they are automatically stored in the PreservedChildren property of their respective parents as XElement instances. This guarantees that no data is lost when schemas are extended or custom tags are used, and allows you to manually query elements using LINQ to XML:
foreach (var child in invoice.PreservedChildren)
{
if (child.Element.Name.LocalName == "MyCustomExtension")
{
Console.WriteLine(child.Element.Value);
}
}
Under the Hood
The parser operates via a pipeline of discrete components registered in the Dependency Injection container:
flowchart TD
PDF[PDF File/Stream] -->|PdfPigIsdocPdfExtractor| XMLBytes[ISDOC XML Bytes]
XMLBytes -->|SafeIsdocXmlReader| Doc[XDocument]
XML[XML File/Bytes] -->|SafeIsdocXmlReader| Doc
Doc -->|IsdocMetadataReader| Meta[IsdocMetadata]
Meta -->|IsdocSchemaValidator| Valid[Validated XDocument]
Valid -->|IsdocSummaryReader| Sum[IsdocSummary]
Valid -->|IsdocContentReader| Content[IsdocContent]
subgraph Catalog[Embedded Schema Catalog]
Valid
end
IIsdocPdfExtractor(PdfPigIsdocPdfExtractor): Extracts embedded XML payload attachments from PDF.IIsdocXmlReader(SafeIsdocXmlReader): Hardens and reads raw XML bytes safely into anXDocument.IIsdocMetadataReader(IsdocMetadataReader): Identifies document kind, version, and signature flag from the root element.IIsdocSchemaValidator(IsdocSchemaValidator): Validates theXDocumentagainst the correct schema fromIIsdocSchemaCatalog.IIsdocSummaryReader(IsdocSummaryReader): Extracts high-level business fields.IIsdocContentReader(IsdocContentReader): Recursively parses the validated XML into the final object model.
License
This project is licensed under the MIT License. See the LICENSE file for details.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 10.0.8)
- PdfPig (>= 0.1.14)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.