txt2obj 1.0.4

dotnet add package txt2obj --version 1.0.4
                    
NuGet\Install-Package txt2obj -Version 1.0.4
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="txt2obj" Version="1.0.4" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="txt2obj" Version="1.0.4" />
                    
Directory.Packages.props
<PackageReference Include="txt2obj" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add txt2obj --version 1.0.4
                    
#r "nuget: txt2obj, 1.0.4"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package txt2obj@1.0.4
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=txt2obj&version=1.0.4
                    
Install as a Cake Addin
#tool nuget:?package=txt2obj&version=1.0.4
                    
Install as a Cake Tool

txt2obj

txt2obj turns unstructured text into strongly typed objects by running a tree of regular expression nodes over the input. Each node can capture data, move it into variables, post-process the value (uppercase, replace, etc.), and finally assign it to a property on your model. This keeps all parsing rules in one declarative structure instead of scattering regex calls throughout your code.

Highlights

  • Tree-based templates - compose complex parsers by nesting Node instances that mirror the layout of the target object graph.
  • Variable plumbing - capture portions of text into named variables, reuse them in descendant nodes, and mutate them with setters.
  • Typed projection - final output is deserialized into any CLR type via Newtonsoft.Json, so primitives, complex objects, and collections are supported out of the box.
  • Processing pipeline - attach dot-chained processors (ToUpper, Replace, ToLower, or custom ones) to normalize data before assignment.
  • Date formatting helpers - standardize captured strings into ISO-8601 DateTime values using the Format property.
  • Collection handling - child nodes can emit repeated matches that are automatically grouped into lists or arrays based on your model.

Installation

dotnet add package txt2obj

The package targets .NET 10.0, .NET Standard 2.0, and .NET Framework 4.5, and depends on Newtonsoft.Json.

Quick Start

Consider parsing a till slip that contains a timestamp and repeating line items. Here is the example slip and the template that extracts it. Each node includes a Comment so templates remain self-documenting:

var rawSlipText = @"DATE: 2020-01-02 16:22:23
LINE ITEMS START --->
Jar of cookies 22 23.55 518.10
Cigarettes 1 10.00 10.00
<--- LINE ITEMS END";

var template = new Node.Node
{
    Comment = "Slip root",
    ChildNodes = new List<Node.Node>
    {
        // Transaction timestamp
        new Node.Node
        {
            Comment = "Capture transaction timestamp",
            Pattern = @"(?<date>\d{4}-\d{2}-\d{2}) (?<time>\d{2}:\d{2}:\d{2})",
            ChildNodes = new List<Node.Node>
            {
                new Node.Node
                {
                    Comment = "Store date portion in timestamp variable (date only)",
                    TargetVariable = "timestamp",
                    FromVariable = "date"
                },
                new Node.Node
                {
                    Comment = "Append time portion",
                    TargetVariable = "timestamp",
                    FromVariable = "time",
                    Setter = "|OLD| |NEW|" // append time to the stored date
                },
                new Node.Node
                {
                    Comment = "Assign parsed timestamp",
                    Target = "TransactionTime",
                    FromVariable = "timestamp",
                    Format = "yyyy-MM-dd HH:mm:ss"
                }
            }
        },
        // Line items collection
        new Node.Node
        {
            Comment = "Capture line items block",
            Pattern = "LINE ITEMS START --->(?<items>.*?)<--- LINE ITEMS END",
            Target = "LineItems",
            ChildNodes = new List<Node.Node>
            {
                new Node.Node
                {
                    Comment = "Parse one line item row",
                    Pattern = @"(?<desc>[^\n\r]+)(?<qty>\d+) (?<unit>\d+\.\d{2}) (?<total>\d+\.\d{2})",
                    ChildNodes = new List<Node.Node>
                    {
                        new Node.Node { Comment = "Item description", Target = "Description", FromVariable = "desc" },
                        new Node.Node { Comment = "Item quantity", Target = "Quantity", FromVariable = "qty" },
                        new Node.Node { Comment = "Item unit price", Target = "UnitPrice", FromVariable = "unit" },
                        new Node.Node { Comment = "Item line total", Target = "LineTotal", FromVariable = "total" }
                    }
                }
            }
        }
    }
};

var parser = new Parser.Parser();
var result = parser.Text2Object<SlipModel>(template, rawSlipText).Result;

result is a fully populated SlipModel, and you never had to manually iterate the regex matches.

Check the txt2obj.demo project for a runnable version that loads slip1.txt and shows how to build a template.

Templates in YAML

If you prefer to store the node structure outside code, you can serialize it in YAML and load it at runtime.

Comment: Slip root
ChildNodes:
  - Comment: Capture transaction timestamp
    Pattern: "(?<date>\\d{4}-\\d{2}-\\d{2}) (?<time>\\d{2}:\\d{2}:\\d{2})"
    ChildNodes:
      - Comment: Store date portion
        TargetVariable: timestamp
        FromVariable: date
      - Comment: Append time portion
        TargetVariable: timestamp
        FromVariable: time
        Setter: "|OLD| |NEW|"
      - Comment: Assign parsed timestamp
        Target: TransactionTime
        FromVariable: timestamp
        Format: "yyyy-MM-dd HH:mm:ss"
  - Comment: Capture line items block
    Pattern: "LINE ITEMS START --->(?<items>.*?)<--- LINE ITEMS END"
    Target: LineItems
    ChildNodes:
      - Comment: Parse one line item row
        Pattern: "(?<desc>[^\\n\\r]+)(?<qty>\\d+) (?<unit>\\d+\\.\\d{2}) (?<total>\\d+\\.\\d{2})"
        ChildNodes:
          - Comment: Item description
            Target: Description
            FromVariable: desc
          - Comment: Item quantity
            Target: Quantity
            FromVariable: qty
          - Comment: Item unit price
            Target: UnitPrice
            FromVariable: unit
          - Comment: Item line total
            Target: LineTotal
            FromVariable: total
// dotnet add package YamlDotNet
using System.IO;
using YamlDotNet.Serialization;
using txt2obj.Node;

var deserializer = new DeserializerBuilder().Build();
var yaml = File.ReadAllText("template.yaml");
var template = deserializer.Deserialize<Node.Node>(yaml);

var parser = new Parser.Parser();
var result = parser.Text2Object<SlipModel>(template, rawSlipText).Result;

Node Property Reference

Property Purpose
Pattern .NET System.Text.RegularExpressions.Regex (compiled with RegexOptions.Singleline) applied to the current input. Named groups are written to variables. When omitted, the node simply forwards the incoming text (or FromVariable).
Target Property or field on the target object to populate. Leave blank to act as a helper node (e.g., for variable manipulation) without writing to the output.
ChildNodes Nested nodes that receive the current node's output text for complex properties. Collection handling has its own rules (see below).
FromVariable Pulls the node input from an existing variable instead of using the latest match. Great for reusing captured values.
TargetVariable Stores the node output into a named variable. Combine with Setter to append/prepend.
Setter Template used when TargetVariable already exists. |OLD| is replaced with the previous value, |NEW| with the current one.
Constant Overrides the node input with the provided literal string (useful for seeding variables).
Process Dot-separated processor chain (e.g., "Replace(hello,goodbye).ToUpper()"). Parameters are comma-separated; spaces are not supported (use 0x20 for space). The syntax matches what StringProcessorHolder.CreateProcessorList expects.
Format Optional input format string used when writing to DateTime members. Values are converted to ISO 8601 prior to assignment.
Comment Documentation-only field; helpful when serializing templates for sharing.

Working with Variables

Variables travel down the node tree so that deep descendants can reuse matches captured by their ancestors. Typical workflow:

  1. Capture text with a named group (e.g., (?<sku>ABC\d+)).
  2. Store it by setting TargetVariable = "sku".
  3. Use it later via FromVariable = "sku" to populate multiple targets or keep building with setters.

Node.SetVariable searches up the parent chain, so updating a variable anywhere automatically updates the shared instance.

String Processors

Processors are lightweight classes that implement IStringProcessor. Three ship with the package:

  • ToUpper()
  • ToLower()
  • Replace(old,new)

Attach them using the Process property ("Replace(hello,goodbye).ToUpper()"). Internally the StringProcessorHolder parses the expression, instantiates each processor, injects the parameters (hex escapes such as 0x20 are supported), and runs them sequentially.

Custom Processors

public class TrimProcessor : IStringProcessor
{
    public string Name => "Trim";
    public string[] Parameters { get; set; }
    public string Execute(string input) => input.Trim();
}

var parser = new Parser.Parser();
parser.RegisterProcessor(new TrimProcessor());

Once registered, Process = "Trim()" becomes available to every node handled by that parser instance.

Collections & Complex Types

When a node targets a complex property (non-collection), its nested nodes are applied to the node's output text and the results are assigned to that property. If the property is an IEnumerable, array, or List, txt2obj automatically:

  1. Determines the element type via reflection.
  2. Runs each child node independently against the collection node's input text (not the collection node's own Pattern/Process output).
  3. Orders matches by their index in the source text so the items stay aligned with the input sequence.

See txt2obj.test/CollectionTest.cs for multiple patterns that fill lists and arrays from different regex strategies.

Running the Demo & Tests

# Demo (prints parsed slip to the console)
dotnet run --project txt2obj.demo

# Test suite
dotnet test
Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net45 is compatible.  net451 was computed.  net452 was computed.  net46 was computed.  net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.4 97 12/29/2025
1.0.3 93 12/29/2025
1.0.2 689 7/13/2020
1.0.1 563 7/13/2020
1.0.0 714 4/5/2019