txt2obj 1.0.4
dotnet add package txt2obj --version 1.0.4
NuGet\Install-Package txt2obj -Version 1.0.4
<PackageReference Include="txt2obj" Version="1.0.4" />
<PackageVersion Include="txt2obj" Version="1.0.4" />
<PackageReference Include="txt2obj" />
paket add txt2obj --version 1.0.4
#r "nuget: txt2obj, 1.0.4"
#:package txt2obj@1.0.4
#addin nuget:?package=txt2obj&version=1.0.4
#tool nuget:?package=txt2obj&version=1.0.4
txt2obj
txt2obj turns unstructured text into strongly typed objects by running a tree of regular expression nodes over the input. Each node can capture data, move it into variables, post-process the value (uppercase, replace, etc.), and finally assign it to a property on your model. This keeps all parsing rules in one declarative structure instead of scattering regex calls throughout your code.
Highlights
- Tree-based templates - compose complex parsers by nesting
Nodeinstances that mirror the layout of the target object graph. - Variable plumbing - capture portions of text into named variables, reuse them in descendant nodes, and mutate them with setters.
- Typed projection - final output is deserialized into any CLR type via Newtonsoft.Json, so primitives, complex objects, and collections are supported out of the box.
- Processing pipeline - attach dot-chained processors (
ToUpper,Replace,ToLower, or custom ones) to normalize data before assignment. - Date formatting helpers - standardize captured strings into ISO-8601
DateTimevalues using theFormatproperty. - Collection handling - child nodes can emit repeated matches that are automatically grouped into lists or arrays based on your model.
Installation
dotnet add package txt2obj
The package targets .NET 10.0, .NET Standard 2.0, and .NET Framework 4.5, and depends on Newtonsoft.Json.
Quick Start
Consider parsing a till slip that contains a timestamp and repeating line items. Here is the example slip and the template that extracts it. Each node includes a Comment so templates remain self-documenting:
var rawSlipText = @"DATE: 2020-01-02 16:22:23
LINE ITEMS START --->
Jar of cookies 22 23.55 518.10
Cigarettes 1 10.00 10.00
<--- LINE ITEMS END";
var template = new Node.Node
{
Comment = "Slip root",
ChildNodes = new List<Node.Node>
{
// Transaction timestamp
new Node.Node
{
Comment = "Capture transaction timestamp",
Pattern = @"(?<date>\d{4}-\d{2}-\d{2}) (?<time>\d{2}:\d{2}:\d{2})",
ChildNodes = new List<Node.Node>
{
new Node.Node
{
Comment = "Store date portion in timestamp variable (date only)",
TargetVariable = "timestamp",
FromVariable = "date"
},
new Node.Node
{
Comment = "Append time portion",
TargetVariable = "timestamp",
FromVariable = "time",
Setter = "|OLD| |NEW|" // append time to the stored date
},
new Node.Node
{
Comment = "Assign parsed timestamp",
Target = "TransactionTime",
FromVariable = "timestamp",
Format = "yyyy-MM-dd HH:mm:ss"
}
}
},
// Line items collection
new Node.Node
{
Comment = "Capture line items block",
Pattern = "LINE ITEMS START --->(?<items>.*?)<--- LINE ITEMS END",
Target = "LineItems",
ChildNodes = new List<Node.Node>
{
new Node.Node
{
Comment = "Parse one line item row",
Pattern = @"(?<desc>[^\n\r]+)(?<qty>\d+) (?<unit>\d+\.\d{2}) (?<total>\d+\.\d{2})",
ChildNodes = new List<Node.Node>
{
new Node.Node { Comment = "Item description", Target = "Description", FromVariable = "desc" },
new Node.Node { Comment = "Item quantity", Target = "Quantity", FromVariable = "qty" },
new Node.Node { Comment = "Item unit price", Target = "UnitPrice", FromVariable = "unit" },
new Node.Node { Comment = "Item line total", Target = "LineTotal", FromVariable = "total" }
}
}
}
}
}
};
var parser = new Parser.Parser();
var result = parser.Text2Object<SlipModel>(template, rawSlipText).Result;
result is a fully populated SlipModel, and you never had to manually iterate the regex matches.
Check the txt2obj.demo project for a runnable version that loads slip1.txt and shows how to build a template.
Templates in YAML
If you prefer to store the node structure outside code, you can serialize it in YAML and load it at runtime.
Comment: Slip root
ChildNodes:
- Comment: Capture transaction timestamp
Pattern: "(?<date>\\d{4}-\\d{2}-\\d{2}) (?<time>\\d{2}:\\d{2}:\\d{2})"
ChildNodes:
- Comment: Store date portion
TargetVariable: timestamp
FromVariable: date
- Comment: Append time portion
TargetVariable: timestamp
FromVariable: time
Setter: "|OLD| |NEW|"
- Comment: Assign parsed timestamp
Target: TransactionTime
FromVariable: timestamp
Format: "yyyy-MM-dd HH:mm:ss"
- Comment: Capture line items block
Pattern: "LINE ITEMS START --->(?<items>.*?)<--- LINE ITEMS END"
Target: LineItems
ChildNodes:
- Comment: Parse one line item row
Pattern: "(?<desc>[^\\n\\r]+)(?<qty>\\d+) (?<unit>\\d+\\.\\d{2}) (?<total>\\d+\\.\\d{2})"
ChildNodes:
- Comment: Item description
Target: Description
FromVariable: desc
- Comment: Item quantity
Target: Quantity
FromVariable: qty
- Comment: Item unit price
Target: UnitPrice
FromVariable: unit
- Comment: Item line total
Target: LineTotal
FromVariable: total
// dotnet add package YamlDotNet
using System.IO;
using YamlDotNet.Serialization;
using txt2obj.Node;
var deserializer = new DeserializerBuilder().Build();
var yaml = File.ReadAllText("template.yaml");
var template = deserializer.Deserialize<Node.Node>(yaml);
var parser = new Parser.Parser();
var result = parser.Text2Object<SlipModel>(template, rawSlipText).Result;
Node Property Reference
| Property | Purpose |
|---|---|
Pattern |
.NET System.Text.RegularExpressions.Regex (compiled with RegexOptions.Singleline) applied to the current input. Named groups are written to variables. When omitted, the node simply forwards the incoming text (or FromVariable). |
Target |
Property or field on the target object to populate. Leave blank to act as a helper node (e.g., for variable manipulation) without writing to the output. |
ChildNodes |
Nested nodes that receive the current node's output text for complex properties. Collection handling has its own rules (see below). |
FromVariable |
Pulls the node input from an existing variable instead of using the latest match. Great for reusing captured values. |
TargetVariable |
Stores the node output into a named variable. Combine with Setter to append/prepend. |
Setter |
Template used when TargetVariable already exists. |OLD| is replaced with the previous value, |NEW| with the current one. |
Constant |
Overrides the node input with the provided literal string (useful for seeding variables). |
Process |
Dot-separated processor chain (e.g., "Replace(hello,goodbye).ToUpper()"). Parameters are comma-separated; spaces are not supported (use 0x20 for space). The syntax matches what StringProcessorHolder.CreateProcessorList expects. |
Format |
Optional input format string used when writing to DateTime members. Values are converted to ISO 8601 prior to assignment. |
Comment |
Documentation-only field; helpful when serializing templates for sharing. |
Working with Variables
Variables travel down the node tree so that deep descendants can reuse matches captured by their ancestors. Typical workflow:
- Capture text with a named group (e.g.,
(?<sku>ABC\d+)). - Store it by setting
TargetVariable = "sku". - Use it later via
FromVariable = "sku"to populate multiple targets or keep building with setters.
Node.SetVariable searches up the parent chain, so updating a variable anywhere automatically updates the shared instance.
String Processors
Processors are lightweight classes that implement IStringProcessor. Three ship with the package:
ToUpper()ToLower()Replace(old,new)
Attach them using the Process property ("Replace(hello,goodbye).ToUpper()"). Internally the StringProcessorHolder parses the expression, instantiates each processor, injects the parameters (hex escapes such as 0x20 are supported), and runs them sequentially.
Custom Processors
public class TrimProcessor : IStringProcessor
{
public string Name => "Trim";
public string[] Parameters { get; set; }
public string Execute(string input) => input.Trim();
}
var parser = new Parser.Parser();
parser.RegisterProcessor(new TrimProcessor());
Once registered, Process = "Trim()" becomes available to every node handled by that parser instance.
Collections & Complex Types
When a node targets a complex property (non-collection), its nested nodes are applied to the node's output text and the results are assigned to that property. If the property is an IEnumerable, array, or List, txt2obj automatically:
- Determines the element type via reflection.
- Runs each child node independently against the collection node's input text (not the collection node's own
Pattern/Processoutput). - Orders matches by their index in the source text so the items stay aligned with the input sequence.
See txt2obj.test/CollectionTest.cs for multiple patterns that fill lists and arrays from different regex strategies.
Running the Demo & Tests
# Demo (prints parsed slip to the console)
dotnet run --project txt2obj.demo
# Test suite
dotnet test
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
| .NET Framework | net45 is compatible. net451 was computed. net452 was computed. net46 was computed. net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen40 was computed. tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETFramework 4.5
- Newtonsoft.Json (>= 13.0.3)
-
.NETStandard 2.0
- Newtonsoft.Json (>= 13.0.3)
-
net10.0
- Newtonsoft.Json (>= 13.0.3)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.