ToonTokenizer 1.0.0.35

dotnet add package ToonTokenizer --version 1.0.0.35
                    
NuGet\Install-Package ToonTokenizer -Version 1.0.0.35
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="ToonTokenizer" Version="1.0.0.35" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="ToonTokenizer" Version="1.0.0.35" />
                    
Directory.Packages.props
<PackageReference Include="ToonTokenizer" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add ToonTokenizer --version 1.0.0.35
                    
#r "nuget: ToonTokenizer, 1.0.0.35"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package ToonTokenizer@1.0.0.35
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=ToonTokenizer&version=1.0.0.35
                    
Install as a Cake Addin
#tool nuget:?package=ToonTokenizer&version=1.0.0.35
                    
Install as a Cake Tool

ToonTokenizer for .NET

NuGet License

A complete .NET library for working with TOON (Token-Oriented Object Notation) - a compact, human-readable text format optimized for LLM prompts and structured data interchange.

What is TOON?

TOON is a line-oriented, indentation-based notation that encodes JSON data with explicit structure and minimal quoting. Think of it as:

  • More compact than JSON for arrays of uniform objects (no repeated keys)
  • More structured than CSV with nesting, types, and field names
  • More deterministic than YAML with explicit array lengths and fixed formatting rules

Perfect for LLM prompts, configuration files, and data interchange where token efficiency and readability matter.

Features

  • Complete lexical analyzer with all TOON token types
  • Full AST parser with resilient error recovery
  • Token-to-AST navigation - easily map between tokens and syntax nodes
  • Tokens included in parse results - no separate tokenization call needed
  • Position tracking for every token and AST node (line, column, span)
  • Resilient parsing - continues after errors, returns partial AST
  • Rich error reporting - collects all errors with precise locations and error codes
  • Standardized error codes - 20+ error codes (TOON1xxx-9xxx) for programmatic handling
  • Context-aware error messages - every error explains what, why, and how to fix
  • Visitor pattern for AST traversal and transformation
  • Extension methods for syntax highlighting and IDE integration
  • TOON spec §6.1 compliance - array size validation (detects size mismatches)
  • Battle-tested with 637 unit tests (100% passing)

Targets: .NET Standard 2.0 (maximum compatibility)

Installation

dotnet add package ToonTokenizer

Or via Package Manager Console:

Install-Package ToonTokenizer

Quick Start

Parse TOON to AST

using ToonTokenizer;

var source = @"
users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user
";

// Parse returns: Document (AST), Errors (if any), and Tokens
var result = Toon.Parse(source);

if (result.IsSuccess)
{
    // Access the parsed document
    foreach (var property in result.Document.Properties)
    {
        Console.WriteLine($"{property.Key}: {property.Value}");
    }
    
    // Access tokens for syntax highlighting
    foreach (var token in result.Tokens)
    {
        Console.WriteLine($"{token.Type}: '{token.Value}' at {token.Line}:{token.Column}");
    }
}
else
{
    // Resilient parsing: you still get a partial AST + all errors
    Console.WriteLine($"Found {result.Errors.Count} error(s):");
    foreach (var error in result.Errors)
    {
        Console.WriteLine($"  Line {error.Line}: {error.Message}");
    }
}

Validate TOON

if (Toon.TryParse(source, out var result))
{
    if (result.IsSuccess)
        Console.WriteLine("✓ Valid TOON");
    else
        Console.WriteLine($"✗ {result.Errors.Count} error(s) found");
}

Access Tokens Only

// Get tokens without parsing
var tokens = Toon.Tokenize(source);

foreach (var token in tokens)
{
    Console.WriteLine($"{token.Type}: {token.Value}");
}
var source = "name: John\nage: 30";
var result = Toon.Parse(source);

// Get a token and find which AST node it belongs to
var token = result.Tokens.Find(t => t.Value == "30");
var property = token.GetPropertyNode(result.Document);

Console.WriteLine($"Token '{token.Value}' belongs to property: {property.Key}");
// Output: Token '30' belongs to property: age

// Or find property at a specific line/column
var prop = result.GetPropertyAt(line: 2, column: 1);
Console.WriteLine($"Property at line 2: {prop.Key}");
// Output: Property at line 2: age

// Find nested properties by path
var theme = result.FindPropertyByPath("user.settings.theme");
if (theme?.Value is StringValueNode str)
{
    Console.WriteLine($"Theme: {str.Value}");
}

TOON Language Examples

Simple Properties

name: John Doe
age: 30
active: true
email: john@example.com

Nested Objects

user:
  name: Jane Smith
  email: jane@example.com
  settings:
    theme: dark
    notifications: true

Arrays - Inline (Primitives)

colors[3]: red,green,blue
scores[5]: 95,87,92,88,91

Arrays - Tabular (Uniform Objects)

The killer feature! No repeated keys:

users[3]{id,name,email,active}:
  1,Alice,alice@example.com,true
  2,Bob,bob@example.com,false
  3,Charlie,charlie@example.com,true

Compare to JSON:

{
  "users": [
    {"id": 1, "name": "Alice", "email": "alice@example.com", "active": true},
    {"id": 2, "name": "Bob", "email": "bob@example.com", "active": false},
    {"id": 3, "name": "Charlie", "email": "charlie@example.com", "active": true}
  ]
}

60% fewer tokens! 🎉

Real-World Example

context:
  task: Favorite hiking trails
  location: Boulder, CO
  season: Spring 2025

friends[3]: Ana,Luis,Sam

hikes[3]{id,name,distance,elevation,companion,sunny}:
  1,Blue Lake Trail,7.5,320,Ana,true
  2,Ridge Overlook,9.2,540,Luis,false
  3,Wildflower Loop,5.1,180,Sam,true

notes:
  best: Ridge Overlook has amazing views!
  bring: Water, snacks, sunscreen

API Reference

Main Entry Point: Toon

// Parse TOON source (returns Document, Errors, and Tokens)
ToonParseResult Parse(string source)

// Validate and parse (returns true for completed parse, even with errors)
bool TryParse(string source, out ToonParseResult result)

// Tokenize only
List<Token> Tokenize(string source)

Parse Result

public class ToonParseResult
{
    public ToonDocument Document { get; }      // Always available (even with errors)
    public List<ToonError> Errors { get; }     // Empty if no errors
    public List<Token> Tokens { get; }         // All tokens from lexing
    
    public bool IsSuccess => Errors.Count == 0;
    public bool HasErrors => Errors.Count > 0;
}

Errors with Precise Locations

public class ToonError
{
    public string Message { get; }
    public string? Code { get; }      // Error code (e.g., "TOON1001")
    public int Position { get; }      // 0-based character offset
    public int Length { get; }        // Length of error span
    public int Line { get; }          // 1-based line number
    public int Column { get; }        // 1-based column number
    public int EndPosition { get; }   // Position + Length
}

Error Codes for Programmatic Handling

All errors include standardized error codes for programmatic handling and filtering:

var result = Toon.Parse(source);

foreach (var error in result.Errors)
{
    // Errors include descriptive messages with fix suggestions
    Console.WriteLine($"[{error.Code}] {error.Message}");
    
    // Filter by error type
    if (error.Code?.StartsWith("TOON1") == true)
        Console.WriteLine("  → Lexer/tokenization error");
    else if (error.Code?.StartsWith("TOON2") == true)
        Console.WriteLine("  → Parser structural error");
    else if (error.Code?.StartsWith("TOON3") == true)
        Console.WriteLine("  → Validation error");
}

Error Code Categories:

Category Range Description Examples
Lexer TOON1xxx Tokenization errors TOON1001 Unterminated string<br/>TOON1002 Invalid escape sequence<br/>TOON1003 Invalid character
Parser TOON2xxx Structural errors TOON2001 Expected property key<br/>TOON2002 Expected colon<br/>TOON2003 Expected right bracket<br/>TOON2004 Expected value<br/>TOON2005 Expected delimiter
Validation TOON3xxx Semantic errors TOON3001 Array size mismatch<br/>TOON3002 Table array size mismatch<br/>TOON3003 Table row field mismatch
Delimiters TOON4xxx Delimiter issues TOON4001 Mixed delimiters<br/>TOON4002 Delimiter marker misplaced
Indentation TOON5xxx Indentation problems TOON5001 Unexpected indentation<br/>TOON5002 Inconsistent indentation
Internal TOON9xxx Library bugs TOON9001 Infinite loop detected

Context-Aware Error Messages:

Every error includes:

  • What went wrong - Clear description of the problem
  • Why it's wrong - Explanation of the rule that was violated
  • How to fix it - Actionable suggestions for correction

Example error messages:

// Unterminated string
[TOON1001] Unterminated double-quoted string at line 5, column 10. 
String reached end of line without closing " character. 
Fix: Add closing " before the end of the line

// Invalid escape sequence
[TOON1002] Invalid escape sequence '\x' at line 3, column 15. 
Valid escape sequences: \n, \r, \t, \\, \", \'. 
Fix: Use a valid escape sequence or remove the backslash

// Array size mismatch
[TOON3001] Array size mismatch: declared 5 elements, but found 3. 
Missing 2 elements. Check if array is incomplete or elements are on wrong indentation level. 
Fix: Either add 2 more elements or change the size declaration [5]→[3]

// Table size mismatch with helpful hint
[TOON3002] Table array size mismatch: declared 10 rows, but found 8. 
Missing 2 rows. Check if rows are incomplete or have incorrect indentation. 
Fix: Either add 2 more rows or update the size [10]→[8]

Token Types

public enum TokenType
{
    // Values
    String, Number, True, False, Null, Identifier,
    
    // Structure
    Colon, Comma, Pipe,
    LeftBracket, RightBracket,
    LeftBrace, RightBrace,
    
    // Formatting
    Newline, Indent, Dedent, Whitespace,
    
    // Special
    Comment, EndOfFile, Invalid
}

AST Nodes

All inherit from AstNode with position tracking:

// Document root
ToonDocument              // Contains Properties[]

// Structural
PropertyNode              // Key + Value
ObjectNode                // Nested object with Properties[]

// Arrays
ArrayNode                 // Simple array with Elements[]
TableArrayNode            // Tabular with Schema[] and Rows[][]

// Values
StringValueNode           // String literal
NumberValueNode           // Numeric (integer or float)
BooleanValueNode          // true/false
NullValueNode             // null

Every node includes:

int StartLine, StartColumn, StartPosition
int EndLine, EndColumn, EndPosition

Advanced Features

Resilient Parsing

The parser continues after errors, returning a partial AST and all error locations:

var source = @"
name: John
invalid line here
city: Boulder
";

var result = Toon.Parse(source);

// result.Document has 2 valid properties (name, city)
// result.Errors has 1 error (line 3)

Perfect for:

  • IDE integration (IntelliSense on valid parts)
  • Error highlighting (show all errors at once)
  • Language servers
  • Linters and validators

Extension Methods

Token Extensions
using ToonTokenizer;

var tokens = Toon.Tokenize(source);

// Get tokens on specific line
var lineTokens = tokens.GetTokensOnLine(5);

// Find token at position
var token = tokens.GetTokenAt(line: 3, column: 10);

// Filter by type
var strings = tokens.GetTokensByType(TokenType.String);

// Syntax highlighting classification
foreach (var token in tokens)
{
    string cssClass = token.GetClassification();
    // Returns: "keyword", "string", "number", "comment", etc.
}

// Check categories
bool isKeyword = token.IsKeyword();        // true, false, null
bool isStructural = token.IsStructural();  // :, [, ], {, }, ,
bool isValue = token.IsValue();            // strings, numbers, booleans
Token to AST Navigation
using ToonTokenizer;
using ToonTokenizer.Ast;

var result = Toon.Parse(source);

// From token to AST node
var token = result.Tokens.GetTokenAt(line: 5, column: 3);
var node = token.GetAstNode(result.Document);
var property = token.GetPropertyNode(result.Document);

// From parse result directly
var nodeAtPosition = result.GetNodeAtPosition(42);
var nodeForToken = result.GetNodeForToken(myToken);
var propertyAt = result.GetPropertyAt(line: 3, column: 5);

// Get all properties (including nested)
var allProps = result.GetAllProperties();

// Find by path (dot notation)
var theme = result.FindPropertyByPath("user.settings.theme");
var email = result.FindPropertyByPath("user.email");

if (theme?.Value is StringValueNode str)
{
    Console.WriteLine($"Theme: {str.Value}");
}

Visitor Pattern

Implement custom AST processing:

using ToonTokenizer.Ast;

public class MyVisitor : IAstVisitor<string>
{
    public string VisitDocument(ToonDocument node)
    {
        var results = node.Properties.Select(p => p.Accept(this));
        return string.Join(", ", results);
    }
    
    public string VisitProperty(PropertyNode node)
    {
        return $"{node.Key} = {node.Value.Accept(this)}";
    }
    
    public string VisitStringValue(StringValueNode node)
    {
        return $"\"{node.Value}\"";
    }
    
    // ... implement other Visit methods
}

// Use it
var doc = Toon.Parse(source).Document;
var output = doc.Accept(new MyVisitor());

Use Cases

1. Visual Studio Extension (Syntax Highlighting)

public IEnumerable<ClassificationSpan> GetClassificationSpans(SnapshotSpan span)
{
    var source = span.GetText();
    var result = Toon.Parse(source);  // Gets tokens + AST in one call
    
    foreach (var token in result.Tokens)
    {
        var classification = token.GetClassification();
        var tokenSpan = new SnapshotSpan(
            span.Snapshot, 
            token.Position, 
            token.Length
        );
        
        yield return new ClassificationSpan(
            tokenSpan, 
            GetClassificationType(classification)
        );
    }
}

2. Language Server (IntelliSense)

public IEnumerable<Completion> GetCompletions(int line, int column)
{
    var result = Toon.Parse(documentText);
    
    // Find the property we're currently in
    var property = result.GetPropertyAt(line, column);
    
    if (property != null)
    {
        // Context-aware suggestions based on property type
        if (property.Value is ObjectNode)
        {
            // Suggest nested property names
            yield return new Completion("theme");
            yield return new Completion("enabled");
        }
        else if (property.Value is ArrayNode)
        {
            // Suggest array-specific completions
            yield return new Completion("[size]");
        }
    }
    
    var token = result.Tokens.GetTokenAt(line, column);
    if (token?.Type == TokenType.Colon)
    {
        // After colon: suggest value types
        yield return new Completion("true");
        yield return new Completion("false");
        yield return new Completion("null");
    }
}

3. Error Diagnostics

public IEnumerable<Diagnostic> GetDiagnostics()
{
    var result = Toon.Parse(documentText);
    
    foreach (var error in result.Errors)
    {
        yield return new Diagnostic
        {
            Severity = DiagnosticSeverity.Error,
            Message = error.Message,
            Range = new Range(
                error.Line - 1,
                error.Column - 1,
                error.EndPosition
            )
        };
    }
}

4. Code Folding / Outlining

public IEnumerable<FoldingRange> GetFoldingRanges()
{
    var result = Toon.Parse(documentText);
    
    foreach (var property in result.Document.Properties)
    {
        if (property.Value is ObjectNode obj && obj.Properties.Count > 0)
        {
            yield return new FoldingRange
            {
                StartLine = obj.StartLine,
                EndLine = obj.EndLine,
                Kind = FoldingRangeKind.Region
            };
        }
        else if (property.Value is TableArrayNode table && table.Rows.Count > 5)
        {
            yield return new FoldingRange
            {
                StartLine = table.StartLine,
                EndLine = table.EndLine,
                Kind = FoldingRangeKind.Region
            };
        }
    }
}

5. LLM Prompt Optimization

// Convert verbose JSON to compact TOON for token savings
var jsonData = GetDataFromApi();
var toonEncoder = new ToonEncoder();
var compactPrompt = toonEncoder.Encode(jsonData);

// Use in prompt
var prompt = $@"
Analyze this data:
{compactPrompt}

What insights can you provide?
";

Why Choose ToonTokenizer?

✨ Feature Complete

  • Full TOON v3.0 specification support
  • Handles all array types (inline, tabular, nested)
  • Complete delimiter support (comma, tab, pipe)
  • Resilient parsing with error recovery

🎯 Production Ready

  • 637 unit tests covering edge cases (100% passing)
  • Battle-tested on complex real-world data
  • Handles malformed input gracefully
  • Comprehensive error reporting with standardized error codes
  • Context-aware error messages with actionable fix suggestions

🚀 Performance Focused

  • Efficient single-pass lexer
  • Minimal allocations
  • Streaming-friendly design
  • .NET Standard 2.0 for maximum compatibility

🛠️ Developer Friendly

  • Rich IntelliSense support
  • Extensive XML documentation
  • Position tracking on everything
  • Extension methods for common tasks

🏗️ Extensible

  • Visitor pattern for AST traversal
  • Hook points for custom behavior
  • Clean separation of concerns
  • Easy to integrate into larger systems

Specification Compliance

This library implements the TOON v3.0 specification. The full spec is included in spec.md.

Key features:

  • ✅ Deterministic encoding
  • ✅ Lossless round-tripping
  • ✅ Strict and lenient parsing modes
  • ✅ Position tracking for all tokens
  • ✅ Table array detection
  • ✅ Delimiter scoping rules
  • ✅ Escape sequence handling
  • ✅ Array size validation per §6.1 (detects undersized arrays)

Platform Support

Platform Support
.NET Core 2.0+
.NET Framework 4.6.1+
.NET 5, 6, 7, 8, 9, 10
Mono
Xamarin
Unity ✅ (via .NET Standard 2.0)

Performance

Typical parse performance on modern hardware:

Document Size Parse Time Tokens/sec
1 KB < 1 ms 500K
10 KB 2-5 ms 400K
100 KB 20-40 ms 350K
1 MB 200-300 ms 300K

Benchmarks vary based on document structure and hardware.

Documentation

Examples

Check out the Examples directory for:

  • Basic parsing examples
  • Syntax highlighter implementation
  • Error handling patterns
  • AST visitor examples
  • Token manipulation

Contributing

Contributions welcome! Please:

  1. Follow existing code style

    • Use the .editorconfig settings
    • Keep methods focused and well-named
    • Add XML documentation for public APIs
  2. Write tests

    • Add tests for new features
    • Ensure all existing tests pass
    • Aim for high code coverage
  3. Update documentation

    • Update README for user-facing changes
    • Add examples for new features
    • Keep spec compliance notes current

Testing

Run the full test suite:

dotnet test

Test coverage:

  • Lexer: Token generation, escape sequences, position tracking
  • Parser: All node types, error recovery, edge cases
  • Validation: Array size validation, string format validation, number format validation
  • Extensions: Helper methods, visitor pattern
  • Integration: Round-trip encoding/decoding

License

Apache License 2.0 - See LICENSE.txt file for details.

This library is independent from the TOON specification but implements it faithfully. The specification itself is MIT licensed.

Support

Author

Mads Kristensen - GitHub | Twitter

Implementing the TOON specification by Johann Schopplich - @johannschopplich


<p align="center">Made with ❤️ for the .NET community</p>

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net461 was computed.  net462 is compatible.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.0.35 178 12/4/2025
1.0.0.29 171 11/26/2025
1.0.0.28 166 11/26/2025
1.0.0.23 165 11/26/2025
1.0.0.18 169 11/25/2025
1.0.0.16 169 11/25/2025