ToonTokenizer 1.0.0.35
dotnet add package ToonTokenizer --version 1.0.0.35
NuGet\Install-Package ToonTokenizer -Version 1.0.0.35
<PackageReference Include="ToonTokenizer" Version="1.0.0.35" />
<PackageVersion Include="ToonTokenizer" Version="1.0.0.35" />
<PackageReference Include="ToonTokenizer" />
paket add ToonTokenizer --version 1.0.0.35
#r "nuget: ToonTokenizer, 1.0.0.35"
#:package ToonTokenizer@1.0.0.35
#addin nuget:?package=ToonTokenizer&version=1.0.0.35
#tool nuget:?package=ToonTokenizer&version=1.0.0.35
ToonTokenizer for .NET
A complete .NET library for working with TOON (Token-Oriented Object Notation) - a compact, human-readable text format optimized for LLM prompts and structured data interchange.
What is TOON?
TOON is a line-oriented, indentation-based notation that encodes JSON data with explicit structure and minimal quoting. Think of it as:
- More compact than JSON for arrays of uniform objects (no repeated keys)
- More structured than CSV with nesting, types, and field names
- More deterministic than YAML with explicit array lengths and fixed formatting rules
Perfect for LLM prompts, configuration files, and data interchange where token efficiency and readability matter.
Features
- ✅ Complete lexical analyzer with all TOON token types
- ✅ Full AST parser with resilient error recovery
- ✅ Token-to-AST navigation - easily map between tokens and syntax nodes
- ✅ Tokens included in parse results - no separate tokenization call needed
- ✅ Position tracking for every token and AST node (line, column, span)
- ✅ Resilient parsing - continues after errors, returns partial AST
- ✅ Rich error reporting - collects all errors with precise locations and error codes
- ✅ Standardized error codes - 20+ error codes (TOON1xxx-9xxx) for programmatic handling
- ✅ Context-aware error messages - every error explains what, why, and how to fix
- ✅ Visitor pattern for AST traversal and transformation
- ✅ Extension methods for syntax highlighting and IDE integration
- ✅ TOON spec §6.1 compliance - array size validation (detects size mismatches)
- ✅ Battle-tested with 637 unit tests (100% passing)
Targets: .NET Standard 2.0 (maximum compatibility)
Installation
dotnet add package ToonTokenizer
Or via Package Manager Console:
Install-Package ToonTokenizer
Quick Start
Parse TOON to AST
using ToonTokenizer;
var source = @"
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
";
// Parse returns: Document (AST), Errors (if any), and Tokens
var result = Toon.Parse(source);
if (result.IsSuccess)
{
// Access the parsed document
foreach (var property in result.Document.Properties)
{
Console.WriteLine($"{property.Key}: {property.Value}");
}
// Access tokens for syntax highlighting
foreach (var token in result.Tokens)
{
Console.WriteLine($"{token.Type}: '{token.Value}' at {token.Line}:{token.Column}");
}
}
else
{
// Resilient parsing: you still get a partial AST + all errors
Console.WriteLine($"Found {result.Errors.Count} error(s):");
foreach (var error in result.Errors)
{
Console.WriteLine($" Line {error.Line}: {error.Message}");
}
}
Validate TOON
if (Toon.TryParse(source, out var result))
{
if (result.IsSuccess)
Console.WriteLine("✓ Valid TOON");
else
Console.WriteLine($"✗ {result.Errors.Count} error(s) found");
}
Access Tokens Only
// Get tokens without parsing
var tokens = Toon.Tokenize(source);
foreach (var token in tokens)
{
Console.WriteLine($"{token.Type}: {token.Value}");
}
Navigate from Tokens to AST
var source = "name: John\nage: 30";
var result = Toon.Parse(source);
// Get a token and find which AST node it belongs to
var token = result.Tokens.Find(t => t.Value == "30");
var property = token.GetPropertyNode(result.Document);
Console.WriteLine($"Token '{token.Value}' belongs to property: {property.Key}");
// Output: Token '30' belongs to property: age
// Or find property at a specific line/column
var prop = result.GetPropertyAt(line: 2, column: 1);
Console.WriteLine($"Property at line 2: {prop.Key}");
// Output: Property at line 2: age
// Find nested properties by path
var theme = result.FindPropertyByPath("user.settings.theme");
if (theme?.Value is StringValueNode str)
{
Console.WriteLine($"Theme: {str.Value}");
}
TOON Language Examples
Simple Properties
name: John Doe
age: 30
active: true
email: john@example.com
Nested Objects
user:
name: Jane Smith
email: jane@example.com
settings:
theme: dark
notifications: true
Arrays - Inline (Primitives)
colors[3]: red,green,blue
scores[5]: 95,87,92,88,91
Arrays - Tabular (Uniform Objects)
The killer feature! No repeated keys:
users[3]{id,name,email,active}:
1,Alice,alice@example.com,true
2,Bob,bob@example.com,false
3,Charlie,charlie@example.com,true
Compare to JSON:
{
"users": [
{"id": 1, "name": "Alice", "email": "alice@example.com", "active": true},
{"id": 2, "name": "Bob", "email": "bob@example.com", "active": false},
{"id": 3, "name": "Charlie", "email": "charlie@example.com", "active": true}
]
}
60% fewer tokens! 🎉
Real-World Example
context:
task: Favorite hiking trails
location: Boulder, CO
season: Spring 2025
friends[3]: Ana,Luis,Sam
hikes[3]{id,name,distance,elevation,companion,sunny}:
1,Blue Lake Trail,7.5,320,Ana,true
2,Ridge Overlook,9.2,540,Luis,false
3,Wildflower Loop,5.1,180,Sam,true
notes:
best: Ridge Overlook has amazing views!
bring: Water, snacks, sunscreen
API Reference
Main Entry Point: Toon
// Parse TOON source (returns Document, Errors, and Tokens)
ToonParseResult Parse(string source)
// Validate and parse (returns true for completed parse, even with errors)
bool TryParse(string source, out ToonParseResult result)
// Tokenize only
List<Token> Tokenize(string source)
Parse Result
public class ToonParseResult
{
public ToonDocument Document { get; } // Always available (even with errors)
public List<ToonError> Errors { get; } // Empty if no errors
public List<Token> Tokens { get; } // All tokens from lexing
public bool IsSuccess => Errors.Count == 0;
public bool HasErrors => Errors.Count > 0;
}
Errors with Precise Locations
public class ToonError
{
public string Message { get; }
public string? Code { get; } // Error code (e.g., "TOON1001")
public int Position { get; } // 0-based character offset
public int Length { get; } // Length of error span
public int Line { get; } // 1-based line number
public int Column { get; } // 1-based column number
public int EndPosition { get; } // Position + Length
}
Error Codes for Programmatic Handling
All errors include standardized error codes for programmatic handling and filtering:
var result = Toon.Parse(source);
foreach (var error in result.Errors)
{
// Errors include descriptive messages with fix suggestions
Console.WriteLine($"[{error.Code}] {error.Message}");
// Filter by error type
if (error.Code?.StartsWith("TOON1") == true)
Console.WriteLine(" → Lexer/tokenization error");
else if (error.Code?.StartsWith("TOON2") == true)
Console.WriteLine(" → Parser structural error");
else if (error.Code?.StartsWith("TOON3") == true)
Console.WriteLine(" → Validation error");
}
Error Code Categories:
| Category | Range | Description | Examples |
|---|---|---|---|
| Lexer | TOON1xxx | Tokenization errors | TOON1001 Unterminated string<br/>TOON1002 Invalid escape sequence<br/>TOON1003 Invalid character |
| Parser | TOON2xxx | Structural errors | TOON2001 Expected property key<br/>TOON2002 Expected colon<br/>TOON2003 Expected right bracket<br/>TOON2004 Expected value<br/>TOON2005 Expected delimiter |
| Validation | TOON3xxx | Semantic errors | TOON3001 Array size mismatch<br/>TOON3002 Table array size mismatch<br/>TOON3003 Table row field mismatch |
| Delimiters | TOON4xxx | Delimiter issues | TOON4001 Mixed delimiters<br/>TOON4002 Delimiter marker misplaced |
| Indentation | TOON5xxx | Indentation problems | TOON5001 Unexpected indentation<br/>TOON5002 Inconsistent indentation |
| Internal | TOON9xxx | Library bugs | TOON9001 Infinite loop detected |
Context-Aware Error Messages:
Every error includes:
- ✅ What went wrong - Clear description of the problem
- ✅ Why it's wrong - Explanation of the rule that was violated
- ✅ How to fix it - Actionable suggestions for correction
Example error messages:
// Unterminated string
[TOON1001] Unterminated double-quoted string at line 5, column 10.
String reached end of line without closing " character.
Fix: Add closing " before the end of the line
// Invalid escape sequence
[TOON1002] Invalid escape sequence '\x' at line 3, column 15.
Valid escape sequences: \n, \r, \t, \\, \", \'.
Fix: Use a valid escape sequence or remove the backslash
// Array size mismatch
[TOON3001] Array size mismatch: declared 5 elements, but found 3.
Missing 2 elements. Check if array is incomplete or elements are on wrong indentation level.
Fix: Either add 2 more elements or change the size declaration [5]→[3]
// Table size mismatch with helpful hint
[TOON3002] Table array size mismatch: declared 10 rows, but found 8.
Missing 2 rows. Check if rows are incomplete or have incorrect indentation.
Fix: Either add 2 more rows or update the size [10]→[8]
Token Types
public enum TokenType
{
// Values
String, Number, True, False, Null, Identifier,
// Structure
Colon, Comma, Pipe,
LeftBracket, RightBracket,
LeftBrace, RightBrace,
// Formatting
Newline, Indent, Dedent, Whitespace,
// Special
Comment, EndOfFile, Invalid
}
AST Nodes
All inherit from AstNode with position tracking:
// Document root
ToonDocument // Contains Properties[]
// Structural
PropertyNode // Key + Value
ObjectNode // Nested object with Properties[]
// Arrays
ArrayNode // Simple array with Elements[]
TableArrayNode // Tabular with Schema[] and Rows[][]
// Values
StringValueNode // String literal
NumberValueNode // Numeric (integer or float)
BooleanValueNode // true/false
NullValueNode // null
Every node includes:
int StartLine, StartColumn, StartPosition
int EndLine, EndColumn, EndPosition
Advanced Features
Resilient Parsing
The parser continues after errors, returning a partial AST and all error locations:
var source = @"
name: John
invalid line here
city: Boulder
";
var result = Toon.Parse(source);
// result.Document has 2 valid properties (name, city)
// result.Errors has 1 error (line 3)
Perfect for:
- IDE integration (IntelliSense on valid parts)
- Error highlighting (show all errors at once)
- Language servers
- Linters and validators
Extension Methods
Token Extensions
using ToonTokenizer;
var tokens = Toon.Tokenize(source);
// Get tokens on specific line
var lineTokens = tokens.GetTokensOnLine(5);
// Find token at position
var token = tokens.GetTokenAt(line: 3, column: 10);
// Filter by type
var strings = tokens.GetTokensByType(TokenType.String);
// Syntax highlighting classification
foreach (var token in tokens)
{
string cssClass = token.GetClassification();
// Returns: "keyword", "string", "number", "comment", etc.
}
// Check categories
bool isKeyword = token.IsKeyword(); // true, false, null
bool isStructural = token.IsStructural(); // :, [, ], {, }, ,
bool isValue = token.IsValue(); // strings, numbers, booleans
Token to AST Navigation
using ToonTokenizer;
using ToonTokenizer.Ast;
var result = Toon.Parse(source);
// From token to AST node
var token = result.Tokens.GetTokenAt(line: 5, column: 3);
var node = token.GetAstNode(result.Document);
var property = token.GetPropertyNode(result.Document);
// From parse result directly
var nodeAtPosition = result.GetNodeAtPosition(42);
var nodeForToken = result.GetNodeForToken(myToken);
var propertyAt = result.GetPropertyAt(line: 3, column: 5);
// Get all properties (including nested)
var allProps = result.GetAllProperties();
// Find by path (dot notation)
var theme = result.FindPropertyByPath("user.settings.theme");
var email = result.FindPropertyByPath("user.email");
if (theme?.Value is StringValueNode str)
{
Console.WriteLine($"Theme: {str.Value}");
}
Visitor Pattern
Implement custom AST processing:
using ToonTokenizer.Ast;
public class MyVisitor : IAstVisitor<string>
{
public string VisitDocument(ToonDocument node)
{
var results = node.Properties.Select(p => p.Accept(this));
return string.Join(", ", results);
}
public string VisitProperty(PropertyNode node)
{
return $"{node.Key} = {node.Value.Accept(this)}";
}
public string VisitStringValue(StringValueNode node)
{
return $"\"{node.Value}\"";
}
// ... implement other Visit methods
}
// Use it
var doc = Toon.Parse(source).Document;
var output = doc.Accept(new MyVisitor());
Use Cases
1. Visual Studio Extension (Syntax Highlighting)
public IEnumerable<ClassificationSpan> GetClassificationSpans(SnapshotSpan span)
{
var source = span.GetText();
var result = Toon.Parse(source); // Gets tokens + AST in one call
foreach (var token in result.Tokens)
{
var classification = token.GetClassification();
var tokenSpan = new SnapshotSpan(
span.Snapshot,
token.Position,
token.Length
);
yield return new ClassificationSpan(
tokenSpan,
GetClassificationType(classification)
);
}
}
2. Language Server (IntelliSense)
public IEnumerable<Completion> GetCompletions(int line, int column)
{
var result = Toon.Parse(documentText);
// Find the property we're currently in
var property = result.GetPropertyAt(line, column);
if (property != null)
{
// Context-aware suggestions based on property type
if (property.Value is ObjectNode)
{
// Suggest nested property names
yield return new Completion("theme");
yield return new Completion("enabled");
}
else if (property.Value is ArrayNode)
{
// Suggest array-specific completions
yield return new Completion("[size]");
}
}
var token = result.Tokens.GetTokenAt(line, column);
if (token?.Type == TokenType.Colon)
{
// After colon: suggest value types
yield return new Completion("true");
yield return new Completion("false");
yield return new Completion("null");
}
}
3. Error Diagnostics
public IEnumerable<Diagnostic> GetDiagnostics()
{
var result = Toon.Parse(documentText);
foreach (var error in result.Errors)
{
yield return new Diagnostic
{
Severity = DiagnosticSeverity.Error,
Message = error.Message,
Range = new Range(
error.Line - 1,
error.Column - 1,
error.EndPosition
)
};
}
}
4. Code Folding / Outlining
public IEnumerable<FoldingRange> GetFoldingRanges()
{
var result = Toon.Parse(documentText);
foreach (var property in result.Document.Properties)
{
if (property.Value is ObjectNode obj && obj.Properties.Count > 0)
{
yield return new FoldingRange
{
StartLine = obj.StartLine,
EndLine = obj.EndLine,
Kind = FoldingRangeKind.Region
};
}
else if (property.Value is TableArrayNode table && table.Rows.Count > 5)
{
yield return new FoldingRange
{
StartLine = table.StartLine,
EndLine = table.EndLine,
Kind = FoldingRangeKind.Region
};
}
}
}
5. LLM Prompt Optimization
// Convert verbose JSON to compact TOON for token savings
var jsonData = GetDataFromApi();
var toonEncoder = new ToonEncoder();
var compactPrompt = toonEncoder.Encode(jsonData);
// Use in prompt
var prompt = $@"
Analyze this data:
{compactPrompt}
What insights can you provide?
";
Why Choose ToonTokenizer?
✨ Feature Complete
- Full TOON v3.0 specification support
- Handles all array types (inline, tabular, nested)
- Complete delimiter support (comma, tab, pipe)
- Resilient parsing with error recovery
🎯 Production Ready
- 637 unit tests covering edge cases (100% passing)
- Battle-tested on complex real-world data
- Handles malformed input gracefully
- Comprehensive error reporting with standardized error codes
- Context-aware error messages with actionable fix suggestions
🚀 Performance Focused
- Efficient single-pass lexer
- Minimal allocations
- Streaming-friendly design
- .NET Standard 2.0 for maximum compatibility
🛠️ Developer Friendly
- Rich IntelliSense support
- Extensive XML documentation
- Position tracking on everything
- Extension methods for common tasks
🏗️ Extensible
- Visitor pattern for AST traversal
- Hook points for custom behavior
- Clean separation of concerns
- Easy to integrate into larger systems
Specification Compliance
This library implements the TOON v3.0 specification. The full spec is included in spec.md.
Key features:
- ✅ Deterministic encoding
- ✅ Lossless round-tripping
- ✅ Strict and lenient parsing modes
- ✅ Position tracking for all tokens
- ✅ Table array detection
- ✅ Delimiter scoping rules
- ✅ Escape sequence handling
- ✅ Array size validation per §6.1 (detects undersized arrays)
Platform Support
| Platform | Support |
|---|---|
| .NET Core 2.0+ | ✅ |
| .NET Framework 4.6.1+ | ✅ |
| .NET 5, 6, 7, 8, 9, 10 | ✅ |
| Mono | ✅ |
| Xamarin | ✅ |
| Unity | ✅ (via .NET Standard 2.0) |
Performance
Typical parse performance on modern hardware:
| Document Size | Parse Time | Tokens/sec |
|---|---|---|
| 1 KB | < 1 ms | 500K |
| 10 KB | 2-5 ms | 400K |
| 100 KB | 20-40 ms | 350K |
| 1 MB | 200-300 ms | 300K |
Benchmarks vary based on document structure and hardware.
Documentation
- Quick Start: See examples above
- Token Access: Examples/TokensInParseResult.md
- API Documentation: XML docs included in package
Examples
Check out the Examples directory for:
- Basic parsing examples
- Syntax highlighter implementation
- Error handling patterns
- AST visitor examples
- Token manipulation
Contributing
Contributions welcome! Please:
Follow existing code style
- Use the
.editorconfigsettings - Keep methods focused and well-named
- Add XML documentation for public APIs
- Use the
Write tests
- Add tests for new features
- Ensure all existing tests pass
- Aim for high code coverage
Update documentation
- Update README for user-facing changes
- Add examples for new features
- Keep spec compliance notes current
Testing
Run the full test suite:
dotnet test
Test coverage:
- Lexer: Token generation, escape sequences, position tracking
- Parser: All node types, error recovery, edge cases
- Validation: Array size validation, string format validation, number format validation
- Extensions: Helper methods, visitor pattern
- Integration: Round-trip encoding/decoding
License
Apache License 2.0 - See LICENSE.txt file for details.
This library is independent from the TOON specification but implements it faithfully. The specification itself is MIT licensed.
Links
- TOON Specification: https://github.com/toon-format/spec
- Reference Implementation (TypeScript): https://github.com/toon-format/toon
- This Library: https://github.com/madskristensen/ToonTokenizer
- NuGet Package: https://www.nuget.org/packages/ToonTokenizer/
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Spec Questions: TOON Spec Repo
Author
Mads Kristensen - GitHub | Twitter
Implementing the TOON specification by Johann Schopplich - @johannschopplich
<p align="center">Made with ❤️ for the .NET community</p>
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
| .NET Framework | net461 was computed. net462 is compatible. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen40 was computed. tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETFramework 4.6.2
- System.Text.Json (>= 9.0.0)
-
.NETStandard 2.0
- System.Text.Json (>= 9.0.0)
-
net10.0
- No dependencies.
-
net8.0
- System.Text.Json (>= 9.0.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.