NTokenizers 0.5.0-preview
See the version list below for details.
dotnet add package NTokenizers --version 0.5.0-preview
NuGet\Install-Package NTokenizers -Version 0.5.0-preview
<PackageReference Include="NTokenizers" Version="0.5.0-preview" />
<PackageVersion Include="NTokenizers" Version="0.5.0-preview" />
<PackageReference Include="NTokenizers" />
paket add NTokenizers --version 0.5.0-preview
#r "nuget: NTokenizers, 0.5.0-preview"
#:package NTokenizers@0.5.0-preview
#addin nuget:?package=NTokenizers&version=0.5.0-preview&prerelease
#tool nuget:?package=NTokenizers&version=0.5.0-preview&prerelease
NTokenizers
Collection of tokenizers for Markup, JSON, XML, SQL, Typescript and CSharp processing
Overview
NTokenizers is a .NET library written in C# that provides tokenizers for processing structured text formats like Markup, JSON, XML, SQL, Typescript and CSharp. The Tokenize method is the core functionality that breaks down structured text into meaningful components (tokens) for processing. Its key feature is stream processing capability - it can handle data as it arrives in real-time, making it ideal for processing large files or streaming data without loading everything into memory at once.
These tokenizers are not validation-based and are primarily intended for prettifying, formatting, or visualizing structured text. They do not perform strict validation of the input format, so they may produce unexpected results when processing malformed or invalid XML, JSON, or HTML. Use them with caution when dealing with untrusted or poorly formatted input.
Example
Here's a simple example showing how to use the XML tokenizer:
using NTokenizers.Json;
using NTokenizers.Markup;
using NTokenizers.Markup.Metadata;
using NTokenizers.Typescript;
using NTokenizers.Xml;
using Spectre.Console;
using System.IO.Pipes;
using System.Text;
class Program
{
static async Task Main()
{
string markup = """
# NTokenizers Showcase
## XML example
```xml
<user id="4821" active="true">
<name>Laura Smith</name>
</user>
```
## JSON example
```json
{
"name": "Laura Smith",
"active": true
}
```
## TypeScript example
```typescript
const user = {
name: "Laura Smith",
active: true
};
```
""";
// Create connected streams
using var pipe = new AnonymousPipeServerStream(PipeDirection.Out);
using var reader = new AnonymousPipeClientStream(PipeDirection.In, pipe.ClientSafePipeHandle);
// Start slow writer
var writerTask = EmitSlowlyAsync(markup, pipe);
// Parse markup
MarkupTokenizer.Create().Parse(reader, onToken: token =>
{
if (token.Metadata is HeadingMetadata headingMetadata)
{
headingMetadata.OnInlineToken = inlineToken =>
{
var value = Markup.Escape(inlineToken.Value);
var colored = headingMetadata.Level != 1 ?
new Markup($"[bold blue]{value}[/]") :
new Markup($"[bold yellow]** {value} **[/]");
AnsiConsole.Write(colored);
};
}
else if (token.Metadata is XmlCodeBlockMetadata xmlMetadata)
{
xmlMetadata.OnInlineToken = inlineToken =>
{
var value = Markup.Escape(inlineToken.Value);
var colored = inlineToken.TokenType switch
{
XmlTokenType.ElementName => new Markup($"[blue]{value}[/]"),
XmlTokenType.EndElement => new Markup($"[blue]{value}[/]"),
XmlTokenType.OpeningAngleBracket => new Markup($"[yellow]{value}[/]"),
XmlTokenType.ClosingAngleBracket => new Markup($"[yellow]{value}[/]"),
XmlTokenType.SelfClosingSlash => new Markup($"[yellow]{value}[/]"),
XmlTokenType.AttributeName => new Markup($"[cyan]{value}[/]"),
XmlTokenType.AttributeEquals => new Markup($"[yellow]{value}[/]"),
XmlTokenType.AttributeQuote => new Markup($"[grey]{value}[/]"),
XmlTokenType.AttributeValue => new Markup($"[green]{value}[/]"),
XmlTokenType.Text => new Markup($"[white]{value}[/]"),
XmlTokenType.Whitespace => new Markup($"[grey]{value}[/]"),
_ => new Markup(value)
};
AnsiConsole.Write(colored);
};
}
else if (token.Metadata is JsonCodeBlockMetadata jsonMetadata)
{
jsonMetadata.OnInlineToken = inlineToken =>
{
var value = Markup.Escape(inlineToken.Value);
var colored = inlineToken.TokenType switch
{
JsonTokenType.StartObject => new Markup($"[yellow]{value}[/]"),
JsonTokenType.EndObject => new Markup($"[yellow]{value}[/]"),
JsonTokenType.StartArray => new Markup($"[yellow]{value}[/]"),
JsonTokenType.EndArray => new Markup($"[yellow]{value}[/]"),
JsonTokenType.PropertyName => new Markup($"[cyan]{value}[/]"),
JsonTokenType.StringValue => new Markup($"[green]{value}[/]"),
JsonTokenType.Number => new Markup($"[magenta]{value}[/]"),
JsonTokenType.True => new Markup($"[orange1]{value}[/]"),
JsonTokenType.False => new Markup($"[orange1]{value}[/]"),
JsonTokenType.Null => new Markup($"[grey]{value}[/]"),
JsonTokenType.Colon => new Markup($"[yellow]{value}[/]"),
JsonTokenType.Comma => new Markup($"[yellow]{value}[/]"),
JsonTokenType.Whitespace => new Markup($"[grey]{value}[/]"),
_ => new Markup(value)
};
AnsiConsole.Write(colored);
};
}
else if (token.Metadata is TypeScriptCodeBlockMetadata tsMetadata)
{
tsMetadata.OnInlineToken = inlineToken =>
{
var value = Markup.Escape(inlineToken.Value);
var colored = inlineToken.TokenType switch
{
TypescriptTokenType.Identifier => new Markup($"[cyan]{value}[/]"),
TypescriptTokenType.Keyword => new Markup($"[blue]{value}[/]"),
TypescriptTokenType.StringValue => new Markup($"[green]{value}[/]"),
TypescriptTokenType.Number => new Markup($"[magenta]{value}[/]"),
TypescriptTokenType.Operator => new Markup($"[yellow]{value}[/]"),
TypescriptTokenType.Comment => new Markup($"[grey]{value}[/]"),
TypescriptTokenType.Whitespace => new Markup($"[grey]{value}[/]"),
_ => new Markup(value)
};
AnsiConsole.Write(colored);
};
}
else
{
// Handle regular markup tokens
var value = Markup.Escape(token.Value);
var colored = token.TokenType switch
{
MarkupTokenType.Text => new Markup($"[white]{value}[/]"),
MarkupTokenType.Bold => new Markup($"[bold]{value}[/]"),
MarkupTokenType.Italic => new Markup($"[italic]{value}[/]"),
MarkupTokenType.Heading => new Markup($"[bold blue]{value}[/]"),
MarkupTokenType.Link => new Markup($"[blue underline]{value}[/]"),
_ => new Markup(value)
};
AnsiConsole.Write(colored);
}
//Important: wait for inline processing to complete before proceeding
if (token.Metadata is IInlineMarkupMedata inlineMetadata)
{
while (inlineMetadata.IsProcessing)
{
Thread.Sleep(3);
}
AnsiConsole.WriteLine();
}
});
await writerTask;
Console.WriteLine();
Console.WriteLine("Done.");
}
static async Task EmitSlowlyAsync(string markup, Stream output)
{
var rng = new Random();
byte[] bytes = Encoding.UTF8.GetBytes(markup);
foreach (var b in bytes)
{
await output.WriteAsync(new[] { b }.AsMemory(0, 1));
await output.FlushAsync();
await Task.Delay(rng.Next(2, 8));
}
output.Close(); // EOF
}
}
This gives the following output:
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
| .NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen40 was computed. tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- No dependencies.
NuGet packages (1)
Showing the top 1 NuGet packages that depend on NTokenizers:
| Package | Downloads |
|---|---|
|
NTokenizers.Extensions.Spectre.Console
Spectre.Console rendering extensions for NTokenizers (XML, JSON, Markup, TypeScript, C# and SQL), Style-rich console syntax highlighting |
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.0.1 | 35 | 12/9/2025 |
| 1.0.0 | 90 | 12/8/2025 |
| 0.8.0-preview | 116 | 12/6/2025 |
| 0.7.0-preview | 111 | 12/6/2025 |
| 0.6.0-preview | 662 | 12/3/2025 |
| 0.5.0-preview | 650 | 12/3/2025 |
| 0.4.0-preview | 650 | 12/2/2025 |
| 0.3.0-preview | 659 | 12/2/2025 |
| 0.2.0-preview | 164 | 11/27/2025 |
| 0.1.0-preview | 380 | 11/19/2025 |
EmitText when encountering a space