TokenizerIda 1.1.0

dotnet add package TokenizerIda --version 1.1.0
                    
NuGet\Install-Package TokenizerIda -Version 1.1.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="TokenizerIda" Version="1.1.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="TokenizerIda" Version="1.1.0" />
                    
Directory.Packages.props
<PackageReference Include="TokenizerIda" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add TokenizerIda --version 1.1.0
                    
#r "nuget: TokenizerIda, 1.1.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package TokenizerIda@1.1.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=TokenizerIda&version=1.1.0
                    
Install as a Cake Addin
#tool nuget:?package=TokenizerIda&version=1.1.0
                    
Install as a Cake Tool

TokenizerIda

A tokenizer made in .NET 5.0 with C#.

Description

The Tokenizer class is the tokenizer itself, which takes a TokenParser object as a parameter. The TokenParser takes a string (input)and makes this into tokens based on the grammar that the user sent to TokenParser when creating it. The grammar is an array consisting of TokenTypes that the user can create through its constructor.

Each time the user wants to move to the next token with StepToNextToken() a Token is generated if it did not already exist before. It can exist since before because when the user moves back to a previous token with StepToPreviousToken() the next token will already be defined. The user can get the currently active token by using GetCurrentToken().

The Tokenizer contains, that is to say a linked list where tokens are added as the user moves the active token forward. If the Tokenizer encounters a token that does not defined in the grammar a LexicalException is given.

Usage example

// Create the grammar you want to match tokens with.
var arithmeticGrammar = new TokenType[]
    {
        new TokenType("ADD", @"^\+"),
        new TokenType("NUMBER", @"^[0-9]+(\.([0-9])+)?"),
        new TokenType("MUL", @"^\*")
    };
    
// Create an instance of the token parser by giving it your grammar and an input to tokenize.
var tokenParser = new TokenParser(this._arithmeticGrammar, "3 + 54 * 4");

// Create an instance of the tokenizer.   
 var tokenizer = new Tokenizer(tokenParser);

// When stepping through the input's tokens it's possible to nestle the commands like this.
var token = tokenizer
    .StepToNextToken() // ADD("+")
    .StepToNextToken() // NUMBER("54")
    .StepToNextToken() // MUL("*")
    .StepToPreviousToken() // NUMBER("54")
    .GetCurrentToken();
    
Console.Write("TokenType: " + token.TokenType.Name + " Value: " + token.Value);
// Output: "TokenType: NUMBER Value: 54"

/* ------------------------------------------------------------------------------------------- */

// If you reach the end of the input an END token is created and given if using GetCurrentToken().
var token = tokenizer
    .StepToNextToken() // ADD("+")
    .StepToNextToken() // NUMBER("54")
    .StepToNextToken() // MUL("*")
    .StepToNextToken() // NUMBER("4")
    .StepToNextToken() // END("")
    .StepToNextToken() // END("")
    .StepToNextToken() // END("")
    .StepToPreviousToken() // NUMBER("4")
    .StepToNextToken() // END("")
    .GetCurrentToken();
    
Console.Write("TokenType: " + token.TokenType.Name + " Value: " + token.Value);
// Output: "TokenType: END Value: "

//You can see token is an END token by using IsEndToken() on the token itself.
token.IsEndToken() // true

/* ------------------------------------------------------------------------------------------- */

// The Tokenizer trims away newlines and spaces by default.
// Change to only trim away spaces by specifying 'false' when creating TokenParser.
var tokenParser = new TokenParser(this._arithmeticGrammar, "3 + 54 * 4", false);

Product Compatible and additional computed target framework versions.
.NET net5.0 is compatible.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net5.0

    • No dependencies.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on TokenizerIda:

Package Downloads
DocumentParserIda

A parser that takes a string as input and parses it into an object containing paragraphs with sentences objects.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.1.0 591 10/15/2021
1.0.0 593 9/27/2021

Added option when initializing TokenParser to not trim away newlines from the input. Added public method on Token to see if it is an END token.