TokenizerIda 1.1.0
dotnet add package TokenizerIda --version 1.1.0
NuGet\Install-Package TokenizerIda -Version 1.1.0
<PackageReference Include="TokenizerIda" Version="1.1.0" />
<PackageVersion Include="TokenizerIda" Version="1.1.0" />
<PackageReference Include="TokenizerIda" />
paket add TokenizerIda --version 1.1.0
#r "nuget: TokenizerIda, 1.1.0"
#:package TokenizerIda@1.1.0
#addin nuget:?package=TokenizerIda&version=1.1.0
#tool nuget:?package=TokenizerIda&version=1.1.0
TokenizerIda
A tokenizer made in .NET 5.0 with C#.
Description
The Tokenizer class is the tokenizer itself, which takes a TokenParser object as a parameter. The TokenParser takes a string (input)and makes this into tokens based on the grammar that the user sent to TokenParser when creating it. The grammar is an array consisting of TokenTypes that the user can create through its constructor.
Each time the user wants to move to the next token with StepToNextToken() a Token is generated if it did not already exist before. It can exist since before because when the user moves back to a previous token with StepToPreviousToken() the next token will already be defined. The user can get the currently active token by using GetCurrentToken().
The Tokenizer contains, that is to say a linked list where tokens are added as the user moves the active token forward. If the Tokenizer encounters a token that does not defined in the grammar a LexicalException is given.
Usage example
// Create the grammar you want to match tokens with.
var arithmeticGrammar = new TokenType[]
{
new TokenType("ADD", @"^\+"),
new TokenType("NUMBER", @"^[0-9]+(\.([0-9])+)?"),
new TokenType("MUL", @"^\*")
};
// Create an instance of the token parser by giving it your grammar and an input to tokenize.
var tokenParser = new TokenParser(this._arithmeticGrammar, "3 + 54 * 4");
// Create an instance of the tokenizer.
var tokenizer = new Tokenizer(tokenParser);
// When stepping through the input's tokens it's possible to nestle the commands like this.
var token = tokenizer
.StepToNextToken() // ADD("+")
.StepToNextToken() // NUMBER("54")
.StepToNextToken() // MUL("*")
.StepToPreviousToken() // NUMBER("54")
.GetCurrentToken();
Console.Write("TokenType: " + token.TokenType.Name + " Value: " + token.Value);
// Output: "TokenType: NUMBER Value: 54"
/* ------------------------------------------------------------------------------------------- */
// If you reach the end of the input an END token is created and given if using GetCurrentToken().
var token = tokenizer
.StepToNextToken() // ADD("+")
.StepToNextToken() // NUMBER("54")
.StepToNextToken() // MUL("*")
.StepToNextToken() // NUMBER("4")
.StepToNextToken() // END("")
.StepToNextToken() // END("")
.StepToNextToken() // END("")
.StepToPreviousToken() // NUMBER("4")
.StepToNextToken() // END("")
.GetCurrentToken();
Console.Write("TokenType: " + token.TokenType.Name + " Value: " + token.Value);
// Output: "TokenType: END Value: "
//You can see token is an END token by using IsEndToken() on the token itself.
token.IsEndToken() // true
/* ------------------------------------------------------------------------------------------- */
// The Tokenizer trims away newlines and spaces by default.
// Change to only trim away spaces by specifying 'false' when creating TokenParser.
var tokenParser = new TokenParser(this._arithmeticGrammar, "3 + 54 * 4", false);
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 is compatible. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net5.0
- No dependencies.
NuGet packages (1)
Showing the top 1 NuGet packages that depend on TokenizerIda:
| Package | Downloads |
|---|---|
|
DocumentParserIda
A parser that takes a string as input and parses it into an object containing paragraphs with sentences objects. |
GitHub repositories
This package is not used by any popular GitHub repositories.
Added option when initializing TokenParser to not trim away newlines from the input. Added public method on Token to see if it is an END token.