gg.parse
0.2.0
dotnet add package gg.parse --version 0.2.0
NuGet\Install-Package gg.parse -Version 0.2.0
<PackageReference Include="gg.parse" Version="0.2.0" />
<PackageVersion Include="gg.parse" Version="0.2.0" />
<PackageReference Include="gg.parse" />
paket add gg.parse --version 0.2.0
#r "nuget: gg.parse, 0.2.0"
#:package gg.parse@0.2.0
#addin nuget:?package=gg.parse&version=0.2.0
#tool nuget:?package=gg.parse&version=0.2.0
gg.parse 0.2
dotnet add package gg.parse.script --version 0.2.0
!Please Note! the code base is under development and changes frequently. The documentation below may be out of date.
gg.parse is a c# project aiming to provide a library for a tokenization, parsing and offer an ebnf-like scripting tools to make parsing of simple and complex data easy to do.
Table of Contents
License
Goals, use cases and otherwise
The goal of the gg.parse project is to provide a library for a tokenization, parsing and offer an ebnf-like scripting tools to make parsing of simple and complex data easy to do, both programatically and via an interpreted scripting language. Furthermore this project aims to provide an easy to understand and use, light-weight framework.
gg.parse is an LL(k) parser. There are many parsers out there but this one is, according to the license, ours as well.
Quickstart
Core concepts:
- A
Ruleimplements a function to parse data (text) and create one or moreAnnotations. - An
Annotationdescribes what data is intended to mean, as expressed by a rule. An annotation describes a specific part of the data by way of aRange. A range a position in the data and its length. Furthermore an annotation is a tree-node where its children may give further insight in the details of data in question. - A collection of
Rulesmake up aRule Graph. ARule Graphcan parse data (commonly, but not necessarily text) and map the data to one or moreAnnotations. Depending on the use case, the collection ofAnnotationscan either be used asTokensor anAbstract Syntax Tree.
Extended concepts:
- A set of common rules (literal, sequence, not...) to quickly build tokenizers and parsers.
- A tokenizer/parser/compiler which can build a tokenizer and or parser based on a high-level ebnf-like script.
- A facade-like class,
gg.parse.script.ParserBuilder, which combines all of the above in a single convenient class. - Pruning as first class citizen to generate lean syntax tree.
Example
Programmatically create a tokenizer to tokenizer (simplified) filenames in a text (see
gg.parse.doc.examples.test\CreateFilenameTokenizer.cs):
public class FilenameTokenizer : CommonTokenizer
{
public FilenameTokenizer()
{
var letter = OneOf(UpperCaseLetter(), LowerCaseLetter());
var number = InRange('0', '9');
var specialCharacters = InSet("_-~()[]{}+=@!#$%&'`.".ToArray());
var separator = InSet("\\/".ToArray());
var drive = Sequence("drive", letter, Literal(":"), separator);
var pathPart = OneOrMore("path_part", OneOf(letter, number, specialCharacters));
var pathChain = ZeroOrMore("#path_chain", Sequence("#path_chain_part", separator, pathPart));
var path = Sequence("path", pathPart, pathChain);
var filename = Sequence("filename", drive, path);
var findFilename = Skip(filename, failOnEoF: false);
Root = OneOrMore("#filenames", Sequence("#find_filename", findFilename, filename));
}
}
...
var filename = "c:\\users\\text.txt";
var data = $"find the filename {filename} in this line.";
var tokens = new FilenameTokenizer().Tokenize(data);
IsTrue(tokens[0].GetText(data) == filename);
IsTrue(tokens[0] == "filename");
IsTrue(tokens[0][0] == "drive");
IsTrue(tokens[0][1] == "path");
IsTrue(tokens[0][1][0] == "path_part");
IsTrue(tokens[0][1][1] == "path_part");
Doing the same using a script (see gg.parse.doc.examples.test\CreateFilenameTokenizer.cs):
// note this can also be read from a separate file
public static readonly string _filenameScript =
"-r filenames = +(find_filename, filename);\n" +
"-a find_filename = >>> filename;\n" +
"filename = drive, path;\n" +
"drive = letter, ':', separator;\n" +
"path = path_part, *(-a separator, path_part);\n" +
"path_part = +(letter | number | special_character);\n" +
"letter = {'a'..'z'} | {'A'..'Z'};\n" +
"number = {'0'..'9'};\n" +
"separator = {'\\\\/'};\n" +
"special_character = {\"_-~()[]{}+=@!#$%&`.'\"};\n";
...
var filename = "c:\\users\\text.txt";
var data = $"find the filename {filename} in this line.";
var tokens = new ParserBuilder().From(_filenameScript).Tokenize(data);
IsTrue(tokens[0].GetText(data) == filename);
IsTrue(tokens[0] == "filename");
IsTrue(tokens[0][0] == "drive");
IsTrue(tokens[0][1] == "path");
IsTrue(tokens[0][1][0] == "path_part");
IsTrue(tokens[0][1][1] == "path_part");
Project structure
The project consists of 3 main topics:
- Core: Core classes (eg IRule, RuleGraph and Annotation) as well as the basic rules (eg Sequence, Data matchers)
- Script: Everything related to the scripting framework: parsers, tokenizers and the compiler.
- Examples: Various examples to demonstrate (and test) the framework.
Each of these main topics has each own corresponding test project.
Rule References
Data rules
Match Any (.)
Match Data Range ({'a'..'z'})
Match Data Sequence ('abc')
Meta rules
Match Evaluation (a/b/c)
Match Count (*,+, ?)
Rule reference (foo)
Look-ahead rules
Match Condition (if ...)
Match Not (!foo ...)
More information
Rule graph details on working with - and implementation details of rule graphs.
Pruning details on how to keep your syntax tree lean.
Extending parse script steps required to add a new rule to the script.
To do list a list of all planned or unplanned tasks.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net9.0
- No dependencies.
NuGet packages (1)
Showing the top 1 NuGet packages that depend on gg.parse:
| Package | Downloads |
|---|---|
|
gg.parse.script
gg.parse is a c# project aiming to provide a library for a tokenization, parsing and offer an ebnf-like scripting tools to make parsing of simple and complex data easy to do. |
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated | |
|---|---|---|---|
| 0.2.0 | 201 | 10/31/2025 | |
| 0.2.0-debug | 173 | 10/31/2025 | |
| 0.1.0 | 255 | 10/21/2025 |