gg.parse 0.2.0-debug

This is a prerelease version of gg.parse.
There is a newer version of this package available.
See the version list below for details.
dotnet add package gg.parse --version 0.2.0-debug
                    
NuGet\Install-Package gg.parse -Version 0.2.0-debug
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="gg.parse" Version="0.2.0-debug" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="gg.parse" Version="0.2.0-debug" />
                    
Directory.Packages.props
<PackageReference Include="gg.parse" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add gg.parse --version 0.2.0-debug
                    
#r "nuget: gg.parse, 0.2.0-debug"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package gg.parse@0.2.0-debug
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=gg.parse&version=0.2.0-debug&prerelease
                    
Install as a Cake Addin
#tool nuget:?package=gg.parse&version=0.2.0-debug&prerelease
                    
Install as a Cake Tool

gg.parse 0.2

dotnet add package gg.parse.script --version 0.2.0

!Please Note! the code base is under development and changes frequently. The documentation below may be out of date.

gg.parse is a c# project aiming to provide a library for a tokenization, parsing and offer an ebnf-like scripting tools to make parsing of simple and complex data easy to do.

Table of Contents

License

MIT

Goals, use cases and otherwise

The goal of the gg.parse project is to provide a library for a tokenization, parsing and offer an ebnf-like scripting tools to make parsing of simple and complex data easy to do, both programatically and via an interpreted scripting language. Furthermore this project aims to provide an easy to understand and use, light-weight framework.

gg.parse is an LL(k) parser. There are many parsers out there but this one is, according to the license, ours as well.

Quickstart

Core concepts:

  • A Rule implements a function to parse data (text) and create one or more Annotations.
  • An Annotation describes what data is intended to mean, as expressed by a rule. An annotation describes a specific part of the data by way of a Range. A range a position in the data and its length. Furthermore an annotation is a tree-node where its children may give further insight in the details of data in question.
  • A collection of Rules make up a Rule Graph. A Rule Graph can parse data (commonly, but not necessarily text) and map the data to one or more Annotations. Depending on the use case, the collection of Annotations can either be used as Tokens or an Abstract Syntax Tree.

Extended concepts:

  • A set of common rules (literal, sequence, not...) to quickly build tokenizers and parsers.
  • A tokenizer/parser/compiler which can build a tokenizer and or parser based on a high-level ebnf-like script.
  • A facade-like class, gg.parse.script.ParserBuilder, which combines all of the above in a single convenient class.
  • Pruning as first class citizen to generate lean syntax tree.

Example

Programmatically create a tokenizer to tokenizer (simplified) filenames in a text (see gg.parse.doc.examples.test\CreateFilenameTokenizer.cs):

        public class FilenameTokenizer : CommonTokenizer
        {
            public FilenameTokenizer()
            {
                var letter = OneOf(UpperCaseLetter(), LowerCaseLetter());
                var number = InRange('0', '9');
                var specialCharacters = InSet("_-~()[]{}+=@!#$%&'`.".ToArray());
                var separator = InSet("\\/".ToArray());
                var drive = Sequence("drive", letter, Literal(":"), separator);
                var pathPart = OneOrMore("path_part", OneOf(letter, number, specialCharacters));
                var pathChain = ZeroOrMore("#path_chain", Sequence("#path_chain_part", separator, pathPart));
                var path = Sequence("path", pathPart, pathChain);
                var filename = Sequence("filename", drive, path);
                var findFilename = Skip(filename, failOnEoF: false);

                Root = OneOrMore("#filenames", Sequence("#find_filename", findFilename, filename));
            }
        }

        ...

        var filename = "c:\\users\\text.txt";
        var data = $"find the filename {filename} in this line.";           
        var tokens = new FilenameTokenizer().Tokenize(data);
            
        IsTrue(tokens[0].GetText(data) == filename);

        IsTrue(tokens[0] == "filename");
        IsTrue(tokens[0][0] == "drive");
        IsTrue(tokens[0][1] == "path");
        IsTrue(tokens[0][1][0] == "path_part");
        IsTrue(tokens[0][1][1] == "path_part");

Doing the same using a script (see gg.parse.doc.examples.test\CreateFilenameTokenizer.cs):


    // note this can also be read from a separate file
    public static readonly string _filenameScript =
      "-r filenames       = +(find_filename, filename);\n" +
      "-a find_filename   = >>> filename;\n" +
      "filename           = drive, path;\n" +
      "drive              = letter, ':', separator;\n" +
      "path               = path_part, *(-a separator, path_part);\n" +
      "path_part          = +(letter | number | special_character);\n" +
      "letter             = {'a'..'z'} | {'A'..'Z'};\n" +
      "number             = {'0'..'9'};\n" +
      "separator          = {'\\\\/'};\n" +
      "special_character  = {\"_-~()[]{}+=@!#$%&`.'\"};\n";

    ...

    var filename = "c:\\users\\text.txt";
    var data = $"find the filename {filename} in this line.";
    var tokens = new ParserBuilder().From(_filenameScript).Tokenize(data);

    IsTrue(tokens[0].GetText(data) == filename);

    IsTrue(tokens[0] == "filename");
    IsTrue(tokens[0][0] == "drive");
    IsTrue(tokens[0][1] == "path");
    IsTrue(tokens[0][1][0] == "path_part");
    IsTrue(tokens[0][1][1] == "path_part");

Project structure

The project consists of 3 main topics:

  1. Core: Core classes (eg IRule, RuleGraph and Annotation) as well as the basic rules (eg Sequence, Data matchers)
  2. Script: Everything related to the scripting framework: parsers, tokenizers and the compiler.
  3. Examples: Various examples to demonstrate (and test) the framework.

Each of these main topics has each own corresponding test project.

Rule References

Data rules

Match Any (.)
Match Data Range ({'a'..'z'})
Match Data Sequence ('abc')

Meta rules

Match Evaluation (a/b/c)
Match Count (*,+, ?)
Rule reference (foo)

Look-ahead rules

Match Condition (if ...)
Match Not (!foo ...)

More information

Rule graph details on working with - and implementation details of rule graphs.

Pruning details on how to keep your syntax tree lean.

Extending parse script steps required to add a new rule to the script.

To do list a list of all planned or unplanned tasks.

Product Compatible and additional computed target framework versions.
.NET net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net9.0

    • No dependencies.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on gg.parse:

Package Downloads
gg.parse.script

gg.parse is a c# project aiming to provide a library for a tokenization, parsing and offer an ebnf-like scripting tools to make parsing of simple and complex data easy to do.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.2.0 201 10/31/2025
0.2.0-debug 173 10/31/2025
0.1.0 255 10/21/2025 0.1.0 is deprecated because it is no longer maintained.