DAWG (Directed Acyclic Word Graph) is a data structure for storing and searching large word lists while keeping your memory footprint small and lookups fast. DawgSharp is an open-source C# implementation featuring a linear time graph reduction algorithm and out-of-the-box persistence support.
The Dawg class is nearly as fast as a HashSet for lookups and is much, much more memory-efficient (factors of 30x - 40x are not uncommon). In a benchmark application it held two million words while consuming only 2Mbytes of RAM. That's only one byte per word! And it's even less on disk.
The Dawg class can be thought of as a read-only Dictionary<string, Value>: it has a ["string"] indexer and implements IEnumerable<KeyValuePair<string, Value>>. One other very useful feature of Dawg (not found in Dictionary) is the ability to quickly find all words that start with a particular prefix: dawg.MatchPrefix("star") could possibly yield "star", "starch", "start", "starting", etc.
This package is provided under the terms of the GNU GPL v3. Source code and documentation are available on GitHub: https://github.com/bzaar/DawgSharp. Commercial licenses are also available at http://morpher.co.uk/
Install-Package DawgSharp -Version 1.3.0
dotnet add package DawgSharp --version 1.3.0
<PackageReference Include="DawgSharp" Version="1.3.0" />
paket add DawgSharp --version 1.3.0
- XML comments file added.
- New method 'Dawg.GetPrefixes' for splitting prefixes off a word.
- New method 'DawgBuilder.BuildYaleDawg' for building the optimized DAWG without having to save to it disk.
- Support for .NET Standard 1.2 and .NET 3.5 (in addition to .NET 4.0). Apps targeting .NET 4.5+ will work with the .NET 4.0 version.
This release only has additive API changes. The file format has not changed.
- No dependencies.
- No dependencies.
- NETStandard.Library (>= 1.6.1)
NuGet packages (2)
Showing the top 2 NuGet packages that depend on DawgSharp:
.NET Morphological library for Russian language
.NET Vector words library for Russian language
This package is not used by any popular GitHub repositories.