DawgSharp 1.2.0

DAWG (Directed Acyclic Word Graph) is a data structure for storing and searching large word lists while keeping your memory footprint small and lookups fast. DawgSharp is an open-source C# implementation featuring a linear time graph reduction algorithm and out-of-the-box persistence support.

The two main objects in the library are Dawg and DawgBuilder. Dawg is immutable, you must use DawgBuilder to build a Dawg and then save it to a stream. Then use Dawg.Load to rehydrate the data. Once reloaded, Dawg re-emerges as a completely different data structure (but, oddly, the same class) that is nearly as fast as a HashSet for lookups and is much, much more memory-efficient (factors of 30x - 40x are not uncommon). Please note that the Save/Load step is necessary to get the full potential out of the Dawg object. Use a MemoryStream if disk interaction is not desired.

The Dawg class can be thought of as a read-only Dictionary <string, Value> type. It has the [""] indexer and implements IEnumerable <KeyValuePair <string, Value>>.

One other very useful feature of Dawg (not found in Dictionary) is the ability to quickly find all words that start with a particular substring: dawg.MatchPrefix ("star") could possibly yield "star", "starch", "start", "starting", etc.

This package is provided under the terms of the GNU GPL v3. Source code and documentation are available on GitHub: https://github.com/bzaar/DawgSharp. Commercial licenses are also available.

There is a newer version of this package available.
See the version list below for details.
Install-Package DawgSharp -Version 1.2.0
dotnet add package DawgSharp --version 1.2.0
<PackageReference Include="DawgSharp" Version="1.2.0" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add DawgSharp --version 1.2.0
The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Release Notes

This version has been optimized to use EIGHT times less RAM than the previous version. In a typical benchmark test, it used 2.5M RAM to store 2.5 million words (yes, one byte per word) while maintaining a lookup speed of around one million words per second. The .NET Dictionary object uses 87M RAM under the same conditions.

The new version is fully compatible with the previous version on both source-code and binary levels and it will happily read files produced by its predecessor.
The assembly is now CLS-compliant which ensures it can be used from VB.NET and F#.
The assembly has been signed to allow side-by-side installations.
The SaveTo method has been changed slightly not to close the stream after it’s done.
The license was changed to GPL for this release. If you need a commercial license, please contact the author.

Dependencies

This package has no dependencies.

This package is not used by any popular GitHub repositories.

Version History

Version Downloads Last updated
1.3.0 3,985 6/23/2018
1.2.0 12,968 10/8/2015
1.1.1 596 7/10/2015
1.1.0 357 7/10/2015
1.0.7 5,069 1/2/2015
1.0.6 974 10/31/2014
1.0.5 443 6/2/2014
1.0.4 477 5/5/2014
1.0.3 438 5/4/2014
1.0.2 468 4/27/2014