NuGet Gallery Feed for DawgSharpDAWG (Directed Acyclic Word Graph) is a data structure for storing and searching large word lists while keeping your memory footprint small and lookups fast. DawgSharp is an open-source C# implementation featuring a linear time graph reduction algorithm and out-of-the-box persistence support.
The Dawg class is nearly as fast as a HashSet for lookups and is much, much more memory-efficient (factors of 30x - 40x are not uncommon). In a benchmark application it held two million words while consuming only 2Mbytes of RAM. That's only one byte per word! And it's even less on disk.
The Dawg class can be thought of as a read-only Dictionary<string, Value>: it has a ["string"] indexer and implements IEnumerable<KeyValuePair<string, Value>>. One other very useful feature of Dawg (not found in Dictionary) is the ability to quickly find all words that start with a particular prefix: dawg.MatchPrefix("star") could possibly yield "star", "starch", "start", "starting", etc.
The two main objects in the library are Dawg and DawgBuilder. Dawg is immutable, you must use DawgBuilder to build a Dawg and then save it to a stream. Then use Dawg.Load to rehydrate the data. Once reloaded, Dawg re-emerges as a completely different data structure (but, oddly, the same class) that is nearly as fast as a HashSet for lookups and is much, much more memory-efficient (factors of 30x - 40x are not uncommon). Please note that the Save/Load step is necessary to get the full potential out of the Dawg object. Use a MemoryStream if disk interaction is not desired.
The Dawg class can be thought of as a read-only Dictionary <string, Value> type. It has the [""] indexer and implements IEnumerable <KeyValuePair <string, Value>>.
One other very useful feature of Dawg (not found in Dictionary) is the ability to quickly find all words that start with a particular substring: dawg.MatchPrefix ("star") could possibly yield "star", "starch", "start", "starting", etc.
