SpiseMisu.Text.Dstring 0.11.18

dotnet add package SpiseMisu.Text.Dstring --version 0.11.18
                    
NuGet\Install-Package SpiseMisu.Text.Dstring -Version 0.11.18
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="SpiseMisu.Text.Dstring" Version="0.11.18" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="SpiseMisu.Text.Dstring" Version="0.11.18" />
                    
Directory.Packages.props
<PackageReference Include="SpiseMisu.Text.Dstring" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add SpiseMisu.Text.Dstring --version 0.11.18
                    
#r "nuget: SpiseMisu.Text.Dstring, 0.11.18"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package SpiseMisu.Text.Dstring@0.11.18
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=SpiseMisu.Text.Dstring&version=0.11.18
                    
Install as a Cake Addin
#tool nuget:?package=SpiseMisu.Text.Dstring&version=0.11.18
                    
Install as a Cake Tool

SpiseMisu.Text.Dstring

A Danish string is a German string alike implementation for .NET, managed memory optimized.

A dstring consists of 16-bytes (128-bits) of continuous memory, where:

  • The first byte, stores a bitmask for the seven next bytes as well as a byte [] pointer

  • The first byte, uses a 4-bit bitmask to store the length of the dstring prefix, as well as another 4-bit bitmask to store flags for format-and-encoding. Once the upperbound length of the dstring prefix length is reached, a 3-bit bitmask with compression flags is available:

    # Upperbound length of eight (compression flags are available)
    +--------+
    |▭▭▭▭■□□□|
    +--------+
    # Lenth of five (compression flags are NOT available)
    +--------+
    |▭▭▭▭□■□■|
    +--------+
    

    and

    # A byte[] (dbytes) aka Extended ASCII
    +--------+
    |□□□□▭▭▭▭| isExtASCII...: Encoded bytes in [0x00 - 0xFF]
    +--------+
    # Format
    +--------+
    |□□□■▭▭▭▭| isBin....: Ex: 1001010101… (log 02. / log 02. = 1.0-bit   => 08 vals in 01-byte
    +--------+
    |□□■□▭▭▭▭| isDig....: Ex: 0123456789… (log 10. / log 02. = 3.3-bit   => 09 vals in 03-bytes
    +--------+
    |□□■■▭▭▭▭| isHex....: Ex: AF332EC219… (log 16. / log 02. = 4.0-bit   => 02 vals in 01-byte
    +--------+
    |□■□□▭▭▭▭| isISO8601 (TBC)
    +--------+
    |□■□■▭▭▭▭| isUUID...: Ex: d6c3ff78-0546-42dd-abc8-24a9e74ccf90      => 36 vals in 16-byte
    +--------+
    |□■■□▭▭▭▭| isF064...: Ex: 1. / 3. = 0.3333333333                    => 01 val  in 08-bytes (fixed)
    +--------+
    |□■■■▭▭▭▭| isD128...: Ex: 1m / 3m = 0.3333333333333333333333333333M => 01 val  in 16-bytes (fixed)
    +--------+
    |■□□□▭▭▭▭| isJSON...: Ex: [{"foo":42}]
    +--------+
    |■□□■▭▭▭▭| isJSONL..: Ex: [{"foo":42}]\n[{"bar":43}]
    +--------+
    # Format and Encoding placeholders
    +--------+
    |■□■□▭▭▭▭| PlaceholderF10 (placeholder for future formats/encodings)
    +--------+
    |■□■■▭▭▭▭| PlaceholderF11 (placeholder for future formats/encodings)
    +--------+
    |■■□□▭▭▭▭| PlaceholderF12 (placeholder for future formats/encodings)
    +--------+
    # Encoding. Default is multi-byte Unicode for optimal storage
    +--------+
    |■■□■▭▭▭▭| isASCII......: Encoded bytes in [0x00 - 0x7F]
    +--------+
    |■■■□▭▭▭▭| isUTF8.......: Encoded bytes as multiple UTF8 single-bytes
    +--------+
    |■■■■▭▭▭▭| isUnicode....: Encoded bytes as multi-byte Unicode
    +--------+
    bit-mask
    

    and

    # Default is uncompressed
    +--------+
    |▭▭▭▭■□□□| Uncompressed
    +--------+
    # Compression algorithms, with streaming support
    +--------+
    |▭▭▭▭■□□■| Deflate
    +--------+
    |▭▭▭▭■□■□| GZip
    +--------+
    |▭▭▭▭■□■■| ZLib
    +--------+
    |▭▭▭▭■■□□| Brotli
    +--------+
    # Compression algorithms placeholders
    +--------+
    |▭▭▭▭■■□■| PlaceholderF05
    +--------+
    |▭▭▭▭■■■□| PlaceholderF06
    +--------+
    |▭▭▭▭■■■■| PlaceholderF07
    +--------+
    bit-mask
    
  • The next seven bytes, store each of the seven first bytes of a dstring. If the dstring is less than seven bytes, then the remaining bytes will be instantiated to a default value of zero

  • Finally, the last bytes, contain a x64-pointer (8-bytes) to a byte [] (on the heap) for the rest of the bytes in the dstring. If the dstring is less than eight bytes, the byte [] will not be instantiated (null value)

  1. Example of a 4-byte dstring ("test"). No heap allocation:
+--------+----+----+----+----+----+----+----+----------+
|□□□□□■□□|0x74|0x65|0x73|0x74|0x00|0x00|0x00|  <NULL>  |
+--------+----+----+----+----+----+----+----+----------+
 bit-mask  b0   b1   b2   b3   b4   b5   b6    pointer
           ——   ——   ——   ——
  1. Example of a +8-byte dstring ("Danish string") + heap allocation:
                                         0x551A4290 (byte[] on heap)
                                              |
                                              v
+--------+----+----+---+----+----------+      +----+----+---+----+
|□□□□■□□□|0x44|0x61| … |0x20|0x551A4290| ---> |0x73|0x74| … |0x67|
+--------+----+----+---+----+----------+      +----+----+---+----+
 bit-mask  b0   b1   …   b6    pointer          b7   b8   …   bn
           ——   ——       ——    ———————          ——   ——       ——
  1. Example of an array of nine dstring:
extra allocated byte arrays on heap ----+------------+------------+
                                        |            |            |
                                        v            |            |
                                   0x6796EE96        |            |
+-+----+-----------------------+        |            |            |
|i|memo|   continuous memory   |        v            |            |
+-+----+--------+---+----------+        +---+        v            |
|0|0x00|□□□□■□□□| … |0x6796EE96| -----> | … |   0x53EB31F6        |
+-+----+--------+---+----------+        +---+        |            |
|1|0x10|□□□□□□■□| … |  <NULL>  |                     v            |
+-+----+--------+---+----------+                     +---+        v
|2|0x20|□□□□■□□□| … |0x53EB31F6| ------------------> | … |   0x4A424B5E
+-+----+--------+---+----------+                     +---+        |
|…|0x…0|□□□□□■□■| … |  <NULL>  |                                  v
+-+----+--------+---+----------+                                  +---+
|8|0x80|□□□□■□□□| … |0x4A424B5E| -------------------------------> | … |
+-+----+--------+---+----------+                                  +---+

Project structure

├── SpiseMisu.Text.Dstring
│   ├── lib
│   │   └── utils.fs
│   ├── SpiseMisu.Text.Dstring.fsproj
│   └── dstring.fs
├── SpiseMisu.Text.Dstring.Perfs
│   ├── SpiseMisu.Text.Dstring.Perfs.fsproj
│   └── program.fs
├── SpiseMisu.Text.Dstring.Tests
│   ├── SpiseMisu.Text.Dstring.Tests.fsproj
│   ├── program.fs
│   └── tests.fs
├── demo
│   └── dstring.fsx
├── imgs
│   ├── docs
│   ├── licenses
│   └── nuget
├── SpiseMisu.Text.Dstring.sln
├── global.json
├── license.txt
├── license_cil-bytecode_agpl-3.0-only.txt
├── license_knowhow_cc-by-nc-nd-40.txt
├── readme.md
└── todo.org

Memory layout

Figure: dstring[] hex-dump

Heap dump with dotnet-dump mini-guide

  1. In ./SpiseMisu.Text.Dstring.Perfs/program.fs > x.GlobalCleanup () = outcomment System.Threading.Thread.Sleep(15_000 (* 15 secs *))

  2. Execute ./dotnet-cli-pidof.sh and you will see all the dotnet apps running. Look for the ones ending with SpiseMisu.Text.Dstring.Perfs-Job-OVERNF-1/bin/Release/net8.0.

  3. Now wait for the job, you want to make the memory dump for, reaches the clean-up section: // AfterActualRun

  4. Execute dotnet-dump collect --type Heap --process-id 2456129 and you will see:

// AfterActualRun
WorkloadResult   1: 2 op, 507459083.00 ns, 253.7295 ms/op
// GC:  8 7 0 207217488 2
// Threading:  0 0 2
 
[createdump] Gathering state for process 2456129 dotnet
[createdump] Writing minidump with heap to file ~/…/SpiseMisu.Text.Dstring/core_20251004_170724
[createdump] Written 596156416 bytes (145546 pages) to core file
[createdump] Target process is alive
[createdump] Dump successfully written in 306ms
  1. Investigate by typing: dotnet-dump analyze core_20251004_170724

  2. In the tool, type: dumpheap -stat and you will see:

…
561d22bacde0    13,565     539,936 Free
7f54cec830c0         1   8,000,024 System.Int64[]
7f54cec82ee8         1  16,000,024 SpiseMisu.Text+Dstring[]
7f54cec82010         2  16,000,048 System.Byte[][]
7f54ce9aeb48        34  24,004,640 System.String[]
7f54ce90d7c8 3,000,708 158,772,680 System.String
7f54ceb75950 5,000,005 209,002,292 System.Byte[]
Total 8,015,865 objects, 432,486,422 bytes
  1. See details for a given memory address: dumpheap -mt 7f54cec82ee8
         Address               MT           Size
    7f14ce800048     7f54cec82ee8     16,000,024
  1. You can now drill further by typing: dumparray -length 5 7f14ce800048
Name:        SpiseMisu.Text+Dstring[]
MethodTable: 00007f54cec82ee8
EEClass:     00007f54cec82e60
Size:        16000024(0xf42418) bytes
Array:       Rank 1, Number of elements 1000000, Type VALUETYPE
Element Methodtable: 00007f54cec82db0
[0] 00007f14ce800058
[1] 00007f14ce800068
[2] 00007f14ce800078
[3] 00007f14ce800088
[4] 00007f14ce800098
  1. And now we can see the contents of some of the (struct) elements in our array by typing: db -c 80 00007f14ce800058 (16-byte element x 5 = 80-bytes):
00007f14ce800058: 30 6b 22 ce 14 7f 00 00 08 73 9a ac 37 c9 be ba  0k"......s..7...
00007f14ce800068: 58 6b 22 ce 14 7f 00 00 08 53 d1 20 a4 46 a1 86  Xk"......S. .F..
00007f14ce800078: 80 6b 22 ce 14 7f 00 00 08 44 8f d6 ea 76 37 34  .k"......D...v74
00007f14ce800088: a8 6b 22 ce 14 7f 00 00 08 5b c1 41 f8 f9 bd 58  .k"......[.A...X
00007f14ce800098: d0 6b 22 ce 14 7f 00 00 08 50 72 ef 42 a5 6a 2a  .k"......Pr.B.j*

which show a similar pattern as the hex dumper (Dstring.Memory.dump):

0112748739DB99|00001000|↔|00007F536E755118|459055102CAE09F54B
01E606DBB4F6FA|00001000|↔|00007F536E754DD8|4BBC8ED0A25F0B8755
07BDEDF50B83AC|00001000|↔|00007F536E754DB0|43A0DFEEA191AEA2A3
0C5FB78013D42F|00001000|↔|00007F536E754CC0|41854A8815FE6E6A3C
1F3A8D9CC33F5E|00001000|↔|00007F536E7550F0|4BA36307910E82AB70

NOTE: In the performance benchmark Guid's are byte[]-reversed.

> 0112748739DB99|08|↔|00007F536E755118
  (byte reversed becomes)
> 18 51 75 6E 53 7F 00 00|08|99 DB 39 87 74 12 01
  (and compared to `dotnet-dump`)
< 30 6b 22 ce 14 7f 00 00 08 73 9a ac 37 c9 be ba
  1. Once you are done, clean the core_[DATESTAMP]_[TIMESTAMP] files

Benchmarks

// * Summary *

BenchmarkDotNet v0.15.4, Linux NixOS 25.05 (Warbler)
12th Gen Intel Core i7-12800H 0.40GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 8.0.414
  [Host]     : .NET 8.0.20 (8.0.20, 8.0.2025.41914), X64 RyuJIT x86-64-v3 DEBUG
  Job-OVERNF : .NET 8.0.20 (8.0.20, 8.0.2025.41914), X64 RyuJIT x86-64-v3

Job=Job-OVERNF  Runtime=.NET 8.0  IterationCount=1  
LaunchCount=0  WarmupCount=0  Error=NA  

| Method                                             | N       | Mean       | Ratio  | Allocated | Alloc Ratio |
|--------------------------------------------------- |-------- |-----------:|-------:|----------:|------------:|
| 'Array.zeroCreate<string> x.N'                     | 1000000 |   2.183 ms |   1.00 |   7.63 MB |        1.00 |
| 'Array.zeroCreate<dstring> x.N'                    | 1000000 |   5.296 ms |   2.43 |  15.26 MB |        2.00 |
| 'x.guids |> Array.map Encoding.ASCII.GetString'    | 1000000 | 121.282 ms |  55.57 |  61.04 MB |        8.00 |
| 'x.guids |> Array.map Dstring.Bytes.toDstring'     | 1000000 |  63.640 ms |  29.16 |  53.41 MB |        7.00 |
| 'x.sha256s |> Array.map Encoding.ASCII.GetString'  | 1000000 | 215.073 ms |  98.54 |  91.55 MB |       12.00 |
| 'x.sha256s |> Array.map Dstring.Bytes.toDstring'   | 1000000 |  76.005 ms |  34.82 |  68.66 MB |        9.00 |
| 'x.strings |> Array.sort'                          | 1000000 | 264.986 ms | 121.41 |   7.63 MB |        1.00 |
| 'x.strings |> Array.sortDescending'                | 1000000 | 288.462 ms | 132.17 |   7.63 MB |        1.00 |
| 'x.strings |> Array.map Dstring.UTF8.fromString'   | 1000000 | 112.914 ms |  51.74 |  53.41 MB |        7.00 |
| 'x.dstrings |> Array.map Dstring.UTF8.toString'    | 1000000 | 252.340 ms | 115.62 |  98.81 MB |       12.95 |
| 'x.dstrings |> Dstring.Array.sort'                 | 1000000 | 174.879 ms |  80.13 |  15.26 MB |        2.00 |
| 'x.dstrings |> Dstring.Array.sortDescending'       | 1000000 | 180.526 ms |  82.71 |  15.26 MB |        2.00 |
| 'x.dstrings |> Dstring.Array.sortPrefix'           | 1000000 | 155.760 ms |  71.37 |  15.26 MB |        2.00 |
| 'x.dstrings |> Dstring.Array.sortPrefixDescending' | 1000000 | 157.594 ms |  72.21 |  15.26 MB |        2.00 |

// * Hints *
HideColumnsAnalyser
  Summary -> Hidden columns: Error

// * Legends *
  N           : Value of the 'N' parameter
  Mean        : Arithmetic mean of all measurements
  Ratio       : Mean of the ratio distribution ([Current]/[Baseline])
  Allocated   : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
  Alloc Ratio : Allocated memory ratio distribution ([Current]/[Baseline])
  1 ms        : 1 Millisecond (0.001 sec)

Licenses

Source code in this repository is ONLY covered by a Server Side Public License, v 1 while the rest (knowhow, text, media, …), is covered by the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license.

Figure: CC BY-NC-ND 4.0

However, as it's not permitted to deploy a nuget package with non OSI nor FSF licenses:

Pushing SpiseMisu.Text.Dstring.0.11.0.nupkg to 'https://www.nuget.org/api/v2/package'...
  PUT https://www.nuget.org/api/v2/package/
  BadRequest https://www.nuget.org/api/v2/package/ 846ms
error: Response status code does not indicate success: 400 (License expression must only contain licenses that are approved by Open Source Initiative or Free Software Foundation. Unsupported licenses: SSPL-1.0.).

The CIL-bytecode content of the nuget package is therefore dual-licensed under the GNU Affero General Public License v3.0 only and the rest (knowhow, text, media, …), is covered by the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license.

For more info on compatible nuget packages licenses, see SPDX License List.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.11.18 154 10/8/2025
0.11.17 151 10/7/2025
0.11.16 147 10/7/2025
0.11.15 143 10/5/2025