SourceDocParser.Zensical
0.3.1-alpha
See the version list below for details.
dotnet add package SourceDocParser.Zensical --version 0.3.1-alpha
NuGet\Install-Package SourceDocParser.Zensical -Version 0.3.1-alpha
<PackageReference Include="SourceDocParser.Zensical" Version="0.3.1-alpha" />
<PackageVersion Include="SourceDocParser.Zensical" Version="0.3.1-alpha" />
<PackageReference Include="SourceDocParser.Zensical" />
paket add SourceDocParser.Zensical --version 0.3.1-alpha
#r "nuget: SourceDocParser.Zensical, 0.3.1-alpha"
#:package SourceDocParser.Zensical@0.3.1-alpha
#addin nuget:?package=SourceDocParser.Zensical&version=0.3.1-alpha&prerelease
#tool nuget:?package=SourceDocParser.Zensical&version=0.3.1-alpha&prerelease
SourceDocParserLib
Roslyn-based .NET assembly walker that turns compiled .dll + .pdb + .xml triples into a strongly-typed API catalog (types, members, signatures, XML docs, inheritdoc, SourceLink) and hands it to a pluggable emitter for rendering.
The catalog is format-neutral. Emitters decide how to render it — Markdown for Zensical / mkdocs Material, or YAML for docfx ManagedReference, with room for other targets.
Packages
| Package | What it does |
|---|---|
SourceDocParser |
Core walker, merger, source-link resolution. Defines IAssemblySource, IDocumentationEmitter, IMetadataExtractor. |
SourceDocParser.NuGet |
IAssemblySource that fetches packages from nuget.org by owner / explicit list and exposes the per-TFM lib/ trees. |
SourceDocParser.Zensical |
IDocumentationEmitter that writes Markdown tuned for Zensical / mkdocs Material (admonitions, content tabs, mermaid). |
SourceDocParser.Docfx |
IDocumentationEmitter that writes docfx ManagedReference YAML pages (drop-in replacement for dotnet docfx metadata output) plus the docfx.json config-file shim that lets an existing docfx site drive the parser pipeline. |
Logging flows through Microsoft.Extensions.Logging.Abstractions source-generated [LoggerMessage] partials, so any host (Serilog, Console, NLog, …) plugs in without the libraries taking a dependency on a specific backend.
Quick start
var loggerFactory = LoggerFactory.Create(b => b.AddConsole());
var source = new NuGetAssemblySource(
rootDirectory: "/path/to/repo", // contains nuget-packages.json
apiPath: "/path/to/api", // where lib/ + refs/ get extracted
logger: loggerFactory.CreateLogger<NuGetAssemblySource>());
var emitter = new ZensicalDocumentationEmitter();
var result = await new MetadataExtractor().RunAsync(
source,
outputRoot: "/path/to/markdown-output",
emitter,
loggerFactory.CreateLogger<MetadataExtractor>());
Console.WriteLine($"Emitted {result.PagesEmitted} pages across {result.CanonicalTypes} types.");
Performance
The pipeline is built around a span-based XML scanner, pooled buffers, eager release of memory-mapped reference DLLs, and a streaming type merger that consumes catalogs as they land. The result is a small, predictable allocation budget and a fast wall-time per assembly.
Benchmark workload. Numbers below are from the BenchmarkDotNet suite under src/benchmarks/, run on a Ryzen 7 5800X / .NET 10. The workload extracts three real NuGet packages from nuget.org — pulling each package's lib/ and ref/ trees and the matching reference assemblies, walking every public symbol across ~19 target-framework groups, parsing the shipped XML doc files for each assembly, resolving <inheritdoc/> chains, and emitting roughly 600 canonical type pages after cross-TFM merge. The local NuGet cache is warmed once during global setup so per-iteration timings measure the walk + merge + emit pipeline, not the network leg.
End-to-end (MetadataExtractor.RunAsync):
| Phase | Wall time | Allocated |
|---|---|---|
Full pipeline (RunAsync) |
~1.4 s | ~650 MB |
| Discover (NuGet config + cache scan) | ~660 ms | ~240 MB |
| Load + walk (parallel, all groups) | ~1.5 s | ~670 MB |
| Merge (cross-TFM dedup) | 2 ms | ~550 KB |
| Emit (Zensical Markdown) | 79 ms | ~63 MB |
Peak working set is bounded too: per-TFM compilation loaders dispose as soon as their last assembly finishes walking, so the memory-mapped BCL reference views are released eagerly instead of accumulating until RunAsync exits.
Per-call hotspots:
| Operation | Time | Allocated |
|---|---|---|
XmlDocToMarkdown.Convert — plain summary |
~25 ns | 176 B |
XmlDocToMarkdown.Convert — tagged with <see> / <c> / <paramref> |
~786 ns | 304 B |
XmlDocToMarkdown.Convert — code block + bullet list |
~1.0 µs | 440 B |
TfmResolver.FindBestRefsTfm — exact match |
~2 ns | 0 B |
TfmResolver.FindBestRefsTfm — platform-suffix strip |
~11 ns | 0 B |
TfmResolver.FindBestRefsTfm — netstandard fallback |
~471 ns | 1 KB |
TypeMerger.Merge — 600 types × 3 TFMs |
~115 µs | 325 KB |
Emitter cost per type page (no I/O, just markup formatting; baseline = Zensical Markdown):
| Workload (types × members/type) | Zensical Markdown | DocFx YAML | Time | Alloc |
|---|---|---|---|---|
| 100 × 5 | 78 µs / 420 KB | 305 µs / 1,410 KB | 3.9× | 3.4× |
| 100 × 30 | 288 µs / 1,334 KB | 1,618 µs / 6,184 KB | 5.5× | 4.6× |
| 600 × 5 | 459 µs / 2,522 KB | 1,823 µs / 8,461 KB | 3.9× | 3.4× |
| 600 × 30 | 1,938 µs / 8,006 KB | 10,820 µs / 37,106 KB | 5.7× | 4.6× |
| 2000 × 5 | 1,617 µs / 8,406 KB | 7,443 µs / 28,203 KB | 4.5× | 3.4× |
| 2000 × 30 | 8,528 µs / 26.7 MB | 37,166 µs / 123.7 MB | 4.4× | 4.6× |
DocFx YAML is heavier by design — every member duplicates uid / commentId / parent / name / nameWithType / fullName, and the page-level references: list adds another mapping per cross-referenced type. The emitter still hand-writes its YAML directly via StringBuilder (no YamlDotNet runtime dependency), with a single-allocation fast path for the qualified-name composites (type.Name + "." + member.Name) that round-trips identifiers as plain scalars when escape-safe.
Side-by-side against dotnet docfx metadata. Two fully isolated standalone benchmark assemblies — benchmarks/Docfx.StandaloneBenchmarks/ (calls DotnetApiCatalog.GenerateManagedReferenceYamlFiles in-process) and benchmarks/SourceDocParser.Docfx.StandaloneBenchmarks/ (drives our pipeline through DocfxYamlEmitter) — both target the same 4 NuGet packages (ReactiveUI, Splat, DynamicData, System.Reactive), measured by BenchmarkDotNet's [ShortRunJob] on the same machine:
| Pipeline | Mean | Allocated |
|---|---|---|
docfx 2.78.5 — DotnetApiCatalog.GenerateManagedReferenceYamlFiles |
1.598 s | 6.72 MB |
SourceDocParser + DocfxYamlEmitter |
2.031 s | 919.6 MB |
The two pipelines aren't strictly walking identical inputs — docfx loads a synthesised Fixture.csproj that pulls the 4 packages as transitive PackageReferences and walks one effective TFM, while our pipeline resolves every shipped lib//ref/ slice across ~19 supported TFMs from nuget-packages.json and merges across them. Working backward from that fixture difference, our per-TFM walk explains both the wall-time delta and the allocation gap (each TFM spins a fresh Roslyn compilation graph, and the cross-TFM merger holds catalogs while it dedupes UIDs). The contract pinned by the comparison is parity output (every T:, M:, P:, E: UID docfx emits, our pipeline emits too) at the per-page emit cost shown in the per-page table above.
Strategies the pipeline uses
- Custom span-based XML scanner. Every NuGet package ships an
<assembly-name>.xmldoc file alongside its.dll, holding the///doc comments for every public symbol. The walker has to read each member's XML fragment per symbol, render its<see>/<c>/<list>/<inheritdoc>tags into Markdown, and do the same again per<param>/<exception>inside it — for thousands of symbols per assembly.XmlReaderworks for that, but itsXmlTextReaderImplallocates multi-KB internal buffers (NodeData[],NamespaceManager, char buffers,Entry[]) per construction, which dominates the doc-parse profile. So the pipeline ships a smallref struct DocXmlScannerthat walks the doc text directly overReadOnlySpan<char>and implements just the XML grammar that///doc comments actually use. Both the per-symbol parser and the Markdown renderer drive the scanner, so per-element XML processing is allocation-free apart from the result string. - Build-once-then-read-many
XmlDocSource. Each.xmldoc file is read once viaFile.ReadAllBytes+Encoding.UTF8.GetString, then indexed by per-member(offset, length)ranges. The substring is only materialised when a consumer callsGet(memberId), and the source is safe for concurrent reads under the parallel walker. - Eager per-group loader disposal. Each TFM group has its own
CompilationLoaderwith a privateMetadataReferenceCacheholding memory-mapped views of every reference DLL. As soon as the last assembly in a group finishes its walk, an interlocked counter drops to zero and the loader disposes — peak working set scales with the slowest-finishing group, not the total number of groups times their references. - Streaming type merger. The parallel walk feeds
ApiCatalogs intoStreamingTypeMergerone at a time and immediately drops its reference, instead of accumulating every catalog in aConcurrentBaguntil the walk phase finishes. - Capture-free parallel dispatch. The
Parallel.ForEachAsynclambda isstatic— every dependency it touches is bundled into aWalkContextrecord attached to each work item, so dispatch never allocates a closure object per assembly. - Pooled
StringBuilderon the converter.XmlDocToMarkdownis per-walk by construction; reusing a single builder across everyConvertcall eliminates the per-element allocation that would otherwise dominate the renderer. - Pre-sized buffers. Each nupkg zip entry is sized to its known uncompressed length up front so the backing
byte[]is allocated once at the right size instead of doubling-and-copying on everyWrite. SourceLink URL rewriting fuses the base URL and the line anchor into one interpolated-string handler call so the GitHub / Bitbucket / GitLab / Azure DevOps blob URL is materialised in a singlestring.
Repository layout
SourceDocParserLib/
src/
SourceDocParser/
SourceDocParser.NuGet/
SourceDocParser.Docfx/
SourceDocParser.Zensical/
tests/
SourceDocParser.Tests/ unit tests (TUnit)
SourceDocParser.IntegrationTests/ end-to-end + Zensical render-smoke
Directory.Build.props shared lib config
Directory.Packages.props central package versions
SourceDocParserLib.slnx
Directory.Build.props
version.json Nerdbank.GitVersioning
.editorconfig
stylecop.json
dotnet build from src/ packs every non-test project into artifacts/packages/ automatically (<GeneratePackageOnBuild>true</GeneratePackageOnBuild>). Consumers in other repos can wire that directory up as a local feed via nuget.config until the libraries are published.
Acknowledgements
The metadata extraction pipeline is inspired by — and lifts patterns from — dotnet/docfx (MIT licensed). docfx's Roslyn-based assembly walker, inheritdoc resolution, and overall metadata model shaped this library's design. See LICENSE for the original docfx attribution.
Built on:
- Roslyn (Microsoft.CodeAnalysis.CSharp) for compilation + symbol model
- ICSharpCode.Decompiler for transitive reference resolution
- NuGet.Frameworks + NuGet.Versioning for proper TFM compatibility and SemVer ordering
- Polly v8 for HTTP retry/rate-limit pipelines
License
MIT — see LICENSE for the full text and the docfx attribution.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- ICSharpCode.Decompiler (>= 10.0.0.8330)
- Microsoft.CodeAnalysis.CSharp (>= 5.3.0)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.7)
- NuGet.Frameworks (>= 7.3.1)
- Polly.RateLimiting (>= 8.6.6)
- SourceDocParser (>= 0.3.1-alpha)
- SourceDocParser.Common (>= 0.3.1-alpha)
- System.Threading.RateLimiting (>= 10.0.7)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on SourceDocParser.Zensical:
| Package | Downloads |
|---|---|
|
NuStreamDocs.CSharpApiGenerator
Generate API reference pages from your .NET assemblies as part of your NuStreamDocs build. Point at NuGet packages or local DLLs and the plugin writes Markdown reference docs into your docs tree, ready to be linked from your handwritten content. |
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 2.0.0 | 508 | 5/10/2026 |
| 1.4.2 | 774 | 5/2/2026 |
| 1.4.1 | 453 | 4/30/2026 |
| 1.3.1 | 206 | 4/28/2026 |
| 1.2.1 | 108 | 4/28/2026 |
| 1.1.1 | 92 | 4/28/2026 |
| 1.0.5 | 90 | 4/28/2026 |
| 1.0.3 | 98 | 4/28/2026 |
| 0.6.1-alpha | 99 | 4/28/2026 |
| 0.5.1-alpha | 98 | 4/28/2026 |
| 0.4.1-alpha | 89 | 4/28/2026 |
| 0.3.1-alpha | 98 | 4/27/2026 |
| 0.2.1-alpha | 94 | 4/27/2026 |
| 0.1.23-alpha | 98 | 4/25/2026 |