SourceDocParser 0.1.23-alpha
See the version list below for details.
dotnet add package SourceDocParser --version 0.1.23-alpha
NuGet\Install-Package SourceDocParser -Version 0.1.23-alpha
<PackageReference Include="SourceDocParser" Version="0.1.23-alpha" />
<PackageVersion Include="SourceDocParser" Version="0.1.23-alpha" />
<PackageReference Include="SourceDocParser" />
paket add SourceDocParser --version 0.1.23-alpha
#r "nuget: SourceDocParser, 0.1.23-alpha"
#:package SourceDocParser@0.1.23-alpha
#addin nuget:?package=SourceDocParser&version=0.1.23-alpha&prerelease
#tool nuget:?package=SourceDocParser&version=0.1.23-alpha&prerelease
SourceDocParserLib
Roslyn-based .NET assembly walker that turns compiled .dll + .pdb + .xml triples into a strongly-typed API catalog (types, members, signatures, XML docs, inheritdoc, SourceLink) and hands it to a pluggable emitter for rendering.
The catalog is format-neutral. Emitters decide how to render it — Markdown for Zensical / mkdocs Material today, with room for other targets.
Packages
| Package | What it does |
|---|---|
SourceDocParser |
Core walker, merger, source-link resolution. Defines IAssemblySource, IDocumentationEmitter, IMetadataExtractor. |
SourceDocParser.NuGet |
IAssemblySource that fetches packages from nuget.org by owner / explicit list and exposes the per-TFM lib/ trees. |
SourceDocParser.Zensical |
IDocumentationEmitter that writes Markdown tuned for Zensical / mkdocs Material (admonitions, content tabs, mermaid). |
SourceDocParser.Docfx |
docfx config-file shim — reads + writes docfx.json shapes so an existing docfx site can drive the parser pipeline. No emitter; that ships separately. |
Logging flows through Microsoft.Extensions.Logging.Abstractions source-generated [LoggerMessage] partials, so any host (Serilog, Console, NLog, …) plugs in without the libraries taking a dependency on a specific backend.
Quick start
var loggerFactory = LoggerFactory.Create(b => b.AddConsole());
var source = new NuGetAssemblySource(
rootDirectory: "/path/to/repo", // contains nuget-packages.json
apiPath: "/path/to/api", // where lib/ + refs/ get extracted
logger: loggerFactory.CreateLogger<NuGetAssemblySource>());
var emitter = new ZensicalDocumentationEmitter();
var result = await new MetadataExtractor().RunAsync(
source,
outputRoot: "/path/to/markdown-output",
emitter,
loggerFactory.CreateLogger<MetadataExtractor>());
Console.WriteLine($"Emitted {result.PagesEmitted} pages across {result.CanonicalTypes} types.");
Performance
The pipeline is built around a span-based XML scanner, pooled buffers, eager release of memory-mapped reference DLLs, and a streaming type merger that consumes catalogs as they land. The result is a small, predictable allocation budget and a fast wall-time per assembly.
Benchmark workload. Numbers below are from the BenchmarkDotNet suite under src/benchmarks/, run on a Ryzen 7 5800X / .NET 10. The workload extracts three real NuGet packages from nuget.org — pulling each package's lib/ and ref/ trees and the matching reference assemblies, walking every public symbol across ~19 target-framework groups, parsing the shipped XML doc files for each assembly, resolving <inheritdoc/> chains, and emitting roughly 600 canonical type pages after cross-TFM merge. The local NuGet cache is warmed once during global setup so per-iteration timings measure the walk + merge + emit pipeline, not the network leg.
End-to-end (MetadataExtractor.RunAsync):
| Phase | Wall time | Allocated |
|---|---|---|
Full pipeline (RunAsync) |
~1.4 s | ~650 MB |
| Discover (NuGet config + cache scan) | ~660 ms | ~240 MB |
| Load + walk (parallel, all groups) | ~1.5 s | ~670 MB |
| Merge (cross-TFM dedup) | 2 ms | ~550 KB |
| Emit (Zensical Markdown) | 79 ms | ~63 MB |
Peak working set is bounded too: per-TFM compilation loaders dispose as soon as their last assembly finishes walking, so the memory-mapped BCL reference views are released eagerly instead of accumulating until RunAsync exits.
Per-call hotspots:
| Operation | Time | Allocated |
|---|---|---|
XmlDocToMarkdown.Convert — plain summary |
~26 ns | 176 B |
XmlDocToMarkdown.Convert — tagged with <see> / <c> / <paramref> |
~890 ns | 304 B |
XmlDocToMarkdown.Convert — code block + bullet list |
~1.1 µs | 440 B |
TfmResolver.FindBestRefsTfm — exact match |
~3 ns | 0 B |
TfmResolver.FindBestRefsTfm — platform-suffix strip |
~11 ns | 0 B |
TfmResolver.FindBestRefsTfm — netstandard fallback |
~500 ns | 1 KB |
TypeMerger.Merge — 600 types × 3 TFMs |
~120 µs | 330 KB |
Strategies the pipeline uses
- Custom span-based XML scanner. Every NuGet package ships an
<assembly-name>.xmldoc file alongside its.dll, holding the///doc comments for every public symbol. The walker has to read each member's XML fragment per symbol, render its<see>/<c>/<list>/<inheritdoc>tags into Markdown, and do the same again per<param>/<exception>inside it — for thousands of symbols per assembly.XmlReaderworks for that, but itsXmlTextReaderImplallocates multi-KB internal buffers (NodeData[],NamespaceManager, char buffers,Entry[]) per construction, which dominates the doc-parse profile. So the pipeline ships a smallref struct DocXmlScannerthat walks the doc text directly overReadOnlySpan<char>and implements just the XML grammar that///doc comments actually use. Both the per-symbol parser and the Markdown renderer drive the scanner, so per-element XML processing is allocation-free apart from the result string. - Build-once-then-read-many
XmlDocSource. Each.xmldoc file is read once viaFile.ReadAllBytes+Encoding.UTF8.GetString, then indexed by per-member(offset, length)ranges. The substring is only materialised when a consumer callsGet(memberId), and the source is safe for concurrent reads under the parallel walker. - Eager per-group loader disposal. Each TFM group has its own
CompilationLoaderwith a privateMetadataReferenceCacheholding memory-mapped views of every reference DLL. As soon as the last assembly in a group finishes its walk, an interlocked counter drops to zero and the loader disposes — peak working set scales with the slowest-finishing group, not the total number of groups times their references. - Streaming type merger. The parallel walk feeds
ApiCatalogs intoStreamingTypeMergerone at a time and immediately drops its reference, instead of accumulating every catalog in aConcurrentBaguntil the walk phase finishes. - Capture-free parallel dispatch. The
Parallel.ForEachAsynclambda isstatic— every dependency it touches is bundled into aWalkContextrecord attached to each work item, so dispatch never allocates a closure object per assembly. - Pooled
StringBuilderon the converter.XmlDocToMarkdownis per-walk by construction; reusing a single builder across everyConvertcall eliminates the per-element allocation that would otherwise dominate the renderer. - Pre-sized buffers. Each nupkg zip entry is sized to its known uncompressed length up front so the backing
byte[]is allocated once at the right size instead of doubling-and-copying on everyWrite. SourceLink URL rewriting fuses the base URL and the line anchor into one interpolated-string handler call so the GitHub / Bitbucket / GitLab / Azure DevOps blob URL is materialised in a singlestring.
Repository layout
SourceDocParserLib/
src/
SourceDocParser/
SourceDocParser.NuGet/
SourceDocParser.Docfx/
SourceDocParser.Zensical/
tests/
SourceDocParser.Tests/ unit tests (TUnit)
SourceDocParser.IntegrationTests/ end-to-end + Zensical render-smoke
Directory.Build.props shared lib config
Directory.Packages.props central package versions
SourceDocParserLib.slnx
Directory.Build.props
version.json Nerdbank.GitVersioning
.editorconfig
stylecop.json
dotnet build from src/ packs every non-test project into artifacts/packages/ automatically (<GeneratePackageOnBuild>true</GeneratePackageOnBuild>). Consumers in other repos can wire that directory up as a local feed via nuget.config until the libraries are published.
Acknowledgements
The metadata extraction pipeline is inspired by — and lifts patterns from — dotnet/docfx (MIT licensed). docfx's Roslyn-based assembly walker, inheritdoc resolution, and overall metadata model shaped this library's design. See LICENSE for the original docfx attribution.
Built on:
- Roslyn (Microsoft.CodeAnalysis.CSharp) for compilation + symbol model
- ICSharpCode.Decompiler for transitive reference resolution
- NuGet.Frameworks + NuGet.Versioning for proper TFM compatibility and SemVer ordering
- Polly v8 for HTTP retry/rate-limit pipelines
License
MIT — see LICENSE for the full text and the docfx attribution.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- ICSharpCode.Decompiler (>= 10.0.0.8330)
- Microsoft.CodeAnalysis.CSharp (>= 5.3.0)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.7)
- NuGet.Frameworks (>= 7.3.1)
- Polly.RateLimiting (>= 8.6.6)
- System.Threading.RateLimiting (>= 10.0.7)
NuGet packages (4)
Showing the top 4 NuGet packages that depend on SourceDocParser:
| Package | Downloads |
|---|---|
|
SourceDocParser.NuGet
NuGet-backed IAssemblySource for SourceDocParser. Discovers, fetches and extracts packages, then exposes their per-TFM assemblies to the parser. |
|
|
SourceDocParser.Zensical
Zensical / mkdocs Material emitter for SourceDocParser. Renders the parser's ApiCatalog into a flat tree of Markdown pages tuned for the Zensical theme (admonitions, content tabs, mermaid diagrams). |
|
|
SourceDocParser.Docfx
docfx compatibility for SourceDocParser. Reads and writes docfx.json shapes (metadata + build sections) so an existing docfx site can plug into the parser pipeline, and emits docfx ManagedReference YAML pages so the parser output is consumable by docfx as a drop-in replacement for its own metadata extractor. |
|
|
NuStreamDocs.CSharpApiGenerator
Generate API reference pages from your .NET assemblies as part of your NuStreamDocs build. Point at NuGet packages or local DLLs and the plugin writes Markdown reference docs into your docs tree, ready to be linked from your handwritten content. |
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 2.0.0 | 513 | 5/10/2026 |
| 1.4.2 | 789 | 5/2/2026 |
| 1.4.1 | 487 | 4/30/2026 |
| 1.3.1 | 231 | 4/28/2026 |
| 1.2.1 | 126 | 4/28/2026 |
| 1.1.1 | 116 | 4/28/2026 |
| 1.0.5 | 122 | 4/28/2026 |
| 1.0.3 | 123 | 4/28/2026 |
| 0.6.1-alpha | 117 | 4/28/2026 |
| 0.5.1-alpha | 112 | 4/28/2026 |
| 0.4.1-alpha | 109 | 4/28/2026 |
| 0.3.1-alpha | 119 | 4/27/2026 |
| 0.2.1-alpha | 117 | 4/27/2026 |
| 0.1.23-alpha | 113 | 4/25/2026 |