SourceDocParser 0.1.23-alpha

This is a prerelease version of SourceDocParser.

There is a newer version of this package available.
See the version list below for details.

dotnet add package SourceDocParser --version 0.1.23-alpha

NuGet\Install-Package SourceDocParser -Version 0.1.23-alpha

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="SourceDocParser" Version="0.1.23-alpha" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="SourceDocParser" Version="0.1.23-alpha" />
                    

                            Directory.Packages.props

<PackageReference Include="SourceDocParser" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add SourceDocParser --version 0.1.23-alpha

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: SourceDocParser, 0.1.23-alpha"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package SourceDocParser@0.1.23-alpha

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=SourceDocParser&version=0.1.23-alpha&prerelease
                    

                            Install as a Cake Addin

#tool nuget:?package=SourceDocParser&version=0.1.23-alpha&prerelease
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

SourceDocParserLib

Roslyn-based .NET assembly walker that turns compiled .dll + .pdb + .xml triples into a strongly-typed API catalog (types, members, signatures, XML docs, inheritdoc, SourceLink) and hands it to a pluggable emitter for rendering.

The catalog is format-neutral. Emitters decide how to render it — Markdown for Zensical / mkdocs Material today, with room for other targets.

Packages

Package	What it does
`SourceDocParser`	Core walker, merger, source-link resolution. Defines `IAssemblySource`, `IDocumentationEmitter`, `IMetadataExtractor`.
`SourceDocParser.NuGet`	`IAssemblySource` that fetches packages from `nuget.org` by owner / explicit list and exposes the per-TFM `lib/` trees.
`SourceDocParser.Zensical`	`IDocumentationEmitter` that writes Markdown tuned for Zensical / mkdocs Material (admonitions, content tabs, mermaid).
`SourceDocParser.Docfx`	docfx config-file shim — reads + writes `docfx.json` shapes so an existing docfx site can drive the parser pipeline. No emitter; that ships separately.

Logging flows through Microsoft.Extensions.Logging.Abstractions source-generated [LoggerMessage] partials, so any host (Serilog, Console, NLog, …) plugs in without the libraries taking a dependency on a specific backend.

Quick start

var loggerFactory = LoggerFactory.Create(b => b.AddConsole());

var source = new NuGetAssemblySource(
    rootDirectory: "/path/to/repo",   // contains nuget-packages.json
    apiPath:       "/path/to/api",    // where lib/ + refs/ get extracted
    logger:        loggerFactory.CreateLogger<NuGetAssemblySource>());

var emitter = new ZensicalDocumentationEmitter();

var result = await new MetadataExtractor().RunAsync(
    source,
    outputRoot: "/path/to/markdown-output",
    emitter,
    loggerFactory.CreateLogger<MetadataExtractor>());

Console.WriteLine($"Emitted {result.PagesEmitted} pages across {result.CanonicalTypes} types.");

Performance

The pipeline is built around a span-based XML scanner, pooled buffers, eager release of memory-mapped reference DLLs, and a streaming type merger that consumes catalogs as they land. The result is a small, predictable allocation budget and a fast wall-time per assembly.

Benchmark workload. Numbers below are from the BenchmarkDotNet suite under src/benchmarks/, run on a Ryzen 7 5800X / .NET 10. The workload extracts three real NuGet packages from nuget.org — pulling each package's lib/ and ref/ trees and the matching reference assemblies, walking every public symbol across ~19 target-framework groups, parsing the shipped XML doc files for each assembly, resolving <inheritdoc/> chains, and emitting roughly 600 canonical type pages after cross-TFM merge. The local NuGet cache is warmed once during global setup so per-iteration timings measure the walk + merge + emit pipeline, not the network leg.

End-to-end (MetadataExtractor.RunAsync):

Phase	Wall time	Allocated
Full pipeline (`RunAsync`)	~1.4 s	~650 MB
Discover (NuGet config + cache scan)	~660 ms	~240 MB
Load + walk (parallel, all groups)	~1.5 s	~670 MB
Merge (cross-TFM dedup)	2 ms	~550 KB
Emit (Zensical Markdown)	79 ms	~63 MB

Peak working set is bounded too: per-TFM compilation loaders dispose as soon as their last assembly finishes walking, so the memory-mapped BCL reference views are released eagerly instead of accumulating until RunAsync exits.

Per-call hotspots:

Operation	Time	Allocated
`XmlDocToMarkdown.Convert` — plain summary	~26 ns	176 B
`XmlDocToMarkdown.Convert` — tagged with `<see>` / `<c>` / `<paramref>`	~890 ns	304 B
`XmlDocToMarkdown.Convert` — code block + bullet list	~1.1 µs	440 B
`TfmResolver.FindBestRefsTfm` — exact match	~3 ns	0 B
`TfmResolver.FindBestRefsTfm` — platform-suffix strip	~11 ns	0 B
`TfmResolver.FindBestRefsTfm` — netstandard fallback	~500 ns	1 KB
`TypeMerger.Merge` — 600 types × 3 TFMs	~120 µs	330 KB

Strategies the pipeline uses

Custom span-based XML scanner. Every NuGet package ships an <assembly-name>.xml doc file alongside its .dll, holding the /// doc comments for every public symbol. The walker has to read each member's XML fragment per symbol, render its <see> / <c> / <list> / <inheritdoc> tags into Markdown, and do the same again per <param> / <exception> inside it — for thousands of symbols per assembly. XmlReader works for that, but its XmlTextReaderImpl allocates multi-KB internal buffers (NodeData[], NamespaceManager, char buffers, Entry[]) per construction, which dominates the doc-parse profile. So the pipeline ships a small ref struct DocXmlScanner that walks the doc text directly over ReadOnlySpan<char> and implements just the XML grammar that /// doc comments actually use. Both the per-symbol parser and the Markdown renderer drive the scanner, so per-element XML processing is allocation-free apart from the result string.
Build-once-then-read-many XmlDocSource. Each .xml doc file is read once via File.ReadAllBytes + Encoding.UTF8.GetString, then indexed by per-member (offset, length) ranges. The substring is only materialised when a consumer calls Get(memberId), and the source is safe for concurrent reads under the parallel walker.
Eager per-group loader disposal. Each TFM group has its own CompilationLoader with a private MetadataReferenceCache holding memory-mapped views of every reference DLL. As soon as the last assembly in a group finishes its walk, an interlocked counter drops to zero and the loader disposes — peak working set scales with the slowest-finishing group, not the total number of groups times their references.
Streaming type merger. The parallel walk feeds ApiCatalogs into StreamingTypeMerger one at a time and immediately drops its reference, instead of accumulating every catalog in a ConcurrentBag until the walk phase finishes.
Capture-free parallel dispatch. The Parallel.ForEachAsync lambda is static — every dependency it touches is bundled into a WalkContext record attached to each work item, so dispatch never allocates a closure object per assembly.
Pooled StringBuilder on the converter. XmlDocToMarkdown is per-walk by construction; reusing a single builder across every Convert call eliminates the per-element allocation that would otherwise dominate the renderer.
Pre-sized buffers. Each nupkg zip entry is sized to its known uncompressed length up front so the backing byte[] is allocated once at the right size instead of doubling-and-copying on every Write. SourceLink URL rewriting fuses the base URL and the line anchor into one interpolated-string handler call so the GitHub / Bitbucket / GitLab / Azure DevOps blob URL is materialised in a single string.

Repository layout

SourceDocParserLib/
  src/
    SourceDocParser/
    SourceDocParser.NuGet/
    SourceDocParser.Docfx/
    SourceDocParser.Zensical/
    tests/
      SourceDocParser.Tests/             unit tests (TUnit)
      SourceDocParser.IntegrationTests/  end-to-end + Zensical render-smoke
    Directory.Build.props                shared lib config
    Directory.Packages.props             central package versions
    SourceDocParserLib.slnx
  Directory.Build.props
  version.json                           Nerdbank.GitVersioning
  .editorconfig
  stylecop.json

dotnet build from src/ packs every non-test project into artifacts/packages/ automatically (<GeneratePackageOnBuild>true</GeneratePackageOnBuild>). Consumers in other repos can wire that directory up as a local feed via nuget.config until the libraries are published.

Acknowledgements

The metadata extraction pipeline is inspired by — and lifts patterns from — dotnet/docfx (MIT licensed). docfx's Roslyn-based assembly walker, inheritdoc resolution, and overall metadata model shaped this library's design. See LICENSE for the original docfx attribution.

Built on:

Roslyn (Microsoft.CodeAnalysis.CSharp) for compilation + symbol model
ICSharpCode.Decompiler for transitive reference resolution
NuGet.Frameworks + NuGet.Versioning for proper TFM compatibility and SemVer ordering
Polly v8 for HTTP retry/rate-limit pipelines

License

MIT — see LICENSE for the full text and the docfx attribution.

Product	Compatible and additional computed target framework versions.
.NET	net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net10.0
- ICSharpCode.Decompiler (>= 10.0.0.8330)
- Microsoft.CodeAnalysis.CSharp (>= 5.3.0)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.7)
- NuGet.Frameworks (>= 7.3.1)
- Polly.RateLimiting (>= 8.6.6)
- System.Threading.RateLimiting (>= 10.0.7)

NuGet packages (4)

Showing the top 4 NuGet packages that depend on SourceDocParser:

Package	Downloads
SourceDocParser.NuGet NuGet-backed IAssemblySource for SourceDocParser. Discovers, fetches and extracts packages, then exposes their per-TFM assemblies to the parser.	2.9K
SourceDocParser.Zensical Zensical / mkdocs Material emitter for SourceDocParser. Renders the parser's ApiCatalog into a flat tree of Markdown pages tuned for the Zensical theme (admonitions, content tabs, mermaid diagrams).	2.8K
SourceDocParser.Docfx docfx compatibility for SourceDocParser. Reads and writes docfx.json shapes (metadata + build sections) so an existing docfx site can plug into the parser pipeline, and emits docfx ManagedReference YAML pages so the parser output is consumable by docfx as a drop-in replacement for its own metadata extractor.	1.2K
NuStreamDocs.CSharpApiGenerator Generate API reference pages from your .NET assemblies as part of your NuStreamDocs build. Point at NuGet packages or local DLLs and the plugin writes Markdown reference docs into your docs tree, ready to be linked from your handwritten content.	736

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
2.0.0	513	5/10/2026
1.4.2	789	5/2/2026
1.4.1	487	4/30/2026
1.3.1	231	4/28/2026
1.2.1	126	4/28/2026
1.1.1	116	4/28/2026
1.0.5	122	4/28/2026
1.0.3	123	4/28/2026
0.6.1-alpha	117	4/28/2026
0.5.1-alpha	112	4/28/2026
0.4.1-alpha	109	4/28/2026
0.3.1-alpha	119	4/27/2026
0.2.1-alpha	117	4/27/2026
0.1.23-alpha	113	4/25/2026