Mostlylucid.StyloExtract.AspNetCore 1.7.1

There is a newer prerelease version of this package available.
See the version list below for details.
dotnet add package Mostlylucid.StyloExtract.AspNetCore --version 1.7.1
                    
NuGet\Install-Package Mostlylucid.StyloExtract.AspNetCore -Version 1.7.1
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Mostlylucid.StyloExtract.AspNetCore" Version="1.7.1" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Mostlylucid.StyloExtract.AspNetCore" Version="1.7.1" />
                    
Directory.Packages.props
<PackageReference Include="Mostlylucid.StyloExtract.AspNetCore" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Mostlylucid.StyloExtract.AspNetCore --version 1.7.1
                    
#r "nuget: Mostlylucid.StyloExtract.AspNetCore, 1.7.1"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Mostlylucid.StyloExtract.AspNetCore@1.7.1
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Mostlylucid.StyloExtract.AspNetCore&version=1.7.1
                    
Install as a Cake Addin
#tool nuget:?package=Mostlylucid.StyloExtract.AspNetCore&version=1.7.1
                    
Install as a Cake Tool

Mostlylucid.StyloExtract.AspNetCore

AddStyloExtract() DI extensions for ASP.NET Core and any Microsoft.Extensions.DependencyInjection host, plus opt-in Markdown content negotiation middleware.

What this package is

The canonical way to register StyloExtract in any .NET application that uses IServiceCollection. It depends on Core, Html, Fingerprint, Templates, Heuristics, and Markdown and wires them all up correctly.

Since v1.1.0 it also ships the Markdown content negotiation suite: a global middleware, a per-action MVC attribute, and a Minimal API extension that all transparently return Markdown instead of HTML when a client sends Accept: text/markdown.

When to depend on this directly

This is the package most application code should reference directly. Add it to your web API, worker service, or console application, call AddStyloExtract(), and inject ILayoutExtractor wherever you need it.

dotnet add package Mostlylucid.StyloExtract.AspNetCore

Usage

// Program.cs (ASP.NET Core)
builder.Services.AddStyloExtract(o =>
{
    o.StorePath    = "styloextract-templates.db";
    o.HostHashKey  = Environment.GetEnvironmentVariable("STYLOEXTRACT_HMAC_KEY");
    o.DefaultProfile = ExtractionProfile.RagFull;

    o.Match.FastPathJaccardThreshold = 0.85;
    o.Match.SlowPathCosineThreshold  = 0.75;
    o.Centroid.DriftRefitThreshold   = 0.35;
});
// Inject ILayoutExtractor in a controller, service, or background worker
public class ContentService(ILayoutExtractor extractor)
{
    public async Task<string> GetMarkdownAsync(string html, Uri uri)
    {
        var result = await extractor.ExtractAsync(html, uri);
        return result.Markdown;
    }
}

Version event sink

To receive template version change events, register an ITemplateVersionEventSink before calling AddStyloExtract:

services.AddSingleton<ITemplateVersionEventSink, MyVersionEventSink>();
services.AddStyloExtract(o => { ... });

If no sink is registered, DefaultNoopVersionEventSink is used (events discarded).

Response policy framework (v1.2)

IResponsePolicy is the canonical response-transformation primitive in StyloExtract.AspNetCore. Markdown content negotiation is the first built-in policy instance; cache-hint emission is the second. The framework is modelled on IOutputCachePolicy's three-phase lifecycle.

Three phases

public interface IResponsePolicy
{
    // Pre-pipeline: parse request, configure vary semantics, store per-request state.
    ValueTask OnRequestAsync(ResponsePolicyContext context);

    // Pre-serve: short-circuit the response (e.g. serve from cache) without calling downstream.
    ValueTask OnServeAsync(ResponsePolicyContext context);

    // Post-produce: transform the buffered body, set headers, store in cache.
    ValueTask OnProducedAsync(ResponsePolicyContext context);
}

Setup

Recommended path (new in v1.2): use the fluent AddStyloExtract(Action<ResponsePolicyBuilder>) overload.

// 1. Register the core stack and Markdown negotiation.
builder.Services.AddStyloExtract(o => o.StorePath = "styloextract.db");
builder.Services.AddStyloExtractMarkdownNegotiation(o => { ... });

// 2. Register named policies via the fluent builder (recommended).
builder.Services.AddStyloExtract(b =>
{
    b.AddPolicy("md",    p => p.NegotiateMarkdown());
    b.AddPolicy("cache", p => p.CacheHints(o =>
    {
        o.MaxAge = TimeSpan.FromMinutes(5);
        o.EmitETag = true;
        o.HonorIfNoneMatch = true;
    }));
});

// 3. Wire the middleware (after UseRouting, UseAuthentication, UseAuthorization).
app.UseRouting();
app.UseStyloExtract();

If you need access to the service provider to construct policies manually, use the factory overload instead:

builder.Services.AddSingleton<ResponsePolicyOptions>(sp =>
{
    var opts = new ResponsePolicyOptions();
    opts.AddPolicy("md", sp.GetRequiredService<MarkdownNegotiationPolicy>());
    opts.AddPolicy("cache", new CacheHintPolicy(new CacheHintOptions { MaxAge = TimeSpan.FromMinutes(5) }));
    return opts;
});

Attaching policies to endpoints

// Minimal API: chain WithResponsePolicy calls in declaration order.
app.MapGet("/article", handler)
    .WithResponsePolicy("md")
    .WithResponsePolicy("cache");

// MVC controller action: use [ResponsePolicy] attribute.
[HttpGet("article")]
[ResponsePolicy("md")]
public IActionResult GetArticle() => Content(html, "text/html");

Composition

Policies run in declaration order. Each policy's OnProducedAsync sees the body as it was left by the preceding policy. When MarkdownNegotiationPolicy runs before CacheHintPolicy, the ETag is computed from the Markdown body, not the original HTML.

Backward compat (v1.1 paths still work)

All v1.1 entry points (UseStyloExtractMarkdownNegotiation, [NegotiateMarkdown], WithMarkdownNegotiation, StyloExtractResults.HtmlOrMarkdown) remain unchanged and continue to work bit-compatibly. The new MarkdownNegotiationPolicy provides equivalent functionality on the IResponsePolicy pipeline; new code should prefer it via services.AddStyloExtract(b => b.AddPolicy("md", p => p.NegotiateMarkdown(...))) and endpoint.WithResponsePolicy("md").

The framework is purely additive:

  • All v1.1.0 public API signatures are unchanged.
  • Existing AddStyloExtract(Action<StyloExtractOptions>?) signature is unchanged.

Markdown content negotiation

StyloExtract can transparently serve Markdown instead of HTML when a client sends Accept: text/markdown. Three opt-in paths are provided; choose the one that fits your app.

1. Global middleware

Call AddStyloExtractMarkdownNegotiation() in your services and UseStyloExtractMarkdownNegotiation() in your pipeline. Every HTML response on every route is subject to negotiation.

// Program.cs
builder.Services.AddStyloExtract(o => o.StorePath = "styloextract.db");
builder.Services.AddStyloExtractMarkdownNegotiation(o =>
{
    o.DefaultProfile = ExtractionProfile.RagFull;
    o.EmitVaryHeader = true;       // adds Vary: Accept to negotiated responses
    o.MaxBodyBytes   = 4 * 1024 * 1024; // skip bodies larger than 4 MB
});

// ...
app.UseRouting();
app.UseStyloExtractMarkdownNegotiation(); // after UseRouting
app.MapControllers();

A client that sends Accept: text/markdown receives Content-Type: text/markdown; charset=utf-8. All other clients receive the original HTML. The Vary: Accept header is added automatically so HTTP caches differentiate responses by content type.

2. Per-action MVC attribute

Use [NegotiateMarkdown] on a controller action or controller class when you want per-endpoint control without a global middleware.

[HttpGet("article/{id}")]
[NegotiateMarkdown(ExtractionProfile.AgentNavigation)]
public IActionResult GetArticle(int id)
{
    var html = BuildArticleHtml(id);
    return Content(html, "text/html");
}

The attribute runs as an IAsyncResultFilter. It does not require the global middleware to be registered.

3. Minimal API

Use .WithMarkdownNegotiation() on a route builder to add an endpoint filter, or use StyloExtractResults.HtmlOrMarkdown(...) to produce the right result type in the handler itself.

// Endpoint filter approach
app.MapGet("/article", () => Results.Content(BuildHtml(), "text/html"))
   .WithMarkdownNegotiation(ExtractionProfile.RagFull);

// Inline IResult approach
app.MapGet("/article", (IHttpContextAccessor acc) =>
    StyloExtractResults.HtmlOrMarkdown(BuildHtml()));

StyloExtractResults.HtmlOrMarkdown inspects Accept and calls ILayoutExtractor before the response is written, making it the simplest approach for Minimal API when you control the handler body.

Profile selection

The profile used for extraction is resolved in this order:

  1. X-Stylo-Profile request header (e.g. AgentNavigation)
  2. stylo_profile query string parameter (e.g. ?stylo_profile=RagFull)
  3. MarkdownNegotiationOptions.DefaultProfile (default: RagFull)

The header and query names are configurable via MarkdownNegotiationOptions.ProfileHeaderName and ProfileQueryName.

Query-string Accept override (v1.1.0+)

Browser clients cannot easily set custom Accept headers. The AcceptOverrideQueryName option (default: "format") maps a query-string value to a virtual Accept header, so ?format=markdown behaves identically to Accept: text/markdown for any browser.

builder.Services.AddStyloExtractMarkdownNegotiation(o =>
{
    o.AcceptOverrideQueryName = "format"; // null to disable
    // Default mappings: markdown/md => text/markdown, html => text/html,
    //                   json => application/json, text => text/plain
});

When the override fires, the response carries X-Stylo-Accept-Override: text/markdown so consumers can see it was applied.

Caching (v1.1.0+)

Enable Cache.Enabled to avoid re-extracting the same URL + profile combination on repeated requests. The implementation uses IDistributedCache (in-memory by default; inject a real distributed cache before calling AddStyloExtractMarkdownNegotiation to upgrade).

builder.Services.AddStyloExtractMarkdownNegotiation(o =>
{
    o.Cache.Enabled = true;
    o.Cache.AbsoluteExpiration = TimeSpan.FromMinutes(5);
    o.Cache.SlidingExpiration = TimeSpan.FromMinutes(2);
    o.Cache.EnableEtag = true;               // honors If-None-Match; returns 304
    o.Cache.EmitCacheControlHeader = false;  // set true for CDN-friendly Cache-Control: public
});

Cache key shape: sha256(method + "|" + scheme + "|" + host + "|" + path + "|" + sortedQuery(minus override key) + "|" + profile). The Accept override query parameter is excluded from the key so ?format=markdown and a bare Accept: text/markdown request share the same cache slot.

Response headers on Markdown responses:

Header Value
X-Stylo-Cache miss or hit
ETag SHA-256 digest of the Markdown bytes (when EnableEtag = true)
Cache-Control public, max-age=N (when EmitCacheControlHeader = true)

AOT

This package is IsAotCompatible=true. The negotiation middleware and attribute use no reflection-based JSON serialization; Markdown output is plain text. IDistributedCache and MemoryDistributedCache are both AOT-safe.


Full documentation and package family

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on Mostlylucid.StyloExtract.AspNetCore:

Package Downloads
Mostlylucid.StyloExtract.StyloBot

Bridge between StyloExtract and StyloBot's IActionPolicy registry. Provides extract-markdown / extract-headers / extract-sidecar / extract-passthrough action policies that operators reference by name from EndpointPolicy rules or [BotAction] attributes.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.8.0-alpha.8 0 6/25/2026
1.8.0-alpha.4 0 6/25/2026
1.8.0-alpha.3 0 6/25/2026
1.8.0-alpha.2 0 6/25/2026
1.8.0-alpha.1 6 6/24/2026
1.7.1 82 6/23/2026
1.7.0 88 6/23/2026
1.6.2 88 6/23/2026
1.6.1 182 6/22/2026
1.6.0 93 6/22/2026
1.5.2 93 6/22/2026
1.4.0 96 6/21/2026
1.3.0 92 6/21/2026
1.2.0 96 6/21/2026
1.1.0 93 6/21/2026
1.0.1 94 6/21/2026
1.0.0 110 6/21/2026

StyloExtract 1.7.1 - 2026-06-23
================================

Patch release. One bug fix to DomMarkdownWalker so heavily-indented
source HTML (typical of Tailwind / HTMX / framework-generated markup)
stops producing markdown that CommonMark parses as indented code blocks.

Bug
---

* DomMarkdownWalker.AppendEscapedInline preserved leading whitespace at
 line-start, so consecutive text-node visits each emitted a single
 space and accumulated to 4+ spaces ahead of links and paragraphs.
 CommonMark then parsed those lines as indented code blocks and the
 resulting markdown rendered as raw `[text](href)` text instead of
 clickable links. Now skipped at line-start; inner-paragraph whitespace
 still collapses to single spaces as before.

Real-world repro: lucidVIEW loading mostlylucid.net (HTMX-driven blog
index). Before 1.7.1 every blog-post card after the first collapsed into
a code block; after 1.7.1 each card is a styled link with its summary
as its own paragraph beneath.

----

StyloExtract 1.7.0 - 2026-06-23
================================

Structured markdown output. Previously every classified block flattened
to element.TextContent.Trim() and the renderer emitted a wall of plain
paragraphs with "# " collapsing all six heading levels. This release
makes ExtractedBlock.Markdown carry a real GFM rendition produced by
walking the block's DOM subtree.

Highlights
----------

* Heading levels H1-H6 emit one-through-six "#" characters.
* Inline content preserved: links, **bold**, *italic*, `code`, images,
 hard breaks.
* Lists, fenced code blocks (with language hint), blockquotes (single
 and multi-paragraph following GFM convention), and figures all render
 with their structure intact.
* GFM tables built from a WHATWG slot grid: colspan/rowspan respected,
 caption rendered above as bold paragraph, alignment markers derived
 from align attribute or style="text-align" via majority-vote, pipes
 escaped, newlines converted to <br>. Complex tables (multi-row thead,
 nested tables, block content in a cell) fall back to raw HTML which
 CommonMark passes through.
* Sidebar and RelatedLinks now use the DOM walker. The classic "on this
 page" TOC pattern renders as a proper markdown list with anchor links
 instead of flattening to indented text.

Performance
-----------

Walker on Apple M5 / .NET 10, full pipeline numbers in parentheses:

 Small article: 1.3 us / 8 KB    (full pipeline:  370 us /  925 KB)
 Medium doc  : 25.2 us / 72 KB   (full pipeline:  491 us /  823 KB)
 Large doc   : 34.1 us / 114 KB  (full pipeline:  642 us /  843 KB)
 Table-heavy : 69.2 us / 165 KB  (full pipeline:  641 us /  688 KB)

Walker share of ExtractAsync total time fell from 25-55% to 5-11% across
the four scenarios. ExtractAsync continues to sit well under the spec's
15ms p99 budget on a cache hit.

Compatibility
-------------

Backwards-compatible. ExtractedBlock.Text continues to project the
flattened plain-text view unchanged; the new markdown rendition is read
via ExtractedBlock.Markdown. Existing extraction profiles behave
identically; the only observable change is that the markdown emitted by
TypedMarkdownRenderer is now reader-grade rather than flat prose.

Tests
-----

329 tests across 7 projects, all green. 51 unit tests on the new walker
cover inline composition, list and code rendering, and the full GFM
table reconstruction path including the complexity-detection fallback
to raw HTML. Four end-to-end pipeline tests exercise the spec's headline
gaps (heading levels, inline links, lists, GFM tables) through
parse -> clean -> segment -> classify -> render -> SQLite.

See CHANGELOG.md for the full record.