WebReaper.Extraction.Generators 11.3.0

dotnet add package WebReaper.Extraction.Generators --version 11.3.0
                    
NuGet\Install-Package WebReaper.Extraction.Generators -Version 11.3.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="WebReaper.Extraction.Generators" Version="11.3.0">
  <PrivateAssets>all</PrivateAssets>
  <IncludeAssets>runtime; build; native; contentfiles; analyzers</IncludeAssets>
</PackageReference>
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="WebReaper.Extraction.Generators" Version="11.3.0" />
                    
Directory.Packages.props
<PackageReference Include="WebReaper.Extraction.Generators">
  <PrivateAssets>all</PrivateAssets>
  <IncludeAssets>runtime; build; native; contentfiles; analyzers</IncludeAssets>
</PackageReference>
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add WebReaper.Extraction.Generators --version 11.3.0
                    
#r "nuget: WebReaper.Extraction.Generators, 11.3.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package WebReaper.Extraction.Generators@11.3.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=WebReaper.Extraction.Generators&version=11.3.0
                    
Install as a Cake Addin
#tool nuget:?package=WebReaper.Extraction.Generators&version=11.3.0
                    
Install as a Cake Tool

WebReaper.Extraction.Generators

Roslyn source generator that emits a static Schema and a reflection-free static Materialize method on partial classes marked with [ScrapeSchema]. The .NET-native structural differentiator (REPOSITIONING-PLAN §2.3): Pydantic-parity that Python's runtime reflection structurally cannot match.

Install

You usually want both packages together (this one is a compile-time analyzer; the attributes ship in a sibling package):

dotnet add package WebReaper.Extraction.Generators
dotnet add package WebReaper.Extraction.Attributes

WebReaper.Extraction.Generators is a DevelopmentDependency=true analyzer; it does not propagate to your project's runtime dependency graph.

What's emitted

For each class marked with [ScrapeSchema], the generator emits:

public partial class Article
{
    public static Schema Schema { get; }
    public static Article Materialize(JsonObject json);
}

Schema is built once at compile time from the [ScrapeField] attributes on the class's properties. Materialize is reflection-free; the AOT publish trims and inlines it.

Quick start

using WebReaper.Extraction.Attributes;
using WebReaper.Builders;

[ScrapeSchema]
public partial class Article
{
    [ScrapeField("h1")]                                              public string? Title { get; set; }
    [ScrapeField(".views", Type = SchemaFieldType.Integer)]          public int Views { get; set; }
    [ScrapeField(".tag", IsList = true)]                             public List<string> Tags { get; set; } = new();
}

var engine = await ScraperEngineBuilder
    .Crawl("https://example.com/post")
    .Extract(Article.Schema)
    .Subscribe(p => HandleArticle(Article.Materialize(p.Data)))
    .BuildAsync();

v1 scope

Common case only:

  • Single-level schemas
  • Primitive fields (string, int, bool, DateTime, float)
  • List<T> of primitives

Nested [ScrapeSchema] types are explicitly deferred to a future version. The attributes package supports the syntax; the generator does not yet emit code for nested classes.

See also

There are no supported framework assets in this package.

Learn more about Target Frameworks and .NET Standard.

  • .NETStandard 2.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
11.3.0 91 6/6/2026
11.2.0 93 6/4/2026
11.1.2 98 6/3/2026
11.1.1 102 5/30/2026
11.1.0 97 5/30/2026
11.0.0 89 5/30/2026
10.2.0 97 5/29/2026
10.1.0 96 5/29/2026
10.0.0 99 5/26/2026

10.0.1: NuGet metadata polish. Adds PackageIcon + PackageReadmeFile so the package displays a logo and README on its NuGet page. Removes em-dashes from Description and release notes. No code changes. 10.0.0: initial release. Roslyn IIncrementalGenerator (ADR-0045) emitting a compile-time `static Schema Schema` and a reflection-free `static Materialize(JsonObject)` on partial classes marked with [ScrapeSchema]. AOT-clean: no reflection, no dynamic; the source-generator runs at compile time and emits ordinary C# the AOT publish trims and inlines. The .NET-native structural differentiator (REPOSITIONING-PLAN §2.3): Pydantic-parity Python cannot match. v1 ships the common case: single-level schemas, primitive fields, List<T> of primitives. Nested [ScrapeSchema] POCOs are explicitly deferred. Pairs with WebReaper.Extraction.Attributes; requires WebReaper 10.0.0 at runtime.