Compendium.Adapters.HuggingFace 1.0.0-preview.0

This is a prerelease version of Compendium.Adapters.HuggingFace.

dotnet add package Compendium.Adapters.HuggingFace --version 1.0.0-preview.0

NuGet\Install-Package Compendium.Adapters.HuggingFace -Version 1.0.0-preview.0

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="Compendium.Adapters.HuggingFace" Version="1.0.0-preview.0" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="Compendium.Adapters.HuggingFace" Version="1.0.0-preview.0" />
                    

                            Directory.Packages.props

<PackageReference Include="Compendium.Adapters.HuggingFace" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add Compendium.Adapters.HuggingFace --version 1.0.0-preview.0

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: Compendium.Adapters.HuggingFace, 1.0.0-preview.0"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package Compendium.Adapters.HuggingFace@1.0.0-preview.0

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=Compendium.Adapters.HuggingFace&version=1.0.0-preview.0&prerelease
                    

                            Install as a Cake Addin

#tool nuget:?package=Compendium.Adapters.HuggingFace&version=1.0.0-preview.0&prerelease
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

`compendium-adapter-huggingface`

Hugging Face AI provider adapter for the Compendium event-sourcing framework. Implements IAIProvider from Compendium.Abstractions.AI against the two main Hugging Face inference surfaces — dedicated Inference Endpoints (paid, managed) and the shared Serverless Inference API — through the OpenAI-compatible Messages API and the feature-extraction task.

Built from template-compendium-adapter-dotnet per ADR-0006 (multi-repo adapter split).

Install

dotnet add package Compendium.Adapters.HuggingFace

Quick start — dedicated Inference Endpoint (production)

services.AddCompendiumHuggingFace(opt =>
{
    opt.Mode = HuggingFaceMode.InferenceEndpoint;
    opt.HfToken = Environment.GetEnvironmentVariable("HF_TOKEN")!;
    opt.EndpointUrl = "https://<your-endpoint>.endpoints.huggingface.cloud";
});

var ai = sp.GetRequiredService<IAIProvider>();
var result = await ai.CompleteAsync(new CompletionRequest
{
    Model = string.Empty, // endpoint is bound to one model at deploy time
    Messages = new() { Message.User("What is event sourcing?") }
});

Quick start — Serverless Inference (prototyping)

services.AddCompendiumHuggingFace(opt =>
{
    opt.Mode = HuggingFaceMode.ServerlessInference;
    opt.HfToken = Environment.GetEnvironmentVariable("HF_TOKEN")!;
    opt.DefaultChatModel = "meta-llama/Meta-Llama-3.1-8B-Instruct";
    opt.DefaultEmbeddingModel = "sentence-transformers/all-MiniLM-L6-v2";
});

Mode comparison

Aspect	`InferenceEndpoint`	`ServerlessInference`
URL shape	`https://<name>.endpoints.huggingface.cloud` (one per endpoint)	`https://api-inference.huggingface.co/models/<id>`
Cost model	Per-hour compute (CPU / GPU / accelerator)	Free tier + Pro (per-token, shared org quota)
Rate limits	Your endpoint's compute capacity only	Aggressive shared limits; HTTP 429 common
Cold starts	None (warm) or autoscale-to-zero opt-in (~30 s warm-up)	Frequent — first call to a model often loads it
Latency	Predictable	Variable
Model selection	One model per endpoint, fixed at deploy time	Any model in the Hugging Face Hub
Production fit	Yes (with autoscaling, regions, EU residency)	No — prototyping / spike work only

Options

Option	Required	Default	Notes
`Mode`	yes	`InferenceEndpoint`	One of `InferenceEndpoint`, `ServerlessInference`
`HfToken`	yes	—	A `hf_*` user access token from https://huggingface.co/settings/tokens
`EndpointUrl`	endpoint mode	—	Full URL of your dedicated endpoint
`ServerlessBaseUrl`	serverless mode	`https://api-inference.huggingface.co/models`	Override if HF moves the API
`DefaultChatModel`	—	`meta-llama/Meta-Llama-3.1-8B-Instruct`	Used as fallback when request has no `Model`. In endpoint mode it's sent informationally; the server uses what the endpoint is bound to
`DefaultEmbeddingModel`	—	`sentence-transformers/all-MiniLM-L6-v2`	Same caveat
`DefaultMaxTokens`	—	`1024`	Floor for chat completions
`TimeoutSeconds`	—	`120`	Per-request HTTP timeout
`RetryAttempts`	—	`3`	Applied by `Microsoft.Extensions.Http.Resilience`
`EnableLogging`	—	`false`	Logs request/response bodies at `Debug`

Bind from configuration with services.AddCompendiumHuggingFace(configuration) (binds section HuggingFace).

Model recommendations

For dedicated Inference Endpoints, the deploy-time model choice matters more than anything in this adapter. Good starting points:

Llama-3.1-8B-Instruct — broad chat, tool-calling support via TGI.
Qwen-2.5-7B-Instruct — strong English + Chinese, fast on a single L4.
Mistral-7B-Instruct-v0.3 — small footprint, lower cost-to-serve.
DeepSeek-V3 / R1 — when you need reasoning depth and can afford the compute.

For embeddings, BAAI/bge-large-en-v1.5 (general English) and sentence-transformers/all-MiniLM-L6-v2 (fast, small) cover most needs.

Tool / function calling — caveat

Tool calling is best-effort. The OpenAI-compatible tools + tool_choice body fields are passed through unchanged, and modern TGI-backed endpoints (Llama-3.1+, Qwen-2.5+, Mistral instruct families) generally honor them. Non-TGI endpoints (vision, audio, classification, older runtimes) silently ignore the fields and return a plain assistant message. There is no advance signal of capability — defensive consumers should check response.GetToolCalls().Count > 0 rather than assume.

Production checklist

Pin to a dedicated Inference Endpoint (not serverless) for any user-facing workload.
Enable autoscale-to-zero if traffic is bursty — cuts idle bill to near-zero in exchange for a one-time 30-second warm-up on the first request after idle.
Pick the EU region (region: eu-west-1 at deploy time) if you're subject to GDPR data-residency requirements.
Set a Pro account budget alert — serverless can burst into the paid tier silently.
Cache the typed HttpClient via DI (AddCompendiumHuggingFace does this automatically).
Test the fallback path — your code should still function (degraded) if the endpoint is autoscaled-to-zero during cold start (HF returns HTTP 503 with {"error":"model is currently loading"}).
Watch for HTTP 429 on serverless and back off; AIErrors.RateLimitExceeded is surfaced as a typed Result.Failure for clean retry logic.
Don't ship the HF token in source. Use environment variables, secret managers, or Compendium's secret store.

Repository conventions

Aspect	Choice
Target	.NET 9, C# 13
Test framework	xUnit 2.9.3 + FluentAssertions 6.12.1 + NSubstitute 5.1.0
Coverage	98.4 % line / 94.3 % branch (102 tests) — gate at 90 %
HTTP mocking	`RichardSzalay.MockHttp` 7.0.0
Result pattern	`Result<T>` from `Compendium.Core`
Test naming	`{SUT}Tests` / `{Method}_{Scenario}_{Expected}` + AAA explicit

Build & test locally

dotnet restore
dotnet build -c Release
dotnet test  -c Release --filter "FullyQualifiedName!~IntegrationTests"

Releasing

Tag with a v prefix on main to publish to nuget.org + GitHub Packages:

git tag v1.0.0-preview.0
git push origin v1.0.0-preview.0

(Tagging is done by the orchestrator after merge; do not push tags from feature branches.)

License

Product	Compatible and additional computed target framework versions.
.NET	net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Product

.NET

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net9.0
- Compendium.Abstractions.AI (>= 1.0.1)
- Compendium.Core (>= 1.0.1)
- Microsoft.Extensions.Configuration.Abstractions (>= 9.0.16)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.16)
- Microsoft.Extensions.Http (>= 9.0.16)
- Microsoft.Extensions.Http.Resilience (>= 9.0.0)
- Microsoft.Extensions.Logging.Abstractions (>= 9.0.16)
- Microsoft.Extensions.Options (>= 9.0.16)
- Microsoft.Extensions.Options.ConfigurationExtensions (>= 9.0.16)
- Microsoft.Extensions.Options.DataAnnotations (>= 9.0.16)
- System.Text.Json (>= 9.0.16)

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
1.0.0-preview.0	69	5/21/2026