Compendium.Adapters.HuggingFace 1.0.0-preview.0

This is a prerelease version of Compendium.Adapters.HuggingFace.
dotnet add package Compendium.Adapters.HuggingFace --version 1.0.0-preview.0
                    
NuGet\Install-Package Compendium.Adapters.HuggingFace -Version 1.0.0-preview.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Compendium.Adapters.HuggingFace" Version="1.0.0-preview.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Compendium.Adapters.HuggingFace" Version="1.0.0-preview.0" />
                    
Directory.Packages.props
<PackageReference Include="Compendium.Adapters.HuggingFace" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Compendium.Adapters.HuggingFace --version 1.0.0-preview.0
                    
#r "nuget: Compendium.Adapters.HuggingFace, 1.0.0-preview.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Compendium.Adapters.HuggingFace@1.0.0-preview.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Compendium.Adapters.HuggingFace&version=1.0.0-preview.0&prerelease
                    
Install as a Cake Addin
#tool nuget:?package=Compendium.Adapters.HuggingFace&version=1.0.0-preview.0&prerelease
                    
Install as a Cake Tool

compendium-adapter-huggingface

Hugging Face AI provider adapter for the Compendium event-sourcing framework. Implements IAIProvider from Compendium.Abstractions.AI against the two main Hugging Face inference surfaces — dedicated Inference Endpoints (paid, managed) and the shared Serverless Inference API — through the OpenAI-compatible Messages API and the feature-extraction task.

Built from template-compendium-adapter-dotnet per ADR-0006 (multi-repo adapter split).

Install

dotnet add package Compendium.Adapters.HuggingFace

Quick start — dedicated Inference Endpoint (production)

services.AddCompendiumHuggingFace(opt =>
{
    opt.Mode = HuggingFaceMode.InferenceEndpoint;
    opt.HfToken = Environment.GetEnvironmentVariable("HF_TOKEN")!;
    opt.EndpointUrl = "https://<your-endpoint>.endpoints.huggingface.cloud";
});

var ai = sp.GetRequiredService<IAIProvider>();
var result = await ai.CompleteAsync(new CompletionRequest
{
    Model = string.Empty, // endpoint is bound to one model at deploy time
    Messages = new() { Message.User("What is event sourcing?") }
});

Quick start — Serverless Inference (prototyping)

services.AddCompendiumHuggingFace(opt =>
{
    opt.Mode = HuggingFaceMode.ServerlessInference;
    opt.HfToken = Environment.GetEnvironmentVariable("HF_TOKEN")!;
    opt.DefaultChatModel = "meta-llama/Meta-Llama-3.1-8B-Instruct";
    opt.DefaultEmbeddingModel = "sentence-transformers/all-MiniLM-L6-v2";
});

Mode comparison

Aspect InferenceEndpoint ServerlessInference
URL shape https://<name>.endpoints.huggingface.cloud (one per endpoint) https://api-inference.huggingface.co/models/<id>
Cost model Per-hour compute (CPU / GPU / accelerator) Free tier + Pro (per-token, shared org quota)
Rate limits Your endpoint's compute capacity only Aggressive shared limits; HTTP 429 common
Cold starts None (warm) or autoscale-to-zero opt-in (~30 s warm-up) Frequent — first call to a model often loads it
Latency Predictable Variable
Model selection One model per endpoint, fixed at deploy time Any model in the Hugging Face Hub
Production fit Yes (with autoscaling, regions, EU residency) No — prototyping / spike work only

Options

Option Required Default Notes
Mode yes InferenceEndpoint One of InferenceEndpoint, ServerlessInference
HfToken yes A hf_* user access token from https://huggingface.co/settings/tokens
EndpointUrl endpoint mode Full URL of your dedicated endpoint
ServerlessBaseUrl serverless mode https://api-inference.huggingface.co/models Override if HF moves the API
DefaultChatModel meta-llama/Meta-Llama-3.1-8B-Instruct Used as fallback when request has no Model. In endpoint mode it's sent informationally; the server uses what the endpoint is bound to
DefaultEmbeddingModel sentence-transformers/all-MiniLM-L6-v2 Same caveat
DefaultMaxTokens 1024 Floor for chat completions
TimeoutSeconds 120 Per-request HTTP timeout
RetryAttempts 3 Applied by Microsoft.Extensions.Http.Resilience
EnableLogging false Logs request/response bodies at Debug

Bind from configuration with services.AddCompendiumHuggingFace(configuration) (binds section HuggingFace).

Model recommendations

For dedicated Inference Endpoints, the deploy-time model choice matters more than anything in this adapter. Good starting points:

  • Llama-3.1-8B-Instruct — broad chat, tool-calling support via TGI.
  • Qwen-2.5-7B-Instruct — strong English + Chinese, fast on a single L4.
  • Mistral-7B-Instruct-v0.3 — small footprint, lower cost-to-serve.
  • DeepSeek-V3 / R1 — when you need reasoning depth and can afford the compute.

For embeddings, BAAI/bge-large-en-v1.5 (general English) and sentence-transformers/all-MiniLM-L6-v2 (fast, small) cover most needs.

Tool / function calling — caveat

Tool calling is best-effort. The OpenAI-compatible tools + tool_choice body fields are passed through unchanged, and modern TGI-backed endpoints (Llama-3.1+, Qwen-2.5+, Mistral instruct families) generally honor them. Non-TGI endpoints (vision, audio, classification, older runtimes) silently ignore the fields and return a plain assistant message. There is no advance signal of capability — defensive consumers should check response.GetToolCalls().Count > 0 rather than assume.

Production checklist

  • Pin to a dedicated Inference Endpoint (not serverless) for any user-facing workload.
  • Enable autoscale-to-zero if traffic is bursty — cuts idle bill to near-zero in exchange for a one-time 30-second warm-up on the first request after idle.
  • Pick the EU region (region: eu-west-1 at deploy time) if you're subject to GDPR data-residency requirements.
  • Set a Pro account budget alert — serverless can burst into the paid tier silently.
  • Cache the typed HttpClient via DI (AddCompendiumHuggingFace does this automatically).
  • Test the fallback path — your code should still function (degraded) if the endpoint is autoscaled-to-zero during cold start (HF returns HTTP 503 with {"error":"model is currently loading"}).
  • Watch for HTTP 429 on serverless and back off; AIErrors.RateLimitExceeded is surfaced as a typed Result.Failure for clean retry logic.
  • Don't ship the HF token in source. Use environment variables, secret managers, or Compendium's secret store.

Repository conventions

Aspect Choice
Target .NET 9, C# 13
Test framework xUnit 2.9.3 + FluentAssertions 6.12.1 + NSubstitute 5.1.0
Coverage 98.4 % line / 94.3 % branch (102 tests) — gate at 90 %
HTTP mocking RichardSzalay.MockHttp 7.0.0
Result pattern Result<T> from Compendium.Core
Test naming {SUT}Tests / {Method}_{Scenario}_{Expected} + AAA explicit

Build & test locally

dotnet restore
dotnet build -c Release
dotnet test  -c Release --filter "FullyQualifiedName!~IntegrationTests"

Releasing

Tag with a v prefix on main to publish to nuget.org + GitHub Packages:

git tag v1.0.0-preview.0
git push origin v1.0.0-preview.0

(Tagging is done by the orchestrator after merge; do not push tags from feature branches.)

License

MIT — Copyright © 2026 Sassy Solutions.

Product Compatible and additional computed target framework versions.
.NET net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.0-preview.0 50 5/21/2026