TokenWarden 1.0.2

dotnet add package TokenWarden --version 1.0.2
                    
NuGet\Install-Package TokenWarden -Version 1.0.2
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="TokenWarden" Version="1.0.2" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="TokenWarden" Version="1.0.2" />
                    
Directory.Packages.props
<PackageReference Include="TokenWarden" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add TokenWarden --version 1.0.2
                    
#r "nuget: TokenWarden, 1.0.2"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package TokenWarden@1.0.2
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=TokenWarden&version=1.0.2
                    
Install as a Cake Addin
#tool nuget:?package=TokenWarden&version=1.0.2
                    
Install as a Cake Tool

TokenWarden

Drop-in spend governor for .NET LLM workloads.
Real-time cost attribution, budget enforcement, and OpenTelemetry signals — via a single DelegatingHandler.

NuGet License: MIT


What it does

Every HTTP call your application makes to an LLM provider is intercepted by TokenWarden's DelegatingHandler. For each request it:

  1. Detects the provider (OpenAI, Anthropic, Gemini, xAI, DeepSeek)
  2. Extracts model name from the request
  3. Resolves attribution context (tenant, user, feature) from headers or JWT claims
  4. Checks current budget spend — blocks with 429 ProblemDetails if an Enforce budget is exceeded
  5. Forwards the request (or streams the SSE response)
  6. Parses token usage from the response (including streaming final-chunk usage)
  7. Calculates cost using a live price catalog (refreshed from CDN every 6 hours)
  8. Increments budget counters
  9. Emits OpenTelemetry Activity + Counter/Histogram metrics

Quick start

dotnet add package TokenWarden
// Program.cs
builder.Services.AddTokenWarden(opts =>
{
    opts.Budgets.AddDailyLimit(50.00m); // $50/day, global, Enforce mode
});

builder.Services.AddHttpClient<MyOpenAiClient>()
    .AddTokenWarden();

Multi-tenant SaaS example

builder.Services.AddHttpContextAccessor(); // required for UseHttpContextClaim

builder.Services.AddTokenWarden(opts =>
{
    // Attribution — read tenant_id and user_id from JWT claims
    opts.Attribution.UseHttpContextClaim("tenant_id", "tenant");
    opts.Attribution.UseHttpContextClaim("user_id",   "user");
    opts.Attribution.UseHeader("X-Feature", "feature");

    // Budget rules
    opts.Budgets
        .ForDimension("tenant")
        .Monthly(limit: 1000.00m, mode: BudgetMode.Enforce);

    opts.Budgets
        .ForDimension("user")
        .Daily(limit: 5.00m, mode: BudgetMode.Warn);

    // Live price catalog
    opts.Catalog
        .RefreshFrom("https://cdn.jsdelivr.net/gh/Aquilaone-Labs/TokenWarden@main/catalog/prices.json")
        .Every(TimeSpan.FromHours(6));

    // Redis for distributed budget tracking
    opts.Store.UseRedis(connectionString);
});

Supported providers

Provider Host Notes
OpenAI api.openai.com Streaming: injects stream_options.include_usage=true
Anthropic api.anthropic.com Streaming: accumulates message_start + message_delta usage
Google Gemini generativelanguage.googleapis.com usageMetadata in response
xAI (Grok) api.x.ai OpenAI-compatible
DeepSeek api.deepseek.com OpenAI-compatible
Azure OpenAI OpenAI-compatible Use opts.Attribution.UseHeader(...) to tag tenant

Provider examples

All examples follow the same pattern: register a typed HttpClient, call .AddTokenWarden(), and TokenWarden intercepts the call automatically based on the host URL.

OpenAI

// Registration
builder.Services.AddHttpClient<OpenAiClient>(c =>
    c.DefaultRequestHeaders.Authorization =
        new("Bearer", Environment.GetEnvironmentVariable("OPENAI_API_KEY")!))
    .AddTokenWarden();

// Typed client
public sealed class OpenAiClient(HttpClient http)
{
    public async Task<string> ChatAsync(string prompt)
    {
        var body = new
        {
            model    = "gpt-4o",
            messages = new[] { new { role = "user", content = prompt } }
        };
        var res = await http.PostAsJsonAsync("https://api.openai.com/v1/chat/completions", body);
        res.EnsureSuccessStatusCode();
        using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
        return doc!.RootElement
            .GetProperty("choices")[0].GetProperty("message").GetProperty("content")
            .GetString() ?? "";
    }
}

Anthropic

// Registration
builder.Services.AddHttpClient<AnthropicClient>(c =>
{
    c.DefaultRequestHeaders.Add("x-api-key", Environment.GetEnvironmentVariable("ANTHROPIC_API_KEY")!);
    c.DefaultRequestHeaders.Add("anthropic-version", "2023-06-01");
})
.AddTokenWarden();

// Typed client
public sealed class AnthropicClient(HttpClient http)
{
    public async Task<string> MessageAsync(string prompt)
    {
        var body = new
        {
            model      = "claude-sonnet-4-6",
            max_tokens = 1024,
            messages   = new[] { new { role = "user", content = prompt } }
        };
        var res = await http.PostAsJsonAsync("https://api.anthropic.com/v1/messages", body);
        res.EnsureSuccessStatusCode();
        using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
        return doc!.RootElement
            .GetProperty("content")[0].GetProperty("text")
            .GetString() ?? "";
    }
}

Google Gemini

// Registration
builder.Services.AddHttpClient<GeminiClient>()
    .AddTokenWarden();

// Typed client — API key goes in the URL
public sealed class GeminiClient(HttpClient http)
{
    private static readonly string ApiKey = Environment.GetEnvironmentVariable("GEMINI_API_KEY")!;

    public async Task<string> GenerateAsync(string prompt)
    {
        var url  = $"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key={ApiKey}";
        var body = new
        {
            contents = new[] { new { parts = new[] { new { text = prompt } } } }
        };
        var res = await http.PostAsJsonAsync(url, body);
        res.EnsureSuccessStatusCode();
        using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
        return doc!.RootElement
            .GetProperty("candidates")[0].GetProperty("content")
            .GetProperty("parts")[0].GetProperty("text")
            .GetString() ?? "";
    }
}

xAI (Grok)

// Registration
builder.Services.AddHttpClient<GrokClient>(c =>
    c.DefaultRequestHeaders.Authorization =
        new("Bearer", Environment.GetEnvironmentVariable("XAI_API_KEY")!))
    .AddTokenWarden();

// Typed client — OpenAI-compatible API
public sealed class GrokClient(HttpClient http)
{
    public async Task<string> ChatAsync(string prompt)
    {
        var body = new
        {
            model    = "grok-3",
            messages = new[] { new { role = "user", content = prompt } }
        };
        var res = await http.PostAsJsonAsync("https://api.x.ai/v1/chat/completions", body);
        res.EnsureSuccessStatusCode();
        using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
        return doc!.RootElement
            .GetProperty("choices")[0].GetProperty("message").GetProperty("content")
            .GetString() ?? "";
    }
}

DeepSeek

// Registration
builder.Services.AddHttpClient<DeepSeekClient>(c =>
    c.DefaultRequestHeaders.Authorization =
        new("Bearer", Environment.GetEnvironmentVariable("DEEPSEEK_API_KEY")!))
    .AddTokenWarden();

// Typed client — OpenAI-compatible API
public sealed class DeepSeekClient(HttpClient http)
{
    public async Task<string> ChatAsync(string prompt)
    {
        var body = new
        {
            model    = "deepseek-chat",
            messages = new[] { new { role = "user", content = prompt } }
        };
        var res = await http.PostAsJsonAsync("https://api.deepseek.com/v1/chat/completions", body);
        res.EnsureSuccessStatusCode();
        using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
        return doc!.RootElement
            .GetProperty("choices")[0].GetProperty("message").GetProperty("content")
            .GetString() ?? "";
    }
}

Budget modes

Mode Behavior
BudgetMode.Enforce Returns 429 application/problem+json before the request leaves the process
BudgetMode.Warn Logs a warning and emits telemetry; request is allowed

Attribution

opts.Attribution.UseHeader("X-Tenant-Id", "tenant");   // from HTTP header
opts.Attribution.UseHttpContextClaim("sub",  "user");   // from JWT claim (ASP.NET Core)
opts.Attribution.UseHeader("X-Feature", "feature");    // custom dimension

Budget store

In-memory (default): ConcurrentDictionary with per-entry SemaphoreSlim. Hopping-window approximation (small accuracy trade-off, no coordination overhead).

Redis: Activated by opts.Store.UseRedis(connectionString). Lua script ensures atomic INCRBYFLOAT + window expiry. Suitable for multi-instance deployments.

OpenTelemetry

TokenWarden emits to the TokenWarden ActivitySource and TokenWarden Meter:

Signal Type Tags
llm.request Activity gen_ai.system, gen_ai.request.model, tokenwarden.cost_usd, tokenwarden.tenant_id, tokenwarden.user_id, tokenwarden.feature
tokenwarden.tokens_consumed Counter<long> gen_ai.system, gen_ai.request.model
tokenwarden.cost_usd Counter<double> gen_ai.system, gen_ai.request.model
tokenwarden.request_latency_ms Histogram<double> gen_ai.system, gen_ai.request.model
tokenwarden.budget_breaches Counter<long> tokenwarden.budget_key, tokenwarden.budget_mode

Subscribe:

builder.Services.AddOpenTelemetry()
    .WithTracing(t => t.AddSource("TokenWarden"))
    .WithMetrics(m => m.AddMeter("TokenWarden"));

Price catalog

Prices are embedded in the package and refreshed from the jsDelivr CDN by CatalogRefreshService. On refresh failure, the last-known-good catalog is retained.

The catalog lives at catalog/prices.json.

AOT compatibility

TokenWarden targets net8.0;net9.0;net10.0 and is fully AOT-compatible (IsAotCompatible=true). All JSON deserialization uses System.Text.Json source generators. No reflection-emit.

License

MIT — see LICENSE.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.2 118 4/26/2026
1.0.1 100 4/26/2026
0.0.0-alpha.0.39 54 4/26/2026
0.0.0-alpha.0.38 63 4/26/2026
0.0.0-alpha.0.37 57 4/26/2026