TokenWarden 1.0.2
dotnet add package TokenWarden --version 1.0.2
NuGet\Install-Package TokenWarden -Version 1.0.2
<PackageReference Include="TokenWarden" Version="1.0.2" />
<PackageVersion Include="TokenWarden" Version="1.0.2" />
<PackageReference Include="TokenWarden" />
paket add TokenWarden --version 1.0.2
#r "nuget: TokenWarden, 1.0.2"
#:package TokenWarden@1.0.2
#addin nuget:?package=TokenWarden&version=1.0.2
#tool nuget:?package=TokenWarden&version=1.0.2
TokenWarden
Drop-in spend governor for .NET LLM workloads.
Real-time cost attribution, budget enforcement, and OpenTelemetry signals — via a single DelegatingHandler.
What it does
Every HTTP call your application makes to an LLM provider is intercepted by TokenWarden's DelegatingHandler. For each request it:
- Detects the provider (OpenAI, Anthropic, Gemini, xAI, DeepSeek)
- Extracts model name from the request
- Resolves attribution context (tenant, user, feature) from headers or JWT claims
- Checks current budget spend — blocks with
429 ProblemDetailsif an Enforce budget is exceeded - Forwards the request (or streams the SSE response)
- Parses token usage from the response (including streaming final-chunk usage)
- Calculates cost using a live price catalog (refreshed from CDN every 6 hours)
- Increments budget counters
- Emits OpenTelemetry
Activity+Counter/Histogrammetrics
Quick start
dotnet add package TokenWarden
// Program.cs
builder.Services.AddTokenWarden(opts =>
{
opts.Budgets.AddDailyLimit(50.00m); // $50/day, global, Enforce mode
});
builder.Services.AddHttpClient<MyOpenAiClient>()
.AddTokenWarden();
Multi-tenant SaaS example
builder.Services.AddHttpContextAccessor(); // required for UseHttpContextClaim
builder.Services.AddTokenWarden(opts =>
{
// Attribution — read tenant_id and user_id from JWT claims
opts.Attribution.UseHttpContextClaim("tenant_id", "tenant");
opts.Attribution.UseHttpContextClaim("user_id", "user");
opts.Attribution.UseHeader("X-Feature", "feature");
// Budget rules
opts.Budgets
.ForDimension("tenant")
.Monthly(limit: 1000.00m, mode: BudgetMode.Enforce);
opts.Budgets
.ForDimension("user")
.Daily(limit: 5.00m, mode: BudgetMode.Warn);
// Live price catalog
opts.Catalog
.RefreshFrom("https://cdn.jsdelivr.net/gh/Aquilaone-Labs/TokenWarden@main/catalog/prices.json")
.Every(TimeSpan.FromHours(6));
// Redis for distributed budget tracking
opts.Store.UseRedis(connectionString);
});
Supported providers
| Provider | Host | Notes |
|---|---|---|
| OpenAI | api.openai.com |
Streaming: injects stream_options.include_usage=true |
| Anthropic | api.anthropic.com |
Streaming: accumulates message_start + message_delta usage |
| Google Gemini | generativelanguage.googleapis.com |
usageMetadata in response |
| xAI (Grok) | api.x.ai |
OpenAI-compatible |
| DeepSeek | api.deepseek.com |
OpenAI-compatible |
| Azure OpenAI | OpenAI-compatible | Use opts.Attribution.UseHeader(...) to tag tenant |
Provider examples
All examples follow the same pattern: register a typed HttpClient, call .AddTokenWarden(), and TokenWarden intercepts the call automatically based on the host URL.
OpenAI
// Registration
builder.Services.AddHttpClient<OpenAiClient>(c =>
c.DefaultRequestHeaders.Authorization =
new("Bearer", Environment.GetEnvironmentVariable("OPENAI_API_KEY")!))
.AddTokenWarden();
// Typed client
public sealed class OpenAiClient(HttpClient http)
{
public async Task<string> ChatAsync(string prompt)
{
var body = new
{
model = "gpt-4o",
messages = new[] { new { role = "user", content = prompt } }
};
var res = await http.PostAsJsonAsync("https://api.openai.com/v1/chat/completions", body);
res.EnsureSuccessStatusCode();
using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
return doc!.RootElement
.GetProperty("choices")[0].GetProperty("message").GetProperty("content")
.GetString() ?? "";
}
}
Anthropic
// Registration
builder.Services.AddHttpClient<AnthropicClient>(c =>
{
c.DefaultRequestHeaders.Add("x-api-key", Environment.GetEnvironmentVariable("ANTHROPIC_API_KEY")!);
c.DefaultRequestHeaders.Add("anthropic-version", "2023-06-01");
})
.AddTokenWarden();
// Typed client
public sealed class AnthropicClient(HttpClient http)
{
public async Task<string> MessageAsync(string prompt)
{
var body = new
{
model = "claude-sonnet-4-6",
max_tokens = 1024,
messages = new[] { new { role = "user", content = prompt } }
};
var res = await http.PostAsJsonAsync("https://api.anthropic.com/v1/messages", body);
res.EnsureSuccessStatusCode();
using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
return doc!.RootElement
.GetProperty("content")[0].GetProperty("text")
.GetString() ?? "";
}
}
Google Gemini
// Registration
builder.Services.AddHttpClient<GeminiClient>()
.AddTokenWarden();
// Typed client — API key goes in the URL
public sealed class GeminiClient(HttpClient http)
{
private static readonly string ApiKey = Environment.GetEnvironmentVariable("GEMINI_API_KEY")!;
public async Task<string> GenerateAsync(string prompt)
{
var url = $"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key={ApiKey}";
var body = new
{
contents = new[] { new { parts = new[] { new { text = prompt } } } }
};
var res = await http.PostAsJsonAsync(url, body);
res.EnsureSuccessStatusCode();
using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
return doc!.RootElement
.GetProperty("candidates")[0].GetProperty("content")
.GetProperty("parts")[0].GetProperty("text")
.GetString() ?? "";
}
}
xAI (Grok)
// Registration
builder.Services.AddHttpClient<GrokClient>(c =>
c.DefaultRequestHeaders.Authorization =
new("Bearer", Environment.GetEnvironmentVariable("XAI_API_KEY")!))
.AddTokenWarden();
// Typed client — OpenAI-compatible API
public sealed class GrokClient(HttpClient http)
{
public async Task<string> ChatAsync(string prompt)
{
var body = new
{
model = "grok-3",
messages = new[] { new { role = "user", content = prompt } }
};
var res = await http.PostAsJsonAsync("https://api.x.ai/v1/chat/completions", body);
res.EnsureSuccessStatusCode();
using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
return doc!.RootElement
.GetProperty("choices")[0].GetProperty("message").GetProperty("content")
.GetString() ?? "";
}
}
DeepSeek
// Registration
builder.Services.AddHttpClient<DeepSeekClient>(c =>
c.DefaultRequestHeaders.Authorization =
new("Bearer", Environment.GetEnvironmentVariable("DEEPSEEK_API_KEY")!))
.AddTokenWarden();
// Typed client — OpenAI-compatible API
public sealed class DeepSeekClient(HttpClient http)
{
public async Task<string> ChatAsync(string prompt)
{
var body = new
{
model = "deepseek-chat",
messages = new[] { new { role = "user", content = prompt } }
};
var res = await http.PostAsJsonAsync("https://api.deepseek.com/v1/chat/completions", body);
res.EnsureSuccessStatusCode();
using var doc = await res.Content.ReadFromJsonAsync<JsonDocument>();
return doc!.RootElement
.GetProperty("choices")[0].GetProperty("message").GetProperty("content")
.GetString() ?? "";
}
}
Budget modes
| Mode | Behavior |
|---|---|
BudgetMode.Enforce |
Returns 429 application/problem+json before the request leaves the process |
BudgetMode.Warn |
Logs a warning and emits telemetry; request is allowed |
Attribution
opts.Attribution.UseHeader("X-Tenant-Id", "tenant"); // from HTTP header
opts.Attribution.UseHttpContextClaim("sub", "user"); // from JWT claim (ASP.NET Core)
opts.Attribution.UseHeader("X-Feature", "feature"); // custom dimension
Budget store
In-memory (default): ConcurrentDictionary with per-entry SemaphoreSlim. Hopping-window approximation (small accuracy trade-off, no coordination overhead).
Redis: Activated by opts.Store.UseRedis(connectionString). Lua script ensures atomic INCRBYFLOAT + window expiry. Suitable for multi-instance deployments.
OpenTelemetry
TokenWarden emits to the TokenWarden ActivitySource and TokenWarden Meter:
| Signal | Type | Tags |
|---|---|---|
llm.request |
Activity | gen_ai.system, gen_ai.request.model, tokenwarden.cost_usd, tokenwarden.tenant_id, tokenwarden.user_id, tokenwarden.feature |
tokenwarden.tokens_consumed |
Counter<long> | gen_ai.system, gen_ai.request.model |
tokenwarden.cost_usd |
Counter<double> | gen_ai.system, gen_ai.request.model |
tokenwarden.request_latency_ms |
Histogram<double> | gen_ai.system, gen_ai.request.model |
tokenwarden.budget_breaches |
Counter<long> | tokenwarden.budget_key, tokenwarden.budget_mode |
Subscribe:
builder.Services.AddOpenTelemetry()
.WithTracing(t => t.AddSource("TokenWarden"))
.WithMetrics(m => m.AddMeter("TokenWarden"));
Price catalog
Prices are embedded in the package and refreshed from the jsDelivr CDN by CatalogRefreshService. On refresh failure, the last-known-good catalog is retained.
The catalog lives at catalog/prices.json.
AOT compatibility
TokenWarden targets net8.0;net9.0;net10.0 and is fully AOT-compatible (IsAotCompatible=true). All JSON deserialization uses System.Text.Json source generators. No reflection-emit.
License
MIT — see LICENSE.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- OpenTelemetry.Api (>= 1.15.3)
- StackExchange.Redis (>= 2.8.16)
-
net8.0
- OpenTelemetry.Api (>= 1.15.3)
- StackExchange.Redis (>= 2.8.16)
-
net9.0
- OpenTelemetry.Api (>= 1.15.3)
- StackExchange.Redis (>= 2.8.16)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.0.2 | 118 | 4/26/2026 |
| 1.0.1 | 100 | 4/26/2026 |
| 0.0.0-alpha.0.39 | 54 | 4/26/2026 |
| 0.0.0-alpha.0.38 | 63 | 4/26/2026 |
| 0.0.0-alpha.0.37 | 57 | 4/26/2026 |