AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware
1.0.0
dotnet add package AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware --version 1.0.0
NuGet\Install-Package AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware -Version 1.0.0
<PackageReference Include="AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware" Version="1.0.0" />
<PackageVersion Include="AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware" Version="1.0.0" />
<PackageReference Include="AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware" />
paket add AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware --version 1.0.0
#r "nuget: AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware, 1.0.0"
#:package AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware@1.0.0
#addin nuget:?package=AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware&version=1.0.0
#tool nuget:?package=AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware&version=1.0.0
<div align="center">
🗜️ AzureAICommunity – Context Compression Middleware
A plug-and-play Microsoft.Extensions.AI middleware that automatically summarises older conversation history when the estimated token count approaches your configured limit — preventing context-window overflow while keeping the most recent messages intact.
Long-running conversations accumulate history. Every message is sent to the LLM on every turn, so token usage grows linearly. Eventually you hit the model's context-window limit and requests start failing.
</div>
The Solution
ContextCompressionMiddleware intercepts the request before it reaches the LLM. When the estimated token count exceeds a configurable trigger threshold, the middleware:
- Splits the history into old and recent segments.
- Calls the LLM to produce a compact summary of the old segment.
- Replaces the old segment with a single
systemsummary message. - Forwards the compressed history to the inner client as normal.
Features
| Feature | Detail |
|---|---|
| Automatic compression | Triggers pre-call when estimated tokens ≥ triggerRatio × maxTokens |
| Configurable threshold | maxTokens + triggerRatio (default 80 %) |
| Recent-message preservation | Last keepLastMessages messages always kept verbatim |
| Tool-pair awareness | Assistant + Tool messages are kept together when splitting |
| Threshold callback | Optional onThresholdReached — return false to block instead of compress |
| Custom token counter | Inject any tokeniser; defaults to charCount / 4 approximation |
| Separate summariser | Optional dedicated IChatClient for summary calls |
| Streaming support | Works with both GetResponseAsync and GetStreamingResponseAsync |
Installation
dotnet add package AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware
Quick Start
using AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware;
using Microsoft.Extensions.AI;
using OllamaSharp;
IChatClient ollamaClient = new OllamaApiClient(new Uri("http://localhost:11434/"), "llama3.2");
var client = ollamaClient
.AsBuilder()
.Use(inner => new ContextCompressionMiddleware(
inner,
maxTokens: 4000,
triggerRatio: 0.80,
keepLastMessages: 6))
.Build();
// Use like any other IChatClient — compression is transparent
var response = await client.GetResponseAsync(conversationHistory);
Configuration
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
maxTokens |
int |
8000 |
Token budget for the conversation |
triggerRatio |
double |
0.80 |
Compression triggers at this fraction of maxTokens |
keepLastMessages |
int |
8 |
Recent messages kept verbatim after compression |
onThresholdReached |
Func<CompressionInfo, bool>? |
null |
Optional callback; return false to block |
tokenCounter |
Func<IEnumerable<ChatMessage>, int>? |
null |
Custom token estimator |
summarizerClient |
IChatClient? |
null |
Dedicated client for summarisation calls |
Threshold Callback
var client = ollamaClient
.AsBuilder()
.Use(inner => new ContextCompressionMiddleware(
inner,
maxTokens: 4000,
onThresholdReached: info =>
{
Console.WriteLine($"Threshold hit: {info.TokensUsed} / {info.MaxTokens} tokens");
return true; // true = compress, false = throw ContextCompressionThresholdException
}))
.Build();
Custom Token Counter
// Example using a precise tokeniser
var client = ollamaClient
.AsBuilder()
.Use(inner => new ContextCompressionMiddleware(
inner,
maxTokens: 4000,
tokenCounter: messages =>
{
// supply your own counting logic here
return messages.Sum(m => m.Text?.Length / 4 ?? 0);
}))
.Build();
Separate Summariser Client
// Use a cheaper model for summarisation
IChatClient summarizerClient = new OllamaApiClient(new Uri("http://localhost:11434/"), "phi3");
var client = expensiveClient
.AsBuilder()
.Use(inner => new ContextCompressionMiddleware(
inner,
maxTokens: 8000,
summarizerClient: summarizerClient))
.Build();
🤝 Contributing
Contributions are welcome! Please open an issue to discuss what you'd like to change before submitting a pull request.
📁 Repository: https://github.com/rvinothrajendran/AgentFramework
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Commit your changes (
git commit -m 'Add my feature') - Push to the branch (
git push origin feature/my-feature) - Open a Pull Request
👤 Author
Built and maintained by Vinoth Rajendran.
- 🐙 GitHub: github.com/rvinothrajendran — follow for more projects!
- 📺 YouTube: youtube.com/@VinothRajendran — subscribe for tutorials and demos!
- 💼 LinkedIn: linkedin.com/in/rvinothrajendran — let's connect!
📄 License
MIT © 2026 Vinoth Rajendran – AzureAICommunity
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- Microsoft.Extensions.AI (>= 10.4.1)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.0.0 | 100 | 4/21/2026 |