AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware 1.0.0

dotnet add package AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware --version 1.0.0
                    
NuGet\Install-Package AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware -Version 1.0.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware" Version="1.0.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware" Version="1.0.0" />
                    
Directory.Packages.props
<PackageReference Include="AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware --version 1.0.0
                    
#r "nuget: AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware, 1.0.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware@1.0.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware&version=1.0.0
                    
Install as a Cake Addin
#tool nuget:?package=AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware&version=1.0.0
                    
Install as a Cake Tool

<div align="center">

🗜️ AzureAICommunity – Context Compression Middleware

A plug-and-play Microsoft.Extensions.AI middleware that automatically summarises older conversation history when the estimated token count approaches your configured limit — preventing context-window overflow while keeping the most recent messages intact.

License .NET GitHub Repo GitHub Follow YouTube Channel YouTube Subscribers LinkedIn

Long-running conversations accumulate history. Every message is sent to the LLM on every turn, so token usage grows linearly. Eventually you hit the model's context-window limit and requests start failing.

</div>

The Solution

ContextCompressionMiddleware intercepts the request before it reaches the LLM. When the estimated token count exceeds a configurable trigger threshold, the middleware:

  1. Splits the history into old and recent segments.
  2. Calls the LLM to produce a compact summary of the old segment.
  3. Replaces the old segment with a single system summary message.
  4. Forwards the compressed history to the inner client as normal.

Features

Feature Detail
Automatic compression Triggers pre-call when estimated tokens ≥ triggerRatio × maxTokens
Configurable threshold maxTokens + triggerRatio (default 80 %)
Recent-message preservation Last keepLastMessages messages always kept verbatim
Tool-pair awareness Assistant + Tool messages are kept together when splitting
Threshold callback Optional onThresholdReached — return false to block instead of compress
Custom token counter Inject any tokeniser; defaults to charCount / 4 approximation
Separate summariser Optional dedicated IChatClient for summary calls
Streaming support Works with both GetResponseAsync and GetStreamingResponseAsync

Installation

dotnet add package AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware

Quick Start

using AzureAICommunity.Agent.Middleware.ContextCompressionMiddleware;
using Microsoft.Extensions.AI;
using OllamaSharp;

IChatClient ollamaClient = new OllamaApiClient(new Uri("http://localhost:11434/"), "llama3.2");

var client = ollamaClient
    .AsBuilder()
    .Use(inner => new ContextCompressionMiddleware(
        inner,
        maxTokens: 4000,
        triggerRatio: 0.80,
        keepLastMessages: 6))
    .Build();

// Use like any other IChatClient — compression is transparent
var response = await client.GetResponseAsync(conversationHistory);

Configuration

Parameters

Parameter Type Default Description
maxTokens int 8000 Token budget for the conversation
triggerRatio double 0.80 Compression triggers at this fraction of maxTokens
keepLastMessages int 8 Recent messages kept verbatim after compression
onThresholdReached Func<CompressionInfo, bool>? null Optional callback; return false to block
tokenCounter Func<IEnumerable<ChatMessage>, int>? null Custom token estimator
summarizerClient IChatClient? null Dedicated client for summarisation calls

Threshold Callback

var client = ollamaClient
    .AsBuilder()
    .Use(inner => new ContextCompressionMiddleware(
        inner,
        maxTokens: 4000,
        onThresholdReached: info =>
        {
            Console.WriteLine($"Threshold hit: {info.TokensUsed} / {info.MaxTokens} tokens");
            return true; // true = compress, false = throw ContextCompressionThresholdException
        }))
    .Build();

Custom Token Counter

// Example using a precise tokeniser
var client = ollamaClient
    .AsBuilder()
    .Use(inner => new ContextCompressionMiddleware(
        inner,
        maxTokens: 4000,
        tokenCounter: messages =>
        {
            // supply your own counting logic here
            return messages.Sum(m => m.Text?.Length / 4 ?? 0);
        }))
    .Build();

Separate Summariser Client

// Use a cheaper model for summarisation
IChatClient summarizerClient = new OllamaApiClient(new Uri("http://localhost:11434/"), "phi3");

var client = expensiveClient
    .AsBuilder()
    .Use(inner => new ContextCompressionMiddleware(
        inner,
        maxTokens: 8000,
        summarizerClient: summarizerClient))
    .Build();

🤝 Contributing

Contributions are welcome! Please open an issue to discuss what you'd like to change before submitting a pull request.

📁 Repository: https://github.com/rvinothrajendran/AgentFramework

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Commit your changes (git commit -m 'Add my feature')
  4. Push to the branch (git push origin feature/my-feature)
  5. Open a Pull Request

👤 Author

Built and maintained by Vinoth Rajendran.


📄 License

MIT © 2026 Vinoth Rajendran – AzureAICommunity

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.0 100 4/21/2026