AgentEval 0.2.0-beta
This is a prerelease version of AgentEval.
dotnet add package AgentEval --version 0.2.0-beta
NuGet\Install-Package AgentEval -Version 0.2.0-beta
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="AgentEval" Version="0.2.0-beta" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="AgentEval" Version="0.2.0-beta" />
<PackageReference Include="AgentEval" />
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add AgentEval --version 0.2.0-beta
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: AgentEval, 0.2.0-beta"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package AgentEval@0.2.0-beta
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=AgentEval&version=0.2.0-beta&prerelease
#tool nuget:?package=AgentEval&version=0.2.0-beta&prerelease
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
AgentEval
The first .NET-native AI agent testing, evaluation, and benchmarking framework
Features
- 🎯 Tool Tracking - Monitor tool/function calls with timing and arguments
- 📊 Performance Metrics - TTFT, latency, tokens, cost estimation
- ✅ Fluent Assertions - Expressive test assertions with rich failure messages, "because" reasons, and assertion scopes
- 🔬 RAG Metrics - Faithfulness, relevance, context precision
- 📈 Benchmarking - Performance and agentic benchmarks
- 🔌 Extensible - Adapter pattern for multiple agent frameworks
- 📋 Trace-First Failure Reporting - Structured failure reports with tool timelines
- 🔧 Testing Infrastructure - FakeChatClient for mocking LLM responses
Quick Start
using AgentEval;
using AgentEval.MAF;
using AgentEval.Assertions;
// Create test harness
var harness = new MAFTestHarness(evaluatorClient);
// Run test with tool tracking
var result = await harness.RunTestAsync(agent, new TestCase
{
Name = "Feature Planning Test",
Input = "Plan a user authentication feature",
EvaluationCriteria = ["Should include security considerations"]
});
// Assert tool usage with "because" reasons
result.ToolUsage!
.Should()
.HaveCalledTool("SecurityTool", because: "auth features require security review")
.BeforeTool("FeatureTool")
.WithoutError()
.And()
.HaveNoErrors();
// Assert performance
result.Performance!
.Should()
.HaveTotalDurationUnder(TimeSpan.FromSeconds(10))
.HaveEstimatedCostUnder(0.10m);
// Use assertion scopes to collect all failures
using (new AgentEvalScope())
{
result.ToolUsage!.Should().HaveCalledTool("Tool1");
result.ToolUsage!.Should().HaveCalledTool("Tool2");
result.ActualOutput!.Should().Contain("success");
}
// Throws single exception with ALL failures listed
Rich Failure Messages
When assertions fail, you get structured, actionable output:
Expected tool 'SearchTool' to be called, but it was not because user query requires search.
Expected: Tool 'SearchTool' called at least once
Actual: Tools called: [CalculateTool, FormatTool]
Tools called:
• CalculateTool
• FormatTool
Suggestions:
→ Verify the agent has access to the expected tools
→ Check if the prompt clearly requests tool usage
Test Coverage
- 2100+ unit tests covering all core functionality
- Tests for all assertions, metrics, models, and adapters
- All tests passing ✅
Installation
dotnet add package AgentEval
Documentation
- Getting Started Guide - Quick introduction
- Fluent Assertions Guide - Complete assertion reference
- Architecture Guide - Component overview
- AgentEval Design Document - Full technical documentation
License
MIT License - See LICENSE file for details.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
-
net10.0
- Azure.AI.OpenAI (>= 2.5.0-beta.1)
- Microsoft.Agents.AI (>= 1.0.0-preview.251110.2)
- Microsoft.Extensions.AI (>= 10.0.0)
- Microsoft.Extensions.AI.Evaluation.Quality (>= 9.5.0)
- System.Numerics.Tensors (>= 10.0.0)
- YamlDotNet (>= 16.3.0)
-
net8.0
- Azure.AI.OpenAI (>= 2.5.0-beta.1)
- Microsoft.Agents.AI (>= 1.0.0-preview.251110.2)
- Microsoft.Extensions.AI (>= 10.0.0)
- Microsoft.Extensions.AI.Evaluation.Quality (>= 9.5.0)
- System.Numerics.Tensors (>= 10.0.0)
- YamlDotNet (>= 16.3.0)
-
net9.0
- Azure.AI.OpenAI (>= 2.5.0-beta.1)
- Microsoft.Agents.AI (>= 1.0.0-preview.251110.2)
- Microsoft.Extensions.AI (>= 10.0.0)
- Microsoft.Extensions.AI.Evaluation.Quality (>= 9.5.0)
- System.Numerics.Tensors (>= 10.0.0)
- YamlDotNet (>= 16.3.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.2.0-beta | 30 | 1/18/2026 |
| 0.1.1-alpha | 49 | 1/3/2026 |
| 0.1.0-alpha | 44 | 1/3/2026 |