QuickBench 0.1.0

dotnet add package QuickBench --version 0.1.0
                    
NuGet\Install-Package QuickBench -Version 0.1.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="QuickBench" Version="0.1.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="QuickBench" Version="0.1.0" />
                    
Directory.Packages.props
<PackageReference Include="QuickBench" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add QuickBench --version 0.1.0
                    
#r "nuget: QuickBench, 0.1.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package QuickBench@0.1.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=QuickBench&version=0.1.0
                    
Install as a Cake Addin
#tool nuget:?package=QuickBench&version=0.1.0
                    
Install as a Cake Tool

QuickBench

Give it seconds and operations — get ±1% results with machine-independent relative units.

QuickBenchBuilder.Create("MyBench")
    .Add("Sort",   () => Array.Sort(data.Clone() as int[]))
    .Add("Search", () => Array.BinarySearch(sorted, target))
    .Run(seconds: 15)
    .PrintText();
QuickBench — MyBench
======================================================================
Baseline: 92.3 ± 0.2 μs  →  1🍌 = 9.2 ns
Mode: Quick (15s)  Rounds: 152
✅ Baseline stable — results reliable

           |       μs |       🍌 |    KB
-----------+----------+----------+----------
Sort       |     12.4 |    1,345 |     0.10
Search     |     0.03 |      3.3 |     0.00

μs — wall-clock time, comparable within one run. 🍌 Bananas — CPU-normalized units, comparable across runs, power modes, machines.

API

Single operation

QuickBenchBuilder.Create(() => Array.Sort(data.Clone() as int[]))
    .Run(seconds: 15).PrintText();

Multiple operations with tags

QuickBenchBuilder.Create("Algo")
    .Add("BubbleSort", () => BubbleSort(data))
    .Add("QuickSort",  () => QuickSort(data))
    .Add("MergeSort",  () => MergeSort(data))
    .Run(seconds: 30).PrintText();

One tag → flat table: each tag = row.

Two tags (row × column)

var builder = QuickBenchBuilder.Create("Database");
foreach (var q in simpleQueries) {
    builder.Add("Simple", "Query",     () => db.Execute(q));
    builder.Add("Simple", "Serialize", () => Serialize(q));
}
foreach (var q in complexQueries) {
    builder.Add("Complex", "Query",     () => db.Execute(q));
    builder.Add("Complex", "Serialize", () => Serialize(q));
}
builder.Run(seconds: 60).PrintText();
Absolute (μs per op):
           | Query    | Serialize
-----------+----------+----------
Simple     |     0.84 |     2.10
Complex    |     45.2 |     12.8

Two tags → matrix: first tag = row, second tag = column.

Three tags (section × row × column)

builder.Add("v1", "Simple", "Build", () => BuildV1(s));
builder.Add("v2", "Simple", "Build", () => BuildV2(s));

Three tags → grouped tables: first tag = section header.

Custom operation

public class HttpBenchOp : IBenchOperation {
    public string[] Tags => new[] { "API", "GET /users" };
    public void Perform() => _client.GetAsync(_url).Wait();
}
builder.Add(new HttpBenchOp());

JSON output

var report = builder.Run(seconds: 30);
report.SaveJson("bench.json");
string json = report.ToJson();

Algorithm

Pipeline

1. Parallel warmup (2s)    All CPU cores run arithmetic loop simultaneously.
2. Calibration             Measure each slot's duration, compute batch sizes.
3. JIT warmup              Run all operations ~1s to trigger .NET tiered compilation.
4. Measurement             Shuffled round-robin, baseline interleaved every 10th round.
5. Memory measurement      GC.GetTotalAllocatedBytes before/after, median of 5 runs.
6. Statistics              Trimmed mean, block CI, practical CI (×4), propagation.

1. Parallel warmup

All CPU cores run the baseline arithmetic loop for 2 seconds simultaneously. This:

  • Heats the CPU to thermal steady state (prevents frequency drift during measurement)
  • Forces OS scheduler to assign the benchmark thread to a performance core
  • Eliminates turbo boost transients

2. Calibration

For each slot, the engine runs the operation for ~0.5s and measures average duration. From this it computes batch size — how many times to call the operation per measurement to reach ~10ms total. A 10μs operation gets batch=1000; a 100ms operation gets batch=1.

3. JIT warmup

All operations run for ~1 second total (scaled by calibrated speed). This triggers .NET's tiered JIT: Tier0 → Tier1 compilation. Minimum 30 iterations per operation (JIT Tier1 threshold).

4. Shuffled round-robin

Operations are grouped into slots by their tag combination. Each round measures all slots once. Between rounds, slot order is shuffled:

Round 1: [Parse] [Baseline] [Build] [Run]
Round 2: [Build] [Run] [Parse] [Baseline]
Round 3: [Baseline] [Parse] [Run] [Build]

Without shuffling, Build always runs after Parse with warm caches. Shuffling randomizes this — each slot sees a mix of warm and cold cache states, producing a realistic average.

Baseline is measured every 10th round at a random position within the round. This tracks CPU frequency changes without polluting L1/L2 cache every round.

5. Memory measurement

After timing completes, each slot runs once more bracketed by GC.GetTotalAllocatedBytes(true). Repeated 5 times, median taken. Reports KB allocated per operation.

6. Statistics

Drop first 10% of each slot's samples. Removes residual JIT warmup effects.

Trimmed mean (15%): sort samples, discard bottom 15% and top 15%, mean the remaining 70%. Rejects GC spikes and OS interrupts while preserving more information than median.

Block CI: samples are split into ~10 blocks of equal size. Each block's mean is computed. Standard CI formula applied to block means with t-distribution correction for small N. Blocks account for temporal autocorrelation — consecutive samples within a block may correlate (thermal drift), but block means are approximately independent.

Practical CI (×4): block CI multiplied by 4. Empirical correction calibrated on Apple M4 Pro: covers ~90% of run-to-run variance on thermally stable systems.

Propagation for bananas: banana CI combines operation uncertainty and baseline uncertainty:

δ(🍌) = 🍌 × √( (δ_op / op)² + (δ_baseline / baseline)² )

Stability detection

If baseline practical CI exceeds 3% of baseline mean → warning:

⚠️  Baseline unstable — results unreliable

Indicates background CPU load or thermal instability during measurement.

Baseline and bananas

A fixed arithmetic loop (80K iterations of integer multiply + XOR, [NoInlining]) runs as a regular slot. It defines bananas:

1 🍌 = baseline_time / 10000

Bananas normalize CPU frequency differences. On M4 Pro: ~60μs (Hi Power) / ~92μs (Low Power) → same ~2,560 🍌 on both.

Accuracy

Duration Practical CI Rounds Use case
7s ±3% ~80 Smoke test
15s ±1.5% ~160 Quick check
30s ±1% ~320 Development
60s ±0.5% ~650 Pre-commit
120s+ ±0.2% ~1300 Release

Requirements

.NET 6.0+. Zero dependencies.

Product Compatible and additional computed target framework versions.
.NET net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 is compatible.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net6.0

    • No dependencies.
  • net7.0

    • No dependencies.
  • net8.0

    • No dependencies.
  • net9.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.1.0 107 4/3/2026