braid 0.5.0
dotnet add package braid --version 0.5.0
NuGet\Install-Package braid -Version 0.5.0
<PackageReference Include="braid" Version="0.5.0" />
<PackageVersion Include="braid" Version="0.5.0" />
<PackageReference Include="braid" />
paket add braid --version 0.5.0
#r "nuget: braid, 0.5.0"
#:package braid@0.5.0
#addin nuget:?package=braid&version=0.5.0
#tool nuget:?package=braid&version=0.5.0
braid
Deterministic concurrency testing for .NET libraries using explicit async probe points and replay tokens.
Find the interleaving. Copy the replay token. Keep the race fixed forever.
Tests fork logical workers, workers stop at named probes, and braid controls which worker is released next. When a race is understood, keep the reproducing interleaving as a copyable replay token. Roadmap: docs/roadmap.md.
Install
dotnet add package braid
braid targets .NET 10.
For release validation and consumer smoke-test instructions, see docs/release-process.md.
What braid does
- Runs ordinary async .NET code under a deterministic scheduler.
- Uses explicit probe points instead of runtime rewriting.
- Supports seeded random scheduling while you search for useful interleavings.
- Supports typed replay schedules with
BraidScheduleandBraidStep, and text replay schedules viaBraidSchedule.Parse/TryParse. - Reports failures with seed, iteration, schedule, trace, inner exception details, and, when a replay schedule was configured, a copyable replay token (canonical replay text) plus scheduler-state diagnostics.
What braid does not do
- It is not a
TaskSchedulerreplacement. - It does not intercept every
await. - It does not rewrite binaries.
- It is not a distributed-system test framework.
- It is not exhaustive model checking.
Quick start
using Braid;
using Xunit;
public sealed class BraidQuickStartTests
{
[Fact]
public async Task ForkProbeJoinCompletesUnderReplay()
{
var workerCompleted = false;
var options = new BraidOptions
{
Iterations = 1,
Schedule = BraidSchedule.Replay(BraidStep.Hit("worker-1", "ready")),
};
await Braid.RunAsync(
async context =>
{
context.Fork(async () =>
{
await BraidProbe.HitAsync("ready");
workerCompleted = true;
});
await context.JoinAsync();
},
options);
Assert.True(workerCompleted);
}
}
Outside a braid run, BraidProbe.HitAsync completes immediately. Inside a braid run, it becomes an explicit scheduling point.
Replay schedules
A replay schedule describes the exact worker/probe order to reproduce.
var options = new BraidOptions
{
Seed = 12345,
Iterations = 1,
Schedule = BraidSchedule.Replay(
BraidStep.Hit("worker-1", "after-read"),
BraidStep.Hit("worker-2", "after-read"),
BraidStep.Hit("worker-1", "before-write"),
BraidStep.Hit("worker-2", "before-write")),
};
Replay matching is ordinal and case-sensitive for both worker ids and probe names.
Text replay schedules
Replay schedules can also be parsed from text:
var schedule = BraidSchedule.Parse("""
hit worker-1 after-read
hit worker-2 after-read
arrive worker-1 before-write
hit worker-2 updated
release worker-1 before-write
""");
The text format is line based:
hit <worker> <probe>
arrive <worker> <probe>
release <worker> <probe>
Operation names (hit, arrive, release) are case-insensitive. Worker ids and probe names stay case-sensitive. Empty lines and full-line # comments (after trimming) are ignored; inline # comments are not supported—extra tokens are rejected.
Use BraidSchedule.TryParse when you need a non-throwing parse attempt.
ToReplayText() writes the canonical lower-case format accepted by BraidSchedule.Parse(...) for schedules that can be represented (no whitespace inside worker or probe tokens).
var text = schedule.ToReplayText();
var replay = BraidSchedule.Parse(text);
An empty typed schedule exports to an empty string; BraidSchedule.Parse still requires at least one step for non-empty text.
Replay token
A replay token is the canonical replay text Braid emits and BraidSchedule.Parse accepts—the same format as text replay schedules above. There is no separate syntax.
hit worker-1 after-read
hit worker-2 after-read
From a failing run with a typed schedule configured, export without parsing ToString():
catch (BraidRunException ex)
{
if (ex.TryGetReplayText(out var token, out _))
{
var regression = BraidSchedule.Parse(token);
// use regression in BraidOptions.Schedule
}
}
Workflow (random failure → probes → token → regression test): docs/replay-token-workflow.md.
True interleaving replay
BraidStep.Hit(worker, probe) releases a worker when it is blocked at that probe.
For stricter interleaving assertions, use the two-phase arrive/release flow:
BraidStep.Arrive(worker, probe)waits until the worker reaches the probe and keeps it blocked.BraidStep.Release(worker, probe)releases a worker that was previously held byArrive.
var options = new BraidOptions
{
Iterations = 1,
Schedule = BraidSchedule.Replay(
BraidStep.Arrive("worker-1", "cache-hit"),
BraidStep.Hit("worker-2", "mutation-done"),
BraidStep.Release("worker-1", "cache-hit")),
};
This expresses: worker-1 is already blocked at cache-hit, worker-2 mutates state, then worker-1 resumes.
Failure reproduction
When a run fails, braid wraps scheduler and callback failures in BraidRunException where appropriate.
When a typed replay schedule was configured, failure reports include a replay token (canonical replay text) you can paste into BraidSchedule.Parse(...) for a stable regression test—unless the schedule cannot be exported (for example worker or probe names containing whitespace). Prefer BraidRunException.TryGetReplayText.
Reports also include scheduler-state diagnostics when the scheduler captured them: last matched replay step, workers waiting at probes, workers held after Arrive, and unused replay steps. Sections are omitted when empty.
For random scheduling only (no replay schedule in options), failure reports do not synthesize a full replay token; use the seed and trace to investigate, then capture a typed or text schedule once you understand the interleaving.
Failure reports always aim to include:
- seed;
- iteration;
- replay schedule (when configured);
- replay text and diagnostics as described above;
- execution trace;
- original inner exception, when present.
Use the reported seed to reproduce random scheduling behavior. Once a race is understood, prefer a typed or text replay schedule for stable regression tests.
Example report:
braid run failed.
Seed: 12345
Iteration: 0
Schedule:
1. worker-1 @ after-read
2. worker-2 @ after-read
Replay text:
hit worker-1 after-read
hit worker-2 after-read
Last matched replay step:
2. hit worker-2 after-read
Trace:
1. worker-1 forked
2. worker-2 forked
3. worker-1 hit after-read
4. worker-1 released at after-read
Run lifecycle
Braid.RunAsyncawaitsJoinAsyncafter the callback completes, so an explicit finalJoinAsyncis optional.BraidContextis valid only during the activeBraid.RunAsynccallback.- A canceled
CancellationTokenpassed toBraid.RunAsyncis honored before the callback runs. - Empty callbacks complete when no replay schedule is configured.
- Non-empty replay schedules must be fully consumed.
BraidSchedule.Replay()with no steps is allowed for empty or probe-free runs.- Nested
Braid.RunAsynccalls are not supported. - Only one logical probe wait may be in flight per forked worker (including flowing child tasks on that worker); see docs/runtime-boundaries.md.
- Fork delegates must return a non-null
Task. - Probe names cannot be null, empty, or whitespace.
- Reusing one
BraidOptionsinstance across independent runs is supported. - Failure reports are scoped to the current run and iteration only.
When to use braid
Use braid when the bug is about the order of a small number of async operations and you can name the important scheduling points:
- cache, CAS, TTL, and state-machine library tests;
- race reproduction after a flaky failure is understood;
- regression tests where a specific interleaving should stay fixed;
- small async scenarios where explicit probes are acceptable.
Stress tests are still useful for discovering that something is flaky. Braid is useful when you want the failure to become deterministic, explainable, and replayable.
Implicit scheduling or runtime interception can cover code without manual probes, but it is harder to explain and is not braid's current model. Braid keeps the boundary explicit: code reaches a named BraidProbe.HitAsync, then the replay schedule decides which worker continues.
Featured examples
- Lost update — a read-modify-write race that fails under a four-step replay token; see examples/lost-update and docs/examples/lost-update.md.
- Cache / CAS race — a versioned cell where a stale compare-and-set must return
VersionMismatch; usesArrive/Hit/Release; see examples/cache-cas-race and docs/examples/cache-cas-race.md. - Cancellation before observation — cancellation wins before an operation is recorded as observed; see examples/cancellation-before-observation and docs/examples/cancellation-before-observation.md.
Supplementary example: user operation limiter shows an unsafe read/check/write interleaving next to a lock-protected fix.
Current limitations
- Explicit probes are required.
- Await interception is not automatic.
- Braid is not a
TaskSchedulerreplacement, does not rewrite binaries, and is not a distributed-system test framework; determinism stays probe-driven (see What braid does not do above). - Exhaustive search is not implemented.
- Random-run failures do not automatically include a complete replay schedule unless you configured one in
BraidOptions.Schedule.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- No dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.