Butters 1.1.49
dotnet tool install --global Butters --version 1.1.49
dotnet new tool-manifest
dotnet tool install --local Butters --version 1.1.49
#tool dotnet:?package=Butters&version=1.1.49
nuke :add-package Butters --version 1.1.49
Butters
Butters is a dotnet tool that runs an autonomous development loop using GitHub Copilot CLI. It reads your project config from butters/butters.config.json and user stories from butters/butters.stories.json, evaluates their complexity, selects the right AI model for each, and iterates through them with an adaptive multi-pass cycle — all displayed in a live terminal dashboard.
You can generate your stories through a conversational AI interview wizard (butters generate) instead of writing them by hand. The wizard uses a two-stage pipeline — an Architect agent drafts a structured project plan, then a Decomposer agent translates that plan into user stories.
Requirements
- .NET 10 SDK
- GitHub Copilot CLI (
gh extension install github/gh-copilot) - A terminal at least 120 × 40 characters
Installation
As a global tool (from NuGet)
dotnet tool install --global Butters
As a local tool (per project)
dotnet new tool-manifest # creates .config/dotnet-tools.json
dotnet tool install --local Butters
From source
git clone https://github.com/DL87/Butters
cd Butters
dotnet pack src/Butters.Console -c Release -o ./nupkg
dotnet tool install --global --add-source ./nupkg Butters
Quick Start
There are two ways to get started: let the AI interview wizard create your stories, or write them by hand.
Option A — AI Story Generator (recommended)
Run this in your project root to launch a conversational story generation wizard:
butters generate
The wizard will:
- Ask how familiar you are with programming (adjusts its questions accordingly)
- Template selection — non-technical users are shown pre-built project templates (CRUD App, REST API, CLI Tool, Library) with guided scope and seed questions
- Import existing documents — optionally import a spec, requirements doc, or notes file as seed context for the interview
- Interview you about your project idea — what you want to build, who will use it, and any constraints
- Automatically pick the right tech stack (language, framework, commands) if you're non-technical
- Plan — an Architect agent drafts a structured project plan (goals, scope, phases, risks, open questions)
- Decompose — a Decomposer agent translates the plan into ordered user stories with dependencies; open questions become Spike stories that run first
- Let you review and refine the plan and stories in a feedback loop (plan-level feedback triggers re-planning; a diff summary shows what changed each round)
- Write
butters/butters.plan.json,butters/plan.md,butters/butters.config.json, andbutters/butters.stories.json
The wizard saves progress automatically — if interrupted, running butters generate again resumes where you left off.
Then run butters to start the loop.
Option B — Manual setup
1. Initialise a project
Run this in your project root:
butters init
This creates:
butters/
butters.config.json ← project context, MCP servers, skills
butters.stories.json ← your user stories
prompt.md ← instructions sent to the AI agent
state/
progress.txt ← loop state (iteration, current story, pass)
It auto-detects your build and test commands (dotnet build, npm run build, etc.) and your current git branch.
2. Edit butters/butters.config.json and butters/butters.stories.json
Fill in your project config:
// butters/butters.config.json
{
"branchName": "feature/my-feature",
"taskDescription": "Build a greeting API",
"projectContext": {
"framework": "ASP.NET Core",
"language": "C#",
"buildCommand": "dotnet build",
"testCommand": "dotnet test",
"runCommand": "dotnet run"
},
"mcpServers": {
"postgres": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres"],
"env": { "DATABASE_URL": "postgresql://localhost/mydb" }
}
},
"useServers": ["github", "filesystem"],
"useSkills": ["code-review"],
"costMapping": {},
"parallelLanes": 1,
"consolidateEveryNStories": 0,
"autoSplitOnFailures": 0
}
Then add your user stories:
// butters/butters.stories.json
{
"userStories": [
{
"id": "US-001",
"title": "Add greeting endpoint",
"description": "Create a GET /greet endpoint that returns Hello, World!",
"acceptanceCriteria": [
"Endpoint returns 200 OK",
"Response body is 'Hello, World!'"
],
"priority": 1,
"passes": false
}
]
}
The MCP/Skills fields in the config are optional — omit them entirely if you don't need them.
3. Run
butters
The terminal UI launches and you can start, pause, resume, or stop the loop from within it.
How It Works
Complexity evaluation and dependency detection
Before starting a new story, Butters evaluates it using a lightweight model (Haiku). It classifies the story as Simple, Standard, or Complex to select the right models and pass count — and it also analyses the other incomplete stories to detect which ones this story directly depends on.
Detected dependencies are written to butters.stories.json automatically. If you've already set dependsOn manually, your values are never overwritten. Detected dependencies are respected by the parallel lane scheduler — a story is never assigned to a lane until all its dependencies are complete.
Adaptive pass cycle
| Complexity | Type | Passes | Flow |
|---|---|---|---|
| Simple | Normal | 2 | BUILD → VALIDATE |
| Standard | Normal | 4 | TEST → BUILD → REFINE → VALIDATE |
| Complex | Normal | 5 | TEST → BUILD → REFINE → HARDEN → VALIDATE |
| Any | Spike | 1 | RESEARCH |
| Pass | Purpose |
|---|---|
| TEST | Writes failing tests from the story's acceptance criteria before any implementation (red phase of TDD) |
| BUILD | Implements the story from scratch and verifies it compiles |
| REFINE | Fixes errors, improves quality, runs tests |
| HARDEN | Edge cases, error handling, input validation (complex stories only) |
| VALIDATE | Verifies every acceptance criterion; marks the story complete only if all pass |
| RESEARCH | Investigates a technical question and documents findings in butters/learnings.md (Spike stories only — auto-completes after this pass) |
Model escalation on failure
When a pass fails, the next retry automatically uses a more powerful model instead of repeating with the same one. The escalation chain is: haiku → gpt-5-mini → sonnet → opus. On success, escalation resets so the next pass starts with its normal model.
Learnings feedback
Learnings from previous stories (butters/learnings.md) are injected into every prompt, so the agent avoids repeating past mistakes and applies discovered conventions.
MCP servers and skills
Butters supports two ways to give the Copilot agent access to MCP servers and skills:
Inline definitions (mcpServers) — for servers not already installed anywhere. Butters writes these to .vscode/mcp.json in your project before the loop starts (prefixed with butters- so they don't collide with your own entries). gh copilot discovers and starts them automatically.
"mcpServers": {
"postgres": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres"],
"env": { "DATABASE_URL": "postgresql://localhost/mydb" }
}
}
Name-only references (useServers, useSkills) — for servers and skills already installed globally or at the workspace level. Butters injects these into every prompt so the agent knows to use them.
"useServers": ["github", "filesystem"],
"useSkills": ["code-review"]
All three fields are optional and can be omitted entirely.
Git checkpoints
When running in a git repo, Butters creates a checkpoint commit before each pass. On success, the checkpoint is amended into a clean commit. On failure, it rolls back to the checkpoint so retries start from a clean state.
If a story fails VALIDATE three times in a row, Butters rolls back all its changes and blocks the story automatically. Blocked stories are skipped and can be reviewed or manually unblocked in butters.stories.json.
Pre-flight checks
Before starting the loop, Butters runs a set of pre-flight checks and logs the results:
| Check | Type | What it verifies |
|---|---|---|
gh CLI |
Critical | gh is installed and on PATH |
| Copilot extension | Critical | gh copilot is available |
| Build command | Critical | The configured build command exits 0 |
| Git repo | Info | Whether checkpoints are available |
| Test command | Warning | The configured test command is runnable |
| MCP servers | Warning | Any configured MCP server commands are on PATH |
Critical failures halt the loop before any work begins. Warnings are logged but do not block execution.
Parallel lanes
Set parallelLanes to 2–4 to work on multiple independent stories simultaneously. Butters assigns each story its own lane and rotates through them in round-robin — so while one story waits for a model response, others keep progressing. Stories with unmet dependencies are never picked until their dependencies are complete.
Learnings consolidation
Set consolidateEveryNStories to a positive number (e.g. 5) to automatically consolidate butters/learnings.md every N completed stories. A lightweight model deduplicates and compresses accumulated learnings so the prompt stays focused. The original is backed up to learnings.md.bak.
After consolidation, Butters optionally re-plans any remaining stories using the updated learnings. The Architect and Decomposer re-run with the fresher context; completed stories are never touched. Active lanes are evicted and reassigned from the revised story list. A replanCount field in butters/state/progress.txt tracks how many re-plans have occurred.
Auto-split on failures
Set autoSplitOnFailures to a positive number (e.g. 3) to automatically have a lightweight model suggest how to split a repeatedly failing story into 2–3 smaller ones. The original story is blocked and the new stories are added to the backlog.
Budget guardrails
Each model has a cost weight (opus=10, sonnet=5, gpt-5-mini=2, haiku=1). Set costMapping in your config to track real currency costs per invocation alongside weight-based costs. When a BudgetLimit is set, the loop automatically pauses when the limit is reached — you can resume manually to keep going.
Changelog auto-generation
When all stories complete, Butters automatically generates a CHANGELOG.md in your project root summarising every completed story. No manual action needed.
Story quick actions (removed in latest release)
Story management (blocking, priority, manual completion) is managed directly in butters.stories.json. The dashboard focuses on monitoring and loop control.
Dashboard
The terminal dashboard shows three live panels at the top:
- Progress — open vs. done story counts, loop status, current story
- Config — active models for each pass tier
- Tools — configured MCP servers and skills
Below that is a scrollable Output log. Use Page Up / Page Down to scroll through the full history.
The action bar at the bottom adapts to the current state:
| State | Buttons shown |
|---|---|
| No stories loaded | [ Create Stories ] |
| Stories exist, loop idle | [ Start ] [ Evaluate Stories ] |
| Loop running | [ Pause ] [ Stop ] |
| Loop paused | [ Resume ] [ Stop ] |
| Evaluating | (status text) [ Stop ] |
[ Create Stories ] opens the AI story generation wizard in a terminal window and disappears once stories are present.
Each pass invokes gh copilot --yolo with your prompt and the current story context. Progress is persisted to butters/state/progress.txt so the loop can be safely stopped and resumed at any point.
Story Generator deep-dive
butters generate is a fully interactive wizard that interviews you and writes both config and stories automatically. It uses a two-stage agent pipeline so that planning and story decomposition are separate, reviewable steps. The wizard saves session state after every step — if you close the terminal, re-running butters generate resumes from where you left off.
Wizard flow
- Skill check — you choose between non-technical (Butters picks the stack) and technical (Butters asks about language, framework, commands)
- Template selection (non-technical only) — pick from pre-built project templates to seed the interview with relevant questions and scope hints, or skip to describe your own idea from scratch
- Import existing document (optional) — enter a path to a spec, requirements, or notes file; the first 8000 characters are injected as context before the interview begins. Skip to proceed without importing.
- Freeform interview — a multi-turn AI conversation to gather project scope, users, core features, and constraints
- Planning (Architect agent) — synthesises the interview into a structured plan: goals, scope (in/out), personas, architecture notes, phases, risks, assumptions, and open questions. Saved as
butters/butters.plan.jsonwith a human-readablebutters/plan.mdrendered alongside it. - Decomposition (Decomposer agent) — translates the plan into ordered user stories with IDs, descriptions, acceptance criteria, priorities, and dependency chains. Stories follow the plan's phase ordering and stay within its scope boundaries. Open questions in the plan are turned into Spike stories that run first.
- Refinement loop — review the draft stories and provide written feedback; repeat as many times as needed. Feedback containing plan-level keywords (e.g. "scope", "architecture", "goal") automatically re-runs both agents; other feedback only re-runs the Decomposer. A plan diff summary is shown each round so you can see what changed.
- Acceptance — type
done(or press Accept) to write the final stories tobutters/
Domain templates
Non-technical users are offered four built-in templates to choose from:
| Template | Description |
|---|---|
| CRUD App | Create, read, update, delete — task managers, contact books, inventory systems |
| REST API | Backend API with endpoints, authentication, and OpenAPI docs |
| CLI Tool | Command-line program with subcommands, flags, and text output |
| Library / Package | Reusable code packaged for other developers (NuGet, npm) |
Each template pre-loads domain-specific seed questions and scope hints into the interview so the Architect has rich context from the first message.
What gets written
| File | Contents |
|---|---|
butters/butters.plan.json |
Structured project plan (JSON source of truth) |
butters/plan.md |
Human-readable plan (auto-rendered from JSON) |
butters/butters.config.json |
Branch, task description, detected tech stack, build/test/run commands |
butters/butters.stories.json |
All generated user stories |
butters/prompt.md |
Default agent instructions (if not already present) |
.github/ |
Story-generator agent scaffold (if not already present) |
butters/state/generation_session.json |
Auto-saved wizard session (deleted on completion) |
butters/state/plan.round-N.md |
Plan snapshots taken before each re-plan (last 5 kept) |
Existing files are backed up to .bak before being overwritten.
The plan is also referenced during the iteration loop — if butters/plan.md exists, the autonomous agent reads it before each story for architectural alignment.
Configuration
Default values are used unless overridden. To customise, inject AppConfig via DI or edit the defaults in AppConfig.cs:
| Setting | Default | Description |
|---|---|---|
ModelTier1 |
claude-sonnet-4.6 |
Model for BUILD pass (standard stories) |
ModelTier2 |
claude-haiku-4.5 |
Model for TEST/RESEARCH passes and complexity evaluation |
ModelTier3 |
gpt-5-mini |
Model for VALIDATE pass |
ModelSimple |
claude-haiku-4.5 |
Model used for all passes on simple stories |
ModelComplex |
claude-opus-4.6 |
Model used for BUILD/REFINE/HARDEN on complex stories |
TimeoutSeconds |
7200 |
Hard timeout per pass in seconds (2 hours) |
MaxConsecutiveFailures |
3 |
Stop loop after this many consecutive failures |
MaxValidateRetries |
3 |
Roll back and block a story after this many VALIDATE failures |
BudgetLimit |
0 |
Cost budget (0 = unlimited). Pauses loop when exceeded |
ContinueOnError |
false |
Keep running after max consecutive failures |
DemoMode |
false |
Simulate passes without calling Copilot CLI |
ParallelLanes |
1 |
Number of stories to work on simultaneously (1–4) |
ConsolidateEveryNStories |
0 |
Deduplicate and compress learnings.md every N completed stories (0 = off) |
AutoSplitOnFailures |
0 |
Auto-split a story into smaller ones after N failures (0 = off) |
Project Structure (source)
src/
Butters.Core/ ← reusable class library
Models/ ← Prd, UserStory (+ StoryType), ProgressState, AppConfig,
ModelUsageEntry, GenerationSession, Plan, PreflightResult,
DashboardSnapshot, McpServerConfig, UiPreferences, IterationResult
Services/ ← IterationService, PrdService, ProgressService, CopilotCliService,
LearningsService, ModelUsageService, GitService, ExportService,
McpConfigService, StoryGenerationService, ChangelogService,
PlanService, PreflightService, ReplanService,
SessionPersistenceService, UiPreferencesService
Components/ ← Razor terminal UI (Dashboard, App, GenerateView)
Assets/ ← architect.agent.md, decomposer.agent.md,
generation-defaults.json (stacks + templates)
ButtersInitializer.cs ← scaffolds butters/ folder and story-generator agent
ButtersPathResolver.cs← locates the butters/ folder at runtime
Extensions/ ← AddButters() DI extension
Butters.Console/ ← thin dotnet tool host
Program.cs ← handles 'init', 'generate', sizes terminal, starts host
Butters.slnx
Commands
| Command | Description |
|---|---|
butters init |
Scaffold butters/ in the current directory |
butters generate |
Launch the AI story generation wizard |
butters |
Launch the terminal dashboard and iteration loop |
butters --demo |
Run with simulated passes (no Copilot CLI calls) |
butters --free |
Force all passes to use the free 0x model |
butters --version |
Print the installed version |
Resetting Progress
Delete or edit butters/state/progress.txt to reset the loop. Set status back to not_started and iteration to 0 to start from the beginning.
iteration: 0
status: not_started
current_story: none
current_pass: 1
completed_stories: []
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
This package has no dependencies.