Engineering Playbook · AI-Assisted Development

The Aigile Workflow

A spec-driven, agent-orchestrated development system that enforces TDD, adversarial review, and quality gates at every step — from idea to merged PR.
Commands/plan_to_build · /build · /bug_to_pr
PlatformAny IDE · Any Git Host
StackAny stack · project.json-driven
VersionFebruary 2026
Overview Plan Build Bug→PR Guardrails Philosophy
Introduction

Three Commands, One Pipeline

You describe what to build. The system plans it, builds it with tests, reviews it adversarially, and produces a mergeable PR — while you get coffee.

/plan_to_build
specs/*.md
Investigates the codebase, explores design options with you, writes an exhaustive implementation spec
/build
working code
Reads the spec, dispatches specialist agents, enforces TDD, validates every task, rolls back on failure
/bug_to_pr
merged PR
Triages the bug, routes to specialist, fixes with full pipeline, adversarial review, merge gate
For the unfamiliar
Think of this as a self-managing engineering team. You play the role of product owner — you describe what needs to happen and make final decisions. The system handles investigation, planning, coding, testing, reviewing, and creating the PR. Each command produces a real artifact you can inspect and modify.
For the architect
This is a spec-driven, stateless multi-agent orchestration system. The spec file is the coordination artifact — it compensates for the absence of native TaskCreate/resume primitives in Copilot. Quality gates are enforced at three levels: prompt instructions (soft), postToolUse hooks (system), and agentStop hooks (system). The engineering philosophy is embedded non-ignorably in the prompts rather than living in lazy-loaded skill files.

The Full Architecture

End-to-end system overview
flowchart LR DEV([👤 You]) -->|describe feature| PTB[/plan_to_build/] PTB -->|explores codebase\nasks approach + team| SPEC[(specs/*.md)] SPEC -->|spec is airtight| BLD[/build/] BLD -->|TDD + validators\nper task| CODE[working code\n+ tests passing] DEV -->|describe bug| BTP[/bug_to_pr/] BTP -->|triage → route\nplan → build| PR[GitHub PR\n+ artifacts] PR -->|adversarial\nreview| GATE{both\napprove?} GATE -->|yes + you confirm| MERGE[merged to main] GATE -->|no| FIX[fix cycle\nmax 2] FIX --> PR subgraph HOOKS ["🪝 Always-on guardrails"] H1[sessionStart\nauto-install deps] H2[postToolUse\nruff · tsc · sections] H3[agentStop\nspec completeness] end style DEV fill:#faf8f3,stroke:#1a1a1a,color:#1a1a1a style PTB fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style SPEC fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style BLD fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style CODE fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style BTP fill:#e8eef8,stroke:#2a5fa5,color:#1a3060 style PR fill:#edf2fb,stroke:#2a5fa5,color:#1a1a1a style GATE fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style MERGE fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style FIX fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style HOOKS fill:#faf8f3,stroke:#d4cfc2,color:#1a1a1a
Command 01

Plan to Build

/plan_to_build

Turns a requirement into an airtight implementation spec that stateless agents can execute without asking questions.

The most important insight in this system: the spec is the contract. Everything downstream — builders, validators, reviewers — works from the spec and nothing else. If the spec is vague, the build will be wrong. If the spec is exhaustive, the build will be right. /plan_to_build exists to make specs exhaustive.

"The intelligence is in the spec. The build prompt is a mechanical dispatcher." Design principle

What It Does

  • 01
    Explore Before Planning

    For features and enhancements, asks you one multiple-choice question about the technical approach before writing anything. Forces alternatives to be considered. Skipped for bug fixes, chores, and refactors.

  • 02
    Team Composition

    Asks how work should be split: single builder, two builders by layer, or three builders by area. Your answer shapes the task breakdown and which workstreams run independently.

  • 03
    Codebase Investigation

    Reads relevant source files directly to understand existing patterns before designing anything. Agents that don't read the code produce plans that don't fit the codebase.

  • 04
    Exhaustive Spec Writing

    Each task gets min 50 words with: what to do (step-by-step), files to modify (exact paths), code patterns to follow, acceptance criteria (specific, verifiable), and a validation command. One-line task descriptions are forbidden.

  • 05
    Self-Audit + Mandatory Verify

    Before saving, counts builder tasks, validator tasks, descriptions, assertions. Then runs ls -la specs/file.md and grep -c sections = 7. Cannot report done until both pass.

Flow Diagram

plan_to_build internal flow
flowchart TD IN([User types /plan_to_build]) --> CHK CHK{feature or\nenhancement?} CHK -->|yes| BRAIN["Prerequisite 1\nExplore Before Planning\n— one question, multiple choice\n— wait for answer"] CHK -->|no — bug/chore| SKIP["skip brainstorming\ndocument why in Notes"] BRAIN --> TEAM["Prerequisite 2\nTeam Composition\n(a) single builder\n(b) two builders by layer\n(c) three builders by area"] SKIP --> TEAM TEAM --> READ["Read codebase\nexisting patterns\narchitecture\nrelevant files"] READ --> DESIGN["Design solution\narchitecture decisions\ntask breakdown\ndependency graph"] DESIGN --> WRITE["Write spec\nto specs/name.md\n\nEach task:\n- min 50 word description\n- exact file paths\n- code patterns\n- acceptance criteria\n- validation command"] WRITE --> AUDIT["Self-audit\ncount builders/validators\ncheck descriptions\ncheck assertions"] AUDIT --> VERIFY["Mandatory verify\nls -la specs/name.md\ngrep -c sections == 7"] VERIFY -->|fail| FIX_SPEC["fix spec\nand re-verify"] FIX_SPEC --> VERIFY VERIFY -->|pass| HOOK["postToolUse hook\nvalidates sections\nvalidator frequency\nblocks if incomplete"] HOOK -->|pass| REPORT["Report\nfile path · tasks · team\nExecute with /build"] REPORT --> EXEC_DIRECTIVE["EXECUTION DIRECTIVE\nin spec file:\nFORBIDDEN: direct implementation\nREQUIRED: use /build"] style IN fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style CHK fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style BRAIN fill:#edf2fb,stroke:#2a5fa5,color:#1a1a1a style TEAM fill:#edf2fb,stroke:#2a5fa5,color:#1a1a1a style SKIP fill:#f5f2eb,stroke:#d4cfc2,color:#7a7468 style READ fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style DESIGN fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style WRITE fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style AUDIT fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style VERIFY fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style FIX_SPEC fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style HOOK fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style REPORT fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style EXEC_DIRECTIVE fill:#f7ece8,stroke:#c84b2f,color:#5a1f12

Task Quality Rules

These rules are embedded directly in the prompt — not in a skill file that might not load. The model cannot ignore them.

Rule What it prevents Enforced by
min 50 word descriptions Vague tasks that builders can't execute without asking questions system hook blocks spec write
2–5 min task size Tasks too large for a stateless agent to complete reliably prompt self-audit step
Design assertions Weak acceptance criteria like "it works" instead of verifiable checks prompt task format template
Intermediate validators Regressions in earlier work only caught at the end system hook: >5 builders → validators required
7 required sections Incomplete specs that confuse the build orchestrator system postToolUse blocks write
EXECUTION DIRECTIVE Main agent implementing code directly instead of delegating prompt embedded in every spec
Command 02

Build

/build

A mechanical dispatcher. Reads the spec, executes tasks in dependency order, enforces TDD on every builder, validates every task, rolls back on failure.

The build prompt makes no decisions. It reads what the spec says and executes it. This is intentional — all intelligence was front-loaded into the spec by /plan_to_build. The build prompt is simple enough to be audited in five minutes and trusted completely.

Execution Loop

build orchestration loop
flowchart TD START([/build\nspecs/plan.md]) --> PARSE["Parse spec\nextract tasks\nbuild todo list"] PARSE --> LOOP subgraph LOOP ["Task loop — sequential, dependency-ordered"] TASK["Next unblocked task"] --> BUILDER subgraph BUILDER_PHASE ["Builder dispatch"] BUILDER["runSubagent('builder')\n+ TDD preamble:\n 1. Write failing test RED\n 2. Implement GREEN\n 3. Refactor\n+ task description verbatim"] end BUILDER --> VAL["runSubagent('validator')\nrun commands\nshow actual output\nPASS or FAIL"] VAL -->|"✅ PASS"| MARK["mark completed\nnext task"] MARK --> TASK VAL -->|"❌ FAIL cycle 1"| DEBUG["runSubagent('builder')\nDEBUG protocol:\n1. Reproduce\n2. Isolate\n3. Root cause\n4. Fix\n(no random changes)"] DEBUG --> VAL2["re-validate"] VAL2 -->|"✅ PASS"| MARK VAL2 -->|"❌ FAIL cycle 2"| ROLLBACK ROLLBACK["runSubagent('builder')\ngit checkout -- files\nverify with git diff\nlog + continue"] ROLLBACK --> MARK end LOOP --> CKPT["Checkpoint every 3 tasks\nreport to user\ncourse-correct?"] CKPT --> LOOP LOOP --> FINAL["validate-all\nrunSubagent('validator')\nall commands + criteria\nactual output required\nnever say done without proof"] FINAL -->|pass| REPORT["Build Complete\ntask table\nactual command output\nfiles changed"] style START fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style PARSE fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style BUILDER fill:#e8eef8,stroke:#2a5fa5,color:#1a3060 style VAL fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style MARK fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style TASK fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style DEBUG fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style VAL2 fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style ROLLBACK fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style CKPT fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style FINAL fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style REPORT fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a

The Rules the Build Prompt Never Breaks

Rule Why it exists
NEVER implement code yourself Orchestrator has no write tools. All code goes through builder subagents. No exceptions.
NEVER skip validation Every builder task is followed by a validator dispatch. No exceptions.
NEVER say done without proof Final report must include actual command output — not "tests passed" but the actual output.
NEVER run validation yourself If validator can't run commands, report the failure. Don't take over validation. Orchestrator running commands is a pipeline integrity failure.
Max 2 fix cycles per task Prevents infinite loops. After 2 failures, rollback and continue — never leave broken code.
Rollback on exhausted cycles git checkout -- files on failure. Verify with git diff. Broken code never reaches the next task.

The Builder's TDD Contract

Every builder dispatch includes this preamble — the model cannot skip it:

TDD preamble — injected into every builder prompt
1. Write a FAILING test that covers the acceptance criteria. Run it. Confirm RED.
2. Write the MINIMAL implementation to make it pass. Run it. Confirm GREEN.
3. Refactor if needed. Run again. Confirm still GREEN.
4. For commands >30 seconds, note estimated duration.
5. Never make random changes hoping to fix issues. Understand WHY before changing code.
Command 03

Bug to PR

/bug_to_pr

A complete local async pipeline. Describe a bug; receive a reviewed, adversarially-approved, mergeable GitHub PR — with full audit trail attached.

This command deliberately beats the GitHub async Copilot workflow by running everything locally with your full hooks, TDD enforcement, and nested orchestration — but still produces a proper GitHub PR with attached artifacts. You walk away. The pipeline works. You come back to a decision: merge or reject.

The Six Phases

  • P0
    Setup

    Generates BUG-NNN ID, creates bugs/BUG-NNN/ directory, creates fix/bug-nnn git branch, initialises pipeline state file for crash recovery.

  • P1
    Triage

    bug-creator investigates the codebase, reproduces the bug, writes a JIRA-format report with all 8 required sections (hook-enforced). bug-router reads the report and module registry, routes to the correct specialist fixer.

  • P2
    Fix (nested orchestration)

    Phase 2a: specialist fixer reads bug report and creates a fix spec (the plan_to_build equivalent). Phase 2b: orchestrator runs the full build protocol inline — TDD, validators, fix cycles, rollback. Phase 2c: captures test evidence to bugs/BUG-NNN/test-results.md.

  • P3
    PR Creation

    Commits and pushes the fix branch. Opens PR with structured body. Posts the full bug report as a PR comment — the PR becomes the audit hub.

  • P4
    Adversarial Review

    reviewer-alpha runs independently, verdict held in memory. reviewer-beta runs independently — alpha's file does not exist yet (structural isolation). Only after both complete are review files written to disk and posted as formal PR reviews.

  • P5
    Merge Gate

    Both reviewers must APPROVE. You confirm. Pipeline merges and deletes the branch. If either rejects, presents reasons and offers a retry fix cycle (max 2 total).

Full Pipeline Diagram

bug_to_pr — all six phases
flowchart TD IN(["/bug_to_pr\ndescribe the bug"]) --> SETUP SETUP["P0: Setup\nGenerate BUG-NNN\ncreate branch fix/bug-nnn\nwrite pipeline-state.json"] SETUP --> CREATOR CREATOR["P1: Triage\nbug-creator investigates\nreproduces bug\nwrites JIRA report\n8 sections enforced by hook"] CREATOR --> ROUTER ROUTER["P1: Route\nbug-router reads report\nreturns module + fixer\nhigh or medium confidence"] ROUTER --> FIXER FIXER["P2a: Fix Plan\nbug-fixer-module\nreads report, investigates\nwrites specs/fix-bug-nnn.md"] FIXER --> BUILD BUILD["P2b: Build\norchestrator runs build inline\nTDD + validators per task\nfix cycles + rollback on failure"] BUILD --> EVIDENCE EVIDENCE["P2c: Test Evidence\nrun module test command\ncapture to test-results.md\nverify actual output"] EVIDENCE --> PR PR["P3: PR + Artifacts\ngh pr create\nbug report posted as PR comment\npipeline-state updated"] PR --> ALPHA ALPHA["P4: Review Alpha\nindependent 5-point review\nverdict held in memory\nbeta file not yet written"] ALPHA --> BETA BETA["P4: Review Beta\nindependent 5-point review\nverdict held in memory\nalpha file not yet written"] BETA --> WRITE WRITE["Write both review files\npost as gh pr review\napprove or request-changes"] WRITE --> GATE GATE{both\napprove?} GATE -->|yes| CONFIRM["User confirms\ngh pr merge\nbranch deleted"] GATE -->|no, retry| RETRY["Re-enter fix phase\nwith rejection feedback\nmax 2 cycles"] GATE -->|no, exhausted| STOP["Report all reasons\nstop pipeline"] RETRY --> FIXER style IN fill:#e8eef8,stroke:#2a5fa5,color:#1a3060 style SETUP fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style CREATOR fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style ROUTER fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style FIXER fill:#e8eef8,stroke:#2a5fa5,color:#1a3060 style BUILD fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style EVIDENCE fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style PR fill:#edf2fb,stroke:#2a5fa5,color:#1a1a1a style ALPHA fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style BETA fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style WRITE fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style GATE fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style CONFIRM fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style RETRY fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style STOP fill:#ede9df,stroke:#c84b2f,color:#7a7468

Adversarial Review Isolation

Reviewer isolation is enforced structurally rather than by a hook. The orchestrator holds both verdicts in memory and writes nothing to disk until both reviewers have completed:

Isolation protocol
Step 1: Dispatch reviewer-alpha. Verdict returned in response text. Do NOT write to disk.

Step 2: Dispatch reviewer-beta. bugs/BUG-NNN/reviews/alpha.md does not exist yet — it cannot be read even if beta tries.

Step 3: Only after both verdicts are held in orchestrator memory: write both files simultaneously, then post as gh pr review.

What's genuinely lost: No system-level enforcement prevents the orchestrator from accidentally writing alpha's file before dispatching beta. This is protocol discipline, not hardware guarantee.

PR Artifacts

Every bug_to_pr run produces this trail on the GitHub PR:

Artifact Where Posted via
Bug report bugs/BUG-NNN/report.md gh pr comment --body-file
Fix spec specs/fix-bug-nnn.md committed to PR branch
Test evidence bugs/BUG-NNN/test-results.md committed to PR branch
Alpha verdict bugs/BUG-NNN/reviews/alpha.md gh pr review --approve/--request-changes
Beta verdict bugs/BUG-NNN/reviews/beta.md gh pr review --approve/--request-changes
Merge decision bugs/BUG-NNN/verdict.json committed to PR branch

Crash Recovery

Sessions can crash. Pipeline state is written to disk after every phase. On restart, the orchestrator reads bugs/BUG-NNN/pipeline-state.json and resumes from exactly the right phase — no re-running completed work.

State in file Resumes from
"setup" Phase 1: Triage
"triage" Phase 2a: Fix Planning
"fix" Phase 3: PR Creation
"pr" Phase 4: Adversarial Review
"review" Phase 5: Merge Gate
Infrastructure

Guardrails

Three layers of enforcement: hooks that fire on platform events, rules embedded non-ignorably in prompts, and structural design that makes violations architecturally impossible.

Hook System

hooks.json — three events
flowchart LR subgraph SESSION ["sessionStart"] S["setup.sh / setup.ps1\n- pip install -r requirements.txt\n- npm install if no node_modules\n- runs before every session\n- no manual setup ever needed"] end subgraph POST ["postToolUse — fires on every file write"] P1[".py written\n→ ruff check\nblocks on lint failure"] P2[".ts/.tsx written\n→ tsc --noEmit\nblocks on type error"] P3["specs/*.md written\n→ 7 required sections\n→ validator frequency\nblocks if incomplete"] P4["bugs/*/report.md written\n→ 8 required sections\nblocks if incomplete"] end subgraph STOP ["agentStop — fires when agent finishes"] A["validate_spec.py\nfinal gate:\nall 7 sections present\nblocks agent from stopping\nif spec incomplete"] end style SESSION fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style POST fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style STOP fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style S fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style P1 fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style P2 fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style P3 fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style P4 fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style A fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a

Enforcement Levels

Concern Level Mechanism
Spec 7 required sections system postToolUse blocks write if any missing
Bug report 8 required sections system postToolUse blocks write if any missing
Python lint on every .py write system ruff check — blocks with exact error
TypeScript types on every .ts write system tsc --noEmit — blocks with exact error
Validator frequency (>5 builders) system postToolUse: (builders//5)+1 validators required
Dependency auto-install system sessionStart hook: pip + npm before every session
TDD on every builder task prompt TDD preamble injected into every builder dispatch
Systematic debugging prompt reproduce→isolate→root cause→fix — embedded in build
No direct implementation prompt EXECUTION DIRECTIVE in every spec file
Review isolation (alpha/beta) structural alpha's file doesn't exist when beta runs
Read-only agents (router/reviewer) prompt no disallowedTools in Copilot — instructions only
Merge requires both approvals prompt orchestrator rule + ask_questions confirmation
Design Philosophy

Why It Works This Way

The Spec Is the Contract

The build prompt makes no decisions — it reads what the spec says and executes it. All intelligence is front-loaded into the spec by /plan_to_build. This constraint is intentional: a mechanical executor is predictable, auditable, and trustworthy in a way that a decision-making orchestrator is not.

The spec must be airtight because the build prompt is mechanical. This is not a limitation — it is a design choice that produces better specs. When the executor cannot compensate for vagueness, the planner has no choice but to be precise.

"The spec is the contract. Everything downstream — builders, validators, reviewers — works from the spec and nothing else." Design principle

Skills vs Embedded Philosophy

The system has skill files in .github/skills/ covering TDD, systematic debugging, verification before completion, safe rollback, and more. These skills are loaded on relevance — the model decides whether to pull them. That's probabilistic. Under pressure, the model might not load the right skill.

The solution: embed the critical rules directly in the prompts where the model cannot choose not to read them. Skills are the source of truth. Prompts are the enforcement layer.

brainstorming
plan_to_build → Prerequisite 1
One question at a time, multiple-choice preferred
team-composition
plan_to_build → Prerequisite 2
Single / two / three builders based on workstream independence
writing-plans
plan_to_build → Task Quality Rules
≥50 word descriptions, 2-5 min task size, design assertions required
test-driven-development
build → builder preamble
RED-GREEN-REFACTOR. Test first is mandatory.
systematic-debugging
build → fix cycle dispatch
Reproduce → isolate → root cause → fix. No random changes.
verification-before-completion
build → validator dispatch
Never say PASS without actual command output
safe-rollback
build → exhausted fix cycle
git checkout on 2 failed cycles. Verify with git diff.
executing-plans
build → batch checkpoints
Pause every 3 tasks, give user chance to course-correct
plan-reviewer
plan_to_build → Self-Audit
Count builders/validators before saving. Fix if wrong.

Local vs Async — The Comparison

The local pipeline is not a compromise version of async — it is the superior workflow for work that demands quality. You can launch multiple /bug_to_pr sessions simultaneously, one per terminal, each on its own branch, each running the full pipeline autonomously. Go to lunch. Come back to multiple PRs — each with a JIRA bug report, test evidence, and two independent adversarial review verdicts attached. That is genuinely parallel async work, locally, with no quality compromised.

The GitHub async Copilot workflow produces a PR. The local pipeline produces a PR plus everything the async workflow cannot: mandatory TDD on every task, hooks that block bad code at write time, adversarial review by two isolated agents, semantic merge conflict resolution, and a full audit trail from triage through merge gate. The only thing the async workflow adds is that no terminal needs to be running — a marginal advantage when your terminal is already open.

Use the local pipeline for work you care about. Use the async GitHub Copilot workflow only for pure delegation — simple, low-risk tasks you would assign to a junior developer and review later, where quality gates can be lighter. For everything else, the local pipeline wins.

One more advantage worth stating plainly: the local pipeline is completely portable. It runs in any IDE, against any git-compatible repository — GitHub, GitLab, Bitbucket, self-hosted Gitea, whatever your organisation uses. The GitHub async Copilot workflow is locked to GitHub.com. If your team is on GitLab or your enterprise runs an internal git server, the async workflow is simply unavailable. The local pipeline has no such dependency.

Dimension Local pipeline GitHub async Copilot
Quality gate Hooks at write time + adversarial review CI at PR time + human review
TDD enforcement Mandatory preamble in every builder Copilot's own defaults
Review Two isolated adversarial agents Human code review
Merge conflict Orchestrator resolves semantically You resolve manually
Audit trail on PR Bug report + test evidence + two verdicts PR diff + CI logs
Parallel execution Multiple sessions simultaneously Multiple issues assigned to @copilot
Crash recovery pipeline-state.json — resume any phase Stateless — each session fresh
Terminal required Yes — any terminal or IDE No — runs in GitHub Actions
Best for Work you care about — quality is non-negotiable Pure delegation — simple, low-risk tasks
Portability Any IDE · any git host · GitHub, GitLab, Bitbucket, self-hosted GitHub.com only