Aigile Workflow — Engineering Playbook

Introduction

Three Commands, One Pipeline

You describe what to build. The system plans it, builds it with tests, reviews it adversarially, and produces a mergeable PR — while you get coffee.

/plan_to_build

↓

specs/*.md

Investigates the codebase, explores design options with you, writes an exhaustive implementation spec

/build

↓

working code

Reads the spec, dispatches specialist agents, enforces TDD, validates every task, rolls back on failure

/bug_to_pr

↓

merged PR

Triages the bug, routes to specialist, fixes with full pipeline, adversarial review, merge gate

For the unfamiliar

Think of this as a self-managing engineering team. You play the role of product owner — you describe what needs to happen and make final decisions. The system handles investigation, planning, coding, testing, reviewing, and creating the PR. Each command produces a real artifact you can inspect and modify.

For the architect

This is a spec-driven, stateless multi-agent orchestration system. The spec file is the coordination artifact — it compensates for the absence of native TaskCreate/resume primitives in Copilot. Quality gates are enforced at three levels: prompt instructions (soft), postToolUse hooks (system), and agentStop hooks (system). The engineering philosophy is embedded non-ignorably in the prompts rather than living in lazy-loaded skill files.

The Full Architecture

End-to-end system overview

flowchart LR DEV([👤 You]) -->|describe feature| PTB[/plan_to_build/] PTB -->|explores codebase\nasks approach + team| SPEC[(specs/*.md)] SPEC -->|spec is airtight| BLD[/build/] BLD -->|TDD + validators\nper task| CODE[working code\n+ tests passing] DEV -->|describe bug| BTP[/bug_to_pr/] BTP -->|triage → route\nplan → build| PR[GitHub PR\n+ artifacts] PR -->|adversarial\nreview| GATE{both\napprove?} GATE -->|yes + you confirm| MERGE[merged to main] GATE -->|no| FIX[fix cycle\nmax 2] FIX --> PR subgraph HOOKS ["🪝 Always-on guardrails"] H1[sessionStart\nauto-install deps] H2[postToolUse\nruff · tsc · sections] H3[agentStop\nspec completeness] end style DEV fill:#faf8f3,stroke:#1a1a1a,color:#1a1a1a style PTB fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style SPEC fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style BLD fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style CODE fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style BTP fill:#e8eef8,stroke:#2a5fa5,color:#1a3060 style PR fill:#edf2fb,stroke:#2a5fa5,color:#1a1a1a style GATE fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style MERGE fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style FIX fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style HOOKS fill:#faf8f3,stroke:#d4cfc2,color:#1a1a1a

Command 01

Plan to Build

/plan_to_build

Turns a requirement into an airtight implementation spec that stateless agents can execute without asking questions.

The most important insight in this system: the spec is the contract. Everything downstream — builders, validators, reviewers — works from the spec and nothing else. If the spec is vague, the build will be wrong. If the spec is exhaustive, the build will be right. /plan_to_build exists to make specs exhaustive.

"The intelligence is in the spec. The build prompt is a mechanical dispatcher." Design principle

What It Does

01

Explore Before Planning
For features and enhancements, asks you one multiple-choice question about the technical approach before writing anything. Forces alternatives to be considered. Skipped for bug fixes, chores, and refactors.
02

Team Composition
Asks how work should be split: single builder, two builders by layer, or three builders by area. Your answer shapes the task breakdown and which workstreams run independently.
03

Codebase Investigation
Reads relevant source files directly to understand existing patterns before designing anything. Agents that don't read the code produce plans that don't fit the codebase.
04

Exhaustive Spec Writing
Each task gets min 50 words with: what to do (step-by-step), files to modify (exact paths), code patterns to follow, acceptance criteria (specific, verifiable), and a validation command. One-line task descriptions are forbidden.
05

Self-Audit + Mandatory Verify
Before saving, counts builder tasks, validator tasks, descriptions, assertions. Then runs ls -la specs/file.md and grep -c sections = 7. Cannot report done until both pass.

Flow Diagram

plan_to_build internal flow

flowchart TD IN([User types /plan_to_build]) --> CHK CHK{feature or\nenhancement?} CHK -->|yes| BRAIN["Prerequisite 1\nExplore Before Planning\n— one question, multiple choice\n— wait for answer"] CHK -->|no — bug/chore| SKIP["skip brainstorming\ndocument why in Notes"] BRAIN --> TEAM["Prerequisite 2\nTeam Composition\n(a) single builder\n(b) two builders by layer\n(c) three builders by area"] SKIP --> TEAM TEAM --> READ["Read codebase\nexisting patterns\narchitecture\nrelevant files"] READ --> DESIGN["Design solution\narchitecture decisions\ntask breakdown\ndependency graph"] DESIGN --> WRITE["Write spec\nto specs/name.md\n\nEach task:\n- min 50 word description\n- exact file paths\n- code patterns\n- acceptance criteria\n- validation command"] WRITE --> AUDIT["Self-audit\ncount builders/validators\ncheck descriptions\ncheck assertions"] AUDIT --> VERIFY["Mandatory verify\nls -la specs/name.md\ngrep -c sections == 7"] VERIFY -->|fail| FIX_SPEC["fix spec\nand re-verify"] FIX_SPEC --> VERIFY VERIFY -->|pass| HOOK["postToolUse hook\nvalidates sections\nvalidator frequency\nblocks if incomplete"] HOOK -->|pass| REPORT["Report\nfile path · tasks · team\nExecute with /build"] REPORT --> EXEC_DIRECTIVE["EXECUTION DIRECTIVE\nin spec file:\nFORBIDDEN: direct implementation\nREQUIRED: use /build"] style IN fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style CHK fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style BRAIN fill:#edf2fb,stroke:#2a5fa5,color:#1a1a1a style TEAM fill:#edf2fb,stroke:#2a5fa5,color:#1a1a1a style SKIP fill:#f5f2eb,stroke:#d4cfc2,color:#7a7468 style READ fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style DESIGN fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style WRITE fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style AUDIT fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style VERIFY fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style FIX_SPEC fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style HOOK fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style REPORT fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style EXEC_DIRECTIVE fill:#f7ece8,stroke:#c84b2f,color:#5a1f12

Task Quality Rules

These rules are embedded directly in the prompt — not in a skill file that might not load. The model cannot ignore them.

Rule	What it prevents	Enforced by
min 50 word descriptions	Vague tasks that builders can't execute without asking questions	system hook blocks spec write
2–5 min task size	Tasks too large for a stateless agent to complete reliably	prompt self-audit step
Design assertions	Weak acceptance criteria like "it works" instead of verifiable checks	prompt task format template
Intermediate validators	Regressions in earlier work only caught at the end	system hook: >5 builders → validators required
7 required sections	Incomplete specs that confuse the build orchestrator	system postToolUse blocks write
EXECUTION DIRECTIVE	Main agent implementing code directly instead of delegating	prompt embedded in every spec

Command 02

Build

/build

A mechanical dispatcher. Reads the spec, executes tasks in dependency order, enforces TDD on every builder, validates every task, rolls back on failure.

The build prompt makes no decisions. It reads what the spec says and executes it. This is intentional — all intelligence was front-loaded into the spec by /plan_to_build. The build prompt is simple enough to be audited in five minutes and trusted completely.

Execution Loop

build orchestration loop

flowchart TD START([/build\nspecs/plan.md]) --> PARSE["Parse spec\nextract tasks\nbuild todo list"] PARSE --> LOOP subgraph LOOP ["Task loop — sequential, dependency-ordered"] TASK["Next unblocked task"] --> BUILDER subgraph BUILDER_PHASE ["Builder dispatch"] BUILDER["runSubagent('builder')\n+ TDD preamble:\n 1. Write failing test RED\n 2. Implement GREEN\n 3. Refactor\n+ task description verbatim"] end BUILDER --> VAL["runSubagent('validator')\nrun commands\nshow actual output\nPASS or FAIL"] VAL -->|"✅ PASS"| MARK["mark completed\nnext task"] MARK --> TASK VAL -->|"❌ FAIL cycle 1"| DEBUG["runSubagent('builder')\nDEBUG protocol:\n1. Reproduce\n2. Isolate\n3. Root cause\n4. Fix\n(no random changes)"] DEBUG --> VAL2["re-validate"] VAL2 -->|"✅ PASS"| MARK VAL2 -->|"❌ FAIL cycle 2"| ROLLBACK ROLLBACK["runSubagent('builder')\ngit checkout -- files\nverify with git diff\nlog + continue"] ROLLBACK --> MARK end LOOP --> CKPT["Checkpoint every 3 tasks\nreport to user\ncourse-correct?"] CKPT --> LOOP LOOP --> FINAL["validate-all\nrunSubagent('validator')\nall commands + criteria\nactual output required\nnever say done without proof"] FINAL -->|pass| REPORT["Build Complete\ntask table\nactual command output\nfiles changed"] style START fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style PARSE fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style BUILDER fill:#e8eef8,stroke:#2a5fa5,color:#1a3060 style VAL fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style MARK fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style TASK fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style DEBUG fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style VAL2 fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style ROLLBACK fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style CKPT fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style FINAL fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style REPORT fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a

The Rules the Build Prompt Never Breaks

Rule	Why it exists
NEVER implement code yourself	Orchestrator has no write tools. All code goes through builder subagents. No exceptions.
NEVER skip validation	Every builder task is followed by a validator dispatch. No exceptions.
NEVER say done without proof	Final report must include actual command output — not "tests passed" but the actual output.
NEVER run validation yourself	If validator can't run commands, report the failure. Don't take over validation. Orchestrator running commands is a pipeline integrity failure.
Max 2 fix cycles per task	Prevents infinite loops. After 2 failures, rollback and continue — never leave broken code.
Rollback on exhausted cycles	`git checkout -- files` on failure. Verify with `git diff`. Broken code never reaches the next task.

The Builder's TDD Contract

Every builder dispatch includes this preamble — the model cannot skip it:

TDD preamble — injected into every builder prompt

1. Write a FAILING test that covers the acceptance criteria. Run it. Confirm RED.
2. Write the MINIMAL implementation to make it pass. Run it. Confirm GREEN.
3. Refactor if needed. Run again. Confirm still GREEN.
4. For commands >30 seconds, note estimated duration.
5. Never make random changes hoping to fix issues. Understand WHY before changing code.

Command 03

Bug to PR

/bug_to_pr

A complete local async pipeline. Describe a bug; receive a reviewed, adversarially-approved, mergeable GitHub PR — with full audit trail attached.

This command deliberately beats the GitHub async Copilot workflow by running everything locally with your full hooks, TDD enforcement, and nested orchestration — but still produces a proper GitHub PR with attached artifacts. You walk away. The pipeline works. You come back to a decision: merge or reject.

The Six Phases

P0

Setup
Generates BUG-NNN ID, creates bugs/BUG-NNN/ directory, creates fix/bug-nnn git branch, initialises pipeline state file for crash recovery.
P1

Triage
bug-creator investigates the codebase, reproduces the bug, writes a JIRA-format report with all 8 required sections (hook-enforced). bug-router reads the report and module registry, routes to the correct specialist fixer.
P2

Fix (nested orchestration)
Phase 2a: specialist fixer reads bug report and creates a fix spec (the plan_to_build equivalent). Phase 2b: orchestrator runs the full build protocol inline — TDD, validators, fix cycles, rollback. Phase 2c: captures test evidence to bugs/BUG-NNN/test-results.md.
P3

PR Creation
Commits and pushes the fix branch. Opens PR with structured body. Posts the full bug report as a PR comment — the PR becomes the audit hub.
P4

Adversarial Review
reviewer-alpha runs independently, verdict held in memory. reviewer-beta runs independently — alpha's file does not exist yet (structural isolation). Only after both complete are review files written to disk and posted as formal PR reviews.
P5

Merge Gate
Both reviewers must APPROVE. You confirm. Pipeline merges and deletes the branch. If either rejects, presents reasons and offers a retry fix cycle (max 2 total).

Full Pipeline Diagram

bug_to_pr — all six phases

flowchart TD IN(["/bug_to_pr\ndescribe the bug"]) --> SETUP SETUP["P0: Setup\nGenerate BUG-NNN\ncreate branch fix/bug-nnn\nwrite pipeline-state.json"] SETUP --> CREATOR CREATOR["P1: Triage\nbug-creator investigates\nreproduces bug\nwrites JIRA report\n8 sections enforced by hook"] CREATOR --> ROUTER ROUTER["P1: Route\nbug-router reads report\nreturns module + fixer\nhigh or medium confidence"] ROUTER --> FIXER FIXER["P2a: Fix Plan\nbug-fixer-module\nreads report, investigates\nwrites specs/fix-bug-nnn.md"] FIXER --> BUILD BUILD["P2b: Build\norchestrator runs build inline\nTDD + validators per task\nfix cycles + rollback on failure"] BUILD --> EVIDENCE EVIDENCE["P2c: Test Evidence\nrun module test command\ncapture to test-results.md\nverify actual output"] EVIDENCE --> PR PR["P3: PR + Artifacts\ngh pr create\nbug report posted as PR comment\npipeline-state updated"] PR --> ALPHA ALPHA["P4: Review Alpha\nindependent 5-point review\nverdict held in memory\nbeta file not yet written"] ALPHA --> BETA BETA["P4: Review Beta\nindependent 5-point review\nverdict held in memory\nalpha file not yet written"] BETA --> WRITE WRITE["Write both review files\npost as gh pr review\napprove or request-changes"] WRITE --> GATE GATE{both\napprove?} GATE -->|yes| CONFIRM["User confirms\ngh pr merge\nbranch deleted"] GATE -->|no, retry| RETRY["Re-enter fix phase\nwith rejection feedback\nmax 2 cycles"] GATE -->|no, exhausted| STOP["Report all reasons\nstop pipeline"] RETRY --> FIXER style IN fill:#e8eef8,stroke:#2a5fa5,color:#1a3060 style SETUP fill:#f5f2eb,stroke:#6a6860,color:#1a1a1a style CREATOR fill:#f7ece8,stroke:#c84b2f,color:#5a1f12 style ROUTER fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style FIXER fill:#e8eef8,stroke:#2a5fa5,color:#1a3060 style BUILD fill:#e8f4ed,stroke:#2d7a4f,color:#1a4a2e style EVIDENCE fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style PR fill:#edf2fb,stroke:#2a5fa5,color:#1a1a1a style ALPHA fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style BETA fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style WRITE fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style GATE fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style CONFIRM fill:#eef7f1,stroke:#2d7a4f,color:#1a1a1a style RETRY fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style STOP fill:#ede9df,stroke:#c84b2f,color:#7a7468

Adversarial Review Isolation

Reviewer isolation is enforced structurally rather than by a hook. The orchestrator holds both verdicts in memory and writes nothing to disk until both reviewers have completed:

Isolation protocol

Step 1: Dispatch reviewer-alpha. Verdict returned in response text. Do NOT write to disk.

Step 2: Dispatch reviewer-beta. bugs/BUG-NNN/reviews/alpha.md does not exist yet — it cannot be read even if beta tries.

Step 3: Only after both verdicts are held in orchestrator memory: write both files simultaneously, then post as gh pr review.

What's genuinely lost: No system-level enforcement prevents the orchestrator from accidentally writing alpha's file before dispatching beta. This is protocol discipline, not hardware guarantee.

PR Artifacts

Every bug_to_pr run produces this trail on the GitHub PR:

Artifact	Where	Posted via
Bug report	bugs/BUG-NNN/report.md	`gh pr comment --body-file`
Fix spec	specs/fix-bug-nnn.md	committed to PR branch
Test evidence	bugs/BUG-NNN/test-results.md	committed to PR branch
Alpha verdict	bugs/BUG-NNN/reviews/alpha.md	`gh pr review --approve/--request-changes`
Beta verdict	bugs/BUG-NNN/reviews/beta.md	`gh pr review --approve/--request-changes`
Merge decision	bugs/BUG-NNN/verdict.json	committed to PR branch

Crash Recovery

Sessions can crash. Pipeline state is written to disk after every phase. On restart, the orchestrator reads bugs/BUG-NNN/pipeline-state.json and resumes from exactly the right phase — no re-running completed work.

State in file	Resumes from
"setup"	Phase 1: Triage
"triage"	Phase 2a: Fix Planning
"fix"	Phase 3: PR Creation
"pr"	Phase 4: Adversarial Review
"review"	Phase 5: Merge Gate

Infrastructure

Guardrails

Three layers of enforcement: hooks that fire on platform events, rules embedded non-ignorably in prompts, and structural design that makes violations architecturally impossible.

Hook System

hooks.json — three events

flowchart LR subgraph SESSION ["sessionStart"] S["setup.sh / setup.ps1\n- pip install -r requirements.txt\n- npm install if no node_modules\n- runs before every session\n- no manual setup ever needed"] end subgraph POST ["postToolUse — fires on every file write"] P1[".py written\n→ ruff check\nblocks on lint failure"] P2[".ts/.tsx written\n→ tsc --noEmit\nblocks on type error"] P3["specs/*.md written\n→ 7 required sections\n→ validator frequency\nblocks if incomplete"] P4["bugs/*/report.md written\n→ 8 required sections\nblocks if incomplete"] end subgraph STOP ["agentStop — fires when agent finishes"] A["validate_spec.py\nfinal gate:\nall 7 sections present\nblocks agent from stopping\nif spec incomplete"] end style SESSION fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style POST fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style STOP fill:#f7f0dc,stroke:#b08a2e,color:#5a4010 style S fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a style P1 fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style P2 fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style P3 fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style P4 fill:#fdf0ec,stroke:#c84b2f,color:#1a1a1a style A fill:#fdf6e3,stroke:#b08a2e,color:#1a1a1a

Enforcement Levels

Concern	Level	Mechanism
Spec 7 required sections	system	postToolUse blocks write if any missing
Bug report 8 required sections	system	postToolUse blocks write if any missing
Python lint on every .py write	system	ruff check — blocks with exact error
TypeScript types on every .ts write	system	tsc --noEmit — blocks with exact error
Validator frequency (>5 builders)	system	postToolUse: (builders//5)+1 validators required
Dependency auto-install	system	sessionStart hook: pip + npm before every session
TDD on every builder task	prompt	TDD preamble injected into every builder dispatch
Systematic debugging	prompt	reproduce→isolate→root cause→fix — embedded in build
No direct implementation	prompt	EXECUTION DIRECTIVE in every spec file
Review isolation (alpha/beta)	structural	alpha's file doesn't exist when beta runs
Read-only agents (router/reviewer)	prompt	no disallowedTools in Copilot — instructions only
Merge requires both approvals	prompt	orchestrator rule + ask_questions confirmation

Design Philosophy

Why It Works This Way

The Spec Is the Contract

The build prompt makes no decisions — it reads what the spec says and executes it. All intelligence is front-loaded into the spec by /plan_to_build. This constraint is intentional: a mechanical executor is predictable, auditable, and trustworthy in a way that a decision-making orchestrator is not.

The spec must be airtight because the build prompt is mechanical. This is not a limitation — it is a design choice that produces better specs. When the executor cannot compensate for vagueness, the planner has no choice but to be precise.

"The spec is the contract. Everything downstream — builders, validators, reviewers — works from the spec and nothing else." Design principle

Skills vs Embedded Philosophy

The system has skill files in .github/skills/ covering TDD, systematic debugging, verification before completion, safe rollback, and more. These skills are loaded on relevance — the model decides whether to pull them. That's probabilistic. Under pressure, the model might not load the right skill.

The solution: embed the critical rules directly in the prompts where the model cannot choose not to read them. Skills are the source of truth. Prompts are the enforcement layer.

brainstorming

plan_to_build → Prerequisite 1

One question at a time, multiple-choice preferred

team-composition

plan_to_build → Prerequisite 2

Single / two / three builders based on workstream independence

writing-plans

plan_to_build → Task Quality Rules

≥50 word descriptions, 2-5 min task size, design assertions required

test-driven-development

build → builder preamble

RED-GREEN-REFACTOR. Test first is mandatory.

systematic-debugging

build → fix cycle dispatch

Reproduce → isolate → root cause → fix. No random changes.

verification-before-completion

build → validator dispatch

Never say PASS without actual command output

safe-rollback

build → exhausted fix cycle

git checkout on 2 failed cycles. Verify with git diff.

executing-plans

build → batch checkpoints

Pause every 3 tasks, give user chance to course-correct

plan-reviewer

plan_to_build → Self-Audit

Count builders/validators before saving. Fix if wrong.

Local vs Async — The Comparison

The local pipeline is not a compromise version of async — it is the superior workflow for work that demands quality. You can launch multiple /bug_to_pr sessions simultaneously, one per terminal, each on its own branch, each running the full pipeline autonomously. Go to lunch. Come back to multiple PRs — each with a JIRA bug report, test evidence, and two independent adversarial review verdicts attached. That is genuinely parallel async work, locally, with no quality compromised.

The GitHub async Copilot workflow produces a PR. The local pipeline produces a PR plus everything the async workflow cannot: mandatory TDD on every task, hooks that block bad code at write time, adversarial review by two isolated agents, semantic merge conflict resolution, and a full audit trail from triage through merge gate. The only thing the async workflow adds is that no terminal needs to be running — a marginal advantage when your terminal is already open.

Use the local pipeline for work you care about. Use the async GitHub Copilot workflow only for pure delegation — simple, low-risk tasks you would assign to a junior developer and review later, where quality gates can be lighter. For everything else, the local pipeline wins.

One more advantage worth stating plainly: the local pipeline is completely portable. It runs in any IDE, against any git-compatible repository — GitHub, GitLab, Bitbucket, self-hosted Gitea, whatever your organisation uses. The GitHub async Copilot workflow is locked to GitHub.com. If your team is on GitLab or your enterprise runs an internal git server, the async workflow is simply unavailable. The local pipeline has no such dependency.

Dimension	Local pipeline	GitHub async Copilot
Quality gate	Hooks at write time + adversarial review	CI at PR time + human review
TDD enforcement	Mandatory preamble in every builder	Copilot's own defaults
Review	Two isolated adversarial agents	Human code review
Merge conflict	Orchestrator resolves semantically	You resolve manually
Audit trail on PR	Bug report + test evidence + two verdicts	PR diff + CI logs
Parallel execution	Multiple sessions simultaneously	Multiple issues assigned to @copilot
Crash recovery	pipeline-state.json — resume any phase	Stateless — each session fresh
Terminal required	Yes — any terminal or IDE	No — runs in GitHub Actions
Best for	Work you care about — quality is non-negotiable	Pure delegation — simple, low-risk tasks
Portability	Any IDE · any git host · GitHub, GitLab, Bitbucket, self-hosted	GitHub.com only