--- name: subagent-driven-development description: "Execute plans via delegate_task subagents (2-stage review)." version: 1.1.0 author: Hermes Agent (adapted from obra/superpowers) license: MIT metadata: hermes: tags: [delegation, subagent, implementation, workflow, parallel] related_skills: [writing-plans, requesting-code-review, test-driven-development] --- # Subagent-Driven Development ## Overview Execute implementation plans by dispatching fresh subagents per task with systematic two-stage review. **Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration. ## When to Use Use this skill when: - You have an implementation plan (from writing-plans skill or user requirements) - Tasks are mostly independent - Quality and spec compliance are important - You want automated review between tasks **vs. manual execution:** - Fresh context per task (no confusion from accumulated state) - Automated review process catches issues early - Consistent quality checks across all tasks - Subagents can ask questions before starting work ## The Process ### 1. Read and Parse Plan Read the plan file. Extract ALL tasks with their full text and context upfront. Create a todo list: ```python # Read the plan read_file("docs/plans/feature-plan.md") # Create todo list with all tasks todo([ {"id": "task-1", "content": "Create User model with email field", "status": "pending"}, {"id": "task-2", "content": "Add password hashing utility", "status": "pending"}, {"id": "task-3", "content": "Create login endpoint", "status": "pending"}, ]) ``` **Key:** Read the plan ONCE. Extract everything. Don't make subagents read the plan file — provide the full task text directly in context. ### 2. Per-Task Workflow For EACH task in the plan: #### Step 1: Dispatch Implementer Subagent Use `delegate_task` with complete context: ```python delegate_task( goal="Implement Task 1: Create User model with email and password_hash fields", context=""" TASK FROM PLAN: - Create: src/models/user.py - Add User class with email (str) and password_hash (str) fields - Use bcrypt for password hashing - Include __repr__ for debugging FOLLOW TDD: 1. Write failing test in tests/models/test_user.py 2. Run: pytest tests/models/test_user.py -v (verify FAIL) 3. Write minimal implementation 4. Run: pytest tests/models/test_user.py -v (verify PASS) 5. Run: pytest tests/ -q (verify no regressions) 6. Commit: git add -A && git commit -m "feat: add User model with password hashing" PROJECT CONTEXT: - Python 3.11, Flask app in src/app.py - Existing models in src/models/ - Tests use pytest, run from project root - bcrypt already in requirements.txt """, toolsets=['terminal', 'file'] ) ``` #### Step 2: Dispatch Spec Compliance Reviewer After the implementer completes, verify against the original spec: ```python delegate_task( goal="Review if implementation matches the spec from the plan", context=""" ORIGINAL TASK SPEC: - Create src/models/user.py with User class - Fields: email (str), password_hash (str) - Use bcrypt for password hashing - Include __repr__ CHECK: - [ ] All requirements from spec implemented? - [ ] File paths match spec? - [ ] Function signatures match spec? - [ ] Behavior matches expected? - [ ] Nothing extra added (no scope creep)? OUTPUT: PASS or list of specific spec gaps to fix. """, toolsets=['file'] ) ``` **If spec issues found:** Fix gaps, then re-run spec review. Continue only when spec-compliant. #### Step 3: Dispatch Code Quality Reviewer After spec compliance passes: ```python delegate_task( goal="Review code quality for Task 1 implementation", context=""" FILES TO REVIEW: - src/models/user.py - tests/models/test_user.py CHECK: - [ ] Follows project conventions and style? - [ ] Proper error handling? - [ ] Clear variable/function names? - [ ] Adequate test coverage? - [ ] No obvious bugs or missed edge cases? - [ ] No security issues? OUTPUT FORMAT: - Critical Issues: [must fix before proceeding] - Important Issues: [should fix] - Minor Issues: [optional] - Verdict: APPROVED or REQUEST_CHANGES """, toolsets=['file'] ) ``` **If quality issues found:** Fix issues, re-review. Continue only when approved. #### Step 4: Mark Complete ```python todo([{"id": "task-1", "content": "Create User model with email field", "status": "completed"}], merge=True) ``` ### 3. Final Review After ALL tasks are complete, dispatch a final integration reviewer: ```python delegate_task( goal="Review the entire implementation for consistency and integration issues", context=""" All tasks from the plan are complete. Review the full implementation: - Do all components work together? - Any inconsistencies between tasks? - All tests passing? - Ready for merge? """, toolsets=['terminal', 'file'] ) ``` ### 4. Verify and Commit ```bash # Run full test suite pytest tests/ -q # Review all changes git diff --stat # Final commit if needed git add -A && git commit -m "feat: complete [feature name] implementation" ``` ## Task Granularity **Each task = 2-5 minutes of focused work.** **Too big:** - "Implement user authentication system" **Right size:** - "Create User model with email and password fields" - "Add password hashing function" - "Create login endpoint" - "Add JWT token generation" - "Create registration endpoint" ## Red Flags — Never Do These - Start implementation without a plan - Skip reviews (spec compliance OR code quality) - Proceed with unfixed critical/important issues - Dispatch multiple implementation subagents for tasks that touch the same files - Make subagent read the plan file (provide full text in context instead) - Skip scene-setting context (subagent needs to understand where the task fits) - Ignore subagent questions (answer before letting them proceed) - Accept "close enough" on spec compliance - Skip review loops (reviewer found issues → implementer fixes → review again) - Let implementer self-review replace actual review (both are needed) - **Start code quality review before spec compliance is PASS** (wrong order) - Move to next task while either review has open issues ## Handling Issues ### If Subagent Asks Questions - Answer clearly and completely - Provide additional context if needed - Don't rush them into implementation ### If Reviewer Finds Issues - Implementer subagent (or a new one) fixes them - Reviewer reviews again - Repeat until approved - Don't skip the re-review ### If Subagent Fails a Task - Dispatch a new fix subagent with specific instructions about what went wrong - Don't try to fix manually in the controller session (context pollution) ## Efficiency Notes **Why fresh subagent per task:** - Prevents context pollution from accumulated state - Each subagent gets clean, focused context - No confusion from prior tasks' code or reasoning **Why two-stage review:** - Spec review catches under/over-building early - Quality review ensures the implementation is well-built - Catches issues before they compound across tasks **Cost trade-off:** - More subagent invocations (implementer + 2 reviewers per task) - But catches issues early (cheaper than debugging compounded problems later) ## Integration with Other Skills ### With writing-plans This skill EXECUTES plans created by the writing-plans skill: 1. User requirements → writing-plans → implementation plan 2. Implementation plan → subagent-driven-development → working code ### With test-driven-development Implementer subagents should follow TDD: 1. Write failing test first 2. Implement minimal code 3. Verify test passes 4. Commit Include TDD instructions in every implementer context. ### With requesting-code-review The two-stage review process IS the code review. For final integration review, use the requesting-code-review skill's review dimensions. ### With systematic-debugging If a subagent encounters bugs during implementation: 1. Follow systematic-debugging process 2. Find root cause before fixing 3. Write regression test 4. Resume implementation ## Example Workflow ``` [Read plan: docs/plans/auth-feature.md] [Create todo list with 5 tasks] --- Task 1: Create User model --- [Dispatch implementer subagent] Implementer: "Should email be unique?" You: "Yes, email must be unique" Implementer: Implemented, 3/3 tests passing, committed. [Dispatch spec reviewer] Spec reviewer: ✅ PASS — all requirements met [Dispatch quality reviewer] Quality reviewer: ✅ APPROVED — clean code, good tests [Mark Task 1 complete] --- Task 2: Password hashing --- [Dispatch implementer subagent] Implementer: No questions, implemented, 5/5 tests passing. [Dispatch spec reviewer] Spec reviewer: ❌ Missing: password strength validation (spec says "min 8 chars") [Implementer fixes] Implementer: Added validation, 7/7 tests passing. [Dispatch spec reviewer again] Spec reviewer: ✅ PASS [Dispatch quality reviewer] Quality reviewer: Important: Magic number 8, extract to constant Implementer: Extracted MIN_PASSWORD_LENGTH constant Quality reviewer: ✅ APPROVED [Mark Task 2 complete] ... (continue for all tasks) [After all tasks: dispatch final integration reviewer] [Run full test suite: all passing] [Done!] ``` ## Remember ``` Fresh subagent per task Two-stage review every time Spec compliance FIRST Code quality SECOND Never skip reviews Catch issues early ``` **Quality is not an accident. It's the result of systematic process.** ## Further reading (load when relevant) When the orchestration involves significant context usage, long review loops, or complex validation checkpoints, load these references for the specific discipline: - **`references/context-budget-discipline.md`** — Four-tier context degradation model (PEAK / GOOD / DEGRADING / POOR), read-depth rules that scale with context window size, and early warning signs of silent degradation. Load when a run will clearly consume significant context (multi-phase plans, many subagents, large artifacts). - **`references/gates-taxonomy.md`** — The four canonical gate types (Pre-flight, Revision, Escalation, Abort) with behavior, recovery, and examples. Load when designing or reviewing any workflow that has validation checkpoints — use the vocabulary explicitly so each gate has defined entry, failure behavior, and resumption rules. Both references adapted from gsd-build/get-shit-done (MIT © 2025 Lex Christopherson). ## 中文上下文示例 ### 实现者子代理上下文示例 ```python delegate_task( goal="实现任务 1: 创建用户模型,包含邮箱和密码字段", context=""" 来自计划的任务: - 创建:src/models/user.py - 添加 User 类,包含 email (str) 和 password_hash (str) 字段 - 使用 bcrypt 进行密码哈希 - 包含 __repr__ 方法用于调试 遵循 TDD 流程: 1. 在 tests/models/test_user.py 中编写失败测试 2. 运行:pytest tests/models/test_user.py -v(验证失败) 3. 编写最小实现 4. 运行:pytest tests/models/test_user.py -v(验证通过) 5. 运行:pytest tests/ -q(验证无回归) 6. 提交:git add -A && git commit -m "feat: 添加用户模型,包含密码哈希" 项目上下文: - Python 3.11,Flask 应用在 src/app.py - 现有模型在 src/models/ - 测试使用 pytest,从项目根目录运行 - bcrypt 已在 requirements.txt 中 """, toolsets=['terminal', 'file'] ) ``` ### 规范审查者上下文示例 ```python delegate_task( goal="审查实现是否符合计划中的规范", context=""" 原始任务规范: - 创建 src/models/user.py,包含 User 类 - 字段:email (str), password_hash (str) - 使用 bcrypt 进行密码哈希 - 包含 __repr__ 方法 检查项: - [ ] 规范中的所有需求是否已实现? - [ ] 文件路径是否匹配规范? - [ ] 函数签名是否匹配规范? - [ ] 行为是否符合预期? - [ ] 是否添加了额外内容(范围蔓延)? 输出:通过或列出具体的规范差距。 """, toolsets=['file'] ) ``` ### 质量审查者上下文示例 ```python delegate_task( goal="审查任务 1 实现的代码质量", context=""" 待审查文件: - src/models/user.py - tests/models/test_user.py 检查项: - [ ] 是否遵循项目约定和风格? - [ ] 错误处理是否恰当? - [ ] 变量/函数名是否清晰? - [ ] 测试覆盖是否充分? - [ ] 是否有明显 bug 或遗漏的边界情况? - [ ] 是否有安全问题? 输出格式: - 严重问题:[必须修复才能继续] - 重要问题:[应该修复] - 次要问题:[可选] - 裁定:批准或要求修改 """, toolsets=['file'] ) ``` ### 最终集成审查者上下文示例 ```python delegate_task( goal="审查整个实现的一致性和集成问题", context=""" 计划中的所有任务已完成。审查完整实现: - 所有组件是否能协同工作? - 任务间是否存在不一致? - 所有测试是否通过? - 是否可以合并? 请运行完整测试套件: pytest tests/ -q 并检查代码风格: flake8 src/ tests/ """, toolsets=['terminal', 'file'] ) ``` ### 常见中文问题处理 #### 1. 编码问题 ```python # 在上下文中明确指定编码 context = """ 文件编码:UTF-8 请确保所有文件使用 UTF-8 编码保存 """ ``` #### 2. 路径问题 ```python # 使用绝对路径避免歧义 context = """ 项目路径:/home/ubuntu/projects/my-project 测试路径:/home/ubuntu/projects/my-project/tests """ ``` #### 3. 依赖问题 ```python # 明确依赖版本 context = """ Python 版本:3.11 依赖: - flask==2.3.2 - bcrypt==4.0.1 - pytest==7.4.0 """ ```