491 lines
14 KiB
Markdown
491 lines
14 KiB
Markdown
---
|
||
name: subagent-driven-development
|
||
description: "Execute plans via delegate_task subagents (2-stage review)."
|
||
version: 1.1.0
|
||
author: Hermes Agent (adapted from obra/superpowers)
|
||
license: MIT
|
||
metadata:
|
||
hermes:
|
||
tags: [delegation, subagent, implementation, workflow, parallel]
|
||
related_skills: [writing-plans, requesting-code-review, test-driven-development]
|
||
---
|
||
|
||
# Subagent-Driven Development
|
||
|
||
## Overview
|
||
|
||
Execute implementation plans by dispatching fresh subagents per task with systematic two-stage review.
|
||
|
||
**Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration.
|
||
|
||
## When to Use
|
||
|
||
Use this skill when:
|
||
- You have an implementation plan (from writing-plans skill or user requirements)
|
||
- Tasks are mostly independent
|
||
- Quality and spec compliance are important
|
||
- You want automated review between tasks
|
||
|
||
**vs. manual execution:**
|
||
- Fresh context per task (no confusion from accumulated state)
|
||
- Automated review process catches issues early
|
||
- Consistent quality checks across all tasks
|
||
- Subagents can ask questions before starting work
|
||
|
||
## The Process
|
||
|
||
### 1. Read and Parse Plan
|
||
|
||
Read the plan file. Extract ALL tasks with their full text and context upfront. Create a todo list:
|
||
|
||
```python
|
||
# Read the plan
|
||
read_file("docs/plans/feature-plan.md")
|
||
|
||
# Create todo list with all tasks
|
||
todo([
|
||
{"id": "task-1", "content": "Create User model with email field", "status": "pending"},
|
||
{"id": "task-2", "content": "Add password hashing utility", "status": "pending"},
|
||
{"id": "task-3", "content": "Create login endpoint", "status": "pending"},
|
||
])
|
||
```
|
||
|
||
**Key:** Read the plan ONCE. Extract everything. Don't make subagents read the plan file — provide the full task text directly in context.
|
||
|
||
### 2. Per-Task Workflow
|
||
|
||
For EACH task in the plan:
|
||
|
||
#### Step 1: Dispatch Implementer Subagent
|
||
|
||
Use `delegate_task` with complete context:
|
||
|
||
```python
|
||
delegate_task(
|
||
goal="Implement Task 1: Create User model with email and password_hash fields",
|
||
context="""
|
||
TASK FROM PLAN:
|
||
- Create: src/models/user.py
|
||
- Add User class with email (str) and password_hash (str) fields
|
||
- Use bcrypt for password hashing
|
||
- Include __repr__ for debugging
|
||
|
||
FOLLOW TDD:
|
||
1. Write failing test in tests/models/test_user.py
|
||
2. Run: pytest tests/models/test_user.py -v (verify FAIL)
|
||
3. Write minimal implementation
|
||
4. Run: pytest tests/models/test_user.py -v (verify PASS)
|
||
5. Run: pytest tests/ -q (verify no regressions)
|
||
6. Commit: git add -A && git commit -m "feat: add User model with password hashing"
|
||
|
||
PROJECT CONTEXT:
|
||
- Python 3.11, Flask app in src/app.py
|
||
- Existing models in src/models/
|
||
- Tests use pytest, run from project root
|
||
- bcrypt already in requirements.txt
|
||
""",
|
||
toolsets=['terminal', 'file']
|
||
)
|
||
```
|
||
|
||
#### Step 2: Dispatch Spec Compliance Reviewer
|
||
|
||
After the implementer completes, verify against the original spec:
|
||
|
||
```python
|
||
delegate_task(
|
||
goal="Review if implementation matches the spec from the plan",
|
||
context="""
|
||
ORIGINAL TASK SPEC:
|
||
- Create src/models/user.py with User class
|
||
- Fields: email (str), password_hash (str)
|
||
- Use bcrypt for password hashing
|
||
- Include __repr__
|
||
|
||
CHECK:
|
||
- [ ] All requirements from spec implemented?
|
||
- [ ] File paths match spec?
|
||
- [ ] Function signatures match spec?
|
||
- [ ] Behavior matches expected?
|
||
- [ ] Nothing extra added (no scope creep)?
|
||
|
||
OUTPUT: PASS or list of specific spec gaps to fix.
|
||
""",
|
||
toolsets=['file']
|
||
)
|
||
```
|
||
|
||
**If spec issues found:** Fix gaps, then re-run spec review. Continue only when spec-compliant.
|
||
|
||
#### Step 3: Dispatch Code Quality Reviewer
|
||
|
||
After spec compliance passes:
|
||
|
||
```python
|
||
delegate_task(
|
||
goal="Review code quality for Task 1 implementation",
|
||
context="""
|
||
FILES TO REVIEW:
|
||
- src/models/user.py
|
||
- tests/models/test_user.py
|
||
|
||
CHECK:
|
||
- [ ] Follows project conventions and style?
|
||
- [ ] Proper error handling?
|
||
- [ ] Clear variable/function names?
|
||
- [ ] Adequate test coverage?
|
||
- [ ] No obvious bugs or missed edge cases?
|
||
- [ ] No security issues?
|
||
|
||
OUTPUT FORMAT:
|
||
- Critical Issues: [must fix before proceeding]
|
||
- Important Issues: [should fix]
|
||
- Minor Issues: [optional]
|
||
- Verdict: APPROVED or REQUEST_CHANGES
|
||
""",
|
||
toolsets=['file']
|
||
)
|
||
```
|
||
|
||
**If quality issues found:** Fix issues, re-review. Continue only when approved.
|
||
|
||
#### Step 4: Mark Complete
|
||
|
||
```python
|
||
todo([{"id": "task-1", "content": "Create User model with email field", "status": "completed"}], merge=True)
|
||
```
|
||
|
||
### 3. Final Review
|
||
|
||
After ALL tasks are complete, dispatch a final integration reviewer:
|
||
|
||
```python
|
||
delegate_task(
|
||
goal="Review the entire implementation for consistency and integration issues",
|
||
context="""
|
||
All tasks from the plan are complete. Review the full implementation:
|
||
- Do all components work together?
|
||
- Any inconsistencies between tasks?
|
||
- All tests passing?
|
||
- Ready for merge?
|
||
""",
|
||
toolsets=['terminal', 'file']
|
||
)
|
||
```
|
||
|
||
### 4. Verify and Commit
|
||
|
||
```bash
|
||
# Run full test suite
|
||
pytest tests/ -q
|
||
|
||
# Review all changes
|
||
git diff --stat
|
||
|
||
# Final commit if needed
|
||
git add -A && git commit -m "feat: complete [feature name] implementation"
|
||
```
|
||
|
||
## Task Granularity
|
||
|
||
**Each task = 2-5 minutes of focused work.**
|
||
|
||
**Too big:**
|
||
- "Implement user authentication system"
|
||
|
||
**Right size:**
|
||
- "Create User model with email and password fields"
|
||
- "Add password hashing function"
|
||
- "Create login endpoint"
|
||
- "Add JWT token generation"
|
||
- "Create registration endpoint"
|
||
|
||
## Red Flags — Never Do These
|
||
|
||
- Start implementation without a plan
|
||
- Skip reviews (spec compliance OR code quality)
|
||
- Proceed with unfixed critical/important issues
|
||
- Dispatch multiple implementation subagents for tasks that touch the same files
|
||
- Make subagent read the plan file (provide full text in context instead)
|
||
- Skip scene-setting context (subagent needs to understand where the task fits)
|
||
- Ignore subagent questions (answer before letting them proceed)
|
||
- Accept "close enough" on spec compliance
|
||
- Skip review loops (reviewer found issues → implementer fixes → review again)
|
||
- Let implementer self-review replace actual review (both are needed)
|
||
- **Start code quality review before spec compliance is PASS** (wrong order)
|
||
- Move to next task while either review has open issues
|
||
|
||
## Handling Issues
|
||
|
||
### If Subagent Asks Questions
|
||
|
||
- Answer clearly and completely
|
||
- Provide additional context if needed
|
||
- Don't rush them into implementation
|
||
|
||
### If Reviewer Finds Issues
|
||
|
||
- Implementer subagent (or a new one) fixes them
|
||
- Reviewer reviews again
|
||
- Repeat until approved
|
||
- Don't skip the re-review
|
||
|
||
### If Subagent Fails a Task
|
||
|
||
- Dispatch a new fix subagent with specific instructions about what went wrong
|
||
- Don't try to fix manually in the controller session (context pollution)
|
||
|
||
## Efficiency Notes
|
||
|
||
**Why fresh subagent per task:**
|
||
- Prevents context pollution from accumulated state
|
||
- Each subagent gets clean, focused context
|
||
- No confusion from prior tasks' code or reasoning
|
||
|
||
**Why two-stage review:**
|
||
- Spec review catches under/over-building early
|
||
- Quality review ensures the implementation is well-built
|
||
- Catches issues before they compound across tasks
|
||
|
||
**Cost trade-off:**
|
||
- More subagent invocations (implementer + 2 reviewers per task)
|
||
- But catches issues early (cheaper than debugging compounded problems later)
|
||
|
||
## Integration with Other Skills
|
||
|
||
### With writing-plans
|
||
|
||
This skill EXECUTES plans created by the writing-plans skill:
|
||
1. User requirements → writing-plans → implementation plan
|
||
2. Implementation plan → subagent-driven-development → working code
|
||
|
||
### With test-driven-development
|
||
|
||
Implementer subagents should follow TDD:
|
||
1. Write failing test first
|
||
2. Implement minimal code
|
||
3. Verify test passes
|
||
4. Commit
|
||
|
||
Include TDD instructions in every implementer context.
|
||
|
||
### With requesting-code-review
|
||
|
||
The two-stage review process IS the code review. For final integration review, use the requesting-code-review skill's review dimensions.
|
||
|
||
### With systematic-debugging
|
||
|
||
If a subagent encounters bugs during implementation:
|
||
1. Follow systematic-debugging process
|
||
2. Find root cause before fixing
|
||
3. Write regression test
|
||
4. Resume implementation
|
||
|
||
## Example Workflow
|
||
|
||
```
|
||
[Read plan: docs/plans/auth-feature.md]
|
||
[Create todo list with 5 tasks]
|
||
|
||
--- Task 1: Create User model ---
|
||
[Dispatch implementer subagent]
|
||
Implementer: "Should email be unique?"
|
||
You: "Yes, email must be unique"
|
||
Implementer: Implemented, 3/3 tests passing, committed.
|
||
|
||
[Dispatch spec reviewer]
|
||
Spec reviewer: ✅ PASS — all requirements met
|
||
|
||
[Dispatch quality reviewer]
|
||
Quality reviewer: ✅ APPROVED — clean code, good tests
|
||
|
||
[Mark Task 1 complete]
|
||
|
||
--- Task 2: Password hashing ---
|
||
[Dispatch implementer subagent]
|
||
Implementer: No questions, implemented, 5/5 tests passing.
|
||
|
||
[Dispatch spec reviewer]
|
||
Spec reviewer: ❌ Missing: password strength validation (spec says "min 8 chars")
|
||
|
||
[Implementer fixes]
|
||
Implementer: Added validation, 7/7 tests passing.
|
||
|
||
[Dispatch spec reviewer again]
|
||
Spec reviewer: ✅ PASS
|
||
|
||
[Dispatch quality reviewer]
|
||
Quality reviewer: Important: Magic number 8, extract to constant
|
||
Implementer: Extracted MIN_PASSWORD_LENGTH constant
|
||
Quality reviewer: ✅ APPROVED
|
||
|
||
[Mark Task 2 complete]
|
||
|
||
... (continue for all tasks)
|
||
|
||
[After all tasks: dispatch final integration reviewer]
|
||
[Run full test suite: all passing]
|
||
[Done!]
|
||
```
|
||
|
||
## Remember
|
||
|
||
```
|
||
Fresh subagent per task
|
||
Two-stage review every time
|
||
Spec compliance FIRST
|
||
Code quality SECOND
|
||
Never skip reviews
|
||
Catch issues early
|
||
```
|
||
|
||
**Quality is not an accident. It's the result of systematic process.**
|
||
|
||
## Further reading (load when relevant)
|
||
|
||
When the orchestration involves significant context usage, long review loops, or complex validation checkpoints, load these references for the specific discipline:
|
||
|
||
- **`references/context-budget-discipline.md`** — Four-tier context degradation model (PEAK / GOOD / DEGRADING / POOR), read-depth rules that scale with context window size, and early warning signs of silent degradation. Load when a run will clearly consume significant context (multi-phase plans, many subagents, large artifacts).
|
||
- **`references/gates-taxonomy.md`** — The four canonical gate types (Pre-flight, Revision, Escalation, Abort) with behavior, recovery, and examples. Load when designing or reviewing any workflow that has validation checkpoints — use the vocabulary explicitly so each gate has defined entry, failure behavior, and resumption rules.
|
||
|
||
Both references adapted from gsd-build/get-shit-done (MIT © 2025 Lex Christopherson).
|
||
|
||
## 中文上下文示例
|
||
|
||
### 实现者子代理上下文示例
|
||
|
||
```python
|
||
delegate_task(
|
||
goal="实现任务 1: 创建用户模型,包含邮箱和密码字段",
|
||
context="""
|
||
来自计划的任务:
|
||
- 创建:src/models/user.py
|
||
- 添加 User 类,包含 email (str) 和 password_hash (str) 字段
|
||
- 使用 bcrypt 进行密码哈希
|
||
- 包含 __repr__ 方法用于调试
|
||
|
||
遵循 TDD 流程:
|
||
1. 在 tests/models/test_user.py 中编写失败测试
|
||
2. 运行:pytest tests/models/test_user.py -v(验证失败)
|
||
3. 编写最小实现
|
||
4. 运行:pytest tests/models/test_user.py -v(验证通过)
|
||
5. 运行:pytest tests/ -q(验证无回归)
|
||
6. 提交:git add -A && git commit -m "feat: 添加用户模型,包含密码哈希"
|
||
|
||
项目上下文:
|
||
- Python 3.11,Flask 应用在 src/app.py
|
||
- 现有模型在 src/models/
|
||
- 测试使用 pytest,从项目根目录运行
|
||
- bcrypt 已在 requirements.txt 中
|
||
""",
|
||
toolsets=['terminal', 'file']
|
||
)
|
||
```
|
||
|
||
### 规范审查者上下文示例
|
||
|
||
```python
|
||
delegate_task(
|
||
goal="审查实现是否符合计划中的规范",
|
||
context="""
|
||
原始任务规范:
|
||
- 创建 src/models/user.py,包含 User 类
|
||
- 字段:email (str), password_hash (str)
|
||
- 使用 bcrypt 进行密码哈希
|
||
- 包含 __repr__ 方法
|
||
|
||
检查项:
|
||
- [ ] 规范中的所有需求是否已实现?
|
||
- [ ] 文件路径是否匹配规范?
|
||
- [ ] 函数签名是否匹配规范?
|
||
- [ ] 行为是否符合预期?
|
||
- [ ] 是否添加了额外内容(范围蔓延)?
|
||
|
||
输出:通过或列出具体的规范差距。
|
||
""",
|
||
toolsets=['file']
|
||
)
|
||
```
|
||
|
||
### 质量审查者上下文示例
|
||
|
||
```python
|
||
delegate_task(
|
||
goal="审查任务 1 实现的代码质量",
|
||
context="""
|
||
待审查文件:
|
||
- src/models/user.py
|
||
- tests/models/test_user.py
|
||
|
||
检查项:
|
||
- [ ] 是否遵循项目约定和风格?
|
||
- [ ] 错误处理是否恰当?
|
||
- [ ] 变量/函数名是否清晰?
|
||
- [ ] 测试覆盖是否充分?
|
||
- [ ] 是否有明显 bug 或遗漏的边界情况?
|
||
- [ ] 是否有安全问题?
|
||
|
||
输出格式:
|
||
- 严重问题:[必须修复才能继续]
|
||
- 重要问题:[应该修复]
|
||
- 次要问题:[可选]
|
||
- 裁定:批准或要求修改
|
||
""",
|
||
toolsets=['file']
|
||
)
|
||
```
|
||
|
||
### 最终集成审查者上下文示例
|
||
|
||
```python
|
||
delegate_task(
|
||
goal="审查整个实现的一致性和集成问题",
|
||
context="""
|
||
计划中的所有任务已完成。审查完整实现:
|
||
- 所有组件是否能协同工作?
|
||
- 任务间是否存在不一致?
|
||
- 所有测试是否通过?
|
||
- 是否可以合并?
|
||
|
||
请运行完整测试套件:
|
||
pytest tests/ -q
|
||
|
||
并检查代码风格:
|
||
flake8 src/ tests/
|
||
""",
|
||
toolsets=['terminal', 'file']
|
||
)
|
||
```
|
||
|
||
### 常见中文问题处理
|
||
|
||
#### 1. 编码问题
|
||
```python
|
||
# 在上下文中明确指定编码
|
||
context = """
|
||
文件编码:UTF-8
|
||
请确保所有文件使用 UTF-8 编码保存
|
||
"""
|
||
```
|
||
|
||
#### 2. 路径问题
|
||
```python
|
||
# 使用绝对路径避免歧义
|
||
context = """
|
||
项目路径:/home/ubuntu/projects/my-project
|
||
测试路径:/home/ubuntu/projects/my-project/tests
|
||
"""
|
||
```
|
||
|
||
#### 3. 依赖问题
|
||
```python
|
||
# 明确依赖版本
|
||
context = """
|
||
Python 版本:3.11
|
||
依赖:
|
||
- flask==2.3.2
|
||
- bcrypt==4.0.1
|
||
- pytest==7.4.0
|
||
"""
|
||
```
|