Files
agent-skills/software-development/subagent-driven-development/SKILL.md
Hermes Agent ccc63d1e70 first commit
2026-05-10 13:52:46 +08:00

14 KiB
Raw Blame History

name, description, version, author, license, metadata
name description version author license metadata
subagent-driven-development Execute plans via delegate_task subagents (2-stage review). 1.1.0 Hermes Agent (adapted from obra/superpowers) MIT
hermes
tags related_skills
delegation
subagent
implementation
workflow
parallel
writing-plans
requesting-code-review
test-driven-development

Subagent-Driven Development

Overview

Execute implementation plans by dispatching fresh subagents per task with systematic two-stage review.

Core principle: Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration.

When to Use

Use this skill when:

  • You have an implementation plan (from writing-plans skill or user requirements)
  • Tasks are mostly independent
  • Quality and spec compliance are important
  • You want automated review between tasks

vs. manual execution:

  • Fresh context per task (no confusion from accumulated state)
  • Automated review process catches issues early
  • Consistent quality checks across all tasks
  • Subagents can ask questions before starting work

The Process

1. Read and Parse Plan

Read the plan file. Extract ALL tasks with their full text and context upfront. Create a todo list:

# Read the plan
read_file("docs/plans/feature-plan.md")

# Create todo list with all tasks
todo([
    {"id": "task-1", "content": "Create User model with email field", "status": "pending"},
    {"id": "task-2", "content": "Add password hashing utility", "status": "pending"},
    {"id": "task-3", "content": "Create login endpoint", "status": "pending"},
])

Key: Read the plan ONCE. Extract everything. Don't make subagents read the plan file — provide the full task text directly in context.

2. Per-Task Workflow

For EACH task in the plan:

Step 1: Dispatch Implementer Subagent

Use delegate_task with complete context:

delegate_task(
    goal="Implement Task 1: Create User model with email and password_hash fields",
    context="""
    TASK FROM PLAN:
    - Create: src/models/user.py
    - Add User class with email (str) and password_hash (str) fields
    - Use bcrypt for password hashing
    - Include __repr__ for debugging

    FOLLOW TDD:
    1. Write failing test in tests/models/test_user.py
    2. Run: pytest tests/models/test_user.py -v (verify FAIL)
    3. Write minimal implementation
    4. Run: pytest tests/models/test_user.py -v (verify PASS)
    5. Run: pytest tests/ -q (verify no regressions)
    6. Commit: git add -A && git commit -m "feat: add User model with password hashing"

    PROJECT CONTEXT:
    - Python 3.11, Flask app in src/app.py
    - Existing models in src/models/
    - Tests use pytest, run from project root
    - bcrypt already in requirements.txt
    """,
    toolsets=['terminal', 'file']
)

Step 2: Dispatch Spec Compliance Reviewer

After the implementer completes, verify against the original spec:

delegate_task(
    goal="Review if implementation matches the spec from the plan",
    context="""
    ORIGINAL TASK SPEC:
    - Create src/models/user.py with User class
    - Fields: email (str), password_hash (str)
    - Use bcrypt for password hashing
    - Include __repr__

    CHECK:
    - [ ] All requirements from spec implemented?
    - [ ] File paths match spec?
    - [ ] Function signatures match spec?
    - [ ] Behavior matches expected?
    - [ ] Nothing extra added (no scope creep)?

    OUTPUT: PASS or list of specific spec gaps to fix.
    """,
    toolsets=['file']
)

If spec issues found: Fix gaps, then re-run spec review. Continue only when spec-compliant.

Step 3: Dispatch Code Quality Reviewer

After spec compliance passes:

delegate_task(
    goal="Review code quality for Task 1 implementation",
    context="""
    FILES TO REVIEW:
    - src/models/user.py
    - tests/models/test_user.py

    CHECK:
    - [ ] Follows project conventions and style?
    - [ ] Proper error handling?
    - [ ] Clear variable/function names?
    - [ ] Adequate test coverage?
    - [ ] No obvious bugs or missed edge cases?
    - [ ] No security issues?

    OUTPUT FORMAT:
    - Critical Issues: [must fix before proceeding]
    - Important Issues: [should fix]
    - Minor Issues: [optional]
    - Verdict: APPROVED or REQUEST_CHANGES
    """,
    toolsets=['file']
)

If quality issues found: Fix issues, re-review. Continue only when approved.

Step 4: Mark Complete

todo([{"id": "task-1", "content": "Create User model with email field", "status": "completed"}], merge=True)

3. Final Review

After ALL tasks are complete, dispatch a final integration reviewer:

delegate_task(
    goal="Review the entire implementation for consistency and integration issues",
    context="""
    All tasks from the plan are complete. Review the full implementation:
    - Do all components work together?
    - Any inconsistencies between tasks?
    - All tests passing?
    - Ready for merge?
    """,
    toolsets=['terminal', 'file']
)

4. Verify and Commit

# Run full test suite
pytest tests/ -q

# Review all changes
git diff --stat

# Final commit if needed
git add -A && git commit -m "feat: complete [feature name] implementation"

Task Granularity

Each task = 2-5 minutes of focused work.

Too big:

  • "Implement user authentication system"

Right size:

  • "Create User model with email and password fields"
  • "Add password hashing function"
  • "Create login endpoint"
  • "Add JWT token generation"
  • "Create registration endpoint"

Red Flags — Never Do These

  • Start implementation without a plan
  • Skip reviews (spec compliance OR code quality)
  • Proceed with unfixed critical/important issues
  • Dispatch multiple implementation subagents for tasks that touch the same files
  • Make subagent read the plan file (provide full text in context instead)
  • Skip scene-setting context (subagent needs to understand where the task fits)
  • Ignore subagent questions (answer before letting them proceed)
  • Accept "close enough" on spec compliance
  • Skip review loops (reviewer found issues → implementer fixes → review again)
  • Let implementer self-review replace actual review (both are needed)
  • Start code quality review before spec compliance is PASS (wrong order)
  • Move to next task while either review has open issues

Handling Issues

If Subagent Asks Questions

  • Answer clearly and completely
  • Provide additional context if needed
  • Don't rush them into implementation

If Reviewer Finds Issues

  • Implementer subagent (or a new one) fixes them
  • Reviewer reviews again
  • Repeat until approved
  • Don't skip the re-review

If Subagent Fails a Task

  • Dispatch a new fix subagent with specific instructions about what went wrong
  • Don't try to fix manually in the controller session (context pollution)

Efficiency Notes

Why fresh subagent per task:

  • Prevents context pollution from accumulated state
  • Each subagent gets clean, focused context
  • No confusion from prior tasks' code or reasoning

Why two-stage review:

  • Spec review catches under/over-building early
  • Quality review ensures the implementation is well-built
  • Catches issues before they compound across tasks

Cost trade-off:

  • More subagent invocations (implementer + 2 reviewers per task)
  • But catches issues early (cheaper than debugging compounded problems later)

Integration with Other Skills

With writing-plans

This skill EXECUTES plans created by the writing-plans skill:

  1. User requirements → writing-plans → implementation plan
  2. Implementation plan → subagent-driven-development → working code

With test-driven-development

Implementer subagents should follow TDD:

  1. Write failing test first
  2. Implement minimal code
  3. Verify test passes
  4. Commit

Include TDD instructions in every implementer context.

With requesting-code-review

The two-stage review process IS the code review. For final integration review, use the requesting-code-review skill's review dimensions.

With systematic-debugging

If a subagent encounters bugs during implementation:

  1. Follow systematic-debugging process
  2. Find root cause before fixing
  3. Write regression test
  4. Resume implementation

Example Workflow

[Read plan: docs/plans/auth-feature.md]
[Create todo list with 5 tasks]

--- Task 1: Create User model ---
[Dispatch implementer subagent]
  Implementer: "Should email be unique?"
  You: "Yes, email must be unique"
  Implementer: Implemented, 3/3 tests passing, committed.

[Dispatch spec reviewer]
  Spec reviewer: ✅ PASS — all requirements met

[Dispatch quality reviewer]
  Quality reviewer: ✅ APPROVED — clean code, good tests

[Mark Task 1 complete]

--- Task 2: Password hashing ---
[Dispatch implementer subagent]
  Implementer: No questions, implemented, 5/5 tests passing.

[Dispatch spec reviewer]
  Spec reviewer: ❌ Missing: password strength validation (spec says "min 8 chars")

[Implementer fixes]
  Implementer: Added validation, 7/7 tests passing.

[Dispatch spec reviewer again]
  Spec reviewer: ✅ PASS

[Dispatch quality reviewer]
  Quality reviewer: Important: Magic number 8, extract to constant
  Implementer: Extracted MIN_PASSWORD_LENGTH constant
  Quality reviewer: ✅ APPROVED

[Mark Task 2 complete]

... (continue for all tasks)

[After all tasks: dispatch final integration reviewer]
[Run full test suite: all passing]
[Done!]

Remember

Fresh subagent per task
Two-stage review every time
Spec compliance FIRST
Code quality SECOND
Never skip reviews
Catch issues early

Quality is not an accident. It's the result of systematic process.

Further reading (load when relevant)

When the orchestration involves significant context usage, long review loops, or complex validation checkpoints, load these references for the specific discipline:

  • references/context-budget-discipline.md — Four-tier context degradation model (PEAK / GOOD / DEGRADING / POOR), read-depth rules that scale with context window size, and early warning signs of silent degradation. Load when a run will clearly consume significant context (multi-phase plans, many subagents, large artifacts).
  • references/gates-taxonomy.md — The four canonical gate types (Pre-flight, Revision, Escalation, Abort) with behavior, recovery, and examples. Load when designing or reviewing any workflow that has validation checkpoints — use the vocabulary explicitly so each gate has defined entry, failure behavior, and resumption rules.

Both references adapted from gsd-build/get-shit-done (MIT © 2025 Lex Christopherson).

中文上下文示例

实现者子代理上下文示例

delegate_task(
    goal="实现任务 1: 创建用户模型,包含邮箱和密码字段",
    context="""
    来自计划的任务:
    - 创建src/models/user.py
    - 添加 User 类,包含 email (str) 和 password_hash (str) 字段
    - 使用 bcrypt 进行密码哈希
    - 包含 __repr__ 方法用于调试

    遵循 TDD 流程:
    1. 在 tests/models/test_user.py 中编写失败测试
    2. 运行pytest tests/models/test_user.py -v验证失败
    3. 编写最小实现
    4. 运行pytest tests/models/test_user.py -v验证通过
    5. 运行pytest tests/ -q验证无回归
    6. 提交git add -A && git commit -m "feat: 添加用户模型,包含密码哈希"

    项目上下文:
    - Python 3.11Flask 应用在 src/app.py
    - 现有模型在 src/models/
    - 测试使用 pytest从项目根目录运行
    - bcrypt 已在 requirements.txt 中
    """,
    toolsets=['terminal', 'file']
)

规范审查者上下文示例

delegate_task(
    goal="审查实现是否符合计划中的规范",
    context="""
    原始任务规范:
    - 创建 src/models/user.py包含 User 类
    - 字段email (str), password_hash (str)
    - 使用 bcrypt 进行密码哈希
    - 包含 __repr__ 方法

    检查项:
    - [ ] 规范中的所有需求是否已实现?
    - [ ] 文件路径是否匹配规范?
    - [ ] 函数签名是否匹配规范?
    - [ ] 行为是否符合预期?
    - [ ] 是否添加了额外内容(范围蔓延)?

    输出:通过或列出具体的规范差距。
    """,
    toolsets=['file']
)

质量审查者上下文示例

delegate_task(
    goal="审查任务 1 实现的代码质量",
    context="""
    待审查文件:
    - src/models/user.py
    - tests/models/test_user.py

    检查项:
    - [ ] 是否遵循项目约定和风格?
    - [ ] 错误处理是否恰当?
    - [ ] 变量/函数名是否清晰?
    - [ ] 测试覆盖是否充分?
    - [ ] 是否有明显 bug 或遗漏的边界情况?
    - [ ] 是否有安全问题?

    输出格式:
    - 严重问题:[必须修复才能继续]
    - 重要问题:[应该修复]
    - 次要问题:[可选]
    - 裁定:批准或要求修改
    """,
    toolsets=['file']
)

最终集成审查者上下文示例

delegate_task(
    goal="审查整个实现的一致性和集成问题",
    context="""
    计划中的所有任务已完成。审查完整实现:
    - 所有组件是否能协同工作?
    - 任务间是否存在不一致?
    - 所有测试是否通过?
    - 是否可以合并?

    请运行完整测试套件:
    pytest tests/ -q

    并检查代码风格:
    flake8 src/ tests/
    """,
    toolsets=['terminal', 'file']
)

常见中文问题处理

1. 编码问题

# 在上下文中明确指定编码
context = """
文件编码UTF-8
请确保所有文件使用 UTF-8 编码保存
"""

2. 路径问题

# 使用绝对路径避免歧义
context = """
项目路径:/home/ubuntu/projects/my-project
测试路径:/home/ubuntu/projects/my-project/tests
"""

3. 依赖问题

# 明确依赖版本
context = """
Python 版本3.11
依赖:
- flask==2.3.2
- bcrypt==4.0.1
- pytest==7.4.0
"""