Files
agent-skills/software-development/hermes-agent-skill-authoring/references/skill-routing-optimization.md
Hermes Agent ccc63d1e70 first commit
2026-05-10 13:52:46 +08:00

94 lines
3.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Skill Description Optimization for Routing
Based on [SkillRouter (arXiv:2603.22455)](https://arxiv.org/abs/2603.22455) methodology.
## Core Finding
In large, overlapping skill pools, **full skill text is the critical routing signal** — not just name + metadata. Hiding skill body causes 31-44pp drop in routing accuracy at 80K scale. For Hermes at ~120 skills, the impact is smaller but still meaningful for overlapping clusters.
## Description Writing Rules
### 1. Trigger Words (Required)
Every description must include explicit trigger words — the exact phrases users would say.
```
Bad: "Generates professional infographics."
Good: "生成信息图。触发词infographic、信息图、可视化、visual summary。"
```
### 2. Negative Boundaries ("Don't use for")
For skills in overlapping domains, specify what they DON'T cover.
```
Good: "触发词:学术论文、文献调研。不用于:通用搜索(用 web_search。"
```
### 3. No Competitive Recommendations
Never recommend skill B inside skill A's description.
```
Bad: "For multi-source search, prefer sn-search-academic over arxiv."
Good: Each skill describes itself independently.
```
### 4. No Implementation Details
Use user-facing concepts, not internal names.
```
Bad: "Requires SN_API_KEY via sn-image-base's sn_agent_runner.py."
Good: "Requires SenseNova API."
```
### 5. Pipeline Relationships (for sub-skills)
If a skill is part of a pipeline, label its stage.
```
Good: "[sn-deep-research 子阶段] 按 plan.json 执行单维度搜索。"
Good: "[sn-deep-research 最终阶段] 基于 synthesis.md 写最终报告。"
```
### 6. Differentiation Over Function Listing
When multiple skills serve similar goals, describe what makes THIS one distinct.
```
Bad: "生成信息图" (both sn-infographic and baoyu-infographic say this)
Good: sn-infographic: "87 种布局,支持多轮自动评审优化。"
baoyu-infographic: "21 种布局,有用户交互确认流程。"
```
## Overlap Detection
"Overlap" = same user intent AND same implementation approach. Two skills are **complementary** (keep both) when:
- Same output type, different tech stack (Python vs Node.js)
- Same domain, different complexity level (lightweight vs full-featured)
- Same tool, different workflow (quick vs QA-heavy)
Examples of complementary pairs that should NOT be merged:
- `pptx-generator` (python-pptx) + `powerpoint` (pptxgenjs)
- `WeChat-article-reader` (Python/Markdown) + `wechat-article-extractor` (Node.js/JSON)
## Usage Measurement
To find which skills are actually used:
1. Search `~/.hermes/state.db``messages` table for `skill_view` tool results
2. Search `~/.hermes/sessions/*.jsonl` for `skill_view` function calls
3. `.json` files in sessions/ are request dumps — no message history
4. Auto-loaded skills (via system prompt matching) don't generate `skill_view` calls — counts are lower bounds
```sql
-- Find skill_view results in SQLite
SELECT content FROM messages
WHERE role = 'tool'
AND content LIKE '%"skill_dir"%'
AND content LIKE '%"success": true%';
```
## Pool Size vs Description Quality
At Hermes's current scale (~120 skills):
- **Reducing pool size** (removing unused skills) has the highest impact
- **Improving descriptions** helps for the remaining overlapping clusters
- **Code-level changes** (prompt restructuring) are NOT worth the complexity
The optimal strategy: delete genuinely unused skills → fix descriptions for overlapping pairs → stop.