ephron_ren/agent-skills

Fork 0

Files

Hermes Agent ccc63d1e70 first commit

2026-05-10 13:52:46 +08:00

18 KiB

Raw Permalink Blame History

name, description, metadata, triggers

name

description

metadata

triggers

sn-infographic

生成专业信息图。87 种布局 x 66 种风格，支持多轮自动评审优化。需要 SenseNova API。触发词：infographic、信息图、可视化、visual summary。适合追求自动化质量把控的场景。无 SenseNova API 时用 baoyu-infographic。

project	tier	category	priority	user_visible
SenseNova-Skills	1	scene	9	true

infographic

information graphic

infographics generation

visual summary

data visualization

visual explanation

diagram

生成信息图

信息图生成

生成 infographic

信息图表

图表生成

数据可视化

图解

sn-infographic

Info graphic generation scene skill (tier 1), relying on the sn-image-generate, sn-image-recognize, and sn-text-optimize tools provided by sn-image-base (tier 0).

Features:

Evaluation of prompt quality (auto mode)
Prompt expansion (force/auto mode)
Multiple rounds of image generation and VLM review
Output the best result based on quality ranking

Input Specification

Parameter	Type	Default Value	Description
`user_prompt`	string	Required	User original request
`max_rounds`	int	`1`	Maximum number of generation rounds
`output_mode`	string	`friendly`	Output mode: friendly / verbose
`prompts_expand_mode`	string	`auto`	expand strategy: auto / force / disable

API Configuration

All API calls in this skill are executed through the sn_agent_runner.py of the sn-image-base skill, with authentication parameters using default values (CLI > environment variables > built-in defaults),无需显式传入。

Call Type	Tool	Authentication Parameters	Description
LLM	sn-text-optimize (evaluation/expansion)	Default reads `SN_TEXT_API_KEY` -> `SN_CHAT_API_KEY` -> `SN_API_KEY`	Built-in default points to Sensenova internal network service
VLM	sn-image-recognize (image review)	Default reads `SN_VISION_API_KEY` -> `SN_CHAT_API_KEY` -> `SN_API_KEY`	Built-in default points to Sensenova internal network service
Image Generation	sn-image-generate	Default reads `SN_IMAGE_GEN_API_KEY` -> `SN_API_KEY`; `SN_IMAGE_GEN_API_KEY` is only needed for image-specific override	Default uses image generation configuration of `sn-image-base`

When encountering MissingApiKeyError or needing to specify a model: pass explicitly via CLI parameters, parameter reference $SN_IMAGE_BASE/references/api_spec.md.

$SN_IMAGE_BASE path explanation: $SN_IMAGE_BASE is the installation directory of the sn-image-base skill (SKILL.md exists). The agent can locate this path by skill name sn-image-base in the list of installed skills.

Architecture: Main Agent + Worker Agent

This skill uses a two-tier agent architecture:

Role	Responsibility
Main Agent	Receive user request, normalize parameters, send preflight, start Worker, collect results, send text and images to user
Worker Agent	Execute orchestration loop (expand → multiple rounds of generation + review → sort), return structured JSON

Responsibility Boundaries:

Worker Agent does not send any messages to the user directly, only returns structured JSON
Main Agent is responsible for sending all user-visible messages
Worker Agent's last message must be and only be the JSON string defined in the Return Contract
Worker Agent's internal VLM calls always execute directly, without spawning subagents

Workflow

Main Agent Workflow

Extract user_prompt, max_rounds (default 1), output_mode (default friendly), and prompts_expand_mode (default auto) from user request
Send uniform preflight message: "Using sn-infographic skill to generate infographic, please wait..."
Start Worker Agent (Sub-Agent), passing in complete parameters and working directory
When Worker Agent returns status=ok and need_main_agent_send=true:
- max_rounds = 1: Send a one-sentence description of the image content, then send the rank=1 single image
- max_rounds > 1, friendly mode: Generate a one-sentence natural language description based on result and violations, send the evaluation text, then send the rank=1 single image
- max_rounds > 1, verbose mode: Send complete text summary message, then send all images in rank order to the user
If Worker Agent returns status=error, report the real error field content to the user

Worker Agent Workflow

Worker Agent receives user_prompt, max_rounds, output_mode, prompts_expand_mode, and the working directory of this skill (SN_IMAGE_INFOG).

Step 0 — Initialization

Generate task_id (using timestamp, format YYYYMMDD_HHMMSS)
Create a uniform temporary directory: /tmp/openclaw/sn-infographic/<task_id>/ as TEMP_DIR
Initialize an empty rounds list
Infer aspect_ratio (default 16:9) and image_size (default 2k) from user_prompt based on the rules in $SKILL_DIR/references/runtime-parameters.md

Step 1 — `prompts_expand_mode` Processing

disable mode:

Skip expand, directly use user_prompt as expanded_prompt

Assign variable and write to temporary directory:

EXPANDED_PROMPT="$USER_PROMPT"
echo "$EXPANDED_PROMPT" > "$TEMP_DIR/expanded-prompt.txt"

Record prompts_expand_skipped = true

force mode:

Directly execute Step 2

auto mode:

Call sn-text-optimize for evaluation
Parse JSON, extract required_results and optional_results
Determine logic:
- required_pass: All answer in required_results are "yes"
- optional_pass: The number of answer="yes" in optional_results / total ≥ 0.6
- should_expand = not (required_pass and optional_pass)
If JSON parsing fails, default should_expand = true (conservative strategy)
If should_expand = false: Skip Step 2, assign variable and write to temporary directory, record prompts_expand_skipped = true:
```
EXPANDED_PROMPT="$USER_PROMPT"
echo "$EXPANDED_PROMPT" > "$TEMP_DIR/expanded-prompt.txt"
```
If should_expand = true: Execute Step 2

Evaluation Call (using sn-image-base's sn-text-optimize tool):

python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-text-optimize \
  --system-prompt-path "$SKILL_DIR/references/evaluation-standard.md" \
  --user-prompt "$USER_PROMPT" \
  --output-format json

Step 2 — Content Analysis + Layout & Style Selection + Prompt Expansion

2.0 Content Analysis (using sn-image-base's sn-text-optimize tool):

ANALYSIS=$(python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-text-optimize \
  --system-prompt-path "$SKILL_DIR/references/analysis-framework.md" \
  --user-prompt "$USER_PROMPT" \
  --output-format json)

Save analysis result stdout to analysis.json in temporary directory $TEMP_DIR/analysis.json:

echo "$ANALYSIS" > "$TEMP_DIR/analysis.json"

2.1 Layout & Style Selection

Read analysis result from temporary directory $TEMP_DIR/analysis.json;

ANALYSIS=$(cat "$TEMP_DIR/analysis.json")

Based on data_type, tone, audience, select layout and style based on the rules in $SKILL_DIR/references/layout-style-selection.md;
Read layout/style definition files:

LAYOUT_DEF=$(cat "$SKILL_DIR/references/layouts/<layout>.md")
STYLE_DEF=$(cat "$SKILL_DIR/references/styles/<style>.md")

If file does not exist, fallback to hub-spoke + corporate-memphis.

Save selection result to temporary directory: $TEMP_DIR/layout-style.json;

Format of layout-style.json:

{
  "layout": "<layout>",
  "style": "<style>"
}

2.2 Structured Content Generation

Read analysis result and structured content template, convert user_prompt into a design-ready structured content based on the template rules:

ANALYSIS=$(cat "$TEMP_DIR/analysis.json")
LAYOUT_STYLE=$(cat "$TEMP_DIR/layout-style.json")
STRUCTURED_CONTENT_TEMPLATE=$(cat "$SKILL_DIR/references/structured-content-template.md")

Follow the three phases defined in the template (High-Level Outline → Section Development → Data Integrity Check), combine the learning objectives, visual opportunities, and key data in analysis.json, generate structured content, and save it to the temporary directory:

cat > "$TEMP_DIR/structured-content.md" << 'EOF'
<Content generated based on structured-content-template.md format>
EOF

Rules: All data must be preserved exactly. Do not rewrite. Do not add information that is not in the source.

2.3 Prompt Expansion (using sn-image-base's sn-text-optimize tool):

Read structured content and layout/style selection from temporary directory, dynamically concatenate system prompt, and write to temporary file:

STRUCTURED_CONTENT=$(cat "$TEMP_DIR/structured-content.md")
LAYOUT_STYLE=$(cat "$TEMP_DIR/layout-style.json")
LAYOUT=$(echo "$LAYOUT_STYLE" | jq -r '.layout')
STYLE=$(echo "$LAYOUT_STYLE" | jq -r '.style')
LAYOUT_DEF=$(cat "$SKILL_DIR/references/layouts/${LAYOUT}.md")
STYLE_DEF=$(cat "$SKILL_DIR/references/styles/${STYLE}.md")

cat > "$TEMP_DIR/expand-system-prompt.md" << EOF
$(cat "$SKILL_DIR/references/prompts-expand-system.md")

---

## Selected Layout: $LAYOUT

$LAYOUT_DEF

---

## Selected Style: $STYLE

$STYLE_DEF

---

## Output Template Reference

$(cat "$SKILL_DIR/references/base-prompt.md")
EOF

Use the content of structured-content.md as user-prompt, read system prompt from temporary file and call sn-text-optimize:

python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-text-optimize \
  --system-prompt-path "$TEMP_DIR/expand-system-prompt.md" \
  --user-prompt "$STRUCTURED_CONTENT" \
  --output-format json

Parse JSON stdout, extract result field as expanded_prompt, and write to temporary directory:

echo "$EXPANDED_PROMPT" > "$TEMP_DIR/expanded-prompt.txt"

If parsing fails or truncation is suspected (the returned content is incomplete), notify the user and terminate the workflow.

Step 3 — Image Generation Loop

Execute round from 1 to max_rounds sequentially:

Generate Image (using sn-image-base's sn-image-generate tool):

python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-image-generate \
  --prompt "$EXPANDED_PROMPT" \
  --image-size "$IMAGE_SIZE" \
  --aspect-ratio "$ASPECT_RATIO" \
  --save-path "$TEMP_DIR/round_<N>.png" \
  -o json

Review Image (only executed when max_rounds > 1):

VLM configuration requirements:

When max_rounds > 1, call VLM for review
Select VLM model from OpenClaw configuration as parameter for image recognition
If no suitable VLM model exists in OpenClaw configuration:
- Notify user that current parameter combination cannot be executed
- Suggest adding VLM configuration or changing max_rounds to 1 to avoid VLM calls
If VLM call times out or fails: do not fallback, report the real error directly

python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-image-recognize \
  --system-prompt-path "$SN_IMAGE_INFOG/references/prompts-critic-system.md" \
  --user-prompt "Evaluate the diagram in the image against the rules. Output your assessment." \
  --images "$TEMP_DIR/round_<N>.png" \
  --output-format json

System prompt comes from references/prompts-critic-system.md, user prompt is provided directly.

Save Round Result：

{
  "round": 1,
  "image": "$TEMP_DIR/round_1.png",
  "result": "PASS|FAIL",
  "violations_count": 0,
  "violations": [],
  "reasoning": "<Reasoning process, empty string when max_rounds=1>",
  "timing": {
    "image_generation": { "elapsed_seconds": 12.34, "model": "sn_image_model" },
    "vlm_review": { "elapsed_seconds": 5.67, "model": "sensenova-6.7-flash-lite" }
  }
}

Note: elapsed_seconds is read from the --output-format json return of each CLI call; image_generation.model is fixed to the hardcoded placeholder "sn_image_model" (sn-image-generate does not return the model field); vlm_review.model is read from the JSON return of sn-image-recognize. timing.vlm_review is omitted when max_rounds=1.

Early Termination Check (only executed when max_rounds > 1):

If result=PASS, immediately exit the loop, do not continue generating
If result=FAIL, continue to the next round (if there are remaining rounds)

Step 4 — Image Quality Ranking

Sort images by violations_count ascending + round ascending, return structured JSON to Main Agent.

Return Contract

After Worker Agent completes, its last message must be and only be the following JSON string (bare JSON, no code fences, no preceding or trailing text).

Normal Flow:

{
  "status": "ok",
  "need_main_agent_send": true,
  "output_mode": "friendly|verbose",
  "expanded_prompt": "<always contains when output_mode=verbose; value is original user_prompt when prompts_expand_skipped=true, otherwise is expanded result>",
  "prompts_expand_skipped": true,
  "early_terminated": true,
  "timing": {
    "total_elapsed_seconds": 35.12,
    "prompt_detection": { "elapsed_seconds": 2.11, "model": "sensenova-6.7-flash-lite" },
    "content_analysis": { "elapsed_seconds": 3.22, "model": "sensenova-6.7-flash-lite" },
    "prompt_expand": { "elapsed_seconds": 8.45, "model": "sensenova-6.7-flash-lite" }
  },
  "rounds": [
    {
      "round": 1,
      "image": "$TEMP_DIR/round_1.png",
      "result": "PASS|FAIL",
      "violations_count": 0,
      "violations": [],
      "reasoning": "<Reasoning process, empty string when max_rounds=1>",
      "timing": {
        "image_generation": { "elapsed_seconds": 12.34, "model": "sn_image_model" },
        "vlm_review": { "elapsed_seconds": 5.67, "model": "sensenova-6.7-flash-lite" }
      }
    }
  ]
}

Error Flow:

{
  "status": "error",
  "error": "<Actual error information>"
}

Rules:

status=ok must contain need_main_agent_send: true
expanded_prompt must contain when output_mode=verbose; value is original user_prompt when prompts_expand_skipped=true
prompts_expand_skipped must contain when expand is not executed (value is true), covering two cases: prompts_expand_mode=disable and prompts_expand_mode=auto and evaluation passes and skip expand
early_terminated must contain when early termination (value is true), omitted when normal execution completes
violations is an array of strings, from review results
reasoning is an empty string when max_rounds=1
Top-level timing contains:
- total_elapsed_seconds: Worker Agent's wall time from Step 0 to returning JSON, calculated by Worker Agent itself
- prompt_detection: Step 1 evaluation call, containing elapsed_seconds and model (read from sn-text-optimize JSON return); omitted when prompts_expand_mode=disable
- content_analysis: Step 2.0 content analysis call, containing elapsed_seconds and model (read from sn-text-optimize JSON return); omitted when expand is skipped
- prompt_expand: Step 2.3 prompt expansion call, containing elapsed_seconds and model (read from sn-text-optimize JSON return); omitted when expand is skipped
rounds[].timing.image_generation.model is fixed to the hardcoded placeholder "sn_image_model"
rounds[].timing.vlm_review is omitted when max_rounds=1

Output Format

friendly mode (default)

Text Summary:

when max_rounds = 1: Generate a one-sentence description of the image content based on expanded_prompt,不超过50字
when max_rounds > 1: Generate a one-sentence description of the image content based on result and violations,不超过50字：
- result=PASS: Describe in a positive tone
- result=FAIL (1-2 violations): Gently point out specific issues
- result=FAIL (3 or more): Objectively summarize the main issues

Image: rank=1 best single image

verbose mode

Quality ranking result (high -> low)
---
Expanded prompt: [expanded | not expanded, using original prompt]
<expanded_prompt>
---
#1 round=<n> result=<PASS|FAIL> violations=<n> [early terminated]
#2 round=<n> result=<PASS|FAIL> violations=<n>
...
---
Time statistics: Total <total>s | Prompt evaluation <t>s | Content analysis <t>s | Prompt expansion <t>s | Image generation <t>s×<n> rounds | VLM review <t>s×<n> rounds
---
Images (sent in rank order)

Call Relationship

Bottom-level dependency: sn-image-base → sn-image-generate, sn-image-recognize, sn-text-optimize

References

references/analysis-framework.md - Analysis methodology
references/base-prompt.md - Prompt template
references/evaluation-standard.md - Evaluation standard
references/layout-style-selection.md - Layout and style selection rules
references/prompts-expand-system.md - Prompt expansion system prompt
references/prompts-critic-system.md - Prompt critic system prompt
references/runtime-parameters.md - Runtime parameters
references/structured-content-template.md - Structured content template
references/layouts/<layout>.md - Layout definitions (87 layouts)
references/styles/<style>.md - Style definitions (66 styles)

⚠️ 厂商绑定：此 skill 依赖 sn-image-base（SenseNova 图像生成 API），无法替换为其他模型。如果 SenseNova 不再免费或无 plan，此 skill 将不可用。

依赖: SN_API_KEY (SenseNova 平台 API key), sn-image-base, Pillow 配置参考: references/sensenova-config.md（见 sn-image-base） 可替代方案: comfyui (本地图像生成，但无信息图模板和 VLM 评审)

18 KiB Raw Permalink Blame History Unescape Escape

sn-infographic

Input Specification

API Configuration

Architecture: Main Agent + Worker Agent

Workflow

Main Agent Workflow

Worker Agent Workflow

Step 0 — Initialization

Step 1 — prompts_expand_mode Processing

Step 2 — Content Analysis + Layout & Style Selection + Prompt Expansion

Step 3 — Image Generation Loop

Step 4 — Image Quality Ranking

Return Contract

Output Format

friendly mode (default)

verbose mode

Call Relationship

References

18 KiB

Raw Permalink Blame History

Step 1 — `prompts_expand_mode` Processing