first commit

2026-05-10 13:52:46 +08:00
commit ccc63d1e70
4583 changed files with 584341 additions and 0 deletions
--- a/devops/gitea-code-sync/SKILL.md
+++ b/devops/gitea-code-sync/SKILL.md
@@ -0,0 +1,409 @@
+---
+name: gitea-code-sync
+description: 通过 Gitea 仓库进行代码同步的工作流 — agent 在云端写代码推送到仓库，用户在本地拉取
+version: 1.0.0
+author: Hermes Agent
+license: MIT
+metadata:
+  hermes:
+    tags: [git, gitea, workflow, deployment]
+    platforms: [qqbot]
+---
+
+# Gitea 代码同步工作流
+
+## 背景
+Agent 部署在云端服务器，用户在本机。简单项目完全交给 agent 开发，agent 写完后 push 到 Gitea 仓库，用户在本地 pull。
+
+## Gitea 凭证
+- **平台**: https://gitea.ephron.ren
+- **用户**: Elaina
+- **Token**: 存储在 `~/.netrc`
+- **配置**: `git config --global credential.helper store`
+
+## 工作流程
+
+### Agent 端（云端）
+1. 收到项目开发任务后，在 `/home/ubuntu/projects/` 目录下创建项目
+2. 开发完成后，初始化 git（如果还没有）：
+   ```bash
+   cd /home/ubuntu/projects/<project_name>
+   git init
+   git remote add origin https://gitea.ephron.ren/Elaina/<repo_name>.git
+   git add .
+   git commit -m "Initial commit"
+   git push -u origin main
+   ```
+### 通过 Gitea API 创建仓库
+```bash
+# Token 从 ~/.netrc 读取
+TOKEN=$(grep gitea.ephron.ren -A1 ~/.netrc | grep password | awk '{print $2}')
+curl -s -u "token:$TOKEN" "https://gitea.ephron.ren/api/v1/user/repos" \
+  -X POST -H "Content-Type: application/json" \
+  -d '{"name": "repo_name", "private": true, "description": "项目描述"}'
+```
+
+### 用户端（本地）
+```bash
+# 克隆仓库
+git clone https://gitea.ephron.ren/Elaina/<repo_name>.git
+
+# 后续更新
+git pull origin main
+```
+
+## 交付物类型判断
+
+用户说"推送到仓库"时，先判断交付物类型，不要默认推源代码：
+
+| 用户用词 | 期望交付物 | 推送内容 |
+|---------|-----------|---------|
+| 修复方案、分析报告、方案文档 | Markdown 文档（问题描述 + 根因 + diff + 验证） | `.md` 文件到新仓库 |
+| 代码、实现、开发 | 源代码 | 项目代码到仓库 |
+| 测试结果、测试报告 | 测试报告文档 | `.md` 文件到仓库 |
+
+**教训**：用户说"修复方案推送到新仓库"，意思是推送一份修复方案**文档**（分析+方案），不是把修改后的源代码推过去。
+
+## 仓库隐私性规则（用户偏好）
+创建仓库时根据内容隐私性判断：
+- **私有库**: 内部测试、敏感数据、个人项目、QA 测试
+- **公开库**: 开源项目、公开文档、展示性内容
+
+**私有库添加协作者**: 创建后将用户的 Gitea 用户名添加为 write 权限协作者。Gitea 用户名需通过 API 确认（如 `curl -u "token:TOKEN" "https://gitea.ephron.ren/api/v1/repos/{owner}/{repo}/collaborators"` 列出现有协作者可作为参考）。已确认有效的 Gitea 用户名: `ephron_ren`。
+
+## Gitea API 操作
+
+### 盘点所有仓库（含私有）
+
+```bash
+TOKEN=$(awk '/gitea.ephron.ren/{found=1} found && /password/{print $2; exit}' ~/.netrc)
+for user in Elaina ephron_ren; do
+  echo "=== $user ==="
+  curl -s -H "Authorization: token $TOKEN" \
+    "https://gitea.ephron.ren/api/v1/users/$user/repos?limit=100" \
+    | jq -r '.[] | "\(.full_name) | \(.private) | \(.updated_at[:10])"'
+done
+```
+
+⚠️ `/api/v1/repos/search` 不返回私有仓库，盘点必须用 `/api/v1/user/repos` 或 `/api/v1/users/{username}/repos`。详见 `references/repo-inventory.md`。
+
+### 修改仓库可见性
+```bash
+# 改为私有
+curl -s -X PATCH -H "Authorization: token $TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"private": true}' \
+  "https://gitea.ephron.ren/api/v1/repos/{owner}/{repo}"
+
+# 改为公开
+curl -s -X PATCH -H "Authorization: token $TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"private": false}' \
+  "https://gitea.ephron.ren/api/v1/repos/{owner}/{repo}"
+```
+
+### 添加协作者
+```bash
+curl -s -X PUT -H "Authorization: token $TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"permission": "write"}' \
+  "https://gitea.ephron.ren/api/v1/repos/{owner}/{repo}/collaborators/{username}"
+```
+
+### 常见错误
+- `remote: Push to create is not enabled for users.` + HTTP 403: 远程仓库不存在，需要先通过 API 创建。参见上方"通过 Gitea API 创建仓库"
+- `user should be an owner or a collaborator with admin write`: 当前 token 用户不是仓库 owner，需要 owner 操作或 fork 到自己账号下
+- `user does not exist`: 用户名拼写错误，Gitea 用户名区分大小写
+- `User permission denied for writing`: 当前 token 对目标仓库没有写权限（例如 `ephron_ren/ephron.ren` 对 Elaina 是只读的）
+
+### Push 被拒后的排错流程
+
+当 `git push` 返回 403 时，先确认远程仓库是否存在：
+
+```bash
+# 检查仓库是否存在
+curl -s -o /dev/null -w "%{http_code}" \
+  -H "Authorization: token $TOKEN" \
+  "https://gitea.ephron.ren/api/v1/repos/{owner}/{repo}"
+# 404 = 仓库不存在 → 需要创建
+# 200 = 仓库存在 → 权限问题
+```
+
+如果仓库不存在，用 API 创建后再 push。如果仓库存在但 push 被拒，检查协作者权限。
+
+### 推送权限被拒时的解决方案
+
+当 `git push` 返回 `User permission denied for writing` 时：
+
+**方案 A：创建新仓库（推荐）**
+```bash
+# 在 Elaina 账号下创建新仓库
+TOKEN=$(grep gitea.ephron.ren -A1 ~/.netrc | grep password | awk '{print $2}')
+curl -s -X POST "https://gitea.ephron.ren/api/v1/user/repos" \
+  -H "Authorization: Token $TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "<new-repo-name>",
+    "description": "<描述>",
+    "private": false,
+    "auto_init": true,
+    "default_branch": "main"
+  }'
+
+# 然后推送到新仓库
+cd /path/to/project
+git remote set-url origin https://gitea.ephron.ren/Elaina/<new-repo-name>.git
+# 如果远程有 auto_init 的 README，先拉取合并
+git pull origin main --allow-unrelated-histories --no-rebase
+git push -u origin main
+```
+
+**方案 B：请仓库 owner 添加协作者权限**
+- 需要 `ephron_ren` 用户在 Gitea 上给 Elaina 添加 write 权限
+- 适合需要长期协作的场景
+
+## 注意事项
+- 项目目录放在 `/home/ubuntu/projects/` 下
+- Token 有写权限，可以 push 也可以创建仓库
+- 简单项目直接交给 agent 开发，无需用户介入代码层面
+- **权限问题**: 只有仓库 owner 或 admin 权限协作者才能修改仓库设置和添加协作者。Agent 账号 (Elaina) 只能操作自己创建的仓库。
+
+## ⚠️ 破坏性操作必须先确认
+
+**删除仓库、强制推送、覆盖分支等破坏性操作，必须先向用户确认，不能擅自执行。**
+
+用户曾明确要求："清理无用仓库，清理这种危险动作要先让我确认"
+
+正确流程：
+1. 列出待删除/修改的仓库或分支
+2. 向用户展示清单并请求确认
+3. 用户确认后再执行
+
+❌ 错误：直接执行 `curl -X DELETE ...` 删除仓库
+✅ 正确：先问"发现旧仓库 X，需要删除吗？请确认"
+
+## Feature Branch 工作流
+
+当开发复杂功能时，使用 feature branch 避免影响 main 分支：
+
+### 创建 feature branch
+```bash
+cd /home/ubuntu/projects/<repo>
+git checkout -b feature/<feature-name>
+```
+
+### 开发并提交
+```bash
+# 开发完成后
+git add .
+git commit -m "feat: <feature description>"
+git push origin feature/<feature-name>
+```
+
+### 合并到 main
+```bash
+# 方法1: 直接合并（简单项目）
+git checkout main
+git merge feature/<feature-name>
+git push origin main
+
+# 方法2: 通过 Gitea API 创建 Pull Request（推荐）
+TOKEN=$(grep gitea.ephron.ren -A1 ~/.netrc | grep password | awk '{print $2}')
+curl -s -X POST "https://gitea.ephron.ren/api/v1/repos/{owner}/{repo}/pulls" \
+  -H "Authorization: token $TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "title": "feat: <feature description>",
+    "head": "feature/<feature-name>",
+    "base": "main",
+    "body": "## 变更说明\n\n- 变更1\n- 变更2"
+  }'
+```
+
+### 清理 feature branch
+```bash
+# 合并后删除本地分支
+git branch -d feature/<feature-name>
+
+# 删除远程分支
+git push origin --delete feature/<feature-name>
+```
+
+## 常见模式
+
+详细的 API 参考和脱敏模式见 `references/gitea-api.md`、`references/redaction-patterns.md`、`references/repo-inventory.md`（仓库盘点与枚举模式）。
+
+### 克隆已存在的仓库
+```bash
+# 方法1: 目录已存在但为空
+git clone https://gitea.ephron.ren/Elaina/<repo>.git /home/ubuntu/projects/<repo>
+# 会报错 "directory not empty" → 手动处理
+
+# 方法2: 目录已存在，直接进去操作
+cd /home/ubuntu/projects/<repo>
+git pull origin master  # 确保最新
+# 然后正常 add/commit/push
+```
+
+### 增量提交（每个模块完成后）
+```bash
+cd /home/ubuntu/projects/<repo>
+git add <changed_file>
+git commit -m "模块N测试完成: X用例通过, 发现Y个问题"
+git push origin master
+```
+
+### 推送 Hermes 核心文件到私有库
+```bash
+# 创建仓库
+curl -s -u "token:TOKEN" "https://gitea.ephron.ren/api/v1/user/repos" \
+  -X POST -H "Content-Type: application/json" \
+  -d '{"name":"hermes-core","private":true,"description":"Hermes Agent 核心配置"}'
+
+# 打包核心文件（包含 SOUL.md、config.yaml、memories、scripts）
+cd /home/ubuntu/projects/hermes-core
+git init
+# ... add and commit ...
+git push -u origin master
+```
+
+### Hermes 核心文件备份（完整版）
+
+用于服务器数据完全丢失后的完整恢复。备份所有核心配置文件，敏感信息脱敏存储。
+
+**需要备份的核心文件清单：**
+| 文件 | 说明 | 敏感度 |
+|------|------|--------|
+| `SOUL.md` | 人格定义 | 低 |
+| `memories/MEMORY.md` | 持久化记忆 | 低 |
+| `memories/USER.md` | 用户偏好 | 低 |
+| `config.yaml` | 主配置 | 中 |
+| `.env` | 环境变量（含 API Key）| 高 |
+| `auth.json` | 凭证池 | 高 |
+| `providers/*.json` | 模型提供商配置 | 低 |
+| `scripts/mimo_*.py` | 能力脚本 | 低 |
+| `channel_directory.json` | 渠道配置 | 低 |
+| `gateway_state.json` | 网关运行时状态 | 低 |
+| `models_dev_cache.json` | 模型缓存信息 | 低 |
+
+**脱敏方案（备份前执行）：**
+
+❌ 不要用纯 regex 字符串替换，会漏字段、会弄坏 JSON 格式。
+
+✅ 正确做法：
+```python
+import re, json
+
+# 1. .env — 用 Python 逐行处理
+with open('.env', 'r') as f:
+    content = f.read()
+lines = []
+for line in content.split('\n'):
+    stripped = line.lstrip()
+    if stripped.startswith('#'):
+        lines.append(line)
+        continue
+    m = re.match(r'^([A-Z_]+)=(.+)$', line)
+    if m and any(s in m.group(1) for s in ['API_KEY', 'SECRET', 'TOKEN', 'PASSWORD']):
+        lines.append(f"{m.group(1)}=***")
+    else:
+        lines.append(line)
+with open('.env', 'w') as f:
+    f.write('\n'.join(lines))
+
+# 2. auth.json — 用 json 模块序列化，保持格式正确
+with open('auth.json', 'r') as f:
+    auth = json.load(f)
+for provider, creds in auth['credential_pool'].items():
+    for c in creds:
+        c['access_token'] = '***'
+with open('auth.json', 'w') as f:
+    json.dump(auth, f, indent=2, ensure_ascii=False)
+```
+
+⚠️ 容易遗漏的字段（必须覆盖）：
+- `WEIXIN_TOKEN`、`WEIXIN_ACCOUNT_ID`
+- `WEIXIN_ALLOWED_USERS`、`WEIXIN_HOME_CHANNEL`（含用户 openid）
+- `QQ_CLIENT_SECRET`
+- `access_token` 中 `tp-` / `sk-` 前缀的完整 token
+
+**验证脱敏是否干净：**
+```bash
+grep -v "^#" .env | grep -E "tp-[a-zA-Z0-9]{20,}|sk-[a-zA-Z0-9]{20,}|bq6New[A-Za-z0-9]+|[a-f0-9]{30,}|o9cq" && echo "有泄露" || echo "干净"
+```
+
+**恢复时需要手动补充的字段（写入 RESTORE.md）：**
+- `.env`: `XIAOMI_API_KEY`、`MINIMAX_CODING_API_KEY`、`QQ_CLIENT_SECRET`、`WEIXIN_TOKEN`、`WEIXIN_ACCOUNT_ID`、`WEIXIN_ALLOWED_USERS`、`WEIXIN_HOME_CHANNEL`
+- `auth.json`: `credential_pool` 中各 provider 的 `access_token`
+- `~/.netrc`: Gitea 访问令牌（machine/login/password）
+
+**RESTORE.md 模板：**
+```markdown
+# 恢复说明
+
+仓库中包含脱敏后的配置文件。恢复时需要手动补充以下敏感信息：
+
+## .env
+
+| 字段 | 说明 | 获取方式 |
+|------|------|----------|
+| `XIAOMI_API_KEY` | Xiaomi MiMo API Key | https://platform.xiaomimimo.com |
+| `MINIMAX_CODING_API_KEY` | MiniMax 编码 API Key | https://api.minimaxi.com |
+| `QQ_CLIENT_SECRET` | QQ 机器人客户端密钥 | https://connect.qq.com |
+| `WEIXIN_TOKEN` | 微信机器人 Token | 微信开放平台 |
+| `WEIXIN_ACCOUNT_ID` | 微信机器人账号 ID | 微信开放平台 |
+| `WEIXIN_ALLOWED_USERS` | 微信允许的用户列表（openid） | - |
+| `WEIXIN_HOME_CHANNEL` | 微信主页频道 ID | - |
+
+## auth.json
+
+`credential_pool` 中各 provider 的 `access_token` 字段需填入真实 API Key。
+
+## .netrc
+
+`~/.netrc` 包含 Gitea 访问令牌。恢复后重新配置：
+```
+echo 'machine gitea.ephron.ren
+login <token>
+password <token>' > ~/.netrc
+chmod 600 ~/.netrc
+```
+```
+
+**完整备份流程：**
+```bash
+TOKEN=$(grep gitea.ephron.ren -A1 ~/.netrc | grep password | awk '{print $2}')
+# 1. 创建仓库
+curl -s -u "token:$TOKEN" "https://gitea.ephron.ren/api/v1/user/repos" \
+  -X POST -H "Content-Type: application/json" \
+  -d '{"name":"hermes-core","private":true,"description":"Hermes Agent 核心配置"}'
+
+# 2. 打包并脱敏
+mkdir -p /home/ubuntu/projects/hermes-core
+cd /home/ubuntu/projects/hermes-core
+git init
+cp ~/.hermes/SOUL.md .
+cp ~/.hermes/config.yaml .
+cp ~/.hermes/memories/MEMORY.md .
+cp ~/.hermes/memories/USER.md .
+cp ~/.hermes/.env . && python3 redact_env.py .env
+cp ~/.hermes/auth.json . && python3 redact_auth.py auth.json
+cp -r ~/.hermes/providers .
+cp -r ~/.hermes/scripts .
+cp ~/.hermes/channel_directory.json .
+cp ~/.hermes/gateway_state.json .
+
+# 3. 添加 RESTORE.md
+# ... 编写恢复说明 ...
+
+# 4. 推送
+git add -A && git commit -m "Backup $(date +%Y-%m-%d)" && git push -u origin master
+```
+
+### QA报告推送到仓库（推荐工作流）
+```bash
+# 每完成一个模块就推送一次，不用等全部完成
+git add test-results-v3.md
+git commit -m "模块X测试完成: N/M用例"
+git push origin master
+```
--- a/devops/gitea-code-sync/references/gitea-api.md
+++ b/devops/gitea-code-sync/references/gitea-api.md
@@ -0,0 +1,170 @@
+# Gitea API 参考
+
+## 认证
+```bash
+# 从 ~/.netrc 获取 token
+TOKEN=$(grep gitea.ephron.ren -A1 ~/.netrc | grep password | awk '{print $2}')
+
+# 方式1: Basic auth (token as password) — 推荐
+-u "token:$TOKEN"
+
+# 方式2: Token header
+-H "Authorization: token $TOKEN"
+```
+
+## 仓库操作
+
+### 搜索仓库
+```bash
+GET /api/v1/repos/search?limit=10&sort=updated
+```
+
+### 获取仓库信息
+```bash
+GET /api/v1/repos/{owner}/{repo}
+```
+
+### 创建仓库
+```bash
+POST /api/v1/user/repos
+{
+  "name": "repo_name",
+  "description": "描述",
+  "private": false,
+  "auto_init": true,
+  "default_branch": "main"
+}
+```
+
+### 修改仓库设置
+```bash
+PATCH /api/v1/repos/{owner}/{repo}
+{
+  "private": true,           # 修改可见性
+  "description": "新描述",    # 修改描述
+  "default_branch": "main"   # 修改默认分支
+}
+```
+
+### 删除仓库
+```bash
+DELETE /api/v1/repos/{owner}/{repo}
+```
+
+## 协作者管理
+
+### 获取协作者列表
+```bash
+GET /api/v1/repos/{owner}/{repo}/collaborators
+```
+
+### 添加协作者
+```bash
+PUT /api/v1/repos/{owner}/{repo}/collaborators/{username}
+{
+  "permission": "write"  # read, write, admin
+}
+```
+
+### 删除协作者
+```bash
+DELETE /api/v1/repos/{owner}/{repo}/collaborators/{username}
+```
+
+### 修改协作者权限
+```bash
+PATCH /api/v1/repos/{owner}/{repo}/collaborators/{username}
+{
+  "permission": "admin"
+}
+```
+
+## 分支操作
+
+### 获取分支列表
+```bash
+GET /api/v1/repos/{owner}/{repo}/branches
+```
+
+### 创建分支
+```bash
+POST /api/v1/repos/{owner}/{repo}/branches
+{
+  "new_branch_name": "feature-branch",
+  "old_branch_name": "main"
+}
+```
+
+## 文件操作
+
+### 获取文件内容
+```bash
+GET /api/v1/repos/{owner}/{repo}/contents/{filepath}
+```
+
+### 创建/更新文件
+```bash
+POST /api/v1/repos/{owner}/{repo}/contents/{filepath}
+{
+  "message": "commit message",
+  "content": "base64_encoded_content",
+  "branch": "main"
+}
+```
+
+### 删除文件
+```bash
+DELETE /api/v1/repos/{owner}/{repo}/contents/{filepath}
+{
+  "message": "delete message",
+  "sha": "file_sha"
+}
+```
+
+## Release 操作
+
+### 获取 Release 列表
+```bash
+GET /api/v1/repos/{owner}/{repo}/releases
+```
+
+### 创建 Release
+```bash
+POST /api/v1/repos/{owner}/{repo}/releases
+{
+  "tag_name": "v1.0.0",
+  "name": "Release 1.0.0",
+  "body": "Release notes",
+  "draft": false,
+  "prerelease": false
+}
+```
+
+## 常用查询参数
+
+| 参数 | 说明 | 示例 |
+|------|------|------|
+| limit | 返回数量 | `?limit=10` |
+| page | 分页 | `?page=2` |
+| sort | 排序字段 | `?sort=updated` |
+| order | 排序方向 | `?order=desc` |
+| q | 搜索关键词 | `?q=keyword` |
+
+## 响应格式
+成功响应通常返回 JSON 对象或数组。错误响应：
+```json
+{
+  "message": "error description",
+  "url": "https://gitea.ephron.ren/api/swagger"
+}
+```
+
+## Token 权限
+- **read**: 只读访问
+- **write**: 读写访问
+- **admin**: 完全管理权限
+
+## 当前环境
+- **平台**: https://gitea.ephron.ren
+- **Agent 用户**: Elaina (token in ~/.netrc)
+- **主用户**: ephron_ren
--- a/devops/gitea-code-sync/references/redaction-patterns.md
+++ b/devops/gitea-code-sync/references/redaction-patterns.md
@@ -0,0 +1,62 @@
+# 敏感信息脱敏模式参考
+
+## 常见密钥格式
+
+| 格式 | 示例 | 匹配正则 |
+|------|------|----------|
+| MiniMax token | `tp-spf...2tid` | `tp-[a-zA-Z0-9]{20,}` |
+| MiniMax API key | `sk-cp-...faRA` | `sk-cp-[a-zA-Z0-9]+` |
+| QQ client secret | `bq6New...vQvR` | `bq6New[A-Za-z0-9]+` |
+| WeChat openid | `o9cq801H7rXH9zNHTu-xaa29Hbuk@im.wechat` | `o9cq[a-zA-Z0-9@.-]+` |
+| WeChat token | `2fc2d0...8d1b` | `2fc2d0[A-Za-z0-9]+` |
+| Generic hex (30+) | various | `[a-f0-9]{30,}` |
+
+## .env 脱敏易错点
+
+注释行中的示例也可能匹配（如 `# KIMI_BASE_URL=https://api.kimi.com/coding/v1` 包含 `api.kimi.com` 不是密钥，但 `# OPENROUTER_API_KEY=sk-or-...` 包含完整格式密钥）。
+
+```bash
+# 验证 .env 非注释行无泄露
+grep -v "^#" .env | grep -E "tp-[a-zA-Z0-9]{20,}|sk-[a-zA-Z0-9]{20,}|bq6New[A-Za-z0-9]+|[a-f0-9]{30,}|o9cq" && echo "有泄露" || echo "干净"
+```
+
+## auth.json 脱敏易错点
+
+直接用 regex 替换会漏掉嵌套结构，且容易弄坏 JSON 格式（如尾部多出 `"`）。必须用 Python `json` 模块：
+
+```python
+import json
+
+with open('auth.json', 'r') as f:
+    auth = json.load(f)
+
+for provider, creds in auth['credential_pool'].items():
+    for c in creds:
+        c['access_token'] = '***'
+
+with open('auth.json', 'w') as f:
+    json.dump(auth, f, indent=2, ensure_ascii=False)
+
+# 验证是合法 JSON
+with open('auth.json', 'r') as f:
+    json.load(f)  # 能解析则格式正确
+```
+
+## 必须覆盖的敏感字段清单
+
+### .env
+- `XIAOMI_API_KEY`
+- `MINIMAX_CODING_API_KEY`
+- `QQ_CLIENT_SECRET`
+- `WEIXIN_TOKEN`
+- `WEIXIN_ACCOUNT_ID`
+- `WEIXIN_ALLOWED_USERS`（含用户 openid）
+- `WEIXIN_HOME_CHANNEL`
+- `QQ_APP_ID`（应用标识，非密钥但建议检查）
+
+### auth.json
+- `credential_pool.{provider}[].access_token`
+
+### 其他可能遗漏的渠道配置
+- 微信 `channel_directory.json` 中的用户 ID
+- `gateway_state.json` 中的进程信息（一般不敏感）
--- a/devops/gitea-code-sync/references/repo-inventory.md
+++ b/devops/gitea-code-sync/references/repo-inventory.md
@@ -0,0 +1,77 @@
+# Gitea 仓库盘点与枚举
+
+## 枚举当前用户仓库（含私有）
+
+```bash
+TOKEN=$(grep -A1 'gitea.ephron.ren' ~/.netrc | grep password | awk '{print $2}')
+curl -s -H "Authorization: token $TOKEN" \
+  "https://gitea.ephron.ren/api/v1/user/repos?limit=100&sort=updated" \
+  | jq '[.[] | {name: .full_name, private: .private, desc: (.description // ""), updated: .updated_at[:10]}]'
+```
+
+⚠️ **注意**: `/api/v1/repos/search` 默认不返回私有仓库。盘点仓库必须用 `/api/v1/user/repos`（当前用户）或 `/api/v1/users/{username}/repos`（指定用户）。
+
+## 枚举指定用户仓库
+
+```bash
+TOKEN=b81f373d474b6adcb31b1b86e310bb5db29b1d8c
+curl -s -H "Authorization: token $TOKEN" \
+  "https://gitea.ephron.ren/api/v1/users/{username}/repos?limit=100" \
+  | jq '[.[] | {name: .full_name, private: .private, desc: (.description // ""), updated: .updated_at[:10]}]'
+```
+
+已知用户账号:
+- `Elaina` — agent 管理账号（hermes-core, files, ephron-ren-qa）
+- `ephron_ren` — 用户主账号（ephron.ren, model_evaluation, QQbot, LocalAgent）
+
+## 完整盘点命令（一次遍历所有账号）
+
+```bash
+TOKEN=b81f373d474b6adcb31b1b86e310bb5db29b1d8c
+for user in Elaina ephron_ren; do
+  echo "=== $user ==="
+  curl -s -H "Authorization: token $TOKEN" \
+    "https://gitea.ephron.ren/api/v1/users/$user/repos?limit=100" \
+    | jq -r '.[] | "\(.full_name) | \(.private | if . then "🔒私有" else "🌐公开" end) | \(.description // "-") | \(.updated_at[:10])"'
+done
+```
+
+## jq 常见陷阱
+
+### ❌ 复杂字符串插值在 bash 中容易出错
+```bash
+# 这个在 bash 中会因为引号嵌套失败:
+jq '.data[] | "\(.full_name) | \(.private ? "私有" : "公开")"'
+```
+
+### ✅ 正确做法：用对象提取 + 外部格式化
+```bash
+jq '[.[] | {name: .full_name, private: .private, desc: (.description // ""), updated: .updated_at[:10]}]'
+```
+
+或者用 `-r` + 简单插值:
+```bash
+jq -r '.[] | "\(.full_name) | \(.private) | \(.updated_at[:10])"'
+```
+
+## 组织仓库注意事项
+
+- `GET /api/v1/orgs/{org}/repos` 如果组织不存在会返回 404 错误
+- Gitea 中用户名和组织名可能不同，如果 API 报 `user redirect does not exist` 说明该组织不存在或已被删除
+- 已知组织: `OpenClaw`（曾存在，2026-05 查询已不存在）
+
+## .netrc Token 解析坑
+
+### 常见问题
+```bash
+# ❌ 如果 ~/.netrc 格式异常，grep -A1 可能取到错误行
+TOKEN=$(grep -A1 'gitea.ephron.ren' ~/.netrc | grep password | awk '{print $2}')
+
+# ✅ 更可靠：直接硬编码或用更精确的匹配
+TOKEN=$(awk '/gitea.ephron.ren/{found=1} found && /password/{print $2; exit}' ~/.netrc)
+```
+
+### Token 格式
+- Gitea API token 是一个长 hex 字符串（如 `b81f373d474b6adcb31b1b86e310bb5db29b1d8c`）
+- login 和 password 字段存储相同的 token 值
+- 两种认证方式都可以: `-u "token:$TOKEN"` 或 `-H "Authorization: token $TOKEN"`
--- a/devops/kanban-orchestrator/SKILL.md
+++ b/devops/kanban-orchestrator/SKILL.md
@@ -0,0 +1,170 @@
+---
+name: kanban-orchestrator
+description: Decomposition playbook + specialist-roster conventions + anti-temptation rules for an orchestrator profile routing work through Kanban. The "don't do the work yourself" rule and the basic lifecycle are auto-injected into every kanban worker's system prompt; this skill is the deeper playbook when you're specifically playing the orchestrator role.
+version: 2.0.0
+metadata:
+  hermes:
+    tags: [kanban, multi-agent, orchestration, routing]
+    related_skills: [kanban-worker]
+---
+
+# Kanban Orchestrator — Decomposition Playbook
+
+> The **core worker lifecycle** (including the `kanban_create` fan-out pattern and the "decompose, don't execute" rule) is auto-injected into every kanban process via the `KANBAN_GUIDANCE` system-prompt block. This skill is the deeper playbook when you're an orchestrator profile whose whole job is routing.
+
+## Setup & Activation
+
+Kanban is built into Hermes Agent (v0.12+). Verify it's ready:
+
+```bash
+hermes kanban stats          # shows per-status counts (works even when empty)
+hermes kanban init           # create kanban.db if missing (idempotent)
+```
+
+Config (`~/.hermes/config.yaml`):
+```yaml
+kanban:
+  dispatch_in_gateway: true        # auto-dispatch tasks in gateway process
+  dispatch_interval_seconds: 60    # how often dispatcher checks for new work
+```
+
+No separate activation step — once the config exists and gateway is running, the dispatcher loop handles claim → spawn → heartbeat → complete automatically.
+
+## When to use the board (vs. just doing the work)
+
+Create Kanban tasks when any of these are true:
+
+1. **Multiple specialists are needed.** Research + analysis + writing is three profiles.
+2. **The work should survive a crash or restart.** Long-running, recurring, or important.
+3. **The user might want to interject.** Human-in-the-loop at any step.
+4. **Multiple subtasks can run in parallel.** Fan-out for speed.
+5. **Review / iteration is expected.** A reviewer profile loops on drafter output.
+6. **The audit trail matters.** Board rows persist in SQLite forever.
+
+If *none* of those apply — it's a small one-shot reasoning task — use `delegate_task` instead or answer the user directly.
+
+## The anti-temptation rules
+
+Your job description says "route, don't execute." The rules that enforce that:
+
+- **Do not execute the work yourself.** Your restricted toolset usually doesn't even include terminal/file/code/web for implementation. If you find yourself "just fixing this quickly" — stop and create a task for the right specialist.
+- **For any concrete task, create a Kanban task and assign it.** Every single time.
+- **If no specialist fits, ask the user which profile to create.** Do not default to doing it yourself under "close enough."
+- **Decompose, route, and summarize — that's the whole job.**
+
+## The standard specialist roster (convention)
+
+Unless the user's setup has customized profiles, assume these exist. Adjust to whatever the user actually has — ask if you're unsure.
+
+| Profile | Does | Typical workspace |
+|---|---|---|
+| `researcher` | Reads sources, gathers facts, writes findings | `scratch` |
+| `analyst` | Synthesizes, ranks, de-dupes. Consumes multiple `researcher` outputs | `scratch` |
+| `writer` | Drafts prose in the user's voice | `scratch` or `dir:` into their Obsidian vault |
+| `reviewer` | Reads output, leaves findings, gates approval | `scratch` |
+| `backend-eng` | Writes server-side code | `worktree` |
+| `frontend-eng` | Writes client-side code | `worktree` |
+| `ops` | Runs scripts, manages services, handles deployments | `dir:` into ops scripts repo |
+| `pm` | Writes specs, acceptance criteria | `scratch` |
+
+## Decomposition playbook
+
+### Step 1 — Understand the goal
+
+Ask clarifying questions if the goal is ambiguous. Cheap to ask; expensive to spawn the wrong fleet.
+
+### Step 2 — Sketch the task graph
+
+Before creating anything, draft the graph out loud (in your response to the user). Example for "Analyze whether we should migrate to Postgres":
+
+```
+T1  researcher        research: Postgres cost vs current
+T2  researcher        research: Postgres performance vs current
+T3  analyst           synthesize migration recommendation       parents: T1, T2
+T4  writer            draft decision memo                       parents: T3
+```
+
+Show this to the user. Let them correct it before you create anything.
+
+### Step 3 — Create tasks and link
+
+```python
+t1 = kanban_create(
+    title="research: Postgres cost vs current",
+    assignee="researcher",
+    body="Compare estimated infrastructure costs, migration costs, and ongoing ops costs over a 3-year window. Sources: AWS/GCP pricing, team time estimates, current Postgres bills from peers.",
+    tenant=os.environ.get("HERMES_TENANT"),
+)["task_id"]
+
+t2 = kanban_create(
+    title="research: Postgres performance vs current",
+    assignee="researcher",
+    body="Compare query latency, throughput, and scaling characteristics at our expected data volume (~500GB, 10k QPS peak). Sources: benchmark papers, public case studies, pgbench results if easy.",
+)["task_id"]
+
+t3 = kanban_create(
+    title="synthesize migration recommendation",
+    assignee="analyst",
+    body="Read the findings from T1 (cost) and T2 (performance). Produce a 1-page recommendation with explicit trade-offs and a go/no-go call.",
+    parents=[t1, t2],
+)["task_id"]
+
+t4 = kanban_create(
+    title="draft decision memo",
+    assignee="writer",
+    body="Turn the analyst's recommendation into a 2-page memo for the CTO. Match the tone of previous decision memos in the team's knowledge base.",
+    parents=[t3],
+)["task_id"]
+```
+
+`parents=[...]` gates promotion — children stay in `todo` until every parent reaches `done`, then auto-promote to `ready`. No manual coordination needed; the dispatcher and dependency engine handle it.
+
+### Step 4 — Complete your own task
+
+If you were spawned as a task yourself (e.g. `planner` profile was assigned `T0: "investigate Postgres migration"`), mark it done with a summary of what you created:
+
+```python
+kanban_complete(
+    summary="decomposed into T1-T4: 2 researchers parallel, 1 analyst on their outputs, 1 writer on the recommendation",
+    metadata={
+        "task_graph": {
+            "T1": {"assignee": "researcher", "parents": []},
+            "T2": {"assignee": "researcher", "parents": []},
+            "T3": {"assignee": "analyst", "parents": ["T1", "T2"]},
+            "T4": {"assignee": "writer", "parents": ["T3"]},
+        },
+    },
+)
+```
+
+### Step 5 — Report back to the user
+
+Tell them what you created in plain prose:
+
+> I've queued 4 tasks:
+> - **T1** (researcher): cost comparison
+> - **T2** (researcher): performance comparison, in parallel with T1
+> - **T3** (analyst): synthesizes T1 + T2 into a recommendation
+> - **T4** (writer): turns T3 into a CTO memo
+>
+> The dispatcher will pick up T1 and T2 now. T3 starts when both finish. You'll get a gateway ping when T4 completes. Use the dashboard or `hermes kanban tail <id>` to follow along.
+
+## Common patterns
+
+**Fan-out + fan-in (research → synthesize):** N `researcher` tasks with no parents, one `analyst` task with all of them as parents.
+
+**Pipeline with gates:** `pm → backend-eng → reviewer`. Each stage's `parents=[previous_task]`. Reviewer blocks or completes; if reviewer blocks, the operator unblocks with feedback and respawns.
+
+**Same-profile queue:** 50 tasks, all assigned to `translator`, no dependencies between them. Dispatcher serializes — translator processes them in priority order, accumulating experience in their own memory.
+
+**Human-in-the-loop:** Any task can `kanban_block()` to wait for input. Dispatcher respawns after `/unblock`. The comment thread carries the full context.
+
+## Pitfalls
+
+**Reassignment vs. new task.** If a reviewer blocks with "needs changes," create a NEW task linked from the reviewer's task — don't re-run the same task with a stern look. The new task is assigned to the original implementer profile.
+
+**Argument order for links.** `kanban_link(parent_id=..., child_id=...)` — parent first. Mixing them up demotes the wrong task to `todo`.
+
+**Don't pre-create the whole graph if the shape depends on intermediate findings.** If T3's structure depends on what T1 and T2 find, let T3 exist as a "synthesize findings" task whose own first step is to read parent handoffs and plan the rest. Orchestrators can spawn orchestrators.
+
+**Tenant inheritance.** If `HERMES_TENANT` is set in your env, pass `tenant=os.environ.get("HERMES_TENANT")` on every `kanban_create` call so child tasks stay in the same namespace.
--- a/devops/kanban-worker/SKILL.md
+++ b/devops/kanban-worker/SKILL.md
@@ -0,0 +1,160 @@
+---
+name: kanban-worker
+description: Pitfalls, examples, and edge cases for Hermes Kanban workers. The lifecycle itself is auto-injected into every worker's system prompt as KANBAN_GUIDANCE (from agent/prompt_builder.py); this skill is what you load when you want deeper detail on specific scenarios.
+version: 2.0.0
+metadata:
+  hermes:
+    tags: [kanban, multi-agent, collaboration, workflow, pitfalls]
+    related_skills: [kanban-orchestrator]
+---
+
+# Kanban Worker — Pitfalls and Examples
+
+> You're seeing this skill because the Hermes Kanban dispatcher spawned you as a worker with `--skills kanban-worker` — it's loaded automatically for every dispatched worker. The **lifecycle** (6 steps: orient → work → heartbeat → block/complete) also lives in the `KANBAN_GUIDANCE` block that's auto-injected into your system prompt. This skill is the deeper detail: good handoff shapes, retry diagnostics, edge cases.
+
+## Workspace handling
+
+Your workspace kind determines how you should behave inside `$HERMES_KANBAN_WORKSPACE`:
+
+| Kind | What it is | How to work |
+|---|---|---|
+| `scratch` | Fresh tmp dir, yours alone | Read/write freely; it gets GC'd when the task is archived. |
+| `dir:<path>` | Shared persistent directory | Other runs will read what you write. Treat it like long-lived state. Path is guaranteed absolute (the kernel rejects relative paths). |
+| `worktree` | Git worktree at the resolved path | If `.git` doesn't exist, run `git worktree add <path> <branch>` from the main repo first, then cd and work normally. Commit work here. |
+
+## Tenant isolation
+
+If `$HERMES_TENANT` is set, the task belongs to a tenant namespace. When reading or writing persistent memory, prefix memory entries with the tenant so context doesn't leak across tenants:
+
+- Good: `business-a: Acme is our biggest customer`
+- Bad (leaks): `Acme is our biggest customer`
+
+## Good summary + metadata shapes
+
+The `kanban_complete(summary=..., metadata=...)` handoff is how downstream workers read what you did. Patterns that work:
+
+**Coding task:**
+```python
+kanban_complete(
+    summary="shipped rate limiter — token bucket, keys on user_id with IP fallback, 14 tests pass",
+    metadata={
+        "changed_files": ["rate_limiter.py", "tests/test_rate_limiter.py"],
+        "tests_run": 14,
+        "tests_passed": 14,
+        "decisions": ["user_id primary, IP fallback for unauthenticated requests"],
+    },
+)
+```
+
+**Research task:**
+```python
+kanban_complete(
+    summary="3 competing libraries reviewed; vLLM wins on throughput, SGLang on latency, Tensorrt-LLM on memory efficiency",
+    metadata={
+        "sources_read": 12,
+        "recommendation": "vLLM",
+        "benchmarks": {"vllm": 1.0, "sglang": 0.87, "trtllm": 0.72},
+    },
+)
+```
+
+**Review task:**
+```python
+kanban_complete(
+    summary="reviewed PR #123; 2 blocking issues found (SQL injection in /search, missing CSRF on /settings)",
+    metadata={
+        "pr_number": 123,
+        "findings": [
+            {"severity": "critical", "file": "api/search.py", "line": 42, "issue": "raw SQL concat"},
+            {"severity": "high", "file": "api/settings.py", "issue": "missing CSRF middleware"},
+        ],
+        "approved": False,
+    },
+)
+```
+
+Shape `metadata` so downstream parsers (reviewers, aggregators, schedulers) can use it without re-reading your prose.
+
+## Claiming cards you actually created
+
+If your run produced new kanban tasks (via `kanban_create`), pass the ids in `created_cards` on `kanban_complete`. The kernel verifies each id exists and was created by your profile; any phantom id blocks the completion with an error listing what went wrong, and the rejected attempt is permanently recorded on the task's event log. **Only list ids you captured from a successful `kanban_create` return value — never invent ids from prose, never paste ids from earlier runs, never claim cards another worker created.**
+
+```python
+# GOOD — capture return values, then claim them.
+c1 = kanban_create(title="remediate SQL injection", assignee="security-worker")
+c2 = kanban_create(title="fix CSRF middleware", assignee="web-worker")
+
+kanban_complete(
+    summary="Review done; spawned remediations for both findings.",
+    metadata={"pr_number": 123, "approved": False},
+    created_cards=[c1["task_id"], c2["task_id"]],
+)
+```
+
+```python
+# BAD — claiming ids you don't have captured return values for.
+kanban_complete(
+    summary="Created remediation cards t_a1b2c3d4, t_deadbeef",  # hallucinated
+    created_cards=["t_a1b2c3d4", "t_deadbeef"],                   # → gate rejects
+)
+```
+
+If a `kanban_create` call fails (exception, tool_error), the card was NOT created — do not include a phantom id for it. Retry the create, or omit the id and mention the failure in your summary. The prose-scan pass also catches `t_<hex>` references in your free-form summary that don't resolve; these don't block the completion but show up as advisory warnings on the task in the dashboard.
+
+## Block reasons that get answered fast
+
+Bad: `"stuck"` — the human has no context.
+
+Good: one sentence naming the specific decision you need. Leave longer context as a comment instead.
+
+```python
+kanban_comment(
+    task_id=os.environ["HERMES_KANBAN_TASK"],
+    body="Full context: I have user IPs from Cloudflare headers but some users are behind NATs with thousands of peers. Keying on IP alone causes false positives.",
+)
+kanban_block(reason="Rate limit key choice: IP (simple, NAT-unsafe) or user_id (requires auth, skips anonymous endpoints)?")
+```
+
+The block message is what appears in the dashboard / gateway notifier. The comment is the deeper context a human reads when they open the task.
+
+## Heartbeats worth sending
+
+Good heartbeats name progress: `"epoch 12/50, loss 0.31"`, `"scanned 1.2M/2.4M rows"`, `"uploaded 47/120 videos"`.
+
+Bad heartbeats: `"still working"`, empty notes, sub-second intervals. Every few minutes max; skip entirely for tasks under ~2 minutes.
+
+## Retry scenarios
+
+If you open the task and `kanban_show` returns `runs: [...]` with one or more closed runs, you're a retry. The prior runs' `outcome` / `summary` / `error` tell you what didn't work. Don't repeat that path. Typical retry diagnostics:
+
+- `outcome: "timed_out"` — the previous attempt hit `max_runtime_seconds`. You may need to chunk the work or shorten it.
+- `outcome: "crashed"` — OOM or segfault. Reduce memory footprint.
+- `outcome: "spawn_failed"` + `error: "..."` — usually a profile config issue (missing credential, bad PATH). Ask the human via `kanban_block` instead of retrying blindly.
+- `outcome: "reclaimed"` + `summary: "task archived..."` — operator archived the task out from under the previous run; you probably shouldn't be running at all, check status carefully.
+- `outcome: "blocked"` — a previous attempt blocked; the unblock comment should be in the thread by now.
+
+## Do NOT
+
+- Call `delegate_task` as a substitute for `kanban_create`. `delegate_task` is for short reasoning subtasks inside YOUR run; `kanban_create` is for cross-agent handoffs that outlive one API loop.
+- Modify files outside `$HERMES_KANBAN_WORKSPACE` unless the task body says to.
+- Create follow-up tasks assigned to yourself — assign to the right specialist.
+- Complete a task you didn't actually finish. Block it instead.
+
+## Pitfalls
+
+**Task state can change between dispatch and your startup.** Between when the dispatcher claimed and when your process actually booted, the task may have been blocked, reassigned, or archived. Always `kanban_show` first. If it reports `blocked` or `archived`, stop — you shouldn't be running.
+
+**Workspace may have stale artifacts.** Especially `dir:` and `worktree` workspaces can have files from previous runs. Read the comment thread — it usually explains why you're running again and what state the workspace is in.
+
+**Don't rely on the CLI when the guidance is available.** The `kanban_*` tools work across all terminal backends (Docker, Modal, SSH). `hermes kanban <verb>` from your terminal tool will fail in containerized backends because the CLI isn't installed there. When in doubt, use the tool.
+
+## CLI fallback (for scripting)
+
+Every tool has a CLI equivalent for human operators and scripts:
+- `kanban_show` ↔ `hermes kanban show <id> --json`
+- `kanban_complete` ↔ `hermes kanban complete <id> --summary "..." --metadata '{...}'`
+- `kanban_block` ↔ `hermes kanban block <id> "reason"`
+- `kanban_create` ↔ `hermes kanban create "title" --assignee <profile> [--parent <id>]`
+- etc.
+
+Use the tools from inside an agent; the CLI exists for the human at the terminal.
--- a/devops/playwright-browser-install/SKILL.md
+++ b/devops/playwright-browser-install/SKILL.md
@@ -0,0 +1,172 @@
+---
+name: playwright-browser-install
+description: Install, diagnose, and recover Playwright browser binaries (Chromium, Firefox, WebKit) — partial downloads, resume, cache management, and sandbox issues.
+triggers:
+  - "playwright install chromium"
+  - "playwright install firefox"
+  - "playwright install webkit"
+  - "playwright browser download failed"
+  - "playwright chromium not found"
+  - "chrome binary missing"
+  - "playwright download interrupted"
+---
+
+# Playwright Browser Install — Diagnosis & Recovery
+
+## Pitfalls
+
+### Chrome sandbox 在容器/VM 中失败
+**Symptom**: `browser_navigate` 报错 "No usable sandbox! If you are running on Ubuntu 23.10+ or another Linux distro that has disabled unprivileged user namespaces with AppArmor"
+
+**Root cause**: 容器或 VM 环境中 Chrome 的 sandbox 机制不可用。
+
+**Workaround**: 
+```bash
+# 在 ~/.bashrc 中添加
+echo 'export PLAYWRIGHT_CHROMIUM_ARGS="--no-sandbox --disable-setuid-sandbox"' >> ~/.bashrc
+source ~/.bashrc
+
+# 或直接用 playwright-core 调用
+cd ~/.hermes/hermes-agent && node -e "
+const { chromium } = require('playwright-core');
+(async () => {
+  const browser = await chromium.launch({
+    args: ['--no-sandbox', '--disable-setuid-sandbox']
+  });
+  const page = await browser.newPage();
+  await page.goto('https://example.com');
+  const text = await page.textContent('body');
+  console.log(text);
+  await browser.close();
+})().catch(e => console.error(e.message));
+"
+```
+
+**注意**: `browser_navigate` 工具可能不支持自定义 Chrome 参数，此时需要用上述 playwright-core 直接调用。
+
+## Quick Diagnosis
+
+```bash
+# Check which browsers are installed
+ls ~/.cache/ms-playwright/
+
+# Check if a specific browser process is running
+ps aux | grep playwright | grep -v grep
+
+# Check download temp dirs (partial downloads accumulate here)
+ls -la /tmp/playwright-download-*/
+
+# Find largest partial zip (likely the interrupted download)
+for d in /tmp/playwright-download-*; do
+  f="$d/playwright-download-chromium-"*.zip
+  test -f "$f" && echo "$(du -sh "$f" | cut -f1)  $d"
+done | sort -rh | head
+```
+
+## Finding the Actual CDN Download URL
+
+Playwright CDN redirects through several hosts. The final destination (which supports Range/resume) is **storage.googleapis.com**.
+
+```bash
+# Method: use the playwright CLI with verbose to see the URL
+# Or extract from browsers.json
+node -e "const b=require('$HOME/.hermes/hermes-agent/node_modules/playwright-core/browsers.json'); const c=b.find(x=>x.name==='chromium'); console.log(c.browserVersion, c.revision)"
+
+# Then construct the URL:
+# https://storage.googleapis.com/chrome-for-testing-public/{browserVersion}/linux64/chrome-linux64.zip
+```
+
+For Chromium revision 1217 → browserVersion **147.0.7727.15**
+
+## Resume a Partial Download (curl)
+
+1. Identify the largest partial zip in `/tmp/playwright-download-*/`
+2. Copy it to a safe location
+3. Use `curl -C -` for automatic resume (reads existing file size and sends `Range` header):
+
+```bash
+PARTIAL_ZIP="/tmp/playwright-download-XRveVR/playwright-download-chromium-ubuntu24.04-x64-1217.zip"
+OUTPUT="/tmp/playwright-chromium-resume.zip"
+URL="https://storage.googleapis.com/chrome-for-testing-public/147.0.7727.15/linux64/chrome-linux64.zip"
+
+cp "$PARTIAL_ZIP" "$OUTPUT"
+nohup curl -L -C - -o "$OUTPUT" "$URL" &
+```
+
+**Important:** The `-o` flag in `curl` truncates the output file on start. `curl -C -` auto-detects existing file size and resumes from that position, but only if the file already exists. The `-L` follows redirects (needed because the CDN URL 307-redirects to storage.googleapis.com).
+
+## Manual Installation from Downloaded Zip
+
+If Playwright's install process keeps overwriting your manual extraction:
+
+```bash
+BROWSER_DIR="$HOME/.cache/ms-playwright/chromium-1217"
+mkdir -p "$BROWSER_DIR"
+unzip -q /path/to/downloaded.zip -d "$BROWSER_DIR"
+touch "$BROWSER_DIR/.ready"
+ln -sf chrome-linux64 "$BROWSER_DIR/chrome"
+```
+
+# The `.ready` marker file tells Playwright the browser is already installed.
+
+## Headless Shell — Separate Installation
+
+Playwright needs **both** `chromium-1217/` (chrome binary) and `chromium_headless_shell-1217/` (headless shell). The headless shell is a separate ~112MB download:
+
+```bash
+# URL format (same browserVersion as chromium):
+# https://storage.googleapis.com/chrome-for-testing-public/{browserVersion}/linux64/chrome-headless-shell-linux64.zip
+
+BROWSER_VER="147.0.7727.15"  # from browsers.json
+HEADLESS_ZIP="/tmp/chrome-headless-shell.zip"
+HEADLESS_DIR="$HOME/.cache/ms-playwright/chromium_headless_shell-1217"
+
+curl -L -o "$HEADLESS_ZIP" "https://storage.googleapis.com/chrome-for-testing-public/${BROWSER_VER}/linux64/chrome-headless-shell-linux64.zip"
+mkdir -p "$HEADLESS_DIR"
+unzip -q "$HEADLESS_ZIP" -d "$HEADLESS_DIR"
+touch "$HEADLESS_DIR/.ready"
+ln -sf chrome-headless-shell-linux64 "$HEADLESS_DIR/chrome-headless-shell"
+```
+
+## Playwright Overwrites Cache Directory
+
+⚠️ **Critical:** When you run `playwright install chromium`, it creates a **fresh temp directory** each time (using `mkdtemp`), downloads there, then moves to the cache. If you manually extracted to `~/.cache/ms-playwright/chromium-1217/` and then run `playwright install`, it will **delete and recreate that directory** — wiping your manual extraction.
+
+**Workaround:** Run `playwright install` first and let it complete. If interrupted, re-extract after each `playwright install` run.
+
+## Verify Installation
+
+```bash
+# Via Playwright API
+node -e "const {chromium}=require('playwright-core'); console.log(chromium.executablePath())"
+
+# Direct binary test
+"$HOME/.cache/ms-playwright/chromium-1217/chrome-linux64/chrome" --version
+```
+
+## Common Failure Modes
+
+| Symptom | Cause | Fix |
+|---------|-------|-----|
+| `playwright install` hangs at 0% | Network/DNS issue, bad CDN mirror | Set `PLAYWRIGHT_DOWNLOAD_HOST` env var |
+| Download starts but never completes | Server-side timeout, partial file left in `/tmp/` | Resume from partial using `curl -C -` |
+| "Chrome binary not found" after install | `.ready` marker missing or wrong dir name | Create marker, check `chromium-{revision}` dir name |
+| `chrome: command not found` | Sandbox/suffix issues | Check `chrome-linux64/chrome` exists and is executable |
+| Playwright reinstalls even though browser exists | `.ready` marker missing | `touch "$BROWSER_DIR/.ready"` |
+| Headless shell not found | Headless shell is a **separate install** from chromium — both are needed | Install headless shell manually from `chrome-headless-shell-linux64.zip` (same browserVersion as chromium) |
+
+## Environment Variables
+
+```bash
+PLAYWRIGHT_CHROMIUM_DOWNLOAD_HOST   # Override CDN for Chromium
+PLAYWRIGHT_FIREFOX_DOWNLOAD_HOST    # Override CDN for Firefox
+PLAYWRIGHT_DOWNLOAD_HOST            # Override CDN for all browsers
+PLAYWRIGHT_DOWNLOAD_CONNECTION_TIMEOUT  # Socket timeout in ms
+```
+
+## Key Paths
+
+- Browser cache: `~/.cache/ms-playwright/`
+- Temp downloads: `/tmp/playwright-download-*/` (deleted on system reboot)
+- Playwright node_modules: `$HOME/.hermes/hermes-agent/node_modules/playwright-core/`
+- browsers.json: `$HOME/.hermes/hermes-agent/node_modules/playwright-core/browsers.json`
--- a/devops/ssh-server-setup/SKILL.md
+++ b/devops/ssh-server-setup/SKILL.md
@@ -0,0 +1,185 @@
+---
+name: ssh-server-setup
+description: "Set up SSH key authentication, configure firewalls, harden SSH, and audit server security. Use when: (1) user needs SSH access to a server, (2) SSH connection fails with permission errors, (3) setting up key-based auth for new users, (4) enabling/configuring UFW firewall, (5) analyzing SSH attacks, (6) hardening SSH config."
+---
+
+# SSH Server Setup & Troubleshooting
+
+## Quick Setup (New Key Pair)
+
+### 1. Generate Key Pair
+```bash
+# On client machine (or use Xshell's key generator)
+ssh-keygen -t rsa -b 2048 -f ~/.ssh/id_rsa -C "user@host"
+```
+
+### 2. Install Public Key on Server
+```bash
+# On server
+mkdir -p /home/<user>/.ssh
+echo '<public-key-content>' >> /home/<user>/.ssh/authorized_keys
+chmod 700 /home/<user>/.ssh
+chmod 600 /home/<user>/.ssh/authorized_keys
+chmod 755 /home/<user>       # CRITICAL: must NOT be 777 or group-writable
+chown -R <user>:<user> /home/<user>/.ssh
+```
+
+### 3. Verify SSH Config Allows Key Auth
+```bash
+grep -E "^(PubkeyAuthentication|AuthorizedKeysFile)" /etc/ssh/sshd_config
+# Should show:
+# PubkeyAuthentication yes
+# AuthorizedKeysFile .ssh/authorized_keys
+```
+
+## ⚠️ Critical Pitfall: Home Directory Permissions
+
+**Symptom**: SSH logs show `Authentication refused: bad ownership or modes for directory /home/<user>`
+
+**Root cause**: SSH (OpenSSH) refuses public key authentication if the user's home directory has group or other write permissions (e.g., 777, 775).
+
+**Fix**:
+```bash
+chmod 755 /home/<user>
+```
+
+**Why**: OpenSSH considers a writable home directory a security risk — other users could manipulate `~/.ssh/authorized_keys`. The directory must be owned by the user and not writable by group/others.
+
+**Debugging**:
+```bash
+# Check current permissions
+ls -la /home/ | grep <user>
+# Should show drwxr-xr-x (755), NOT drwxrwxrwx (777) or drwxrwxr-x (775)
+
+# Check SSH logs for the exact error
+tail -20 /var/log/auth.log | grep -i "ssh\|publickey"
+# Or on systemd systems:
+journalctl -u ssh --no-pager -n 20
+```
+
+## Permission Checklist
+
+| Path | Owner | Permissions | Why |
+|------|-------|-------------|-----|
+| `/home/<user>` | `<user>` | `755` | SSH refuses auth if group/other writable |
+| `~/.ssh/` | `<user>` | `700` | Only owner should access SSH config |
+| `~/.ssh/authorized_keys` | `<user>` | `600` | Only owner should read/write keys |
+| `~/.ssh/id_rsa` (private) | `<user>` | `600` | Private key must be restricted |
+
+## Xshell-Specific Notes
+
+1. **Generate key**: 工具 → 用户密钥管理器 → 新建 → RSA 2048
+2. **Import key**: 工具 → 用户密钥管理器 → 导入
+3. **Connection settings**:
+   - 协议: SSH
+   - 用户身份验证: 方法选 **Public Key**（不是 Password）
+   - 用户名: `ubuntu` (or whatever the server user is)
+   - 用户密钥: select the imported key
+
+## Common Errors
+
+| Error | Cause | Fix |
+|-------|-------|-----|
+| `bad ownership or modes for directory` | Home dir writable by group/others | `chmod 755 /home/<user>` |
+| `bad ownership or modes for file` | authorized_keys wrong perms | `chmod 600 ~/.ssh/authorized_keys` |
+| `Permission denied (publickey)` | Key not in authorized_keys | Add public key to file |
+| `Connection closed by foreign host` | Auth failed, server disconnects | Check logs for specific reason |
+| `所选的用户密钥未在远程主机上注册` | Public key not installed on server | Add public key to authorized_keys |
+
+---
+
+## UFW Firewall Setup
+
+When enabling SSH access, always set up UFW in this order to avoid lockout:
+
+```bash
+# 1. Allow SSH FIRST (before enabling firewall)
+sudo ufw allow 22/tcp comment "SSH"
+
+# 2. Set default policies
+sudo ufw default deny incoming
+sudo ufw default allow outgoing
+
+# 3. Enable (use --force for non-interactive)
+sudo ufw --force enable
+
+# 4. Verify
+sudo ufw status verbose
+```
+
+**Opening additional ports later:**
+```bash
+sudo ufw allow 80/tcp comment "HTTP"
+sudo ufw allow 443/tcp comment "HTTPS"
+sudo ufw allow from <specific-ip> to any port 22  # Restrict SSH to specific IP
+```
+
+---
+
+## SSH Attack Analysis
+
+**Check attack patterns:**
+```bash
+# Failed/disconnected attempts with IP counts
+journalctl -u ssh --no-pager --since "2026-05-01" | \
+  grep -i "failed\|invalid\|refused\|disconnected.*preauth" | \
+  grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort | uniq -c | sort -rn | head -15
+
+# Successful logins only
+journalctl -u ssh --no-pager --since "2026-05-01" | grep "Accepted" | \
+  grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort | uniq -c | sort -rn
+```
+
+**IP geolocation lookup:**
+```bash
+# ip-api.com (free, no key needed, rate limited 45/min)
+curl -s "http://ip-api.com/json/<IP>?fields=country,regionName,isp,org"
+```
+
+**Typical attack sources:** Cloud provider IPs (Tencent Cloud, Alibaba Cloud, OVH, Hetzner) — these are botnet/scanner nodes, not targeted attacks.
+
+---
+
+## SSH Hardening (sshd_config)
+
+Add to `/etc/ssh/sshd_config`:
+
+```bash
+MaxAuthTries 3              # Limit auth attempts per connection
+LoginGraceTime 30           # Timeout for auth (seconds)
+ClientAliveInterval 300     # Send keepalive every 5 min
+ClientAliveCountMax 2       # Disconnect after 2 missed keepalives
+MaxSessions 3               # Limit concurrent sessions per connection
+AllowAgentForwarding no     # Disable unless needed
+AllowTcpForwarding no       # Disable unless needed
+```
+
+Apply: `sudo systemctl restart sshd`
+
+---
+
+## Server Security Audit Checklist
+
+```bash
+# 1. SSH config
+cat /etc/ssh/sshd_config | grep -v "^#" | grep -v "^$"
+
+# 2. Firewall status
+sudo ufw status verbose
+
+# 3. fail2ban status (if installed)
+sudo fail2ban-client status sshd
+
+# 4. Auto-updates
+cat /etc/apt/apt.conf.d/20auto-upgrades
+
+# 5. Listening ports
+ss -tlnp | grep -v "127.0.0.1" | grep -v "::1"
+
+# 6. System resources
+free -h && df -h / && uptime
+
+# 7. Swap config
+swapon --show
+cat /proc/sys/vm/swappiness
+```
--- a/devops/webhook-subscriptions/SKILL.md
+++ b/devops/webhook-subscriptions/SKILL.md
@@ -0,0 +1,203 @@
+---
+name: webhook-subscriptions
+description: "Webhook subscriptions: event-driven agent runs."
+version: 1.1.0
+metadata:
+  hermes:
+    tags: [webhook, events, automation, integrations, notifications, push]
+---
+
+# Webhook Subscriptions
+
+Create dynamic webhook subscriptions so external services (GitHub, GitLab, Stripe, CI/CD, IoT sensors, monitoring tools) can trigger Hermes agent runs by POSTing events to a URL.
+
+## Setup (Required First)
+
+The webhook platform must be enabled before subscriptions can be created. Check with:
+```bash
+hermes webhook list
+```
+
+If it says "Webhook platform is not enabled", set it up:
+
+### Option 1: Setup wizard
+```bash
+hermes gateway setup
+```
+Follow the prompts to enable webhooks, set the port, and set a global HMAC secret.
+
+### Option 2: Manual config
+Add to `~/.hermes/config.yaml`:
+```yaml
+platforms:
+  webhook:
+    enabled: true
+    extra:
+      host: "0.0.0.0"
+      port: 8644
+      secret: "generate-a-strong-secret-here"
+```
+
+### Option 3: Environment variables
+Add to `~/.hermes/.env`:
+```bash
+WEBHOOK_ENABLED=true
+WEBHOOK_PORT=8644
+WEBHOOK_SECRET=generate-a-strong-secret-here
+```
+
+After configuration, start (or restart) the gateway:
+```bash
+hermes gateway run
+# Or if using systemd:
+systemctl --user restart hermes-gateway
+```
+
+Verify it's running:
+```bash
+curl http://localhost:8644/health
+```
+
+## Commands
+
+All management is via the `hermes webhook` CLI command:
+
+### Create a subscription
+```bash
+hermes webhook subscribe <name> \
+  --prompt "Prompt template with {payload.fields}" \
+  --events "event1,event2" \
+  --description "What this does" \
+  --skills "skill1,skill2" \
+  --deliver telegram \
+  --deliver-chat-id "12345" \
+  --secret "optional-custom-secret"
+```
+
+Returns the webhook URL and HMAC secret. The user configures their service to POST to that URL.
+
+### List subscriptions
+```bash
+hermes webhook list
+```
+
+### Remove a subscription
+```bash
+hermes webhook remove <name>
+```
+
+### Test a subscription
+```bash
+hermes webhook test <name>
+hermes webhook test <name> --payload '{"key": "value"}'
+```
+
+## Prompt Templates
+
+Prompts support `{dot.notation}` for accessing nested payload fields:
+
+- `{issue.title}` — GitHub issue title
+- `{pull_request.user.login}` — PR author
+- `{data.object.amount}` — Stripe payment amount
+- `{sensor.temperature}` — IoT sensor reading
+
+If no prompt is specified, the full JSON payload is dumped into the agent prompt.
+
+## Common Patterns
+
+### GitHub: new issues
+```bash
+hermes webhook subscribe github-issues \
+  --events "issues" \
+  --prompt "New GitHub issue #{issue.number}: {issue.title}\n\nAction: {action}\nAuthor: {issue.user.login}\nBody:\n{issue.body}\n\nPlease triage this issue." \
+  --deliver telegram \
+  --deliver-chat-id "-100123456789"
+```
+
+Then in GitHub repo Settings → Webhooks → Add webhook:
+- Payload URL: the returned webhook_url
+- Content type: application/json
+- Secret: the returned secret
+- Events: "Issues"
+
+### GitHub: PR reviews
+```bash
+hermes webhook subscribe github-prs \
+  --events "pull_request" \
+  --prompt "PR #{pull_request.number} {action}: {pull_request.title}\nBy: {pull_request.user.login}\nBranch: {pull_request.head.ref}\n\n{pull_request.body}" \
+  --skills "github-code-review" \
+  --deliver github_comment
+```
+
+### Stripe: payment events
+```bash
+hermes webhook subscribe stripe-payments \
+  --events "payment_intent.succeeded,payment_intent.payment_failed" \
+  --prompt "Payment {data.object.status}: {data.object.amount} cents from {data.object.receipt_email}" \
+  --deliver telegram \
+  --deliver-chat-id "-100123456789"
+```
+
+### CI/CD: build notifications
+```bash
+hermes webhook subscribe ci-builds \
+  --events "pipeline" \
+  --prompt "Build {object_attributes.status} on {project.name} branch {object_attributes.ref}\nCommit: {commit.message}" \
+  --deliver discord \
+  --deliver-chat-id "1234567890"
+```
+
+### Generic monitoring alert
+```bash
+hermes webhook subscribe alerts \
+  --prompt "Alert: {alert.name}\nSeverity: {alert.severity}\nMessage: {alert.message}\n\nPlease investigate and suggest remediation." \
+  --deliver origin
+```
+
+### Direct delivery (no agent, zero LLM cost)
+
+For use cases where you just want to push a notification through to a user's chat — no reasoning, no agent loop — add `--deliver-only`. The rendered `--prompt` template becomes the literal message body and is dispatched directly to the target adapter.
+
+Use this for:
+- External service push notifications (Supabase/Firebase webhooks → Telegram)
+- Monitoring alerts that should forward verbatim
+- Inter-agent pings where one agent is telling another agent's user something
+- Any webhook where an LLM round trip would be wasted effort
+
+```bash
+hermes webhook subscribe antenna-matches \
+  --deliver telegram \
+  --deliver-chat-id "123456789" \
+  --deliver-only \
+  --prompt "🎉 New match: {match.user_name} matched with you!" \
+  --description "Antenna match notifications"
+```
+
+The POST returns `200 OK` on successful delivery, `502` on target failure — so upstream services can retry intelligently. HMAC auth, rate limits, and idempotency still apply.
+
+Requires `--deliver` to be a real target (telegram, discord, slack, github_comment, etc.) — `--deliver log` is rejected because log-only direct delivery is pointless.
+
+## Security
+
+- Each subscription gets an auto-generated HMAC-SHA256 secret (or provide your own with `--secret`)
+- The webhook adapter validates signatures on every incoming POST
+- Static routes from config.yaml cannot be overwritten by dynamic subscriptions
+- Subscriptions persist to `~/.hermes/webhook_subscriptions.json`
+
+## How It Works
+
+1. `hermes webhook subscribe` writes to `~/.hermes/webhook_subscriptions.json`
+2. The webhook adapter hot-reloads this file on each incoming request (mtime-gated, negligible overhead)
+3. When a POST arrives matching a route, the adapter formats the prompt and triggers an agent run
+4. The agent's response is delivered to the configured target (Telegram, Discord, GitHub comment, etc.)
+
+## Troubleshooting
+
+If webhooks aren't working:
+
+1. **Is the gateway running?** Check with `systemctl --user status hermes-gateway` or `ps aux | grep gateway`
+2. **Is the webhook server listening?** `curl http://localhost:8644/health` should return `{"status": "ok"}`
+3. **Check gateway logs:** `grep webhook ~/.hermes/logs/gateway.log | tail -20`
+4. **Signature mismatch?** Verify the secret in your service matches the one from `hermes webhook list`. GitHub sends `X-Hub-Signature-256`, GitLab sends `X-Gitlab-Token`.
+5. **Firewall/NAT?** The webhook URL must be reachable from the service. For local development, use a tunnel (ngrok, cloudflared).
+6. **Wrong event type?** Check `--events` filter matches what the service sends. Use `hermes webhook test <name>` to verify the route works.