init: consolidate all ephron.ren PRDs and docs
This commit is contained in:
477
prd-qqbot-media-support.md
Normal file
477
prd-qqbot-media-support.md
Normal file
@@ -0,0 +1,477 @@
|
||||
# PRD: QQ Bot send_message 媒体附件支持修复
|
||||
|
||||
## 1. 问题描述
|
||||
|
||||
`tools/send_message_tool.py` 中的 `_send_qqbot()` 函数(第 1677 行)仅发送纯文本消息(`msg_type: 0`),**完全忽略了 `media_files` 参数**。
|
||||
|
||||
当 AI Agent 通过 `send_message` 工具向 QQ Bot 平台发送带有图片、音频、视频、文档等媒体附件的消息时,附件会被静默丢弃,用户只能收到文本内容。更糟糕的是,如果消息中只有媒体附件而没有文本内容,系统会直接返回错误提示"不支持 QQ Bot 媒体发送"。
|
||||
|
||||
与此同时,网关适配器 `gateway/platforms/qqbot/adapter.py` 中的 `QQAdapter` 类已经具备完整的媒体发送能力:
|
||||
|
||||
| 方法 | 行号 | 功能 |
|
||||
|------|------|------|
|
||||
| `send_image()` | 2601 | 发送图片(支持 URL 和本地文件) |
|
||||
| `send_image_file()` | 2627 | 发送本地图片文件 |
|
||||
| `send_voice()` | 2641 | 发送语音消息 |
|
||||
| `send_video()` | 2655 | 发送视频 |
|
||||
| `send_document()` | 2669 | 发送文件/文档 |
|
||||
| `_send_media()` | 2690 | 底层媒体上传(支持 HTTP URL 直传和本地文件分块上传) |
|
||||
|
||||
**核心矛盾:适配器已具备完整媒体能力,但 `send_message` 工具的路由层未对接。**
|
||||
|
||||
## 2. 影响范围
|
||||
|
||||
### 直接影响
|
||||
- 所有通过 QQ Bot 平台发送带媒体附件消息的场景均受影响
|
||||
- 包括:图片、语音、视频、文档/PDF 等所有媒体类型
|
||||
|
||||
### 间接影响
|
||||
- 当 AI Agent 需要向 QQ 用户/群组发送截图、生成的图片、文件等内容时,用户无法收到
|
||||
- 影响 QQ Bot 平台的用户体验完整性
|
||||
|
||||
### 涉及平台
|
||||
- `Platform.QQBOT`(QQ 机器人开放平台)
|
||||
|
||||
### 不受影响
|
||||
- 其他已正确接入媒体支持的平台(Telegram、Discord、Matrix、微信、Signal、元宝、飞书)
|
||||
- QQ Bot 网关适配器的入站消息处理(接收媒体消息正常)
|
||||
|
||||
## 3. 根因分析
|
||||
|
||||
### 3.1 调用链路分析
|
||||
|
||||
当前 QQ Bot 的消息发送路径:
|
||||
|
||||
```
|
||||
send_message_tool()
|
||||
→ 分块处理消息
|
||||
→ 第 658-659 行: _send_qqbot(pconfig, chat_id, chunk) ← 无 media_files 参数
|
||||
→ _send_qqbot() 直接用 httpx 发 REST 请求
|
||||
→ payload = {"content": message, "msg_type": 0} ← 硬编码纯文本
|
||||
```
|
||||
|
||||
对比元宝平台的正确路径:
|
||||
|
||||
```
|
||||
send_message_tool()
|
||||
→ 第 586-598 行: 检测到 YUANBAO + media_files
|
||||
→ _send_yuanbao(chat_id, chunk, media_files=media_files)
|
||||
→ get_active_adapter() 获取运行中的网关适配器
|
||||
→ send_yuanbao_direct(adapter, chat_id, message, media_files=media_files)
|
||||
→ 适配器处理媒体上传和发送
|
||||
```
|
||||
|
||||
### 3.2 三个断点
|
||||
|
||||
1. **`_send_qqbot()` 函数签名缺少 `media_files` 参数**(第 1677 行)
|
||||
2. **调用处未传递 `media_files`**(第 659 行)
|
||||
3. **`QQAdapter` 缺少 `get_active()` 单例模式**——网关适配器无法被工具层获取
|
||||
|
||||
### 3.3 缺失的单例注册
|
||||
|
||||
元宝适配器 `YuanbaoAdapter` 拥有完整的单例模式(第 4392-4404 行):
|
||||
- `_active_instance` 类变量
|
||||
- `get_active()` 类方法
|
||||
- `set_active()` 类方法
|
||||
|
||||
`QQAdapter` 缺少这套机制,导致 `send_message` 工具无法获取正在运行的网关适配器实例。
|
||||
|
||||
## 4. 修复方案
|
||||
|
||||
### 4.1 修改概览
|
||||
|
||||
需要修改 **2 个文件**,共 **4 处变更**:
|
||||
|
||||
| 文件 | 变更 | 说明 |
|
||||
|------|------|------|
|
||||
| `gateway/platforms/qqbot/adapter.py` | 添加单例模式 | `get_active()` / `set_active()` + connect/disconnect 生命周期 |
|
||||
| `gateway/platforms/qqbot/adapter.py` | 添加模块级 `get_active_adapter()` | 供工具层导入 |
|
||||
| `tools/send_message_tool.py` | 修改 `_send_qqbot()` 签名和实现 | 支持 media_files,通过网关适配器路由 |
|
||||
| `tools/send_message_tool.py` | 添加 QQBOT+媒体路由分支 | 在主函数中添加与元宝/飞书类似的媒体处理逻辑 |
|
||||
|
||||
### 4.2 代码 Diff
|
||||
|
||||
#### 4.2.1 `gateway/platforms/qqbot/adapter.py` — 添加单例模式
|
||||
|
||||
在 `QQAdapter` 类定义中(第 155 行之后)添加单例注册机制:
|
||||
|
||||
```diff
|
||||
class QQAdapter(BasePlatformAdapter):
|
||||
"""QQ Bot adapter backed by the official QQ Bot WebSocket Gateway + REST API."""
|
||||
|
||||
# QQ Bot API does not support editing sent messages.
|
||||
SUPPORTS_MESSAGE_EDITING = False
|
||||
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
|
||||
_TYPING_INPUT_SECONDS = 60
|
||||
_TYPING_DEBOUNCE_SECONDS = 50
|
||||
|
||||
+ # -- Active instance registry (class-level singleton) ---
|
||||
+
|
||||
+ _active_instance: ClassVar[Optional["QQAdapter"]] = None
|
||||
+
|
||||
+ @classmethod
|
||||
+ def get_active(cls) -> Optional["QQAdapter"]:
|
||||
+ """Return the currently connected QQAdapter, or None."""
|
||||
+ return cls._active_instance
|
||||
+
|
||||
+ @classmethod
|
||||
+ def set_active(cls, adapter: Optional["QQAdapter"]) -> None:
|
||||
+ """Register (or clear) the active adapter instance."""
|
||||
+ cls._active_instance = adapter
|
||||
+
|
||||
@property
|
||||
def _log_tag(self) -> str:
|
||||
```
|
||||
|
||||
在 `connect()` 方法中注册活跃实例(第 301 行 `_mark_connected()` 之后):
|
||||
|
||||
```diff
|
||||
# 4. Start listeners
|
||||
self._listen_task = asyncio.create_task(self._listen_loop())
|
||||
self._heartbeat_task = asyncio.create_task(self._heartbeat_loop())
|
||||
self._mark_connected()
|
||||
+ QQAdapter.set_active(self)
|
||||
logger.info("[%s] Connected", self._log_tag)
|
||||
return True
|
||||
```
|
||||
|
||||
在 `disconnect()` 方法中清除活跃实例(第 314 行 `_mark_disconnected()` 之后):
|
||||
|
||||
```diff
|
||||
async def disconnect(self) -> None:
|
||||
"""Close all connections and stop listeners."""
|
||||
self._running = False
|
||||
self._mark_disconnected()
|
||||
+ if QQAdapter.get_active() is self:
|
||||
+ QQAdapter.set_active(None)
|
||||
```
|
||||
|
||||
在文件末尾添加模块级委托函数:
|
||||
|
||||
```diff
|
||||
+# ---------------------------------------------------------------------------
|
||||
+# Module-level thin delegates (preserve import compatibility for send_message tool)
|
||||
+# ---------------------------------------------------------------------------
|
||||
+
|
||||
+
|
||||
+def get_active_adapter() -> Optional["QQAdapter"]:
|
||||
+ """Delegate to ``QQAdapter.get_active()``."""
|
||||
+ return QQAdapter.get_active()
|
||||
```
|
||||
|
||||
需要在文件顶部添加 `ClassVar` 导入(如果尚未导入):
|
||||
|
||||
```diff
|
||||
-from typing import Any, ClassVar, Dict, List, Optional
|
||||
+from typing import Any, ClassVar, Dict, List, Optional # 确认 ClassVar 已导入
|
||||
```
|
||||
|
||||
#### 4.2.2 `tools/send_message_tool.py` — 添加 QQ Bot 媒体支持
|
||||
|
||||
**变更 1:修改 `_send_qqbot()` 函数签名和实现**(第 1677 行)
|
||||
|
||||
```diff
|
||||
-async def _send_qqbot(pconfig, chat_id, message):
|
||||
- """Send via QQBot using the REST API directly (no WebSocket needed).
|
||||
-
|
||||
- Uses the QQ Bot Open Platform REST endpoints to get an access token
|
||||
- and post a message. Supports guild channels, C2C (private) chats,
|
||||
- and group chats by trying the appropriate endpoints.
|
||||
- """
|
||||
- try:
|
||||
- import httpx
|
||||
- except ImportError:
|
||||
- return _error("QQBot direct send requires httpx. Run: pip install httpx")
|
||||
-
|
||||
- extra = pconfig.extra or {}
|
||||
- appid = extra.get("app_id") or os.getenv("QQ_APP_ID", "")
|
||||
- secret = (pconfig.token or extra.get("client_secret")
|
||||
- or os.getenv("QQ_CLIENT_SECRET", ""))
|
||||
- if not appid or not secret:
|
||||
- return _error("QQBot: QQ_APP_ID / QQ_CLIENT_SECRET not configured.")
|
||||
-
|
||||
- try:
|
||||
- async with httpx.AsyncClient(timeout=15) as client:
|
||||
- # ... (rest of function unchanged until payload line)
|
||||
-
|
||||
- payload = {"content": message[:4000], "msg_type": 0}
|
||||
-
|
||||
- # ... (endpoint attempts unchanged)
|
||||
- except Exception as e:
|
||||
- return _error(f"QQBot send failed: {e}")
|
||||
+async def _send_qqbot(pconfig, chat_id, message, media_files=None):
|
||||
+ """Send via QQBot using the running gateway adapter.
|
||||
+
|
||||
+ When media_files are present, routes through the gateway adapter's
|
||||
+ native media upload pipeline (send_image, send_document, etc.).
|
||||
+ Falls back to REST API for text-only messages when the adapter
|
||||
+ is not running.
|
||||
+ """
|
||||
+ # If we have media files, try the gateway adapter first
|
||||
+ if media_files:
|
||||
+ try:
|
||||
+ from gateway.platforms.qqbot.adapter import get_active_adapter
|
||||
+ except ImportError:
|
||||
+ return _error("QQBot adapter module not available.")
|
||||
+
|
||||
+ adapter = get_active_adapter()
|
||||
+ if adapter is None:
|
||||
+ return _error(
|
||||
+ "QQBot adapter is not running. "
|
||||
+ "Start the gateway with qqbot platform enabled first "
|
||||
+ "to send media attachments."
|
||||
+ )
|
||||
+
|
||||
+ # Send text first (if any)
|
||||
+ if message.strip():
|
||||
+ text_result = await adapter.send(chat_id=chat_id, content=message)
|
||||
+ if not text_result.success:
|
||||
+ return {"error": f"QQBot text send failed: {text_result.error}"}
|
||||
+
|
||||
+ # Send each media file
|
||||
+ last_result = None
|
||||
+ for file_path, _is_url in media_files:
|
||||
+ import mimetypes
|
||||
+ mime, _ = mimetypes.guess_type(file_path)
|
||||
+ mime = (mime or "").lower()
|
||||
+
|
||||
+ if mime.startswith("image/"):
|
||||
+ result = await adapter.send_image(chat_id, file_path)
|
||||
+ elif mime.startswith("video/"):
|
||||
+ result = await adapter.send_video(chat_id, file_path)
|
||||
+ elif mime.startswith("audio/"):
|
||||
+ result = await adapter.send_voice(chat_id, file_path)
|
||||
+ else:
|
||||
+ result = await adapter.send_document(chat_id, file_path)
|
||||
+
|
||||
+ if not result.success:
|
||||
+ return {"error": f"QQBot media send failed: {result.error}"}
|
||||
+ last_result = result
|
||||
+
|
||||
+ return {
|
||||
+ "success": True,
|
||||
+ "platform": "qqbot",
|
||||
+ "chat_id": chat_id,
|
||||
+ "media_sent": len(media_files),
|
||||
+ }
|
||||
+
|
||||
+ # Text-only: use REST API directly (no gateway needed)
|
||||
+ try:
|
||||
+ import httpx
|
||||
+ except ImportError:
|
||||
+ return _error("QQBot direct send requires httpx. Run: pip install httpx")
|
||||
+
|
||||
+ extra = pconfig.extra or {}
|
||||
+ appid = extra.get("app_id") or os.getenv("QQ_APP_ID", "")
|
||||
+ secret = (pconfig.token or extra.get("client_secret")
|
||||
+ or os.getenv("QQ_CLIENT_SECRET", ""))
|
||||
+ if not appid or not secret:
|
||||
+ return _error("QQBot: QQ_APP_ID / QQ_CLIENT_SECRET not configured.")
|
||||
+
|
||||
+ try:
|
||||
+ async with httpx.AsyncClient(timeout=15) as client:
|
||||
+ # (existing REST API logic unchanged)
|
||||
+ # Step 1: Get access token
|
||||
+ token_resp = await client.post(
|
||||
+ "https://bots.qq.com/app/getAppAccessToken",
|
||||
+ json={"appId": str(appid), "clientSecret": str(secret)},
|
||||
+ )
|
||||
+ if token_resp.status_code != 200:
|
||||
+ return _error(f"QQBot token request failed: {token_resp.status_code}")
|
||||
+ token_data = token_resp.json()
|
||||
+ access_token = token_data.get("access_token")
|
||||
+ if not access_token:
|
||||
+ return _error(f"QQBot: no access_token in response")
|
||||
+
|
||||
+ headers = {
|
||||
+ "Authorization": f"QQBot {access_token}",
|
||||
+ "Content-Type": "application/json",
|
||||
+ }
|
||||
+ payload = {"content": message[:4000], "msg_type": 0}
|
||||
+
|
||||
+ # Try channel endpoint first
|
||||
+ url = f"https://api.sgroup.qq.com/channels/{chat_id}/messages"
|
||||
+ resp = await client.post(url, json=payload, headers=headers)
|
||||
+ if resp.status_code in (200, 201):
|
||||
+ data = resp.json()
|
||||
+ return {"success": True, "platform": "qqbot", "chat_id": chat_id,
|
||||
+ "message_id": data.get("id")}
|
||||
+
|
||||
+ # Try C2C endpoint
|
||||
+ url_c2c = f"https://api.sgroup.qq.com/v2/users/{chat_id}/messages"
|
||||
+ resp_c2c = await client.post(url_c2c, json=payload, headers=headers)
|
||||
+ if resp_c2c.status_code in (200, 201):
|
||||
+ data = resp_c2c.json()
|
||||
+ return {"success": True, "platform": "qqbot", "chat_id": chat_id,
|
||||
+ "message_id": data.get("id")}
|
||||
+
|
||||
+ # Try group endpoint
|
||||
+ url_group = f"https://api.sgroup.qq.com/v2/groups/{chat_id}/messages"
|
||||
+ resp_group = await client.post(url_group, json=payload, headers=headers)
|
||||
+ if resp_group.status_code in (200, 201):
|
||||
+ data = resp_group.json()
|
||||
+ return {"success": True, "platform": "qqbot", "chat_id": chat_id,
|
||||
+ "message_id": data.get("id")}
|
||||
+
|
||||
+ return _error(f"QQBot send failed: channel={resp.status_code} c2c={resp_c2c.status_code} group={resp_group.status_code}")
|
||||
+ except Exception as e:
|
||||
+ return _error(f"QQBot send failed: {e}")
|
||||
```
|
||||
|
||||
**变更 2:在主函数中添加 QQBOT+媒体路由分支**(第 585-598 行之后,第 600 行飞书分支之前)
|
||||
|
||||
```diff
|
||||
+ # --- QQBot: native media attachment support via running gateway adapter ---
|
||||
+ if platform == Platform.QQBOT and media_files:
|
||||
+ last_result = None
|
||||
+ for i, chunk in enumerate(chunks):
|
||||
+ is_last = (i == len(chunks) - 1)
|
||||
+ result = await _send_qqbot(
|
||||
+ pconfig,
|
||||
+ chat_id,
|
||||
+ chunk,
|
||||
+ media_files=media_files if is_last else None,
|
||||
+ )
|
||||
+ if isinstance(result, dict) and result.get("error"):
|
||||
+ return result
|
||||
+ last_result = result
|
||||
+ return last_result
|
||||
+
|
||||
# --- Feishu: native media attachment support via adapter ---
|
||||
if platform == Platform.FEISHU and media_files:
|
||||
```
|
||||
|
||||
**变更 3:更新不支持媒体的平台列表错误信息**(第 618-630 行)
|
||||
|
||||
```diff
|
||||
if media_files and not message.strip():
|
||||
return {
|
||||
"error": (
|
||||
- "send_message MEDIA delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao and feishu; "
|
||||
+ "send_message MEDIA delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao, feishu and qqbot; "
|
||||
f"target {platform.value} had only media attachments"
|
||||
)
|
||||
}
|
||||
warning = None
|
||||
if media_files:
|
||||
warning = (
|
||||
f"MEDIA attachments were omitted for {platform.value}; "
|
||||
- "native send_message media delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao and feishu"
|
||||
+ "native send_message media delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao, feishu and qqbot"
|
||||
)
|
||||
```
|
||||
|
||||
### 4.3 设计决策说明
|
||||
|
||||
| 决策 | 理由 |
|
||||
|------|------|
|
||||
| 文本和媒体分开发送 | QQ Bot API 的 `msg_type=0`(文本)和 `msg_type=7`(媒体/富媒体)是不同的消息类型,无法在一条消息中混合 |
|
||||
| 媒体附件仅在最后一个分块时发送 | 与元宝/飞书保持一致的模式,避免在每个文本分块后都重复发送媒体 |
|
||||
| 纯文本消息仍保留 REST API 直发路径 | 向后兼容——网关未运行时,纯文本消息仍可正常发送,无需强制依赖网关 |
|
||||
| 媒体类型通过 MIME 类型自动判断 | 复用 Python 标准库 `mimetypes`,无需额外依赖 |
|
||||
|
||||
## 5. 验证方法
|
||||
|
||||
### 5.1 单元测试
|
||||
|
||||
```python
|
||||
# 测试 1: QQAdapter 单例模式
|
||||
async def test_qqbot_adapter_singleton():
|
||||
"""验证 set_active/get_active 正确注册和清除实例。"""
|
||||
from gateway.platforms.qqbot.adapter import QQAdapter
|
||||
|
||||
assert QQAdapter.get_active() is None
|
||||
|
||||
# 模拟适配器实例
|
||||
mock_config = PlatformConfig(platform=Platform.QQBOT, ...)
|
||||
adapter = QQAdapter(mock_config)
|
||||
QQAdapter.set_active(adapter)
|
||||
assert QQAdapter.get_active() is adapter
|
||||
|
||||
QQAdapter.set_active(None)
|
||||
assert QQAdapter.get_active() is None
|
||||
|
||||
|
||||
# 测试 2: _send_qqbot 媒体路由
|
||||
async def test_send_qqbot_with_media(mocker):
|
||||
"""验证有 media_files 时通过网关适配器路由。"""
|
||||
mock_adapter = mocker.MagicMock()
|
||||
mock_adapter.send = mocker.AsyncMock(return_value=SendResult(success=True))
|
||||
mock_adapter.send_image = mocker.AsyncMock(return_value=SendResult(success=True))
|
||||
|
||||
mocker.patch(
|
||||
"gateway.platforms.qqbot.adapter.QQAdapter.get_active",
|
||||
return_value=mock_adapter,
|
||||
)
|
||||
|
||||
result = await _send_qqbot(
|
||||
pconfig=...,
|
||||
chat_id="test_chat",
|
||||
message="caption",
|
||||
media_files=[("/tmp/test.png", False)],
|
||||
)
|
||||
|
||||
assert result["success"] is True
|
||||
assert result["media_sent"] == 1
|
||||
mock_adapter.send.assert_called_once()
|
||||
mock_adapter.send_image.assert_called_once()
|
||||
|
||||
|
||||
# 测试 3: _send_qqbot 无网关时的错误处理
|
||||
async def test_send_qqbot_media_no_adapter(mocker):
|
||||
"""验证网关未运行时返回清晰错误。"""
|
||||
mocker.patch(
|
||||
"gateway.platforms.qqbot.adapter.QQAdapter.get_active",
|
||||
return_value=None,
|
||||
)
|
||||
|
||||
result = await _send_qqbot(
|
||||
pconfig=...,
|
||||
chat_id="test_chat",
|
||||
message="",
|
||||
media_files=[("/tmp/test.png", False)],
|
||||
)
|
||||
|
||||
assert "error" in result
|
||||
assert "adapter is not running" in result["error"]
|
||||
|
||||
|
||||
# 测试 4: 纯文本消息保持原有行为
|
||||
async def test_send_qqbot_text_only_unchanged(mocker):
|
||||
"""验证纯文本消息不受修改影响,仍通过 REST API 发送。"""
|
||||
# (mock httpx and verify REST API path is taken)
|
||||
```
|
||||
|
||||
### 5.2 集成测试
|
||||
|
||||
1. **启动网关**:确保 `config.yaml` 中 `platforms.qq.enabled: true`,网关成功连接 QQ Bot
|
||||
2. **发送纯文本**:通过 send_message 向 QQ 用户发送纯文本,验证正常
|
||||
3. **发送图片**:通过 send_message 向 QQ 用户发送带图片附件的消息,验证图片被原生上传并展示
|
||||
4. **发送文档**:发送 PDF/文件附件,验证文件可下载
|
||||
5. **发送语音/视频**:发送音频和视频文件,验证原生播放
|
||||
6. **无网关降级**:停止网关后发送纯文本,验证仍通过 REST API 直发成功
|
||||
7. **无网关发媒体**:停止网关后发送带媒体的消息,验证返回清晰错误提示
|
||||
|
||||
### 5.3 回归测试
|
||||
|
||||
```bash
|
||||
# 运行现有测试套件,确保无回归
|
||||
cd /home/ubuntu/.hermes/hermes-agent
|
||||
source .venv/bin/activate
|
||||
python -m pytest tests/ -x -q
|
||||
|
||||
# 运行 QQ Bot 相关测试(如有)
|
||||
python -m pytest tests/ -k "qqbot" -v
|
||||
```
|
||||
|
||||
### 5.4 手动验证清单
|
||||
|
||||
- [ ] QQ Bot 网关正常连接(日志显示 `QQBot: Connected`)
|
||||
- [ ] 纯文本消息发送成功
|
||||
- [ ] 图片附件消息发送成功(用户看到原生图片)
|
||||
- [ ] 文档附件消息发送成功(用户可下载)
|
||||
- [ ] 语音附件消息发送成功
|
||||
- [ ] 视频附件消息发送成功
|
||||
- [ ] 网关未运行时纯文本降级正常
|
||||
- [ ] 网关未运行时媒体附件返回清晰错误
|
||||
- [ ] 其他平台(Telegram、Discord 等)发送不受影响
|
||||
Reference in New Issue
Block a user