Files
ephron-ren-prd/prd-qqbot-media-support.md

478 lines
18 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# PRD: QQ Bot send_message 媒体附件支持修复
## 1. 问题描述
`tools/send_message_tool.py` 中的 `_send_qqbot()` 函数(第 1677 行)仅发送纯文本消息(`msg_type: 0`**完全忽略了 `media_files` 参数**。
当 AI Agent 通过 `send_message` 工具向 QQ Bot 平台发送带有图片、音频、视频、文档等媒体附件的消息时,附件会被静默丢弃,用户只能收到文本内容。更糟糕的是,如果消息中只有媒体附件而没有文本内容,系统会直接返回错误提示"不支持 QQ Bot 媒体发送"。
与此同时,网关适配器 `gateway/platforms/qqbot/adapter.py` 中的 `QQAdapter` 类已经具备完整的媒体发送能力:
| 方法 | 行号 | 功能 |
|------|------|------|
| `send_image()` | 2601 | 发送图片(支持 URL 和本地文件) |
| `send_image_file()` | 2627 | 发送本地图片文件 |
| `send_voice()` | 2641 | 发送语音消息 |
| `send_video()` | 2655 | 发送视频 |
| `send_document()` | 2669 | 发送文件/文档 |
| `_send_media()` | 2690 | 底层媒体上传(支持 HTTP URL 直传和本地文件分块上传) |
**核心矛盾:适配器已具备完整媒体能力,但 `send_message` 工具的路由层未对接。**
## 2. 影响范围
### 直接影响
- 所有通过 QQ Bot 平台发送带媒体附件消息的场景均受影响
- 包括:图片、语音、视频、文档/PDF 等所有媒体类型
### 间接影响
- 当 AI Agent 需要向 QQ 用户/群组发送截图、生成的图片、文件等内容时,用户无法收到
- 影响 QQ Bot 平台的用户体验完整性
### 涉及平台
- `Platform.QQBOT`QQ 机器人开放平台)
### 不受影响
- 其他已正确接入媒体支持的平台Telegram、Discord、Matrix、微信、Signal、元宝、飞书
- QQ Bot 网关适配器的入站消息处理(接收媒体消息正常)
## 3. 根因分析
### 3.1 调用链路分析
当前 QQ Bot 的消息发送路径:
```
send_message_tool()
→ 分块处理消息
→ 第 658-659 行: _send_qqbot(pconfig, chat_id, chunk) ← 无 media_files 参数
→ _send_qqbot() 直接用 httpx 发 REST 请求
→ payload = {"content": message, "msg_type": 0} ← 硬编码纯文本
```
对比元宝平台的正确路径:
```
send_message_tool()
→ 第 586-598 行: 检测到 YUANBAO + media_files
→ _send_yuanbao(chat_id, chunk, media_files=media_files)
→ get_active_adapter() 获取运行中的网关适配器
→ send_yuanbao_direct(adapter, chat_id, message, media_files=media_files)
→ 适配器处理媒体上传和发送
```
### 3.2 三个断点
1. **`_send_qqbot()` 函数签名缺少 `media_files` 参数**(第 1677 行)
2. **调用处未传递 `media_files`**(第 659 行)
3. **`QQAdapter` 缺少 `get_active()` 单例模式**——网关适配器无法被工具层获取
### 3.3 缺失的单例注册
元宝适配器 `YuanbaoAdapter` 拥有完整的单例模式(第 4392-4404 行):
- `_active_instance` 类变量
- `get_active()` 类方法
- `set_active()` 类方法
`QQAdapter` 缺少这套机制,导致 `send_message` 工具无法获取正在运行的网关适配器实例。
## 4. 修复方案
### 4.1 修改概览
需要修改 **2 个文件**,共 **4 处变更**
| 文件 | 变更 | 说明 |
|------|------|------|
| `gateway/platforms/qqbot/adapter.py` | 添加单例模式 | `get_active()` / `set_active()` + connect/disconnect 生命周期 |
| `gateway/platforms/qqbot/adapter.py` | 添加模块级 `get_active_adapter()` | 供工具层导入 |
| `tools/send_message_tool.py` | 修改 `_send_qqbot()` 签名和实现 | 支持 media_files通过网关适配器路由 |
| `tools/send_message_tool.py` | 添加 QQBOT+媒体路由分支 | 在主函数中添加与元宝/飞书类似的媒体处理逻辑 |
### 4.2 代码 Diff
#### 4.2.1 `gateway/platforms/qqbot/adapter.py` — 添加单例模式
`QQAdapter` 类定义中(第 155 行之后)添加单例注册机制:
```diff
class QQAdapter(BasePlatformAdapter):
"""QQ Bot adapter backed by the official QQ Bot WebSocket Gateway + REST API."""
# QQ Bot API does not support editing sent messages.
SUPPORTS_MESSAGE_EDITING = False
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
_TYPING_INPUT_SECONDS = 60
_TYPING_DEBOUNCE_SECONDS = 50
+ # -- Active instance registry (class-level singleton) ---
+
+ _active_instance: ClassVar[Optional["QQAdapter"]] = None
+
+ @classmethod
+ def get_active(cls) -> Optional["QQAdapter"]:
+ """Return the currently connected QQAdapter, or None."""
+ return cls._active_instance
+
+ @classmethod
+ def set_active(cls, adapter: Optional["QQAdapter"]) -> None:
+ """Register (or clear) the active adapter instance."""
+ cls._active_instance = adapter
+
@property
def _log_tag(self) -> str:
```
`connect()` 方法中注册活跃实例(第 301 行 `_mark_connected()` 之后):
```diff
# 4. Start listeners
self._listen_task = asyncio.create_task(self._listen_loop())
self._heartbeat_task = asyncio.create_task(self._heartbeat_loop())
self._mark_connected()
+ QQAdapter.set_active(self)
logger.info("[%s] Connected", self._log_tag)
return True
```
`disconnect()` 方法中清除活跃实例(第 314 行 `_mark_disconnected()` 之后):
```diff
async def disconnect(self) -> None:
"""Close all connections and stop listeners."""
self._running = False
self._mark_disconnected()
+ if QQAdapter.get_active() is self:
+ QQAdapter.set_active(None)
```
在文件末尾添加模块级委托函数:
```diff
+# ---------------------------------------------------------------------------
+# Module-level thin delegates (preserve import compatibility for send_message tool)
+# ---------------------------------------------------------------------------
+
+
+def get_active_adapter() -> Optional["QQAdapter"]:
+ """Delegate to ``QQAdapter.get_active()``."""
+ return QQAdapter.get_active()
```
需要在文件顶部添加 `ClassVar` 导入(如果尚未导入):
```diff
-from typing import Any, ClassVar, Dict, List, Optional
+from typing import Any, ClassVar, Dict, List, Optional # 确认 ClassVar 已导入
```
#### 4.2.2 `tools/send_message_tool.py` — 添加 QQ Bot 媒体支持
**变更 1修改 `_send_qqbot()` 函数签名和实现**(第 1677 行)
```diff
-async def _send_qqbot(pconfig, chat_id, message):
- """Send via QQBot using the REST API directly (no WebSocket needed).
-
- Uses the QQ Bot Open Platform REST endpoints to get an access token
- and post a message. Supports guild channels, C2C (private) chats,
- and group chats by trying the appropriate endpoints.
- """
- try:
- import httpx
- except ImportError:
- return _error("QQBot direct send requires httpx. Run: pip install httpx")
-
- extra = pconfig.extra or {}
- appid = extra.get("app_id") or os.getenv("QQ_APP_ID", "")
- secret = (pconfig.token or extra.get("client_secret")
- or os.getenv("QQ_CLIENT_SECRET", ""))
- if not appid or not secret:
- return _error("QQBot: QQ_APP_ID / QQ_CLIENT_SECRET not configured.")
-
- try:
- async with httpx.AsyncClient(timeout=15) as client:
- # ... (rest of function unchanged until payload line)
-
- payload = {"content": message[:4000], "msg_type": 0}
-
- # ... (endpoint attempts unchanged)
- except Exception as e:
- return _error(f"QQBot send failed: {e}")
+async def _send_qqbot(pconfig, chat_id, message, media_files=None):
+ """Send via QQBot using the running gateway adapter.
+
+ When media_files are present, routes through the gateway adapter's
+ native media upload pipeline (send_image, send_document, etc.).
+ Falls back to REST API for text-only messages when the adapter
+ is not running.
+ """
+ # If we have media files, try the gateway adapter first
+ if media_files:
+ try:
+ from gateway.platforms.qqbot.adapter import get_active_adapter
+ except ImportError:
+ return _error("QQBot adapter module not available.")
+
+ adapter = get_active_adapter()
+ if adapter is None:
+ return _error(
+ "QQBot adapter is not running. "
+ "Start the gateway with qqbot platform enabled first "
+ "to send media attachments."
+ )
+
+ # Send text first (if any)
+ if message.strip():
+ text_result = await adapter.send(chat_id=chat_id, content=message)
+ if not text_result.success:
+ return {"error": f"QQBot text send failed: {text_result.error}"}
+
+ # Send each media file
+ last_result = None
+ for file_path, _is_url in media_files:
+ import mimetypes
+ mime, _ = mimetypes.guess_type(file_path)
+ mime = (mime or "").lower()
+
+ if mime.startswith("image/"):
+ result = await adapter.send_image(chat_id, file_path)
+ elif mime.startswith("video/"):
+ result = await adapter.send_video(chat_id, file_path)
+ elif mime.startswith("audio/"):
+ result = await adapter.send_voice(chat_id, file_path)
+ else:
+ result = await adapter.send_document(chat_id, file_path)
+
+ if not result.success:
+ return {"error": f"QQBot media send failed: {result.error}"}
+ last_result = result
+
+ return {
+ "success": True,
+ "platform": "qqbot",
+ "chat_id": chat_id,
+ "media_sent": len(media_files),
+ }
+
+ # Text-only: use REST API directly (no gateway needed)
+ try:
+ import httpx
+ except ImportError:
+ return _error("QQBot direct send requires httpx. Run: pip install httpx")
+
+ extra = pconfig.extra or {}
+ appid = extra.get("app_id") or os.getenv("QQ_APP_ID", "")
+ secret = (pconfig.token or extra.get("client_secret")
+ or os.getenv("QQ_CLIENT_SECRET", ""))
+ if not appid or not secret:
+ return _error("QQBot: QQ_APP_ID / QQ_CLIENT_SECRET not configured.")
+
+ try:
+ async with httpx.AsyncClient(timeout=15) as client:
+ # (existing REST API logic unchanged)
+ # Step 1: Get access token
+ token_resp = await client.post(
+ "https://bots.qq.com/app/getAppAccessToken",
+ json={"appId": str(appid), "clientSecret": str(secret)},
+ )
+ if token_resp.status_code != 200:
+ return _error(f"QQBot token request failed: {token_resp.status_code}")
+ token_data = token_resp.json()
+ access_token = token_data.get("access_token")
+ if not access_token:
+ return _error(f"QQBot: no access_token in response")
+
+ headers = {
+ "Authorization": f"QQBot {access_token}",
+ "Content-Type": "application/json",
+ }
+ payload = {"content": message[:4000], "msg_type": 0}
+
+ # Try channel endpoint first
+ url = f"https://api.sgroup.qq.com/channels/{chat_id}/messages"
+ resp = await client.post(url, json=payload, headers=headers)
+ if resp.status_code in (200, 201):
+ data = resp.json()
+ return {"success": True, "platform": "qqbot", "chat_id": chat_id,
+ "message_id": data.get("id")}
+
+ # Try C2C endpoint
+ url_c2c = f"https://api.sgroup.qq.com/v2/users/{chat_id}/messages"
+ resp_c2c = await client.post(url_c2c, json=payload, headers=headers)
+ if resp_c2c.status_code in (200, 201):
+ data = resp_c2c.json()
+ return {"success": True, "platform": "qqbot", "chat_id": chat_id,
+ "message_id": data.get("id")}
+
+ # Try group endpoint
+ url_group = f"https://api.sgroup.qq.com/v2/groups/{chat_id}/messages"
+ resp_group = await client.post(url_group, json=payload, headers=headers)
+ if resp_group.status_code in (200, 201):
+ data = resp_group.json()
+ return {"success": True, "platform": "qqbot", "chat_id": chat_id,
+ "message_id": data.get("id")}
+
+ return _error(f"QQBot send failed: channel={resp.status_code} c2c={resp_c2c.status_code} group={resp_group.status_code}")
+ except Exception as e:
+ return _error(f"QQBot send failed: {e}")
```
**变更 2在主函数中添加 QQBOT+媒体路由分支**(第 585-598 行之后,第 600 行飞书分支之前)
```diff
+ # --- QQBot: native media attachment support via running gateway adapter ---
+ if platform == Platform.QQBOT and media_files:
+ last_result = None
+ for i, chunk in enumerate(chunks):
+ is_last = (i == len(chunks) - 1)
+ result = await _send_qqbot(
+ pconfig,
+ chat_id,
+ chunk,
+ media_files=media_files if is_last else None,
+ )
+ if isinstance(result, dict) and result.get("error"):
+ return result
+ last_result = result
+ return last_result
+
# --- Feishu: native media attachment support via adapter ---
if platform == Platform.FEISHU and media_files:
```
**变更 3更新不支持媒体的平台列表错误信息**(第 618-630 行)
```diff
if media_files and not message.strip():
return {
"error": (
- "send_message MEDIA delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao and feishu; "
+ "send_message MEDIA delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao, feishu and qqbot; "
f"target {platform.value} had only media attachments"
)
}
warning = None
if media_files:
warning = (
f"MEDIA attachments were omitted for {platform.value}; "
- "native send_message media delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao and feishu"
+ "native send_message media delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao, feishu and qqbot"
)
```
### 4.3 设计决策说明
| 决策 | 理由 |
|------|------|
| 文本和媒体分开发送 | QQ Bot API 的 `msg_type=0`(文本)和 `msg_type=7`(媒体/富媒体)是不同的消息类型,无法在一条消息中混合 |
| 媒体附件仅在最后一个分块时发送 | 与元宝/飞书保持一致的模式,避免在每个文本分块后都重复发送媒体 |
| 纯文本消息仍保留 REST API 直发路径 | 向后兼容——网关未运行时,纯文本消息仍可正常发送,无需强制依赖网关 |
| 媒体类型通过 MIME 类型自动判断 | 复用 Python 标准库 `mimetypes`,无需额外依赖 |
## 5. 验证方法
### 5.1 单元测试
```python
# 测试 1: QQAdapter 单例模式
async def test_qqbot_adapter_singleton():
"""验证 set_active/get_active 正确注册和清除实例。"""
from gateway.platforms.qqbot.adapter import QQAdapter
assert QQAdapter.get_active() is None
# 模拟适配器实例
mock_config = PlatformConfig(platform=Platform.QQBOT, ...)
adapter = QQAdapter(mock_config)
QQAdapter.set_active(adapter)
assert QQAdapter.get_active() is adapter
QQAdapter.set_active(None)
assert QQAdapter.get_active() is None
# 测试 2: _send_qqbot 媒体路由
async def test_send_qqbot_with_media(mocker):
"""验证有 media_files 时通过网关适配器路由。"""
mock_adapter = mocker.MagicMock()
mock_adapter.send = mocker.AsyncMock(return_value=SendResult(success=True))
mock_adapter.send_image = mocker.AsyncMock(return_value=SendResult(success=True))
mocker.patch(
"gateway.platforms.qqbot.adapter.QQAdapter.get_active",
return_value=mock_adapter,
)
result = await _send_qqbot(
pconfig=...,
chat_id="test_chat",
message="caption",
media_files=[("/tmp/test.png", False)],
)
assert result["success"] is True
assert result["media_sent"] == 1
mock_adapter.send.assert_called_once()
mock_adapter.send_image.assert_called_once()
# 测试 3: _send_qqbot 无网关时的错误处理
async def test_send_qqbot_media_no_adapter(mocker):
"""验证网关未运行时返回清晰错误。"""
mocker.patch(
"gateway.platforms.qqbot.adapter.QQAdapter.get_active",
return_value=None,
)
result = await _send_qqbot(
pconfig=...,
chat_id="test_chat",
message="",
media_files=[("/tmp/test.png", False)],
)
assert "error" in result
assert "adapter is not running" in result["error"]
# 测试 4: 纯文本消息保持原有行为
async def test_send_qqbot_text_only_unchanged(mocker):
"""验证纯文本消息不受修改影响,仍通过 REST API 发送。"""
# (mock httpx and verify REST API path is taken)
```
### 5.2 集成测试
1. **启动网关**:确保 `config.yaml``platforms.qq.enabled: true`,网关成功连接 QQ Bot
2. **发送纯文本**:通过 send_message 向 QQ 用户发送纯文本,验证正常
3. **发送图片**:通过 send_message 向 QQ 用户发送带图片附件的消息,验证图片被原生上传并展示
4. **发送文档**:发送 PDF/文件附件,验证文件可下载
5. **发送语音/视频**:发送音频和视频文件,验证原生播放
6. **无网关降级**:停止网关后发送纯文本,验证仍通过 REST API 直发成功
7. **无网关发媒体**:停止网关后发送带媒体的消息,验证返回清晰错误提示
### 5.3 回归测试
```bash
# 运行现有测试套件,确保无回归
cd /home/ubuntu/.hermes/hermes-agent
source .venv/bin/activate
python -m pytest tests/ -x -q
# 运行 QQ Bot 相关测试(如有)
python -m pytest tests/ -k "qqbot" -v
```
### 5.4 手动验证清单
- [ ] QQ Bot 网关正常连接(日志显示 `QQBot: Connected`
- [ ] 纯文本消息发送成功
- [ ] 图片附件消息发送成功(用户看到原生图片)
- [ ] 文档附件消息发送成功(用户可下载)
- [ ] 语音附件消息发送成功
- [ ] 视频附件消息发送成功
- [ ] 网关未运行时纯文本降级正常
- [ ] 网关未运行时媒体附件返回清晰错误
- [ ] 其他平台Telegram、Discord 等)发送不受影响