18 KiB
PRD: QQ Bot send_message 媒体附件支持修复
1. 问题描述
tools/send_message_tool.py 中的 _send_qqbot() 函数(第 1677 行)仅发送纯文本消息(msg_type: 0),完全忽略了 media_files 参数。
当 AI Agent 通过 send_message 工具向 QQ Bot 平台发送带有图片、音频、视频、文档等媒体附件的消息时,附件会被静默丢弃,用户只能收到文本内容。更糟糕的是,如果消息中只有媒体附件而没有文本内容,系统会直接返回错误提示"不支持 QQ Bot 媒体发送"。
与此同时,网关适配器 gateway/platforms/qqbot/adapter.py 中的 QQAdapter 类已经具备完整的媒体发送能力:
| 方法 | 行号 | 功能 |
|---|---|---|
send_image() |
2601 | 发送图片(支持 URL 和本地文件) |
send_image_file() |
2627 | 发送本地图片文件 |
send_voice() |
2641 | 发送语音消息 |
send_video() |
2655 | 发送视频 |
send_document() |
2669 | 发送文件/文档 |
_send_media() |
2690 | 底层媒体上传(支持 HTTP URL 直传和本地文件分块上传) |
核心矛盾:适配器已具备完整媒体能力,但 send_message 工具的路由层未对接。
2. 影响范围
直接影响
- 所有通过 QQ Bot 平台发送带媒体附件消息的场景均受影响
- 包括:图片、语音、视频、文档/PDF 等所有媒体类型
间接影响
- 当 AI Agent 需要向 QQ 用户/群组发送截图、生成的图片、文件等内容时,用户无法收到
- 影响 QQ Bot 平台的用户体验完整性
涉及平台
Platform.QQBOT(QQ 机器人开放平台)
不受影响
- 其他已正确接入媒体支持的平台(Telegram、Discord、Matrix、微信、Signal、元宝、飞书)
- QQ Bot 网关适配器的入站消息处理(接收媒体消息正常)
3. 根因分析
3.1 调用链路分析
当前 QQ Bot 的消息发送路径:
send_message_tool()
→ 分块处理消息
→ 第 658-659 行: _send_qqbot(pconfig, chat_id, chunk) ← 无 media_files 参数
→ _send_qqbot() 直接用 httpx 发 REST 请求
→ payload = {"content": message, "msg_type": 0} ← 硬编码纯文本
对比元宝平台的正确路径:
send_message_tool()
→ 第 586-598 行: 检测到 YUANBAO + media_files
→ _send_yuanbao(chat_id, chunk, media_files=media_files)
→ get_active_adapter() 获取运行中的网关适配器
→ send_yuanbao_direct(adapter, chat_id, message, media_files=media_files)
→ 适配器处理媒体上传和发送
3.2 三个断点
_send_qqbot()函数签名缺少media_files参数(第 1677 行)- 调用处未传递
media_files(第 659 行) QQAdapter缺少get_active()单例模式——网关适配器无法被工具层获取
3.3 缺失的单例注册
元宝适配器 YuanbaoAdapter 拥有完整的单例模式(第 4392-4404 行):
_active_instance类变量get_active()类方法set_active()类方法
QQAdapter 缺少这套机制,导致 send_message 工具无法获取正在运行的网关适配器实例。
4. 修复方案
4.1 修改概览
需要修改 2 个文件,共 4 处变更:
| 文件 | 变更 | 说明 |
|---|---|---|
gateway/platforms/qqbot/adapter.py |
添加单例模式 | get_active() / set_active() + connect/disconnect 生命周期 |
gateway/platforms/qqbot/adapter.py |
添加模块级 get_active_adapter() |
供工具层导入 |
tools/send_message_tool.py |
修改 _send_qqbot() 签名和实现 |
支持 media_files,通过网关适配器路由 |
tools/send_message_tool.py |
添加 QQBOT+媒体路由分支 | 在主函数中添加与元宝/飞书类似的媒体处理逻辑 |
4.2 代码 Diff
4.2.1 gateway/platforms/qqbot/adapter.py — 添加单例模式
在 QQAdapter 类定义中(第 155 行之后)添加单例注册机制:
class QQAdapter(BasePlatformAdapter):
"""QQ Bot adapter backed by the official QQ Bot WebSocket Gateway + REST API."""
# QQ Bot API does not support editing sent messages.
SUPPORTS_MESSAGE_EDITING = False
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
_TYPING_INPUT_SECONDS = 60
_TYPING_DEBOUNCE_SECONDS = 50
+ # -- Active instance registry (class-level singleton) ---
+
+ _active_instance: ClassVar[Optional["QQAdapter"]] = None
+
+ @classmethod
+ def get_active(cls) -> Optional["QQAdapter"]:
+ """Return the currently connected QQAdapter, or None."""
+ return cls._active_instance
+
+ @classmethod
+ def set_active(cls, adapter: Optional["QQAdapter"]) -> None:
+ """Register (or clear) the active adapter instance."""
+ cls._active_instance = adapter
+
@property
def _log_tag(self) -> str:
在 connect() 方法中注册活跃实例(第 301 行 _mark_connected() 之后):
# 4. Start listeners
self._listen_task = asyncio.create_task(self._listen_loop())
self._heartbeat_task = asyncio.create_task(self._heartbeat_loop())
self._mark_connected()
+ QQAdapter.set_active(self)
logger.info("[%s] Connected", self._log_tag)
return True
在 disconnect() 方法中清除活跃实例(第 314 行 _mark_disconnected() 之后):
async def disconnect(self) -> None:
"""Close all connections and stop listeners."""
self._running = False
self._mark_disconnected()
+ if QQAdapter.get_active() is self:
+ QQAdapter.set_active(None)
在文件末尾添加模块级委托函数:
+# ---------------------------------------------------------------------------
+# Module-level thin delegates (preserve import compatibility for send_message tool)
+# ---------------------------------------------------------------------------
+
+
+def get_active_adapter() -> Optional["QQAdapter"]:
+ """Delegate to ``QQAdapter.get_active()``."""
+ return QQAdapter.get_active()
需要在文件顶部添加 ClassVar 导入(如果尚未导入):
-from typing import Any, ClassVar, Dict, List, Optional
+from typing import Any, ClassVar, Dict, List, Optional # 确认 ClassVar 已导入
4.2.2 tools/send_message_tool.py — 添加 QQ Bot 媒体支持
变更 1:修改 _send_qqbot() 函数签名和实现(第 1677 行)
-async def _send_qqbot(pconfig, chat_id, message):
- """Send via QQBot using the REST API directly (no WebSocket needed).
-
- Uses the QQ Bot Open Platform REST endpoints to get an access token
- and post a message. Supports guild channels, C2C (private) chats,
- and group chats by trying the appropriate endpoints.
- """
- try:
- import httpx
- except ImportError:
- return _error("QQBot direct send requires httpx. Run: pip install httpx")
-
- extra = pconfig.extra or {}
- appid = extra.get("app_id") or os.getenv("QQ_APP_ID", "")
- secret = (pconfig.token or extra.get("client_secret")
- or os.getenv("QQ_CLIENT_SECRET", ""))
- if not appid or not secret:
- return _error("QQBot: QQ_APP_ID / QQ_CLIENT_SECRET not configured.")
-
- try:
- async with httpx.AsyncClient(timeout=15) as client:
- # ... (rest of function unchanged until payload line)
-
- payload = {"content": message[:4000], "msg_type": 0}
-
- # ... (endpoint attempts unchanged)
- except Exception as e:
- return _error(f"QQBot send failed: {e}")
+async def _send_qqbot(pconfig, chat_id, message, media_files=None):
+ """Send via QQBot using the running gateway adapter.
+
+ When media_files are present, routes through the gateway adapter's
+ native media upload pipeline (send_image, send_document, etc.).
+ Falls back to REST API for text-only messages when the adapter
+ is not running.
+ """
+ # If we have media files, try the gateway adapter first
+ if media_files:
+ try:
+ from gateway.platforms.qqbot.adapter import get_active_adapter
+ except ImportError:
+ return _error("QQBot adapter module not available.")
+
+ adapter = get_active_adapter()
+ if adapter is None:
+ return _error(
+ "QQBot adapter is not running. "
+ "Start the gateway with qqbot platform enabled first "
+ "to send media attachments."
+ )
+
+ # Send text first (if any)
+ if message.strip():
+ text_result = await adapter.send(chat_id=chat_id, content=message)
+ if not text_result.success:
+ return {"error": f"QQBot text send failed: {text_result.error}"}
+
+ # Send each media file
+ last_result = None
+ for file_path, _is_url in media_files:
+ import mimetypes
+ mime, _ = mimetypes.guess_type(file_path)
+ mime = (mime or "").lower()
+
+ if mime.startswith("image/"):
+ result = await adapter.send_image(chat_id, file_path)
+ elif mime.startswith("video/"):
+ result = await adapter.send_video(chat_id, file_path)
+ elif mime.startswith("audio/"):
+ result = await adapter.send_voice(chat_id, file_path)
+ else:
+ result = await adapter.send_document(chat_id, file_path)
+
+ if not result.success:
+ return {"error": f"QQBot media send failed: {result.error}"}
+ last_result = result
+
+ return {
+ "success": True,
+ "platform": "qqbot",
+ "chat_id": chat_id,
+ "media_sent": len(media_files),
+ }
+
+ # Text-only: use REST API directly (no gateway needed)
+ try:
+ import httpx
+ except ImportError:
+ return _error("QQBot direct send requires httpx. Run: pip install httpx")
+
+ extra = pconfig.extra or {}
+ appid = extra.get("app_id") or os.getenv("QQ_APP_ID", "")
+ secret = (pconfig.token or extra.get("client_secret")
+ or os.getenv("QQ_CLIENT_SECRET", ""))
+ if not appid or not secret:
+ return _error("QQBot: QQ_APP_ID / QQ_CLIENT_SECRET not configured.")
+
+ try:
+ async with httpx.AsyncClient(timeout=15) as client:
+ # (existing REST API logic unchanged)
+ # Step 1: Get access token
+ token_resp = await client.post(
+ "https://bots.qq.com/app/getAppAccessToken",
+ json={"appId": str(appid), "clientSecret": str(secret)},
+ )
+ if token_resp.status_code != 200:
+ return _error(f"QQBot token request failed: {token_resp.status_code}")
+ token_data = token_resp.json()
+ access_token = token_data.get("access_token")
+ if not access_token:
+ return _error(f"QQBot: no access_token in response")
+
+ headers = {
+ "Authorization": f"QQBot {access_token}",
+ "Content-Type": "application/json",
+ }
+ payload = {"content": message[:4000], "msg_type": 0}
+
+ # Try channel endpoint first
+ url = f"https://api.sgroup.qq.com/channels/{chat_id}/messages"
+ resp = await client.post(url, json=payload, headers=headers)
+ if resp.status_code in (200, 201):
+ data = resp.json()
+ return {"success": True, "platform": "qqbot", "chat_id": chat_id,
+ "message_id": data.get("id")}
+
+ # Try C2C endpoint
+ url_c2c = f"https://api.sgroup.qq.com/v2/users/{chat_id}/messages"
+ resp_c2c = await client.post(url_c2c, json=payload, headers=headers)
+ if resp_c2c.status_code in (200, 201):
+ data = resp_c2c.json()
+ return {"success": True, "platform": "qqbot", "chat_id": chat_id,
+ "message_id": data.get("id")}
+
+ # Try group endpoint
+ url_group = f"https://api.sgroup.qq.com/v2/groups/{chat_id}/messages"
+ resp_group = await client.post(url_group, json=payload, headers=headers)
+ if resp_group.status_code in (200, 201):
+ data = resp_group.json()
+ return {"success": True, "platform": "qqbot", "chat_id": chat_id,
+ "message_id": data.get("id")}
+
+ return _error(f"QQBot send failed: channel={resp.status_code} c2c={resp_c2c.status_code} group={resp_group.status_code}")
+ except Exception as e:
+ return _error(f"QQBot send failed: {e}")
变更 2:在主函数中添加 QQBOT+媒体路由分支(第 585-598 行之后,第 600 行飞书分支之前)
+ # --- QQBot: native media attachment support via running gateway adapter ---
+ if platform == Platform.QQBOT and media_files:
+ last_result = None
+ for i, chunk in enumerate(chunks):
+ is_last = (i == len(chunks) - 1)
+ result = await _send_qqbot(
+ pconfig,
+ chat_id,
+ chunk,
+ media_files=media_files if is_last else None,
+ )
+ if isinstance(result, dict) and result.get("error"):
+ return result
+ last_result = result
+ return last_result
+
# --- Feishu: native media attachment support via adapter ---
if platform == Platform.FEISHU and media_files:
变更 3:更新不支持媒体的平台列表错误信息(第 618-630 行)
if media_files and not message.strip():
return {
"error": (
- "send_message MEDIA delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao and feishu; "
+ "send_message MEDIA delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao, feishu and qqbot; "
f"target {platform.value} had only media attachments"
)
}
warning = None
if media_files:
warning = (
f"MEDIA attachments were omitted for {platform.value}; "
- "native send_message media delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao and feishu"
+ "native send_message media delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao, feishu and qqbot"
)
4.3 设计决策说明
| 决策 | 理由 |
|---|---|
| 文本和媒体分开发送 | QQ Bot API 的 msg_type=0(文本)和 msg_type=7(媒体/富媒体)是不同的消息类型,无法在一条消息中混合 |
| 媒体附件仅在最后一个分块时发送 | 与元宝/飞书保持一致的模式,避免在每个文本分块后都重复发送媒体 |
| 纯文本消息仍保留 REST API 直发路径 | 向后兼容——网关未运行时,纯文本消息仍可正常发送,无需强制依赖网关 |
| 媒体类型通过 MIME 类型自动判断 | 复用 Python 标准库 mimetypes,无需额外依赖 |
5. 验证方法
5.1 单元测试
# 测试 1: QQAdapter 单例模式
async def test_qqbot_adapter_singleton():
"""验证 set_active/get_active 正确注册和清除实例。"""
from gateway.platforms.qqbot.adapter import QQAdapter
assert QQAdapter.get_active() is None
# 模拟适配器实例
mock_config = PlatformConfig(platform=Platform.QQBOT, ...)
adapter = QQAdapter(mock_config)
QQAdapter.set_active(adapter)
assert QQAdapter.get_active() is adapter
QQAdapter.set_active(None)
assert QQAdapter.get_active() is None
# 测试 2: _send_qqbot 媒体路由
async def test_send_qqbot_with_media(mocker):
"""验证有 media_files 时通过网关适配器路由。"""
mock_adapter = mocker.MagicMock()
mock_adapter.send = mocker.AsyncMock(return_value=SendResult(success=True))
mock_adapter.send_image = mocker.AsyncMock(return_value=SendResult(success=True))
mocker.patch(
"gateway.platforms.qqbot.adapter.QQAdapter.get_active",
return_value=mock_adapter,
)
result = await _send_qqbot(
pconfig=...,
chat_id="test_chat",
message="caption",
media_files=[("/tmp/test.png", False)],
)
assert result["success"] is True
assert result["media_sent"] == 1
mock_adapter.send.assert_called_once()
mock_adapter.send_image.assert_called_once()
# 测试 3: _send_qqbot 无网关时的错误处理
async def test_send_qqbot_media_no_adapter(mocker):
"""验证网关未运行时返回清晰错误。"""
mocker.patch(
"gateway.platforms.qqbot.adapter.QQAdapter.get_active",
return_value=None,
)
result = await _send_qqbot(
pconfig=...,
chat_id="test_chat",
message="",
media_files=[("/tmp/test.png", False)],
)
assert "error" in result
assert "adapter is not running" in result["error"]
# 测试 4: 纯文本消息保持原有行为
async def test_send_qqbot_text_only_unchanged(mocker):
"""验证纯文本消息不受修改影响,仍通过 REST API 发送。"""
# (mock httpx and verify REST API path is taken)
5.2 集成测试
- 启动网关:确保
config.yaml中platforms.qq.enabled: true,网关成功连接 QQ Bot - 发送纯文本:通过 send_message 向 QQ 用户发送纯文本,验证正常
- 发送图片:通过 send_message 向 QQ 用户发送带图片附件的消息,验证图片被原生上传并展示
- 发送文档:发送 PDF/文件附件,验证文件可下载
- 发送语音/视频:发送音频和视频文件,验证原生播放
- 无网关降级:停止网关后发送纯文本,验证仍通过 REST API 直发成功
- 无网关发媒体:停止网关后发送带媒体的消息,验证返回清晰错误提示
5.3 回归测试
# 运行现有测试套件,确保无回归
cd /home/ubuntu/.hermes/hermes-agent
source .venv/bin/activate
python -m pytest tests/ -x -q
# 运行 QQ Bot 相关测试(如有)
python -m pytest tests/ -k "qqbot" -v
5.4 手动验证清单
- QQ Bot 网关正常连接(日志显示
QQBot: Connected) - 纯文本消息发送成功
- 图片附件消息发送成功(用户看到原生图片)
- 文档附件消息发送成功(用户可下载)
- 语音附件消息发送成功
- 视频附件消息发送成功
- 网关未运行时纯文本降级正常
- 网关未运行时媒体附件返回清晰错误
- 其他平台(Telegram、Discord 等)发送不受影响