# PRD: QQ Bot send_message 媒体附件支持修复 ## 1. 问题描述 `tools/send_message_tool.py` 中的 `_send_qqbot()` 函数(第 1677 行)仅发送纯文本消息(`msg_type: 0`),**完全忽略了 `media_files` 参数**。 当 AI Agent 通过 `send_message` 工具向 QQ Bot 平台发送带有图片、音频、视频、文档等媒体附件的消息时,附件会被静默丢弃,用户只能收到文本内容。更糟糕的是,如果消息中只有媒体附件而没有文本内容,系统会直接返回错误提示"不支持 QQ Bot 媒体发送"。 与此同时,网关适配器 `gateway/platforms/qqbot/adapter.py` 中的 `QQAdapter` 类已经具备完整的媒体发送能力: | 方法 | 行号 | 功能 | |------|------|------| | `send_image()` | 2601 | 发送图片(支持 URL 和本地文件) | | `send_image_file()` | 2627 | 发送本地图片文件 | | `send_voice()` | 2641 | 发送语音消息 | | `send_video()` | 2655 | 发送视频 | | `send_document()` | 2669 | 发送文件/文档 | | `_send_media()` | 2690 | 底层媒体上传(支持 HTTP URL 直传和本地文件分块上传) | **核心矛盾:适配器已具备完整媒体能力,但 `send_message` 工具的路由层未对接。** ## 2. 影响范围 ### 直接影响 - 所有通过 QQ Bot 平台发送带媒体附件消息的场景均受影响 - 包括:图片、语音、视频、文档/PDF 等所有媒体类型 ### 间接影响 - 当 AI Agent 需要向 QQ 用户/群组发送截图、生成的图片、文件等内容时,用户无法收到 - 影响 QQ Bot 平台的用户体验完整性 ### 涉及平台 - `Platform.QQBOT`(QQ 机器人开放平台) ### 不受影响 - 其他已正确接入媒体支持的平台(Telegram、Discord、Matrix、微信、Signal、元宝、飞书) - QQ Bot 网关适配器的入站消息处理(接收媒体消息正常) ## 3. 根因分析 ### 3.1 调用链路分析 当前 QQ Bot 的消息发送路径: ``` send_message_tool() → 分块处理消息 → 第 658-659 行: _send_qqbot(pconfig, chat_id, chunk) ← 无 media_files 参数 → _send_qqbot() 直接用 httpx 发 REST 请求 → payload = {"content": message, "msg_type": 0} ← 硬编码纯文本 ``` 对比元宝平台的正确路径: ``` send_message_tool() → 第 586-598 行: 检测到 YUANBAO + media_files → _send_yuanbao(chat_id, chunk, media_files=media_files) → get_active_adapter() 获取运行中的网关适配器 → send_yuanbao_direct(adapter, chat_id, message, media_files=media_files) → 适配器处理媒体上传和发送 ``` ### 3.2 三个断点 1. **`_send_qqbot()` 函数签名缺少 `media_files` 参数**(第 1677 行) 2. **调用处未传递 `media_files`**(第 659 行) 3. **`QQAdapter` 缺少 `get_active()` 单例模式**——网关适配器无法被工具层获取 ### 3.3 缺失的单例注册 元宝适配器 `YuanbaoAdapter` 拥有完整的单例模式(第 4392-4404 行): - `_active_instance` 类变量 - `get_active()` 类方法 - `set_active()` 类方法 `QQAdapter` 缺少这套机制,导致 `send_message` 工具无法获取正在运行的网关适配器实例。 ## 4. 修复方案 ### 4.1 修改概览 需要修改 **2 个文件**,共 **4 处变更**: | 文件 | 变更 | 说明 | |------|------|------| | `gateway/platforms/qqbot/adapter.py` | 添加单例模式 | `get_active()` / `set_active()` + connect/disconnect 生命周期 | | `gateway/platforms/qqbot/adapter.py` | 添加模块级 `get_active_adapter()` | 供工具层导入 | | `tools/send_message_tool.py` | 修改 `_send_qqbot()` 签名和实现 | 支持 media_files,通过网关适配器路由 | | `tools/send_message_tool.py` | 添加 QQBOT+媒体路由分支 | 在主函数中添加与元宝/飞书类似的媒体处理逻辑 | ### 4.2 代码 Diff #### 4.2.1 `gateway/platforms/qqbot/adapter.py` — 添加单例模式 在 `QQAdapter` 类定义中(第 155 行之后)添加单例注册机制: ```diff class QQAdapter(BasePlatformAdapter): """QQ Bot adapter backed by the official QQ Bot WebSocket Gateway + REST API.""" # QQ Bot API does not support editing sent messages. SUPPORTS_MESSAGE_EDITING = False MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH _TYPING_INPUT_SECONDS = 60 _TYPING_DEBOUNCE_SECONDS = 50 + # -- Active instance registry (class-level singleton) --- + + _active_instance: ClassVar[Optional["QQAdapter"]] = None + + @classmethod + def get_active(cls) -> Optional["QQAdapter"]: + """Return the currently connected QQAdapter, or None.""" + return cls._active_instance + + @classmethod + def set_active(cls, adapter: Optional["QQAdapter"]) -> None: + """Register (or clear) the active adapter instance.""" + cls._active_instance = adapter + @property def _log_tag(self) -> str: ``` 在 `connect()` 方法中注册活跃实例(第 301 行 `_mark_connected()` 之后): ```diff # 4. Start listeners self._listen_task = asyncio.create_task(self._listen_loop()) self._heartbeat_task = asyncio.create_task(self._heartbeat_loop()) self._mark_connected() + QQAdapter.set_active(self) logger.info("[%s] Connected", self._log_tag) return True ``` 在 `disconnect()` 方法中清除活跃实例(第 314 行 `_mark_disconnected()` 之后): ```diff async def disconnect(self) -> None: """Close all connections and stop listeners.""" self._running = False self._mark_disconnected() + if QQAdapter.get_active() is self: + QQAdapter.set_active(None) ``` 在文件末尾添加模块级委托函数: ```diff +# --------------------------------------------------------------------------- +# Module-level thin delegates (preserve import compatibility for send_message tool) +# --------------------------------------------------------------------------- + + +def get_active_adapter() -> Optional["QQAdapter"]: + """Delegate to ``QQAdapter.get_active()``.""" + return QQAdapter.get_active() ``` 需要在文件顶部添加 `ClassVar` 导入(如果尚未导入): ```diff -from typing import Any, ClassVar, Dict, List, Optional +from typing import Any, ClassVar, Dict, List, Optional # 确认 ClassVar 已导入 ``` #### 4.2.2 `tools/send_message_tool.py` — 添加 QQ Bot 媒体支持 **变更 1:修改 `_send_qqbot()` 函数签名和实现**(第 1677 行) ```diff -async def _send_qqbot(pconfig, chat_id, message): - """Send via QQBot using the REST API directly (no WebSocket needed). - - Uses the QQ Bot Open Platform REST endpoints to get an access token - and post a message. Supports guild channels, C2C (private) chats, - and group chats by trying the appropriate endpoints. - """ - try: - import httpx - except ImportError: - return _error("QQBot direct send requires httpx. Run: pip install httpx") - - extra = pconfig.extra or {} - appid = extra.get("app_id") or os.getenv("QQ_APP_ID", "") - secret = (pconfig.token or extra.get("client_secret") - or os.getenv("QQ_CLIENT_SECRET", "")) - if not appid or not secret: - return _error("QQBot: QQ_APP_ID / QQ_CLIENT_SECRET not configured.") - - try: - async with httpx.AsyncClient(timeout=15) as client: - # ... (rest of function unchanged until payload line) - - payload = {"content": message[:4000], "msg_type": 0} - - # ... (endpoint attempts unchanged) - except Exception as e: - return _error(f"QQBot send failed: {e}") +async def _send_qqbot(pconfig, chat_id, message, media_files=None): + """Send via QQBot using the running gateway adapter. + + When media_files are present, routes through the gateway adapter's + native media upload pipeline (send_image, send_document, etc.). + Falls back to REST API for text-only messages when the adapter + is not running. + """ + # If we have media files, try the gateway adapter first + if media_files: + try: + from gateway.platforms.qqbot.adapter import get_active_adapter + except ImportError: + return _error("QQBot adapter module not available.") + + adapter = get_active_adapter() + if adapter is None: + return _error( + "QQBot adapter is not running. " + "Start the gateway with qqbot platform enabled first " + "to send media attachments." + ) + + # Send text first (if any) + if message.strip(): + text_result = await adapter.send(chat_id=chat_id, content=message) + if not text_result.success: + return {"error": f"QQBot text send failed: {text_result.error}"} + + # Send each media file + last_result = None + for file_path, _is_url in media_files: + import mimetypes + mime, _ = mimetypes.guess_type(file_path) + mime = (mime or "").lower() + + if mime.startswith("image/"): + result = await adapter.send_image(chat_id, file_path) + elif mime.startswith("video/"): + result = await adapter.send_video(chat_id, file_path) + elif mime.startswith("audio/"): + result = await adapter.send_voice(chat_id, file_path) + else: + result = await adapter.send_document(chat_id, file_path) + + if not result.success: + return {"error": f"QQBot media send failed: {result.error}"} + last_result = result + + return { + "success": True, + "platform": "qqbot", + "chat_id": chat_id, + "media_sent": len(media_files), + } + + # Text-only: use REST API directly (no gateway needed) + try: + import httpx + except ImportError: + return _error("QQBot direct send requires httpx. Run: pip install httpx") + + extra = pconfig.extra or {} + appid = extra.get("app_id") or os.getenv("QQ_APP_ID", "") + secret = (pconfig.token or extra.get("client_secret") + or os.getenv("QQ_CLIENT_SECRET", "")) + if not appid or not secret: + return _error("QQBot: QQ_APP_ID / QQ_CLIENT_SECRET not configured.") + + try: + async with httpx.AsyncClient(timeout=15) as client: + # (existing REST API logic unchanged) + # Step 1: Get access token + token_resp = await client.post( + "https://bots.qq.com/app/getAppAccessToken", + json={"appId": str(appid), "clientSecret": str(secret)}, + ) + if token_resp.status_code != 200: + return _error(f"QQBot token request failed: {token_resp.status_code}") + token_data = token_resp.json() + access_token = token_data.get("access_token") + if not access_token: + return _error(f"QQBot: no access_token in response") + + headers = { + "Authorization": f"QQBot {access_token}", + "Content-Type": "application/json", + } + payload = {"content": message[:4000], "msg_type": 0} + + # Try channel endpoint first + url = f"https://api.sgroup.qq.com/channels/{chat_id}/messages" + resp = await client.post(url, json=payload, headers=headers) + if resp.status_code in (200, 201): + data = resp.json() + return {"success": True, "platform": "qqbot", "chat_id": chat_id, + "message_id": data.get("id")} + + # Try C2C endpoint + url_c2c = f"https://api.sgroup.qq.com/v2/users/{chat_id}/messages" + resp_c2c = await client.post(url_c2c, json=payload, headers=headers) + if resp_c2c.status_code in (200, 201): + data = resp_c2c.json() + return {"success": True, "platform": "qqbot", "chat_id": chat_id, + "message_id": data.get("id")} + + # Try group endpoint + url_group = f"https://api.sgroup.qq.com/v2/groups/{chat_id}/messages" + resp_group = await client.post(url_group, json=payload, headers=headers) + if resp_group.status_code in (200, 201): + data = resp_group.json() + return {"success": True, "platform": "qqbot", "chat_id": chat_id, + "message_id": data.get("id")} + + return _error(f"QQBot send failed: channel={resp.status_code} c2c={resp_c2c.status_code} group={resp_group.status_code}") + except Exception as e: + return _error(f"QQBot send failed: {e}") ``` **变更 2:在主函数中添加 QQBOT+媒体路由分支**(第 585-598 行之后,第 600 行飞书分支之前) ```diff + # --- QQBot: native media attachment support via running gateway adapter --- + if platform == Platform.QQBOT and media_files: + last_result = None + for i, chunk in enumerate(chunks): + is_last = (i == len(chunks) - 1) + result = await _send_qqbot( + pconfig, + chat_id, + chunk, + media_files=media_files if is_last else None, + ) + if isinstance(result, dict) and result.get("error"): + return result + last_result = result + return last_result + # --- Feishu: native media attachment support via adapter --- if platform == Platform.FEISHU and media_files: ``` **变更 3:更新不支持媒体的平台列表错误信息**(第 618-630 行) ```diff if media_files and not message.strip(): return { "error": ( - "send_message MEDIA delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao and feishu; " + "send_message MEDIA delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao, feishu and qqbot; " f"target {platform.value} had only media attachments" ) } warning = None if media_files: warning = ( f"MEDIA attachments were omitted for {platform.value}; " - "native send_message media delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao and feishu" + "native send_message media delivery is currently only supported for telegram, discord, matrix, weixin, signal, yuanbao, feishu and qqbot" ) ``` ### 4.3 设计决策说明 | 决策 | 理由 | |------|------| | 文本和媒体分开发送 | QQ Bot API 的 `msg_type=0`(文本)和 `msg_type=7`(媒体/富媒体)是不同的消息类型,无法在一条消息中混合 | | 媒体附件仅在最后一个分块时发送 | 与元宝/飞书保持一致的模式,避免在每个文本分块后都重复发送媒体 | | 纯文本消息仍保留 REST API 直发路径 | 向后兼容——网关未运行时,纯文本消息仍可正常发送,无需强制依赖网关 | | 媒体类型通过 MIME 类型自动判断 | 复用 Python 标准库 `mimetypes`,无需额外依赖 | ## 5. 验证方法 ### 5.1 单元测试 ```python # 测试 1: QQAdapter 单例模式 async def test_qqbot_adapter_singleton(): """验证 set_active/get_active 正确注册和清除实例。""" from gateway.platforms.qqbot.adapter import QQAdapter assert QQAdapter.get_active() is None # 模拟适配器实例 mock_config = PlatformConfig(platform=Platform.QQBOT, ...) adapter = QQAdapter(mock_config) QQAdapter.set_active(adapter) assert QQAdapter.get_active() is adapter QQAdapter.set_active(None) assert QQAdapter.get_active() is None # 测试 2: _send_qqbot 媒体路由 async def test_send_qqbot_with_media(mocker): """验证有 media_files 时通过网关适配器路由。""" mock_adapter = mocker.MagicMock() mock_adapter.send = mocker.AsyncMock(return_value=SendResult(success=True)) mock_adapter.send_image = mocker.AsyncMock(return_value=SendResult(success=True)) mocker.patch( "gateway.platforms.qqbot.adapter.QQAdapter.get_active", return_value=mock_adapter, ) result = await _send_qqbot( pconfig=..., chat_id="test_chat", message="caption", media_files=[("/tmp/test.png", False)], ) assert result["success"] is True assert result["media_sent"] == 1 mock_adapter.send.assert_called_once() mock_adapter.send_image.assert_called_once() # 测试 3: _send_qqbot 无网关时的错误处理 async def test_send_qqbot_media_no_adapter(mocker): """验证网关未运行时返回清晰错误。""" mocker.patch( "gateway.platforms.qqbot.adapter.QQAdapter.get_active", return_value=None, ) result = await _send_qqbot( pconfig=..., chat_id="test_chat", message="", media_files=[("/tmp/test.png", False)], ) assert "error" in result assert "adapter is not running" in result["error"] # 测试 4: 纯文本消息保持原有行为 async def test_send_qqbot_text_only_unchanged(mocker): """验证纯文本消息不受修改影响,仍通过 REST API 发送。""" # (mock httpx and verify REST API path is taken) ``` ### 5.2 集成测试 1. **启动网关**:确保 `config.yaml` 中 `platforms.qq.enabled: true`,网关成功连接 QQ Bot 2. **发送纯文本**:通过 send_message 向 QQ 用户发送纯文本,验证正常 3. **发送图片**:通过 send_message 向 QQ 用户发送带图片附件的消息,验证图片被原生上传并展示 4. **发送文档**:发送 PDF/文件附件,验证文件可下载 5. **发送语音/视频**:发送音频和视频文件,验证原生播放 6. **无网关降级**:停止网关后发送纯文本,验证仍通过 REST API 直发成功 7. **无网关发媒体**:停止网关后发送带媒体的消息,验证返回清晰错误提示 ### 5.3 回归测试 ```bash # 运行现有测试套件,确保无回归 cd /home/ubuntu/.hermes/hermes-agent source .venv/bin/activate python -m pytest tests/ -x -q # 运行 QQ Bot 相关测试(如有) python -m pytest tests/ -k "qqbot" -v ``` ### 5.4 手动验证清单 - [ ] QQ Bot 网关正常连接(日志显示 `QQBot: Connected`) - [ ] 纯文本消息发送成功 - [ ] 图片附件消息发送成功(用户看到原生图片) - [ ] 文档附件消息发送成功(用户可下载) - [ ] 语音附件消息发送成功 - [ ] 视频附件消息发送成功 - [ ] 网关未运行时纯文本降级正常 - [ ] 网关未运行时媒体附件返回清晰错误 - [ ] 其他平台(Telegram、Discord 等)发送不受影响