Files
Hermes Agent ccc63d1e70 first commit
2026-05-10 13:52:46 +08:00

173 lines
7.0 KiB
Markdown

---
name: playwright-browser-install
description: Install, diagnose, and recover Playwright browser binaries (Chromium, Firefox, WebKit) — partial downloads, resume, cache management, and sandbox issues.
triggers:
- "playwright install chromium"
- "playwright install firefox"
- "playwright install webkit"
- "playwright browser download failed"
- "playwright chromium not found"
- "chrome binary missing"
- "playwright download interrupted"
---
# Playwright Browser Install — Diagnosis & Recovery
## Pitfalls
### Chrome sandbox 在容器/VM 中失败
**Symptom**: `browser_navigate` 报错 "No usable sandbox! If you are running on Ubuntu 23.10+ or another Linux distro that has disabled unprivileged user namespaces with AppArmor"
**Root cause**: 容器或 VM 环境中 Chrome 的 sandbox 机制不可用。
**Workaround**:
```bash
# 在 ~/.bashrc 中添加
echo 'export PLAYWRIGHT_CHROMIUM_ARGS="--no-sandbox --disable-setuid-sandbox"' >> ~/.bashrc
source ~/.bashrc
# 或直接用 playwright-core 调用
cd ~/.hermes/hermes-agent && node -e "
const { chromium } = require('playwright-core');
(async () => {
const browser = await chromium.launch({
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
await page.goto('https://example.com');
const text = await page.textContent('body');
console.log(text);
await browser.close();
})().catch(e => console.error(e.message));
"
```
**注意**: `browser_navigate` 工具可能不支持自定义 Chrome 参数,此时需要用上述 playwright-core 直接调用。
## Quick Diagnosis
```bash
# Check which browsers are installed
ls ~/.cache/ms-playwright/
# Check if a specific browser process is running
ps aux | grep playwright | grep -v grep
# Check download temp dirs (partial downloads accumulate here)
ls -la /tmp/playwright-download-*/
# Find largest partial zip (likely the interrupted download)
for d in /tmp/playwright-download-*; do
f="$d/playwright-download-chromium-"*.zip
test -f "$f" && echo "$(du -sh "$f" | cut -f1) $d"
done | sort -rh | head
```
## Finding the Actual CDN Download URL
Playwright CDN redirects through several hosts. The final destination (which supports Range/resume) is **storage.googleapis.com**.
```bash
# Method: use the playwright CLI with verbose to see the URL
# Or extract from browsers.json
node -e "const b=require('$HOME/.hermes/hermes-agent/node_modules/playwright-core/browsers.json'); const c=b.find(x=>x.name==='chromium'); console.log(c.browserVersion, c.revision)"
# Then construct the URL:
# https://storage.googleapis.com/chrome-for-testing-public/{browserVersion}/linux64/chrome-linux64.zip
```
For Chromium revision 1217 → browserVersion **147.0.7727.15**
## Resume a Partial Download (curl)
1. Identify the largest partial zip in `/tmp/playwright-download-*/`
2. Copy it to a safe location
3. Use `curl -C -` for automatic resume (reads existing file size and sends `Range` header):
```bash
PARTIAL_ZIP="/tmp/playwright-download-XRveVR/playwright-download-chromium-ubuntu24.04-x64-1217.zip"
OUTPUT="/tmp/playwright-chromium-resume.zip"
URL="https://storage.googleapis.com/chrome-for-testing-public/147.0.7727.15/linux64/chrome-linux64.zip"
cp "$PARTIAL_ZIP" "$OUTPUT"
nohup curl -L -C - -o "$OUTPUT" "$URL" &
```
**Important:** The `-o` flag in `curl` truncates the output file on start. `curl -C -` auto-detects existing file size and resumes from that position, but only if the file already exists. The `-L` follows redirects (needed because the CDN URL 307-redirects to storage.googleapis.com).
## Manual Installation from Downloaded Zip
If Playwright's install process keeps overwriting your manual extraction:
```bash
BROWSER_DIR="$HOME/.cache/ms-playwright/chromium-1217"
mkdir -p "$BROWSER_DIR"
unzip -q /path/to/downloaded.zip -d "$BROWSER_DIR"
touch "$BROWSER_DIR/.ready"
ln -sf chrome-linux64 "$BROWSER_DIR/chrome"
```
# The `.ready` marker file tells Playwright the browser is already installed.
## Headless Shell — Separate Installation
Playwright needs **both** `chromium-1217/` (chrome binary) and `chromium_headless_shell-1217/` (headless shell). The headless shell is a separate ~112MB download:
```bash
# URL format (same browserVersion as chromium):
# https://storage.googleapis.com/chrome-for-testing-public/{browserVersion}/linux64/chrome-headless-shell-linux64.zip
BROWSER_VER="147.0.7727.15" # from browsers.json
HEADLESS_ZIP="/tmp/chrome-headless-shell.zip"
HEADLESS_DIR="$HOME/.cache/ms-playwright/chromium_headless_shell-1217"
curl -L -o "$HEADLESS_ZIP" "https://storage.googleapis.com/chrome-for-testing-public/${BROWSER_VER}/linux64/chrome-headless-shell-linux64.zip"
mkdir -p "$HEADLESS_DIR"
unzip -q "$HEADLESS_ZIP" -d "$HEADLESS_DIR"
touch "$HEADLESS_DIR/.ready"
ln -sf chrome-headless-shell-linux64 "$HEADLESS_DIR/chrome-headless-shell"
```
## Playwright Overwrites Cache Directory
⚠️ **Critical:** When you run `playwright install chromium`, it creates a **fresh temp directory** each time (using `mkdtemp`), downloads there, then moves to the cache. If you manually extracted to `~/.cache/ms-playwright/chromium-1217/` and then run `playwright install`, it will **delete and recreate that directory** — wiping your manual extraction.
**Workaround:** Run `playwright install` first and let it complete. If interrupted, re-extract after each `playwright install` run.
## Verify Installation
```bash
# Via Playwright API
node -e "const {chromium}=require('playwright-core'); console.log(chromium.executablePath())"
# Direct binary test
"$HOME/.cache/ms-playwright/chromium-1217/chrome-linux64/chrome" --version
```
## Common Failure Modes
| Symptom | Cause | Fix |
|---------|-------|-----|
| `playwright install` hangs at 0% | Network/DNS issue, bad CDN mirror | Set `PLAYWRIGHT_DOWNLOAD_HOST` env var |
| Download starts but never completes | Server-side timeout, partial file left in `/tmp/` | Resume from partial using `curl -C -` |
| "Chrome binary not found" after install | `.ready` marker missing or wrong dir name | Create marker, check `chromium-{revision}` dir name |
| `chrome: command not found` | Sandbox/suffix issues | Check `chrome-linux64/chrome` exists and is executable |
| Playwright reinstalls even though browser exists | `.ready` marker missing | `touch "$BROWSER_DIR/.ready"` |
| Headless shell not found | Headless shell is a **separate install** from chromium — both are needed | Install headless shell manually from `chrome-headless-shell-linux64.zip` (same browserVersion as chromium) |
## Environment Variables
```bash
PLAYWRIGHT_CHROMIUM_DOWNLOAD_HOST # Override CDN for Chromium
PLAYWRIGHT_FIREFOX_DOWNLOAD_HOST # Override CDN for Firefox
PLAYWRIGHT_DOWNLOAD_HOST # Override CDN for all browsers
PLAYWRIGHT_DOWNLOAD_CONNECTION_TIMEOUT # Socket timeout in ms
```
## Key Paths
- Browser cache: `~/.cache/ms-playwright/`
- Temp downloads: `/tmp/playwright-download-*/` (deleted on system reboot)
- Playwright node_modules: `$HOME/.hermes/hermes-agent/node_modules/playwright-core/`
- browsers.json: `$HOME/.hermes/hermes-agent/node_modules/playwright-core/browsers.json`