--- name: playwright-browser-install description: Install, diagnose, and recover Playwright browser binaries (Chromium, Firefox, WebKit) — partial downloads, resume, cache management, and sandbox issues. triggers: - "playwright install chromium" - "playwright install firefox" - "playwright install webkit" - "playwright browser download failed" - "playwright chromium not found" - "chrome binary missing" - "playwright download interrupted" --- # Playwright Browser Install — Diagnosis & Recovery ## Pitfalls ### Chrome sandbox 在容器/VM 中失败 **Symptom**: `browser_navigate` 报错 "No usable sandbox! If you are running on Ubuntu 23.10+ or another Linux distro that has disabled unprivileged user namespaces with AppArmor" **Root cause**: 容器或 VM 环境中 Chrome 的 sandbox 机制不可用。 **Workaround**: ```bash # 在 ~/.bashrc 中添加 echo 'export PLAYWRIGHT_CHROMIUM_ARGS="--no-sandbox --disable-setuid-sandbox"' >> ~/.bashrc source ~/.bashrc # 或直接用 playwright-core 调用 cd ~/.hermes/hermes-agent && node -e " const { chromium } = require('playwright-core'); (async () => { const browser = await chromium.launch({ args: ['--no-sandbox', '--disable-setuid-sandbox'] }); const page = await browser.newPage(); await page.goto('https://example.com'); const text = await page.textContent('body'); console.log(text); await browser.close(); })().catch(e => console.error(e.message)); " ``` **注意**: `browser_navigate` 工具可能不支持自定义 Chrome 参数,此时需要用上述 playwright-core 直接调用。 ## Quick Diagnosis ```bash # Check which browsers are installed ls ~/.cache/ms-playwright/ # Check if a specific browser process is running ps aux | grep playwright | grep -v grep # Check download temp dirs (partial downloads accumulate here) ls -la /tmp/playwright-download-*/ # Find largest partial zip (likely the interrupted download) for d in /tmp/playwright-download-*; do f="$d/playwright-download-chromium-"*.zip test -f "$f" && echo "$(du -sh "$f" | cut -f1) $d" done | sort -rh | head ``` ## Finding the Actual CDN Download URL Playwright CDN redirects through several hosts. The final destination (which supports Range/resume) is **storage.googleapis.com**. ```bash # Method: use the playwright CLI with verbose to see the URL # Or extract from browsers.json node -e "const b=require('$HOME/.hermes/hermes-agent/node_modules/playwright-core/browsers.json'); const c=b.find(x=>x.name==='chromium'); console.log(c.browserVersion, c.revision)" # Then construct the URL: # https://storage.googleapis.com/chrome-for-testing-public/{browserVersion}/linux64/chrome-linux64.zip ``` For Chromium revision 1217 → browserVersion **147.0.7727.15** ## Resume a Partial Download (curl) 1. Identify the largest partial zip in `/tmp/playwright-download-*/` 2. Copy it to a safe location 3. Use `curl -C -` for automatic resume (reads existing file size and sends `Range` header): ```bash PARTIAL_ZIP="/tmp/playwright-download-XRveVR/playwright-download-chromium-ubuntu24.04-x64-1217.zip" OUTPUT="/tmp/playwright-chromium-resume.zip" URL="https://storage.googleapis.com/chrome-for-testing-public/147.0.7727.15/linux64/chrome-linux64.zip" cp "$PARTIAL_ZIP" "$OUTPUT" nohup curl -L -C - -o "$OUTPUT" "$URL" & ``` **Important:** The `-o` flag in `curl` truncates the output file on start. `curl -C -` auto-detects existing file size and resumes from that position, but only if the file already exists. The `-L` follows redirects (needed because the CDN URL 307-redirects to storage.googleapis.com). ## Manual Installation from Downloaded Zip If Playwright's install process keeps overwriting your manual extraction: ```bash BROWSER_DIR="$HOME/.cache/ms-playwright/chromium-1217" mkdir -p "$BROWSER_DIR" unzip -q /path/to/downloaded.zip -d "$BROWSER_DIR" touch "$BROWSER_DIR/.ready" ln -sf chrome-linux64 "$BROWSER_DIR/chrome" ``` # The `.ready` marker file tells Playwright the browser is already installed. ## Headless Shell — Separate Installation Playwright needs **both** `chromium-1217/` (chrome binary) and `chromium_headless_shell-1217/` (headless shell). The headless shell is a separate ~112MB download: ```bash # URL format (same browserVersion as chromium): # https://storage.googleapis.com/chrome-for-testing-public/{browserVersion}/linux64/chrome-headless-shell-linux64.zip BROWSER_VER="147.0.7727.15" # from browsers.json HEADLESS_ZIP="/tmp/chrome-headless-shell.zip" HEADLESS_DIR="$HOME/.cache/ms-playwright/chromium_headless_shell-1217" curl -L -o "$HEADLESS_ZIP" "https://storage.googleapis.com/chrome-for-testing-public/${BROWSER_VER}/linux64/chrome-headless-shell-linux64.zip" mkdir -p "$HEADLESS_DIR" unzip -q "$HEADLESS_ZIP" -d "$HEADLESS_DIR" touch "$HEADLESS_DIR/.ready" ln -sf chrome-headless-shell-linux64 "$HEADLESS_DIR/chrome-headless-shell" ``` ## Playwright Overwrites Cache Directory ⚠️ **Critical:** When you run `playwright install chromium`, it creates a **fresh temp directory** each time (using `mkdtemp`), downloads there, then moves to the cache. If you manually extracted to `~/.cache/ms-playwright/chromium-1217/` and then run `playwright install`, it will **delete and recreate that directory** — wiping your manual extraction. **Workaround:** Run `playwright install` first and let it complete. If interrupted, re-extract after each `playwright install` run. ## Verify Installation ```bash # Via Playwright API node -e "const {chromium}=require('playwright-core'); console.log(chromium.executablePath())" # Direct binary test "$HOME/.cache/ms-playwright/chromium-1217/chrome-linux64/chrome" --version ``` ## Common Failure Modes | Symptom | Cause | Fix | |---------|-------|-----| | `playwright install` hangs at 0% | Network/DNS issue, bad CDN mirror | Set `PLAYWRIGHT_DOWNLOAD_HOST` env var | | Download starts but never completes | Server-side timeout, partial file left in `/tmp/` | Resume from partial using `curl -C -` | | "Chrome binary not found" after install | `.ready` marker missing or wrong dir name | Create marker, check `chromium-{revision}` dir name | | `chrome: command not found` | Sandbox/suffix issues | Check `chrome-linux64/chrome` exists and is executable | | Playwright reinstalls even though browser exists | `.ready` marker missing | `touch "$BROWSER_DIR/.ready"` | | Headless shell not found | Headless shell is a **separate install** from chromium — both are needed | Install headless shell manually from `chrome-headless-shell-linux64.zip` (same browserVersion as chromium) | ## Environment Variables ```bash PLAYWRIGHT_CHROMIUM_DOWNLOAD_HOST # Override CDN for Chromium PLAYWRIGHT_FIREFOX_DOWNLOAD_HOST # Override CDN for Firefox PLAYWRIGHT_DOWNLOAD_HOST # Override CDN for all browsers PLAYWRIGHT_DOWNLOAD_CONNECTION_TIMEOUT # Socket timeout in ms ``` ## Key Paths - Browser cache: `~/.cache/ms-playwright/` - Temp downloads: `/tmp/playwright-download-*/` (deleted on system reboot) - Playwright node_modules: `$HOME/.hermes/hermes-agent/node_modules/playwright-core/` - browsers.json: `$HOME/.hermes/hermes-agent/node_modules/playwright-core/browsers.json`