| name | browser-automation |
| description | This skill should be used when working on frontend code, debugging UI issues, verifying visual changes, scraping web pages, testing web features, or inspecting page state. Also triggers on "open browser", "take screenshot", "navigate to URL", "check cookies", "extract page content", or any web automation task. Use proactively during frontend development to verify changes visually. |
Multi-Session Browser Server
HTTP server providing isolated browser contexts for multi-agent browser automation. Each session has its own cookies, localStorage, and continuous screencast stream.
When to Use Proactively
Use without being asked when:
- Verifying frontend changes visually
- Debugging UI issues or layout problems
- Scraping or extracting content from web pages
- Testing web features across multiple isolated sessions
- Multiple agents need simultaneous browser access without interference
Key advantage: Sessions are fully isolated. Each agent gets its own browser context with separate cookies, storage, and authentication state.
Session Lifecycle
Keep sessions alive during the conversation. Do not delete a session after a single operation - the user may want to:
- Navigate to additional pages
- Take more screenshots
- Interact with the page further
Only delete sessions when:
- The user explicitly says they're done with the browser
- The conversation/task is clearly complete
- Starting fresh with a new context is needed
Sessions auto-cleanup via server idle timeout, so leaving them is safe.
Prerequisites
Install dependencies once:
cd ${CLAUDE_PLUGIN_ROOT} && npm install
Architecture
Chrome (headless) :9222
↓ CDP
browser-server.js :9223
↓ HTTP API
[Agent 1] [Agent 2] [Agent 3]
session-a session-b session-c
- Chrome runs on port 9222 with CDP enabled
- Browser server connects to Chrome and exposes HTTP API on port 9223
- Each session creates an isolated browser context via
browser.createBrowserContext() - Continuous screencast keeps the latest frame always ready (~15ms retrieval)
Starting the Server
node ${CLAUDE_PLUGIN_ROOT}/scripts/browser-server.js --headless
Options:
--headless- Run Chrome without visible window (recommended for agents)--profile- Copy default Chrome profile (includes cookies, logins)
The server automatically starts Chrome if not already running. Output:
Starting Chrome on :9222 (headless)...
Connected to Chrome
Browser server running on http://localhost:9223
Keep the server running. It manages sessions for all agents and auto-shuts down after 5 minutes idle.
API Reference
Base URL: http://localhost:9223
Session Management
Create session:
curl -X POST http://localhost:9223/session
# {"id":"abc12345"}
List sessions:
curl http://localhost:9223/sessions
# {"sessions":[{"id":"abc12345","frames":42,"lastFrame":"..."}],"count":1}
Destroy session:
curl -X DELETE http://localhost:9223/session/abc12345
# {"ok":true}
Session Operations
All operations require a valid session ID.
Navigate to URL:
curl -X POST http://localhost:9223/session/abc12345/navigate \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com"}'
# {"ok":true,"url":"https://example.com"}
Get cached frame (fast ~15ms):
curl http://localhost:9223/session/abc12345/frame > frame.jpg
Returns JPEG from continuous screencast. Always ready, no rendering delay.
Get full screenshot (accurate ~100ms):
curl http://localhost:9223/session/abc12345/screenshot > screenshot.png
Returns PNG with full resolution. Better for text readability and OCR.
Get session status:
curl http://localhost:9223/session/abc12345/status
# {"id":"abc12345","url":"https://example.com","frames":42,"lastFrame":"...","age":1234}
Server Status
curl http://localhost:9223/status
# {"connected":true,"sessions":3}
Typical Workflows
Single Agent Scraping
Run each command separately. The session ID from step 1 is used in subsequent steps.
# Step 1: Create session - note the ID from response
curl -s -X POST http://localhost:9223/session
# Response: {"id":"abc12345"}
# Step 2: Navigate (replace abc12345 with actual ID)
curl -X POST http://localhost:9223/session/abc12345/navigate \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com"}'
# Step 3: Get screenshot for visual inspection
curl http://localhost:9223/session/abc12345/screenshot > /tmp/page.png
# Step 4: Clean up when done
curl -X DELETE http://localhost:9223/session/abc12345
Multi-Agent Parallel Testing
Each agent creates its own session. Sessions are isolated - cookies set in one session don't affect others.
# Agent 1: Create session, note the ID
curl -s -X POST http://localhost:9223/session
# Use returned ID for all Agent 1 operations
# Agent 2: Create separate session, note the ID
curl -s -X POST http://localhost:9223/session
# Use returned ID for all Agent 2 operations
# Sessions run simultaneously without interference
Authenticated Sessions
Use --profile to copy your default Chrome profile with existing logins:
node ${CLAUDE_PLUGIN_ROOT}/scripts/browser-server.js --profile --headless
Sessions will have access to cookies from your normal Chrome profile.
Screenshot Strategy
Two methods available with different trade-offs:
| Method | Endpoint | Speed | Format | Use Case |
|---|---|---|---|---|
| Frame | /frame |
~15ms | JPEG | Quick visual checks, animations |
| Screenshot | /screenshot |
~100ms | PNG | Text reading, OCR, archiving |
Recommendations:
- Use
/framefor rapid iteration during development - Use
/screenshotwhen reading text content or precision matters - Frame is from screencast (already rendered); screenshot triggers fresh render
Error Handling
"Failed to connect to Chrome after starting it"
- Chrome failed to start
- Check Chrome is installed at expected path
- On macOS:
/Applications/Google Chrome.app/Contents/MacOS/Google Chrome
"Session not found"
- Session ID invalid or already destroyed
- Create a new session with POST /session
"No frame available yet"
- Page just navigated, screencast hasn't captured yet
- Wait briefly or use
/screenshotinstead
Performance Notes
- Persistent CDP connection: ~100ms for screenshot vs ~5s with fresh connection (50x faster)
- Continuous screencast: frames always ready, no rendering wait
- Session creation: ~200ms (creates new browser context)
- Memory: each session holds one page; destroy sessions when done
Cleanup
Sessions persist until explicitly destroyed. Always clean up:
# Destroy specific session (replace with actual ID)
curl -X DELETE http://localhost:9223/session/abc12345
# Check remaining sessions
curl http://localhost:9223/sessions
Stop the server with Ctrl+C - it gracefully closes all sessions.