What is Browser Automation?
Browser automation lets agents interact with websites:
- Navigate pages
- Fill forms
- Click buttons
- Extract data
- Complete workflows
Browser Tool Basics
Starting Browser
browser(action="start", profile="clawd")
Opening URLs
browser(action="open", targetUrl="https://example.com")
Taking Snapshots
See current page state:
browser(action="snapshot")
Screenshots
Visual capture:
browser(action="screenshot")
Navigation
Go to URL
browser(action="navigate", targetUrl="https://example.com/page")
Refresh
browser(action="navigate", targetUrl="current") // Refresh current
Interacting with Elements
Click
browser(action="act", request={
kind: "click",
ref: "button[Submit]" // Element reference from snapshot
})
Type Text
browser(action="act", request={
kind: "type",
ref: "textbox[Search]",
text: "search query"
})
Fill Form
browser(action="act", request={
kind: "fill",
fields: [
{ ref: "textbox[Email]", text: "user@example.com" },
{ ref: "textbox[Password]", text: "password123" }
]
})
Press Keys
browser(action="act", request={
kind: "press",
key: "Enter"
})
Reading Page Content
Snapshot
Get page structure:
browser(action="snapshot")
// Returns structured representation of page
Screenshot
Get visual representation:
browser(action="screenshot")
Extract Text
From snapshot, read element content.
Common Patterns
Login Flow
// Navigate to login
browser(action="open", targetUrl="https://app.example.com/login")
// Fill credentials
browser(action="act", request={
kind: "fill",
fields: [
{ ref: "textbox[Email]", text: "user@example.com" },
{ ref: "textbox[Password]", text: "password" }
]
})
// Submit
browser(action="act", request={
kind: "click",
ref: "button[Sign In]"
})
// Verify success
browser(action="snapshot")
Search and Extract
// Go to search page
browser(action="open", targetUrl="https://example.com")
// Enter search
browser(action="act", request={
kind: "type",
ref: "textbox[Search]",
text: "query"
})
// Submit search
browser(action="act", request={
kind: "press",
key: "Enter"
})
// Get results
browser(action="snapshot")
Form Filling
browser(action="act", request={
kind: "fill",
fields: [
{ ref: "textbox[Name]", text: "John Doe" },
{ ref: "textbox[Email]", text: "john@example.com" },
{ ref: "combobox[Country]", values: ["United States"] }
]
})
Waiting
Wait for Element
browser(action="act", request={
kind: "wait",
ref: "button[Submit]",
timeMs: 5000
})
Wait for Text
browser(action="act", request={
kind: "wait",
textGone: "Loading..."
})
Profiles
Isolated Browser
browser(action="start", profile="clawd")
Fresh browser instance, no saved state.
Chrome Extension
browser(action="start", profile="chrome")
Uses your existing Chrome session.
Best Practices
Use Snapshots First
Before interacting, understand the page:
browser(action="snapshot")
// Examine structure
// Then interact
Handle Loading
Wait for pages to load:
browser(action="act", request={
kind: "wait",
timeMs: 2000
})
Verify Actions
After actions, verify they worked:
browser(action="act", request={kind: "click", ref: "..."})
browser(action="snapshot") // Verify state changed
Handle Errors
Pages can fail:
- Check for error states
- Handle timeouts
- Retry if appropriate
Security Considerations
Credentials
Don't hardcode passwords:
- Use environment variables
- Store securely
- Don't log credentials
Sensitive Data
Be careful extracting:
- Personal information
- Financial data
- Private content
Session Management
Close browsers when done:
browser(action="stop")
Limitations
What Works Well
- Simple navigation
- Form filling
- Data extraction
- Clicking buttons
What's Harder
- Complex JavaScript apps
- CAPTCHAs
- Anti-automation measures
- Real-time interactions
Alternative: web_fetch
For simple data extraction, web_fetch is simpler:
web_fetch(url="https://example.com")
Returns readable content without browser overhead.
Conclusion
Browser automation extends agent capabilities:
- Access web applications
- Automate workflows
- Extract data
- Complete tasks
Use thoughtfully—not every task needs a browser.
Next: Working with Nodes - Device and remote system integration