Building Browser-Powered AI Agents with OpenClaw
Browser automation isn't just for QA teams anymore. AI agents can leverage browser control to interact with web applications, extract data, fill forms, and monitor changes across the internet. OpenClaw makes this accessible without the usual complexity of traditional automation frameworks.
Let's explore the practical use cases where browser-powered agents shine.
Why Agents Need Browsers
Not everything has an API. Many useful services, tools, and data sources are only accessible through web interfaces. For agents to be truly autonomous, they need to interact with the web the way humans do: clicking buttons, filling forms, reading pages.
OpenClaw's browser integration bridges this gap. Instead of hoping for API access, your agent can directly control a browser and get the job done.
Use Case 1: Web Scraping
Web scraping is the most common use case. You need data from a website that doesn't offer an API.
Example: Scraping Job Listings
Let's build an agent that scrapes job listings from a career site:
async function scrapeJobs(url: string) {
const jobs = [];
// Open the listings page
await browser.open({ url, profile: "openclaw" });
// Take snapshot to see page structure
const snapshot = await browser.snapshot({ refs: "aria" });
// Find job cards (this is simplified)
const jobCards = parseJobCards(snapshot.content);
for (const card of jobCards) {
jobs.push({
title: card.title,
company: card.company,
location: card.location,
url: card.link
});
}
return jobs;
}
Handling Pagination
Most listing pages have pagination. Handle it by clicking "Next" until there are no more pages:
let hasNextPage = true;
while (hasNextPage) {
const snapshot = await browser.snapshot();
const jobs = parseJobCards(snapshot.content);
allJobs.push(...jobs);
// Look for next button
const nextButton = findElement(snapshot.content, "Next");
if (nextButton) {
await browser.act({ kind: "click", ref: nextButton.ref });
await browser.act({ kind: "wait", text: "Page loaded" });
} else {
hasNextPage = false;
}
}
Scraping Dynamic Content
Single-page apps (SPAs) load content dynamically. Wait for the content before scraping:
await browser.open({ url: "https://spa-site.com/data" });
// Wait for loading spinner to disappear
await browser.act({ kind: "wait", textGone: "Loading..." });
// Now snapshot will have the full content
const snapshot = await browser.snapshot();
const data = extractData(snapshot.content);
Use Case 2: Form Filling
Automating form submissions saves time and enables agents to interact with services that require user input.
Example: Automated Job Applications
An agent that applies to jobs on your behalf:
async function applyToJob(jobUrl: string, resume: ResumeData) {
await browser.open({ url: jobUrl });
// Click "Apply Now"
const snapshot1 = await browser.snapshot();
const applyButton = findElement(snapshot1.content, "Apply Now");
await browser.act({ kind: "click", ref: applyButton.ref });
// Wait for application form
await browser.act({ kind: "wait", text: "Application Form" });
// Fill all fields
const snapshot2 = await browser.snapshot({ refs: "aria" });
await browser.act({
kind: "fill",
fields: [
{ selector: "input[name=fullName]", text: resume.name },
{ selector: "input[name=email]", text: resume.email },
{ selector: "input[name=phone]", text: resume.phone },
{ selector: "textarea[name=coverLetter]", text: resume.coverLetter }
]
});
// Upload resume
await browser.upload({
selector: "input[type=file]",
paths: [resume.pdfPath]
});
// Submit
await browser.act({ kind: "click", selector: "button[type=submit]" });
// Verify submission
await browser.act({ kind: "wait", text: "Application submitted" });
}
Multi-Step Forms
Some forms span multiple pages. Handle them by tracking progress:
const formSteps = [
{ name: "Personal Info", fields: [...] },
{ name: "Experience", fields: [...] },
{ name: "Review", fields: [] }
];
for (const step of formSteps) {
await fillStep(step);
const snapshot = await browser.snapshot();
const nextButton = findElement(snapshot.content, "Next");
if (nextButton) {
await browser.act({ kind: "click", ref: nextButton.ref });
await browser.act({ kind: "wait", textGone: "Loading" });
}
}
// Final submit
await browser.act({ kind: "click", selector: "button.submit-final" });
Use Case 3: Monitoring and Alerts
Agents can watch websites for changes and alert you when something important happens.
Example: Price Drop Monitor
Watch a product page and notify when the price drops:
async function monitorPrice(productUrl: string, targetPrice: number) {
while (true) {
await browser.open({ url: productUrl });
const snapshot = await browser.snapshot();
const priceElement = findElement(snapshot.content, "$");
const currentPrice = parsePrice(priceElement.text);
console.log(`Current price: ${currentPrice}`);
if (currentPrice <= targetPrice) {
await sendNotification(
`Price drop alert!`,
`${productUrl} is now ${currentPrice}`
);
break;
}
// Check every hour
await sleep(3600000);
}
}
Example: Dashboard Monitoring
Monitor a metrics dashboard and alert on anomalies:
async function monitorDashboard(dashboardUrl: string) {
await browser.open({ url: dashboardUrl });
// Login if needed (reuse session with chrome profile)
const isLoggedIn = await checkIfLoggedIn();
if (!isLoggedIn) {
await performLogin();
}
while (true) {
const snapshot = await browser.snapshot();
const metrics = extractMetrics(snapshot.content);
// Check thresholds
if (metrics.errorRate > 0.05) {
await sendAlert("High error rate detected: " + metrics.errorRate);
}
if (metrics.responseTime > 500) {
await sendAlert("Slow response time: " + metrics.responseTime);
}
await sleep(300000); // Check every 5 minutes
}
}
Use Case 4: Automated Testing
Agents can run automated tests on web applications, verifying functionality and catching regressions.
Example: E2E Testing
Test a checkout flow:
async function testCheckoutFlow() {
// Start on homepage
await browser.open({ url: "https://store.example.com" });
// Search for product
let snapshot = await browser.snapshot();
const searchBox = findElement(snapshot.content, "Search");
await browser.act({ kind: "type", ref: searchBox.ref, text: "laptop" });
await browser.act({ kind: "press", key: "Enter" });
// Wait for results
await browser.act({ kind: "wait", text: "Results" });
// Click first product
snapshot = await browser.snapshot();
const firstProduct = findFirstProduct(snapshot.content);
await browser.act({ kind: "click", ref: firstProduct.ref });
// Add to cart
await browser.act({ kind: "wait", text: "Add to Cart" });
snapshot = await browser.snapshot();
const addButton = findElement(snapshot.content, "Add to Cart");
await browser.act({ kind: "click", ref: addButton.ref });
// Go to cart
await browser.act({ kind: "wait", text: "View Cart" });
snapshot = await browser.snapshot();
const cartButton = findElement(snapshot.content, "View Cart");
await browser.act({ kind: "click", ref: cartButton.ref });
// Verify item in cart
snapshot = await browser.snapshot();
const hasLaptop = snapshot.content.includes("laptop");
assert(hasLaptop, "Product not in cart");
console.log("✅ Checkout flow test passed");
}
Example: Visual Regression Testing
Capture screenshots and compare:
async function visualRegressionTest(url: string, baselinePath: string) {
await browser.open({ url });
// Take screenshot
const screenshot = await browser.screenshot({ type: "png", fullPage: true });
// Compare with baseline
const diff = await compareImages(screenshot, baselinePath);
if (diff > 0.01) {
console.log(`⚠️ Visual regression detected: ${diff * 100}% difference`);
} else {
console.log("✅ Visual regression test passed");
}
}
Use Case 5: Data Enrichment
Combine multiple sources to enrich data:
async function enrichCompanyData(companies: Company[]) {
for (const company of companies) {
// Search for company
await browser.open({
url: `https://search.example.com?q=${encodeURIComponent(company.name)}`
});
const snapshot = await browser.snapshot();
// Extract additional info
company.website = extractWebsite(snapshot.content);
company.description = extractDescription(snapshot.content);
company.industry = extractIndustry(snapshot.content);
// Visit company website for more details
if (company.website) {
await browser.open({ url: company.website });
const siteSnapshot = await browser.snapshot();
company.contactEmail = extractEmail(siteSnapshot.content);
}
}
return companies;
}
Advanced Patterns
Using Chrome Profile for Authenticated Sessions
Many use cases require staying logged in. Use the chrome profile to reuse your existing sessions:
// One-time setup: attach Chrome tab with Browser Relay extension
// Then use chrome profile in automation
await browser.snapshot({ profile: "chrome" });
// Agent operates on your already-logged-in session
await browser.act({ kind: "click", ref: authenticatedActionRef });
This is perfect for:
- Social media automation
- Internal dashboards
- Enterprise apps with SSO
- Any site with complex auth
Combining Browser with APIs
Sometimes you need both:
async function hybridScrapeAndAPI(productId: string) {
// Get public data from API
const apiData = await fetch(`https://api.store.com/products/${productId}`);
// Get data only available on web page (reviews, Q&A)
await browser.open({ url: `https://store.com/product/${productId}` });
const snapshot = await browser.snapshot();
const reviews = extractReviews(snapshot.content);
return {
...apiData,
reviews
};
}
Handling CAPTCHAs
CAPTCHAs are designed to block automation. Options:
Error Handling
Always handle failures gracefully:
async function robustScrape(url: string) {
try {
await browser.open({ url, timeoutMs: 30000 });
} catch (error) {
console.error("Failed to load page:", error);
return null;
}
try {
const snapshot = await browser.snapshot();
return extractData(snapshot.content);
} catch (error) {
console.error("Failed to extract data:", error);
// Take screenshot for debugging
const screenshot = await browser.screenshot({ type: "png" });
await saveDebugScreenshot(screenshot);
return null;
}
}
Best Practices
1. Respect Rate Limits
Don't hammer websites. Add delays:
for (const url of urls) {
await scrapeUrl(url);
await sleep(2000); // 2 second delay between requests
}
2. Use Selectors Wisely
Prefer semantic selectors over brittle ones:
// Good: semantic, stable
selector: "button[aria-label='Submit application']"
// Bad: fragile, breaks easily
selector: "div.container > div:nth-child(3) > button.btn-primary"
3. Take Screenshots for Debugging
When something fails, a screenshot is worth a thousand logs:
catch (error) {
const screenshot = await browser.screenshot({ fullPage: true });
await fs.writeFile(`./debug-${Date.now()}.png`, screenshot);
throw error;
}
4. Keep Sessions Alive
For long-running monitors, reuse browser instances:
// Don't open/close browser each time
for (let i = 0; i < iterations; i++) {
await browser.open({ url: urls[i] }); // Reuses same browser
await processPage();
}
5. Log Everything
For autonomous agents, logging is critical:
console.log(`[${new Date().toISOString()}] Navigating to ${url}`);
const snapshot = await browser.snapshot();
console.log(`[${new Date().toISOString()}] Snapshot captured, ${snapshot.content.length} chars`);
Putting It All Together
Here's a real-world agent that monitors competitor prices:
async function competitorPriceMonitor() {
const competitors = [
{ name: "CompetitorA", url: "https://competitor-a.com/product/123" },
{ name: "CompetitorB", url: "https://competitor-b.com/item/456" }
];
const results = [];
for (const competitor of competitors) {
try {
await browser.open({ url: competitor.url });
await browser.act({ kind: "wait", textGone: "Loading" });
const snapshot = await browser.snapshot();
const price = extractPrice(snapshot.content);
results.push({
competitor: competitor.name,
price,
timestamp: new Date().toISOString()
});
console.log(`${competitor.name}: ${price}`);
await sleep(3000); // Polite delay
} catch (error) {
console.error(`Failed to check ${competitor.name}:`, error);
}
}
// Save to database
await savePriceHistory(results);
// Alert if prices changed significantly
await checkForPriceChanges(results);
}
Conclusion
Browser automation unlocks a huge range of capabilities for AI agents. Whether you're scraping data, filling forms, monitoring changes, or testing applications, OpenClaw's browser integration makes it straightforward.
The key is choosing the right tool for the job:
- API available? Use the API.
- Simple data extraction? Try web_fetch first.
- Complex interaction needed? Use browser automation.
Start small with a single use case, get it working reliably, then expand. Browser automation is powerful, but it's also more fragile than APIs. Build in error handling, logging, and monitoring from day one.
Your agents can now interact with any web interface. The entire internet just became your API.