Browser Use¶

What it is¶

Browser Use is an open-source framework that allows LLMs to interact with real browsers, enabling them to perform web-based tasks like form-filling, scraping, and application navigation.

What problem it solves¶

It bridges the gap between static scraping (which fails on dynamic, JS-heavy sites) and manual browser automation, allowing agents to "see" and "interact" with the web just like a human would.

Where it fits in the stack¶

Infrastructure / Framework. It provides the interface for agents to drive browsers via Playwright or similar drivers.

Typical use cases¶

Complex Scraping: Extracting data from authenticated or multi-step web processes.
Workflow Automation: Automating tasks on web apps that lack official APIs.
Agent Testing: Verifying browser-based agent behaviors.

Example company use cases¶

Finance ops: log into a supplier portal, download monthly statements, and hand them to a document pipeline.
Lead generation: gather structured data from directories or websites that do not expose a practical API.
QA and support: reproduce user-reported UI issues or verify that a browser-based workflow still works after changes.

Example workflow shape¶

Find target page -> authenticate -> navigate multi-step flow -> extract result -> store structured output

Strengths¶

Native MCP Support: Can be used as an MCP server with Claude Desktop.
High Success Rate: Reportedly high accuracy on benchmarks like WebVoyager.
Multi-LLM: Works with any major LLM through standard providers.
Active Community: Rapidly growing star count (78k+).

Limitations¶

Overhead: Driving a real browser is slower and more resource-intensive than API calls.
Cost: High token consumption for vision-based or detailed DOM-reasoning tasks.
Fragility: Still subject to breakage on massive UI changes, though more robust than traditional XPaths.

When to use it¶

When an application has no API but needs to be automated.
For deep web research that requires multi-tab navigation or interactive sessions.

When not to use it¶

When a fast, stable REST API is available for the same task.
For high-frequency, low-latency data extraction.

Selection comments¶

Prefer APIs first, browser automation second.
Browser Use is strongest when the workflow is interactive, stateful, and human-like.
Pair it with n8n for scheduling and retries, and with mem0 if the agent must remember prior interactions.

Getting started¶

Installation¶

pip install browser-use

Basic Usage¶

from browser_use import Agent
from langchain_openai import ChatOpenAI

async def main():
    agent = Agent(
        task="Go to Hacker News and find the top story about AI.",
        llm=ChatOpenAI(model="gpt-4o"),
    )
    result = await agent.run()
    print(result)

import asyncio
asyncio.run(main())

CLI examples¶

# Run a simple browser-use task from the CLI
python -m browser_use "Search for the latest news on SpaceX"

# Start the browser-use web UI for interactive task creation
python -m browser_use --ui

# List all available browser-use agent configurations
python -m browser_use --list-agents

API examples¶

from browser_use import Agent, Browser, BrowserConfig
from langchain_anthropic import ChatAnthropic

# Advanced configuration with a persistent browser context
browser = Browser(config=BrowserConfig(headless=False))

agent = Agent(
    task="Log in to my dashboard and download the last 3 reports",
    llm=ChatAnthropic(model="claude-3-5-sonnet-20240620"),
    browser=browser
)

async def run_task():
    history = await agent.run()
    print(f"Task completed. Steps taken: {len(history)}")
    await browser.close()

Browser Use¶

What it is¶

What problem it solves¶

Where it fits in the stack¶

Typical use cases¶

Example company use cases¶

Example workflow shape¶

Strengths¶

Limitations¶

When to use it¶

When not to use it¶

Selection comments¶

Getting started¶

Installation¶

Basic Usage¶

CLI examples¶

API examples¶

Licensing and cost¶

Sources / References¶

Contribution Metadata¶

Browser Use¶

What it is¶

What problem it solves¶

Where it fits in the stack¶

Typical use cases¶

Example company use cases¶

Example workflow shape¶

Strengths¶

Limitations¶

When to use it¶

When not to use it¶

Selection comments¶

Getting started¶

Installation¶

Basic Usage¶

CLI examples¶

API examples¶

Licensing and cost¶

Related tools / concepts¶

Sources / References¶

Contribution Metadata¶