> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cowagent.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# browser - Browser

> Control a browser to access and interact with web pages

Control a Chromium browser for web navigation, element interaction and content extraction. Supports JavaScript-rendered pages and uses a compact DOM snapshot so the Agent can efficiently understand page structure.

## Installation

<Tabs>
  <Tab title="CLI install (recommended)">
    ```bash theme={null}
    cow install-browser
    ```

    This command will:

    * Install the `playwright` Python package (with auto-fallback for older systems)
    * Install system dependencies on Linux
    * Download the Chromium browser (Linux servers automatically use the headless build)
    * Detect China-mainland networks and use mirror acceleration
  </Tab>

  <Tab title="Manual install">
    ```bash theme={null}
    pip install playwright
    playwright install chromium
    ```

    On Linux servers, install system dependencies as well:

    ```bash theme={null}
    sudo playwright install-deps chromium
    ```

    On older systems (e.g. Ubuntu 18.04, glibc \< 2.28), install a compatible version:

    ```bash theme={null}
    pip install playwright==1.28.0
    python -m playwright install chromium
    ```

    To accelerate the Chromium download from China:

    ```bash theme={null}
    export PLAYWRIGHT_DOWNLOAD_HOST=https://registry.npmmirror.com/-/binary/playwright
    python -m playwright install chromium
    ```
  </Tab>
</Tabs>

<Note>
  1. Supported on Ubuntu 20.04+, Debian 10+, macOS and Windows. Older systems such as Ubuntu 18.04 will fall back to a compatible version automatically.
  2. The browser tool has heavy dependencies (\~300MB) and is optional. For lightweight web content retrieval, use the `web_fetch` tool.
</Note>

## Workflow

A typical browser workflow for the Agent:

1. **`navigate`** — Open the target URL
2. **`snapshot`** — Get a compact DOM with auto-numbered interactive elements (`ref`)
3. **`click` / `fill` / `select`** — Operate elements by `ref`
4. **`snapshot`** — Snapshot again to verify the result

## Supported Actions

| Action             | Description                            | Key parameters                   |
| ------------------ | -------------------------------------- | -------------------------------- |
| `navigate`         | Open URL                               | `url`                            |
| `snapshot`         | Get structured page text (primary way) | `selector` (optional)            |
| `click`            | Click an element                       | `ref` or `selector`              |
| `fill`             | Fill text into an input                | `ref` or `selector`, `text`      |
| `select`           | Select a dropdown option               | `ref` or `selector`, `value`     |
| `scroll`           | Scroll the page                        | `direction` (up/down/left/right) |
| `screenshot`       | Save a screenshot to the workspace     | `full_page`                      |
| `wait`             | Wait for an element or timeout         | `selector`, `timeout`            |
| `press`            | Press a key (Enter, Tab, etc.)         | `key`                            |
| `back` / `forward` | Browser back / forward                 | -                                |
| `get_text`         | Get an element's text content          | `selector`                       |
| `evaluate`         | Run JavaScript                         | `script`                         |

## Use Cases

* Access a URL to retrieve dynamic page content
* Fill in forms and log in
* Operate web elements (click buttons, select options, etc.)
* Verify the result of a deployed web page
* Scrape content that requires JS rendering

## Run Mode

The browser picks a mode based on the runtime environment:

| Environment                  | Mode                            |
| ---------------------------- | ------------------------------- |
| macOS / Windows              | Headed (browser window visible) |
| Linux desktop (with DISPLAY) | Headed                          |
| Linux server (no DISPLAY)    | Headless                        |

You can override it in `config.json`:

```json theme={null}
{
  "tools": {
    "browser": {
      "headless": true
    }
  }
}
```

## Persistent Login

**Log in to a target site once and the Agent can keep using it.** Two ways are supported:

### Option 1: Persistent mode (default)

Works out of the box. Login state is saved under `~/.cow/browser_profile`. No configuration needed.

To disable persistence and start with a clean environment every time:

```json theme={null}
{
  "tools": {
    "browser": {
      "persistent": false
    }
  }
}
```

### Option 2: CDP mode (attach to real Chrome)

Have the Agent connect to a separately launched real Chrome (instead of the Chromium bundled with Playwright) for full browser fingerprints. Useful for sites with strict bot detection.

Launch Chrome with a debugging port and a dedicated user data directory:

<Tabs>
  <Tab title="macOS">
    ```bash theme={null}
    "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
      --remote-debugging-port=9222 \
      --user-data-dir="$HOME/.cow/chrome-cdp"
    ```
  </Tab>

  <Tab title="Linux">
    ```bash theme={null}
    google-chrome \
      --remote-debugging-port=9222 \
      --user-data-dir="$HOME/.cow/chrome-cdp"
    ```
  </Tab>

  <Tab title="Windows">
    ```powershell theme={null}
    & "C:\Program Files\Google\Chrome\Application\chrome.exe" `
      --remote-debugging-port=9222 `
      --user-data-dir="$env:USERPROFILE\.cow\chrome-cdp"
    ```
  </Tab>
</Tabs>

Then point the Agent at the endpoint in `config.json`:

```json theme={null}
{
  "tools": {
    "browser": {
      "cdp_endpoint": "http://localhost:9222"
    }
  }
}
```

<Note>
  Chrome 137+ requires `--remote-debugging-port` to be paired with a dedicated `--user-data-dir`. As a result, the CDP-launched Chrome **cannot directly reuse the login state of your daily Chrome**; you'll need to log in once inside this dedicated profile.
</Note>
