9b8cefcd72
- Add Firefox as an install target with native messaging manifest support. - Generate Firefox-specific extension packages with Gecko metadata and AMO-compatible manifest transforms. - Keep tab group commands available in Firefox through dynamic tab group API helpers. - Avoid Firefox linter warnings for static tab group API references and direct eval tokens. - Add Firefox packaging and installer regression coverage. - Bump the package and extension version to 0.15.1.
541 lines
21 KiB
Markdown
541 lines
21 KiB
Markdown
# browser-cli
|
|
Control your real, running browser from the terminal or the Python SDK — no headless browser, no Playwright, no virtual display. Your actual open tabs, windows, and tab groups respond to your commands.
|
|
|
|
---
|
|
|
|
## What it does
|
|
You have 40 tabs open. You want to close all the duplicates, group the GitHub ones, save your session before a meeting, and open a few URLs into a specific group — all from a script. That is what browser-cli is for.
|
|
|
|
It works by pairing a small browser extension with a Python package that provides both a CLI and SDK. The extension has full access to your browser's tabs, windows, groups, and page DOM. The CLI and SDK talk to it in real time over a local IPC channel.
|
|
|
|
---
|
|
|
|
## How it works
|
|
```
|
|
terminal / python script
|
|
│
|
|
│ Local IPC (Unix socket on Linux/macOS, named pipe on Windows)
|
|
▼
|
|
Native Messaging Host (Python process, launched by the browser)
|
|
│
|
|
│ Native Messaging Protocol (stdin/stdout, 4-byte length prefix + JSON)
|
|
▼
|
|
Browser Extension (background worker/page)
|
|
│
|
|
│ extension APIs
|
|
▼
|
|
Your running browser
|
|
```
|
|
|
|
1. The extension calls `chrome.runtime.connectNative('com.browsercli.host')` on startup.
|
|
2. The browser launches the native host Python process (registered in the OS).
|
|
3. The native host opens a local IPC endpoint for the CLI.
|
|
4. CLI commands connect to that socket, send a JSON command, and wait for the result.
|
|
5. The native host relays the command to the extension via stdout, receives the result via stdin, and sends it back to the CLI.
|
|
|
|
No server needs to be running beforehand. The browser manages the native host's lifecycle.
|
|
|
|
**Message format**
|
|
|
|
Every command is a JSON object:
|
|
```json
|
|
{ "id": "uuid", "command": "tabs.list", "args": {} }
|
|
```
|
|
Every response:
|
|
```json
|
|
{ "id": "uuid", "success": true, "data": [...] }
|
|
```
|
|
|
|
---
|
|
|
|
## Installation
|
|
|
|
**Requirements:** Python 3.10+, [uv](https://github.com/astral-sh/uv), Chrome, Chromium, Brave, Edge, Vivaldi, or Firefox
|
|
|
|
### Install with uv
|
|
Once published on PyPI, install the CLI as a uv tool:
|
|
|
|
```sh
|
|
uv tool install real-browser-cli
|
|
browser-cli --version
|
|
browser-cli install brave # or: chrome, chromium, edge, vivaldi, firefox
|
|
```
|
|
|
|
The PyPI package is named `real-browser-cli`; the installed command is still `browser-cli`.
|
|
|
|
For better remote-response compression, install the optional `fast` extra:
|
|
|
|
```sh
|
|
uv tool install "real-browser-cli[fast]"
|
|
```
|
|
|
|
To upgrade later:
|
|
|
|
```sh
|
|
uv tool upgrade real-browser-cli
|
|
```
|
|
|
|
### Install from source
|
|
```sh
|
|
git clone <repo>
|
|
cd browser-cli
|
|
uv sync
|
|
uv run browser-cli install brave # or: chrome, chromium, edge, vivaldi, firefox
|
|
```
|
|
|
|
The `install` command will:
|
|
1. Ask you to load the browser-specific extension package
|
|
2. Show the stable extension ID used by that browser family
|
|
3. Write the native messaging manifest to your OS so the browser can find the host
|
|
4. Copy the native host into an internal `libexec` directory and create a small wrapper outside your `PATH`
|
|
|
|
After install, **fully restart your browser** (Quit and reopen — not just close the window). The extension will connect to the native host automatically on startup.
|
|
|
|
Only the `browser-cli` command needs to be on your `PATH`. The browser launches the native host wrapper directly from its absolute path in the native messaging manifest, and that wrapper imports the installed `browser_cli.native.host` entry point. On Windows the install command also registers the host in the current user's Registry for the selected browser.
|
|
|
|
---
|
|
|
|
## Project structure
|
|
|
|
```text
|
|
browser-cli/
|
|
├── browser_cli/
|
|
│ ├── __init__.py # Python SDK — BrowserCLI class and SDK entry point
|
|
│ ├── cli.py # Click CLI entry point
|
|
│ ├── client/ # Client-side command routing used by CLI and SDK
|
|
│ │ ├── core.py # send_command and remote command routing
|
|
│ │ ├── targets.py # Browser target discovery and socket resolution
|
|
│ │ ├── auth.py # Remote auth fields and key lookup
|
|
│ │ └── messages.py # Request/response helpers
|
|
│ ├── models.py # Tab and Group helper models
|
|
│ ├── native/ # Native messaging host internals
|
|
│ │ ├── host.py # Browser-launched native host entry point
|
|
│ │ ├── local_server.py # Local CLI IPC server
|
|
│ │ └── protocol.py # Chrome Native Messaging framing
|
|
│ ├── remote/ # Client-side remote browser support
|
|
│ │ ├── transport.py # TCP/TLS remote transport
|
|
│ │ └── registry.py # Saved remote endpoints/keys
|
|
│ └── commands/
|
|
│ ├── navigate.py # nav open/reload/back/forward/focus
|
|
│ ├── search.py # search engine shortcuts
|
|
│ ├── tabs.py # tab management
|
|
│ ├── groups.py # tab group management
|
|
│ ├── windows.py # window management
|
|
│ ├── dom.py # DOM querying and interaction
|
|
│ ├── extract.py # content extraction
|
|
│ └── session.py # session save/load
|
|
├── extension/
|
|
│ ├── manifest.json # MV3 extension manifest
|
|
│ ├── content.js # Content-script helpers
|
|
│ └── src/ # TypeScript source split by command area
|
|
│ ├── index.ts # Builds generated extension/background.js
|
|
│ └── content/ # Builds generated extension/content-dispatch.js
|
|
├── examples/
|
|
│ ├── demo.py # Python SDK walkthrough
|
|
│ └── demo.sh # Bash CLI walkthrough
|
|
├── tests/
|
|
│ ├── conftest.py # shared pytest fixtures
|
|
│ ├── test_api.py
|
|
│ ├── test_cli.py
|
|
│ ├── test_dom.py
|
|
│ ├── test_extract.py
|
|
│ ├── test_groups.py
|
|
│ ├── test_nav.py
|
|
│ ├── test_session.py
|
|
│ ├── test_tabs.py
|
|
│ └── test_windows.py
|
|
├── com.browsercli.host.json # native messaging manifest template
|
|
├── pyproject.toml # package metadata and CLI entry point
|
|
└── uv.lock # locked dependencies for uv
|
|
```
|
|
|
|
---
|
|
|
|
## CLI reference
|
|
|
|
All commands are run with `uv run browser-cli [--browser ALIAS] <command>`.
|
|
|
|
If exactly one browser instance is connected, commands auto-target it. Use `--browser ALIAS` when multiple browser instances are connected. `tabs list`, `tabs count`, `groups list`, `groups count`, `windows list`, and `session list` aggregate across all active browsers when `--browser` is omitted; in that mode they show the source browser alias or UUID. You can inspect the active instances with `browser-cli clients` and assign a persistent profile alias from inside the target browser with `browser-cli clients rename --browser <current-alias> <new-alias>`. Closed browsers are removed from the client registry automatically.
|
|
|
|
Important: profile aliases are browser-instance aliases, not window aliases. Window aliases created with `windows rename` are only for targeting windows in commands like `nav open --window work`. If a browser instance has no explicit profile alias set, the native host gives it a generated UUID alias so multiple unaliased browsers stay distinct.
|
|
|
|
### Navigation (`nav`)
|
|
|
|
```sh
|
|
# Open a URL (no focus stealing by default)
|
|
browser-cli nav open https://example.com
|
|
browser-cli nav open https://example.com --focus # bring opened tab/window forward
|
|
browser-cli nav open https://example.com --window work # into a named window
|
|
browser-cli nav open https://example.com --group research # into a tab group (name or ID)
|
|
|
|
# Reload
|
|
browser-cli nav reload # reload active tab
|
|
browser-cli nav reload 1234 # reload tab by ID
|
|
browser-cli nav hard-reload # bypass cache
|
|
|
|
# Navigate history
|
|
browser-cli nav back
|
|
browser-cli nav forward 1234 # forward in specific tab
|
|
|
|
# Jump to a tab by URL pattern
|
|
browser-cli nav focus github # focuses first tab whose URL contains "github"
|
|
```
|
|
|
|
### Search
|
|
|
|
Each search command opens the search results in your browser using the same flags as `nav open`.
|
|
|
|
```sh
|
|
browser-cli search google openai api
|
|
browser-cli search brave rust iterators
|
|
browser-cli search ddg tab groups --window work
|
|
browser-cli search youtube browser automation
|
|
browser-cli search yt lo fi
|
|
browser-cli search spotify aphex twin
|
|
browser-cli search amazon mechanical keyboard
|
|
browser-cli search ecosia native messaging
|
|
browser-cli search furaffinity dragons
|
|
browser-cli search fa dragons
|
|
browser-cli search bing browser cli
|
|
browser-cli search github browser-cli
|
|
browser-cli search wikipedia native messaging
|
|
browser-cli search wiki native messaging
|
|
browser-cli search reddit chrome extensions
|
|
browser-cli search stackoverflow click choices
|
|
browser-cli search so click choices
|
|
```
|
|
|
|
### Tabs
|
|
|
|
```sh
|
|
browser-cli tabs list # list all open tabs (all windows)
|
|
browser-cli tabs count # count all tabs
|
|
browser-cli tabs count youtube # count tabs matching URL pattern
|
|
browser-cli tabs filter youtube # list tabs matching URL pattern
|
|
browser-cli tabs query "pull request" # search tabs by URL or title
|
|
|
|
browser-cli tabs active 1234 # switch browser focus to tab
|
|
browser-cli tabs html # print full HTML of active tab
|
|
browser-cli tabs html 1234 # print HTML of specific tab
|
|
|
|
browser-cli tabs close 1234 # close specific tab
|
|
browser-cli tabs close --inactive # close all inactive tabs
|
|
browser-cli tabs close --duplicates # close duplicate URLs (keep first)
|
|
browser-cli tabs dedupe # same as close --duplicates
|
|
|
|
browser-cli tabs move 1234 --window 2 # move tab to another window
|
|
browser-cli tabs move 1234 --group 42 # move tab into a group
|
|
|
|
browser-cli tabs sort --by domain # sort tabs within each window
|
|
browser-cli tabs sort --by title
|
|
browser-cli tabs sort --by time
|
|
|
|
browser-cli tabs merge-windows # pull all tabs into the current window
|
|
```
|
|
|
|
### Tab groups
|
|
|
|
```sh
|
|
browser-cli groups list # list all tab groups
|
|
browser-cli groups count # count groups
|
|
browser-cli groups query "work" # search groups by name
|
|
browser-cli groups tabs 42 # list tabs inside group ID 42
|
|
|
|
browser-cli groups create "research" # create a new group
|
|
browser-cli groups add-tab research # open a blank tab in the group
|
|
browser-cli groups add-tab research https://example.com # open URL in the group
|
|
browser-cli groups add-tab 42 https://example.com # by group ID
|
|
|
|
browser-cli groups close 42 # ungroup the group
|
|
browser-cli groups move research --forward # move group right
|
|
browser-cli groups move research --right # same as --forward
|
|
browser-cli groups move research -r # short right alias
|
|
browser-cli groups move 42 --backward # move group left
|
|
browser-cli groups move 42 --left # same as --backward
|
|
browser-cli groups move 42 -l # short left alias
|
|
```
|
|
|
|
### Windows
|
|
|
|
```sh
|
|
browser-cli windows list # list all windows
|
|
browser-cli windows open # open a new window
|
|
browser-cli windows open https://example.com # open a new window on a URL
|
|
browser-cli windows rename 1 "work" # give a window a local alias
|
|
browser-cli windows close 1 # close a window
|
|
```
|
|
|
|
### DOM
|
|
|
|
These commands run on the **active tab**. The tab must be on a regular `http://` or `https://` page — not a browser internal page like `brave://newtab`.
|
|
|
|
```sh
|
|
browser-cli dom query "h1" # return elements matching CSS selector
|
|
browser-cli dom text "h1" # get text content of matching elements
|
|
browser-cli dom attr "a" href # get attribute value from elements
|
|
browser-cli dom exists ".modal-banner" # exits 0 if found, 1 if not
|
|
browser-cli dom click ".accept-button" # click an element
|
|
browser-cli dom type "#search" "hello" # type text into an input
|
|
```
|
|
|
|
### Extract
|
|
|
|
```sh
|
|
browser-cli extract links # all <a href> links on the page
|
|
browser-cli extract images # all <img> tags (src + alt)
|
|
browser-cli extract text # all visible text (innerText)
|
|
browser-cli extract json "#data" # parse JSON inside a CSS selector
|
|
browser-cli extract html # full HTML of the active tab
|
|
browser-cli extract markdown # main page content as Markdown
|
|
browser-cli extract markdown --selector "article" # specific DOM subtree as Markdown
|
|
```
|
|
|
|
### Sessions
|
|
|
|
A session is a snapshot of all open tab URLs, stored inside the extension via `chrome.storage.local`. Sessions survive browser restarts but are lost if the extension is uninstalled or extension data is cleared.
|
|
|
|
```sh
|
|
browser-cli session save before-meeting # save current tabs as a named session
|
|
browser-cli session load before-meeting # reopen all saved tabs
|
|
browser-cli session list # list all saved sessions (name, tab count, date)
|
|
browser-cli session remove before-meeting # delete a saved session
|
|
browser-cli session diff session-a session-b # show which URLs were added / removed
|
|
browser-cli session auto-save on # auto-save after every tab change
|
|
browser-cli session auto-save off
|
|
```
|
|
|
|
### Misc
|
|
|
|
```sh
|
|
browser-cli clients # show connected browser info from the registry
|
|
browser-cli clients rename --browser abcd1234 work # rename one connected browser instance
|
|
browser-cli --browser abcd1234 clients rename work # equivalent global form
|
|
browser-cli install brave # (re)register the native host
|
|
browser-cli completion zsh # print setup instructions
|
|
browser-cli completion zsh --script # output raw completion script
|
|
```
|
|
|
|
---
|
|
|
|
## Python SDK
|
|
|
|
```python
|
|
from browser_cli import AsyncBrowserCLI, BrowserCLI
|
|
|
|
b = BrowserCLI()
|
|
```
|
|
|
|
Commands are grouped into namespaces on the client (`b.tabs`, `b.dom`, `b.session`, ...). Each sync call blocks until the browser responds and returns the data directly as a Python object. For asyncio programs, `AsyncBrowserCLI` exposes the same namespaces as native awaitable methods over async Unix/TCP transport.
|
|
|
|
```python
|
|
# Navigation ── b.nav
|
|
b.nav.open("https://example.com")
|
|
b.nav.open("https://example.com", background=True)
|
|
b.nav.open("https://example.com", window="work")
|
|
b.nav.reload()
|
|
b.nav.hard_reload()
|
|
b.nav.back()
|
|
b.nav.forward(tab_id=1234)
|
|
b.nav.focus("github")
|
|
b.nav.to(1234, "https://example.com") # navigate a specific tab in place
|
|
b.nav.search("google", "python asyncio")
|
|
|
|
# Tabs ── b.tabs
|
|
tabs = b.tabs.list() # list[Tab]; in multi-browser mode each tab.browser is set
|
|
tab = b.tabs.open("https://example.com") # returns a bound Tab object
|
|
tab = b.tabs.open("https://example.com", wait=True, timeout=10)
|
|
active = b.tabs.active() # active Tab object
|
|
tab = b.tabs.get(1234) # tab by ID
|
|
tab = b.tabs.first("github") # first matching tab or None
|
|
b.tabs.activate(1234)
|
|
b.tabs.close(1234)
|
|
b.tabs.close(tab_ids=[1, 2, 3]) # close many in one round-trip (IDs or Tab objects)
|
|
b.tabs.close_inactive()
|
|
b.tabs.close_duplicates()
|
|
b.tabs.filter("youtube") # list of matching tabs
|
|
b.tabs.query("pull request")
|
|
counts = b.tabs.count("github") # int, or BrowserCounts(total=..., by_browser=...) in multi-browser mode
|
|
html = b.tabs.html() # full HTML string of active tab
|
|
b.tabs.sort(by="domain")
|
|
b.tabs.merge_windows()
|
|
b.tabs.dedupe()
|
|
|
|
# Bound Tab helpers
|
|
tab = b.tabs.active()
|
|
tab.pin()
|
|
tab.screenshot()
|
|
tab.refresh()
|
|
tab.wait_for_load(timeout=10)
|
|
tab.watch_url(r"/done$")
|
|
|
|
# Tab groups ── b.groups
|
|
groups = b.groups.list() # list[Group]; in multi-browser mode each group.browser is set
|
|
b.groups.create("research") # creates group, returns Group
|
|
b.groups.close(42)
|
|
b.groups.tabs(42) # tabs inside a group
|
|
b.groups.add_tab(42, "https://example.com")
|
|
b.groups.count() # int, or BrowserCounts(...) in multi-browser mode
|
|
|
|
# Windows ── b.windows
|
|
windows = b.windows.list() # in multi-browser mode each dict has a "browser" key
|
|
b.windows.rename(1, "work")
|
|
b.windows.open()
|
|
b.windows.open("https://example.com")
|
|
b.windows.close(1)
|
|
|
|
# DOM ── b.dom (active tab must be http/https)
|
|
elements = b.dom.query("h2") # list of { tag, text, attrs }
|
|
texts = b.dom.text(".article p") # list of strings
|
|
attrs = b.dom.attr("a", "href") # list of strings
|
|
exists = b.dom.exists(".modal-banner") # bool
|
|
b.dom.click(".accept-button")
|
|
b.dom.type("#search", "hello world")
|
|
b.dom.wait_for("#results", visible=True, timeout=10)
|
|
b.dom.eval("document.title")
|
|
|
|
# Extract ── b.extract
|
|
links = b.extract.links() # list of { text, href }
|
|
images = b.extract.images() # list of { alt, src }
|
|
text = b.extract.text() # string
|
|
data = b.extract.json("#app-data") # parsed Python object
|
|
md = b.extract.markdown("article")
|
|
|
|
# Page / storage
|
|
info = b.page.info()
|
|
b.storage.set("token", "abc")
|
|
val = b.storage.get("token")
|
|
|
|
# Sessions ── b.session
|
|
b.session.save("before-meeting")
|
|
b.session.load("before-meeting")
|
|
sessions = b.session.list() # [{ name, tabs, savedAt }, ...]
|
|
b.session.remove("before-meeting")
|
|
diff = b.session.diff("session-a", "session-b")
|
|
# diff = { "added": [...urls], "removed": [...urls] }
|
|
b.session.auto_save(True)
|
|
|
|
# Performance + extension
|
|
b.perf.status()
|
|
b.perf.set_profile("gentle")
|
|
b.extension.reload()
|
|
|
|
# Workflow decorators ── b.decorators
|
|
@b.decorators.new_tab("https://example.com", wait=True, close=True)
|
|
def scrape(*, tab):
|
|
return b.extract.markdown("article")
|
|
|
|
@b.decorators.wait_for_selector("#ready", visible=True)
|
|
def run_after_page_ready():
|
|
return b.dom.text("#ready")
|
|
|
|
@b.decorators.performance_profile("ultra")
|
|
def restore_big_session():
|
|
return b.session.load("work", lazy=True)
|
|
|
|
@b.decorators.retry(times=3, delay=1)
|
|
@b.decorators.save_session_before("before-risky-step")
|
|
def risky_workflow():
|
|
b.tabs.close_duplicates()
|
|
|
|
# Async SDK: same namespaces, native awaitable methods
|
|
async def async_example():
|
|
ab = AsyncBrowserCLI()
|
|
tabs = await ab.tabs.list()
|
|
|
|
@ab.decorators.new_tab("https://example.com", wait=True, close=True)
|
|
async def scrape(*, tab):
|
|
return await ab.extract.markdown("article")
|
|
|
|
return tabs, await scrape()
|
|
|
|
# Misc
|
|
clients = b.clients()
|
|
raw = b.command("tabs.count", {"pattern": "github"}) # escape hatch for raw commands
|
|
```
|
|
|
|
**Error handling**
|
|
|
|
```python
|
|
from browser_cli import BrowserCLI, BrowserNotConnected
|
|
|
|
b = BrowserCLI()
|
|
try:
|
|
tabs = b.tabs.list()
|
|
except BrowserNotConnected:
|
|
print("Browser is not running or extension is not loaded")
|
|
except RuntimeError as e:
|
|
print(f"Browser returned an error: {e}")
|
|
```
|
|
|
|
```python
|
|
from browser_cli import BrowserCLI, BrowserCounts
|
|
|
|
b = BrowserCLI()
|
|
|
|
tabs = b.tabs.list()
|
|
for tab in tabs:
|
|
print(tab.browser, tab.title)
|
|
|
|
counts = b.tabs.count()
|
|
if isinstance(counts, BrowserCounts):
|
|
print(counts.total)
|
|
print(counts.by_browser)
|
|
```
|
|
|
|
---
|
|
|
|
## Example scripts
|
|
|
|
See `examples/demo.py` (Python) and `examples/demo.sh` (Bash) for full walkthroughs covering tabs, groups, DOM extraction, and session management.
|
|
|
|
```sh
|
|
uv run python examples/demo.py
|
|
bash examples/demo.sh
|
|
```
|
|
|
|
---
|
|
|
|
## Development
|
|
|
|
```sh
|
|
npm ci
|
|
npm run check:extension # type-check, build extension bundles, syntax-check bundle
|
|
uv run pytest -q
|
|
```
|
|
|
|
On NixOS or hosts without global Node/npm:
|
|
|
|
```sh
|
|
nix-shell # automatically runs npm ci when node_modules is missing/outdated
|
|
npm run check:extension
|
|
```
|
|
|
|
The extension source lives in `extension/src/`. `extension/background.js` and `extension/content-dispatch.js` are generated and ignored by git. Run `npm run build:extension` before using `Load unpacked` with `extension/`. On NixOS, use `nix-shell` first if npm is not installed globally.
|
|
|
|
Packaging:
|
|
|
|
```bash
|
|
npm run package:extension # testing/unpacked zip, keeps manifest.key for stable Chromium native-messaging ID
|
|
npm run package:extension:webstore # Chrome Web Store zip, strips manifest.key
|
|
npm run package:extension:firefox # Firefox zip, strips manifest.key and Firefox-incompatible permissions
|
|
```
|
|
|
|
Chrome Web Store rejects `manifest.key`, so upload the `*-webstore-*` zip from `dist/`. For Firefox, use the `*-firefox-*` zip.
|
|
|
|
---
|
|
|
|
## Limitations
|
|
|
|
- **Browser internal pages** (`chrome://`, `brave://`, `edge://`, `about:`) cannot be scripted. DOM and extract commands only work on regular `http://` and `https://` pages.
|
|
- **Multiple browser instances can be auto-distinguished, but generated aliases are temporary**. Unaliased browsers get UUID aliases from the native host, which avoids collisions but is less ergonomic than setting a stable alias with `browser-cli clients rename --browser <current-alias> <new-alias>`.
|
|
- **Supported install targets are explicit, not “all Chromium browsers”**. The installer currently supports Chrome, Chromium, Brave, Edge, Vivaldi, and Firefox. Other Chromium-based browsers may use different or shared native messaging manifest locations, so they need browser-specific verification before being added safely.
|
|
- **Firefox support is experimental**. Basic tab/window/navigation/native-messaging support is wired, including tab-group APIs on supported Firefox versions.
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
PolyForm Noncommercial License 1.0.0. See [LICENSE](LICENSE).
|
|
|
|
Commercial use is not permitted under this license. For commercial licensing, contact the project maintainer.
|