feat: group multi-browser output by source
Testing / remote-protocol-compat (0.9.3) (push) Successful in 52s
Testing / test (push) Successful in 1m2s
Testing / remote-protocol-compat (0.9.5) (push) Successful in 1m0s
Package Extension / package-extension (push) Successful in 1m11s
Build & Publish Package / publish (push) Successful in 1m7s

- Add browser source grouping metadata to SDK-created tabs, groups,
  list results, and aggregate count results.
- Render grouped local/remote browser tables consistently for clients,
  tabs, groups, windows, sessions, and remote status output.
- Document remote control, auth, HTTP gateway usage, and the refreshed
  project structure in the README.
- Add coverage for grouped output and BrowserCounts browser_groups.
- Bump the Python package, extension manifest, and lockfile to 0.15.6.
- Add a just publish helper for building and publishing release artifacts.
This commit is contained in:
2026-06-18 00:52:04 +02:00
parent 479a0f1964
commit 8dece7800f
19 changed files with 540 additions and 270 deletions
+58 -54
View File
@@ -1,20 +1,21 @@
# browser-cli
Control your real, running browser from the terminal or the Python SDK — no headless browser, no Playwright, no virtual display. Your actual open tabs, windows, and tab groups respond to your commands.
Control your real, running browser from the terminal, Python SDK, or a trusted remote client — no headless browser, no Playwright, no virtual display. Your actual open tabs, windows, and tab groups respond to your commands.
---
## What it does
You have 40 tabs open. You want to close all the duplicates, group the GitHub ones, save your session before a meeting, and open a few URLs into a specific group — all from a script. That is what browser-cli is for.
It works by pairing a small browser extension with a Python package that provides both a CLI and SDK. The extension has full access to your browser's tabs, windows, groups, and page DOM. The CLI and SDK talk to it in real time over a local IPC channel.
It works by pairing a small browser extension with a Python package that provides both a CLI and SDK. The extension has full access to your browser's tabs, windows, groups, and page DOM. The CLI and SDK talk to it in real time over a local IPC channel, or through the optional authenticated TCP remote bridge.
---
## How it works
```
terminal / python script
terminal / python script / remote client
│ Local IPC (Unix socket on Linux/macOS, named pipe on Windows)
│ or TCP remote bridge (Ed25519 auth, optional compression)
Native Messaging Host (Python process, launched by the browser)
@@ -33,7 +34,7 @@ terminal / python script
4. CLI commands connect to that socket, send a JSON command, and wait for the result.
5. The native host relays the command to the extension via stdout, receives the result via stdin, and sends it back to the CLI.
No server needs to be running beforehand. The browser manages the native host's lifecycle.
No local server needs to be running beforehand. The browser manages the native host's lifecycle. For cross-machine control, `browser-cli serve` starts an explicit TCP listener protected by Ed25519 public-key authentication unless you opt out with `--no-auth`.
**Message format**
@@ -53,7 +54,7 @@ Every response:
**Requirements:** Python 3.10+, [uv](https://github.com/astral-sh/uv), Chrome, Chromium, Brave, Edge, Vivaldi, or Firefox
### Install with uv
Once published on PyPI, install the CLI as a uv tool:
Install the CLI from PyPI as a uv tool:
```sh
uv tool install real-browser-cli
@@ -100,62 +101,42 @@ Only the `browser-cli` command needs to be on your `PATH`. The browser launches
```text
browser-cli/
├── browser_cli/
│ ├── __init__.py # Python SDK BrowserCLI class and SDK entry point
│ ├── cli.py # Click CLI entry point
│ ├── client/ # Client-side command routing used by CLI and SDK
│ ├── core.py # send_command and remote command routing
│ ├── targets.py # Browser target discovery and socket resolution
│ ├── auth.py # Remote auth fields and key lookup
│ └── messages.py # Request/response helpers
│ ├── models.py # Tab and Group helper models
│ ├── native/ # Native messaging host internals
│ ├── host.py # Browser-launched native host entry point
│ ├── local_server.py # Local CLI IPC server
│ └── protocol.py # Chrome Native Messaging framing
── remote/ # Client-side remote browser support
│ │ ├── transport.py # TCP/TLS remote transport
│ │ └── registry.py # Saved remote endpoints/keys
│ └── commands/
│ ├── navigate.py # nav open/reload/back/forward/focus
│ ├── search.py # search engine shortcuts
│ ├── tabs.py # tab management
│ ├── groups.py # tab group management
│ ├── windows.py # window management
│ ├── dom.py # DOM querying and interaction
│ ├── extract.py # content extraction
│ └── session.py # session save/load
│ ├── __init__.py # Public sync SDK: BrowserCLI and namespace wiring
│ ├── async_sdk.py # AsyncBrowserCLI
│ ├── cli.py # Click root command and native-host entry point
│ ├── client/ # send_command path, local/remote routing, message helpers
│ ├── sdk/ # SDK namespaces: nav, tabs, groups, windows, dom, session, ...
│ ├── commands/ # CLI presentation layer over the SDK namespaces
├── native/ # Browser-launched Native Messaging host + local IPC server
│ ├── remote/ # TCP remote client transport and saved endpoint registry
│ ├── serve/ # Authenticated TCP server runtime
│ ├── transport/ # JSON/msgpack response encoding and compression helpers
│ ├── markdown/ # HTML-to-Markdown extraction helpers
├── auth/ # Ed25519 keys, signing, SSH-agent/YubiKey helpers, PQ KEX
── models.py # Tab, Group, BrowserCounts dataclasses
├── extension/
│ ├── manifest.json # MV3 extension manifest
── content.js # Content-script helpers
└── src/ # TypeScript source split by command area
│ ├── index.ts # Builds generated extension/background.js
── content/ # Builds generated extension/content-dispatch.js
├── examples/
├── demo.py # Python SDK walkthrough
│ └── demo.sh # Bash CLI walkthrough
├── tests/
│ ├── conftest.py # shared pytest fixtures
│ ├── test_api.py
│ ├── test_cli.py
│ ├── test_dom.py
│ ├── test_extract.py
│ ├── test_groups.py
│ ├── test_nav.py
│ ├── test_session.py
│ ├── test_tabs.py
│ └── test_windows.py
├── com.browsercli.host.json # native messaging manifest template
├── pyproject.toml # package metadata and CLI entry point
└── uv.lock # locked dependencies for uv
│ ├── manifest.json # Chromium MV3 manifest
── src/ # TypeScript WebExtension source
├── index.ts # Background/service-worker bundle entry
│ ├── content-dispatch.ts
── commands/ # Browser-side command implementations
│ ├── content/ # DOM/extract/Markdown logic injected into pages
└── core/ # Shared extension helpers
├── examples/ # Python and shell walkthroughs
├── scripts/ # Packaging and release helper scripts
├── tests/ # pytest suite
├── package.json # Extension build/test/package scripts
├── pyproject.toml # Python package metadata
└── uv.lock # locked Python dependencies
```
---
## CLI reference
All commands are run with `uv run browser-cli [--browser ALIAS] <command>`.
During source development, commands are usually run as `uv run browser-cli [--browser ALIAS] <command>`. After tool installation, use `browser-cli ...` directly. Add `--remote HOST[:PORT]` and optionally `--key PATH` to target a browser exposed by `browser-cli serve`.
If exactly one browser instance is connected, commands auto-target it. Use `--browser ALIAS` when multiple browser instances are connected. `tabs list`, `tabs count`, `groups list`, `groups count`, `windows list`, and `session list` aggregate across all active browsers when `--browser` is omitted; in that mode they show the source browser alias or UUID. You can inspect the active instances with `browser-cli clients` and assign a persistent profile alias from inside the target browser with `browser-cli clients rename --browser <current-alias> <new-alias>`. Closed browsers are removed from the client registry automatically.
If exactly one browser instance is connected, commands auto-target it. Use `--browser ALIAS` when multiple browser instances are connected. `tabs list`, `tabs count`, `groups list`, `groups count`, `windows list`, and `session list` aggregate across all active browsers when `--browser` is omitted; in that mode they show the source browser alias or UUID. When local and saved remote browsers are mixed, tables group rows by source (`local` or the remote endpoint) and indent the browser profile below that group. You can inspect active instances with `browser-cli clients` and assign a persistent profile alias from inside the target browser with `browser-cli clients rename --browser <current-alias> <new-alias>`. Closed browsers are removed from the client registry automatically.
Important: profile aliases are browser-instance aliases, not window aliases. Window aliases created with `windows rename` are only for targeting windows in commands like `nav open --window work`. If a browser instance has no explicit profile alias set, the native host gives it a generated UUID alias so multiple unaliased browsers stay distinct.
@@ -315,6 +296,27 @@ browser-cli completion zsh # print setup instructions
browser-cli completion zsh --script # output raw completion script
```
### Remote control, auth, and gateways
```sh
# On the machine with the browser
browser-cli auth keygen --output ~/.config/browser-cli/client.key
PUBKEY=$(browser-cli auth show --key ~/.config/browser-cli/client.key | tail -n1)
browser-cli auth trust "$PUBKEY"
browser-cli serve --host 0.0.0.0 --port 8765 --authorized-keys ~/.config/browser-cli/authorized_keys
# From another machine
browser-cli --remote browser-host.example:8765 --key ~/.config/browser-cli/client.key tabs list
browser-cli remote trust browser-host.example:8765 ~/.config/browser-cli/client.key
browser-cli --remote browser-host.example:8765 clients
# Local HTTP JSON gateway for small integrations
browser-cli serve-http --port 8766
curl -H "Authorization: Bearer <token>" http://127.0.0.1:8766/tabs
```
Remote auth uses Ed25519 challenge/response. `--remote` domains default to port 443; explicit `host:port` endpoints are also supported. Saved remote endpoints participate in aggregate list/count commands, where output is grouped by endpoint.
---
## Python SDK
@@ -325,7 +327,7 @@ from browser_cli import AsyncBrowserCLI, BrowserCLI
b = BrowserCLI()
```
Commands are grouped into namespaces on the client (`b.tabs`, `b.dom`, `b.session`, ...). Each sync call blocks until the browser responds and returns the data directly as a Python object. For asyncio programs, `AsyncBrowserCLI` exposes the same namespaces as native awaitable methods over async Unix/TCP transport.
Commands are grouped into namespaces on the client (`b.tabs`, `b.dom`, `b.session`, ...). Each sync call blocks until the browser responds and returns the data directly as a Python object. Create `BrowserCLI(remote="host:8765", key="client.key")` to target a remote server. For asyncio programs, `AsyncBrowserCLI` exposes the same namespaces as native awaitable methods over async Unix/TCP transport.
```python
# Navigation ── b.nav
@@ -480,6 +482,7 @@ counts = b.tabs.count()
if isinstance(counts, BrowserCounts):
print(counts.total)
print(counts.by_browser)
print(counts.browser_groups) # e.g. {"local:work": "local", "remote:work": "remote"}
```
---
@@ -515,6 +518,7 @@ The extension source lives in `extension/src/`. `extension/background.js` and `e
Packaging:
```bash
just publish # build to /tmp/dist-browser-cli and publish with .env credentials
npm run package:extension # testing/unpacked zip, keeps manifest.key for stable Chromium native-messaging ID
npm run package:extension:webstore # Chrome Web Store zip, strips manifest.key
npm run package:extension:webstore:verified # Chrome Web Store CRX signed for verified uploads