feat: add n8n serve node and harden remote access
- Add the n8n community node package with credentials, command mapping, direct serve TCP client, and browser-cli protocol crypto helpers. - Cover Ed25519 signing, canonical JSON, PQ transport encryption, request mapping, and security behavior with unit tests. - Harden serve-http with per-address rate limiting, an 8 MB request body cap, and clear warnings when binding plain HTTP beyond loopback. - Stop one-shot --key overrides from being persisted automatically; document explicit remote trust and keep key-management behind the keys policy tier. - Make HTML-to-Markdown conversion safer by bounding tree depth and dropping unsafe link/image URL schemes. - Bump package and extension release metadata to 0.16.3.
This commit is contained in:
@@ -37,7 +37,6 @@ terminal / python script / remote client
|
||||
No local server needs to be running beforehand. The browser manages the native host's lifecycle. For cross-machine control, `browser-cli serve` starts an explicit TCP listener protected by Ed25519 public-key authentication unless you opt out with `--no-auth`.
|
||||
|
||||
**Message format**
|
||||
|
||||
Every command is a JSON object:
|
||||
```json
|
||||
{ "id": "uuid", "command": "tabs.list", "args": {} }
|
||||
@@ -50,7 +49,6 @@ Every response:
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
**Requirements:** Python 3.10+, [uv](https://github.com/astral-sh/uv), Chrome, Chromium, Brave, Edge, Vivaldi, or Firefox
|
||||
|
||||
browser-cli has two parts: the **CLI / native host** (a Python package) and the **browser extension** (published on the public stores).
|
||||
@@ -109,7 +107,6 @@ Only the `browser-cli` command needs to be on your `PATH`. The browser launches
|
||||
---
|
||||
|
||||
## Project structure
|
||||
|
||||
```text
|
||||
browser-cli/
|
||||
├── browser_cli/
|
||||
@@ -145,7 +142,6 @@ browser-cli/
|
||||
---
|
||||
|
||||
## CLI reference
|
||||
|
||||
During source development, commands are usually run as `uv run browser-cli [--browser ALIAS] <command>`. After tool installation, use `browser-cli ...` directly. Add `--remote HOST[:PORT]` and optionally `--key PATH` to target a browser exposed by `browser-cli serve`.
|
||||
|
||||
If exactly one browser instance is connected, commands auto-target it. Use `--browser ALIAS` when multiple browser instances are connected. `tabs list`, `tabs count`, `groups list`, `groups count`, `windows list`, and `session list` aggregate across all active browsers when `--browser` is omitted; in that mode they show the source browser alias or UUID. When local and saved remote browsers are mixed, tables group rows by source (`local` or the remote endpoint) and indent the browser profile below that group. You can inspect active instances with `browser-cli clients` and assign a persistent profile alias from inside the target browser with `browser-cli clients rename --browser <current-alias> <new-alias>`. Closed browsers are removed from the client registry automatically.
|
||||
@@ -153,7 +149,6 @@ If exactly one browser instance is connected, commands auto-target it. Use `--br
|
||||
Important: profile aliases are browser-instance aliases, not window aliases. Window aliases created with `windows rename` are only for targeting windows in commands like `nav open --window work`. If a browser instance has no explicit profile alias set, the native host gives it a generated UUID alias so multiple unaliased browsers stay distinct.
|
||||
|
||||
### Navigation (`nav`)
|
||||
|
||||
```sh
|
||||
# Open a URL (no focus stealing by default)
|
||||
browser-cli nav open https://example.com
|
||||
@@ -175,7 +170,6 @@ browser-cli nav focus github # focuses first tab whose URL contains "
|
||||
```
|
||||
|
||||
### Search
|
||||
|
||||
Each search command opens the search results in your browser using the same flags as `nav open`.
|
||||
|
||||
```sh
|
||||
@@ -199,7 +193,6 @@ browser-cli search so click choices
|
||||
```
|
||||
|
||||
### Tabs
|
||||
|
||||
```sh
|
||||
browser-cli tabs list # list all open tabs (all windows)
|
||||
browser-cli tabs count # count all tabs
|
||||
@@ -227,7 +220,6 @@ browser-cli tabs merge-windows # pull all tabs into the current wi
|
||||
```
|
||||
|
||||
### Tab groups
|
||||
|
||||
```sh
|
||||
browser-cli groups list # list all tab groups
|
||||
browser-cli groups count # count groups
|
||||
@@ -249,7 +241,6 @@ browser-cli groups move 42 -l # short left alias
|
||||
```
|
||||
|
||||
### Windows
|
||||
|
||||
```sh
|
||||
browser-cli windows list # list all windows
|
||||
browser-cli windows open # open a new window
|
||||
@@ -259,7 +250,6 @@ browser-cli windows close 1 # close a window
|
||||
```
|
||||
|
||||
### DOM
|
||||
|
||||
These commands run on the **active tab**. The tab must be on a regular `http://` or `https://` page — not a browser internal page like `brave://newtab`.
|
||||
|
||||
```sh
|
||||
@@ -272,7 +262,6 @@ browser-cli dom type "#search" "hello" # type text into an input
|
||||
```
|
||||
|
||||
### Extract
|
||||
|
||||
```sh
|
||||
browser-cli extract links # all <a href> links on the page
|
||||
browser-cli extract images # all <img> tags (src + alt)
|
||||
@@ -284,7 +273,6 @@ browser-cli extract markdown --selector "article" # specific DOM subtree as Ma
|
||||
```
|
||||
|
||||
### Sessions
|
||||
|
||||
A session is a snapshot of all open tab URLs, stored inside the extension via `chrome.storage.local`. Sessions survive browser restarts but are lost if the extension is uninstalled or extension data is cleared.
|
||||
|
||||
```sh
|
||||
@@ -298,7 +286,6 @@ browser-cli session auto-save off
|
||||
```
|
||||
|
||||
### Misc
|
||||
|
||||
```sh
|
||||
browser-cli clients # show connected browser info from the registry
|
||||
browser-cli clients rename --browser abcd1234 work # rename one connected browser instance
|
||||
@@ -309,7 +296,6 @@ browser-cli completion zsh --script # output raw completion script
|
||||
```
|
||||
|
||||
### Remote control, auth, and gateways
|
||||
|
||||
```sh
|
||||
# On the machine with the browser
|
||||
browser-cli auth keygen --output ~/.config/browser-cli/client.key
|
||||
@@ -334,22 +320,24 @@ browser-cli serve-http --port 8766
|
||||
curl -H "Authorization: Bearer <token>" http://127.0.0.1:8766/tabs
|
||||
```
|
||||
|
||||
Remote auth uses Ed25519 challenge/response. `--remote` domains default to port 443; explicit `host:port` endpoints are also supported. Saved remote endpoints participate in aggregate list/count commands, where output is grouped by endpoint.
|
||||
Remote auth uses Ed25519 challenge/response. `--remote` domains default to port 443; explicit `host:port` endpoints are also supported. Use `browser-cli remote trust ENDPOINT KEY` to remember a key for later calls. Saved remote endpoints participate in aggregate list/count commands, where output is grouped by endpoint.
|
||||
|
||||
#### n8n integration
|
||||
browser-cli can't be installed inside an n8n container, so the [`n8n-nodes-browser-cli`](n8n-nodes-browser-cli/) community node talks to a remote `serve-http` gateway over HTTP(S). Run the gateway on the browser machine (behind TLS), drop its token into the node's credential, and drive tabs/DOM/extraction/raw commands from a workflow. See [`n8n-nodes-browser-cli/README.md`](n8n-nodes-browser-cli/README.md).
|
||||
|
||||
#### Security model
|
||||
|
||||
- **`serve` (TCP)** authenticates every connection with an Ed25519 signature over a fresh server nonce and, for modern clients, wraps the transport in an ML-KEM-768 (post-quantum) AEAD channel. Commands are gated by a **safe-only policy by default** — even a trusted key can only run read-only status/listing commands until you open more with `--allow-read-page`, `--allow-control`, `--allow-dangerous`, or `--allow-all` (full control, including `dom.eval`/`storage.*`). `--no-auth` is rejected on non-loopback hosts.
|
||||
- **Per-key authorization:** a key in `authorized_keys` can carry an optional `allow:` token (`<pubkey> <name> allow:read-page,control`) listing its categories (`all`, `safe`, `read-page`, `control`, `dangerous`). That key uses its own policy, overriding the server-wide `--allow-*` default; keys without a token fall back to the default. Set it with `auth trust <pubkey> --allow-control …` (works locally and over `--remote`); `auth keys` shows each key's policy.
|
||||
- **`serve` (TCP)** authenticates every connection with an Ed25519 signature over a fresh server nonce and, for modern clients, wraps the transport in an ML-KEM-768 (post-quantum) AEAD channel. Commands are gated by a **safe-only policy by default** — even a trusted key can only run read-only status/listing commands until you open more with `--allow-read-page`, `--allow-control`, `--allow-dangerous`, `--allow-keys`, or `--allow-all` (full control, including `dom.eval`/`storage.*`). `--no-auth` is rejected on non-loopback hosts.
|
||||
- **Per-key authorization:** a key in `authorized_keys` can carry an optional `allow:` token (`<pubkey> <name> allow:read-page,control`) listing its categories (`all`, `safe`, `read-page`, `control`, `dangerous`, `keys`). That key uses its own policy, overriding the server-wide `--allow-*` default; keys without a token fall back to the default. Set it with `auth trust <pubkey> --allow-control …` when adding a key, or change it later with `auth policy <pubkey|name> …` (interactive picker when run with no args; `--safe`/`--server-default`/`--allow-*` for scripting). Both work locally and over `--remote`; `auth keys` shows each key's policy.
|
||||
- **Key-management is its own category:** listing/trusting/repolicing keys (`auth keys`/`auth trust`/`auth policy` over `--remote`) requires the `keys` category. A key trusted only for browsing — even with full `control`+`dangerous` — cannot manage the trust store unless granted `allow:keys` (or `allow:all`). This prevents a compromised browser key from escalating by trusting its own.
|
||||
- **Rate limiting:** `--rate-limit N` caps commands/second per client key (token bucket, default `100`, `0` disables) so a compromised key can't hammer the browser.
|
||||
- **Audit logging:** request logs include the acting key (its name from `authorized_keys` plus a short pubkey), not just the client address.
|
||||
- **`serve-http`** is a convenience gateway with the inverse trade-off: commands are gated by the same `--allow-*` policy (safe-only by default), but the bearer token travels in **clear text over plain HTTP**. It binds to loopback by default; `--no-auth` is only permitted there. If you must expose it beyond loopback, put it behind a TLS-terminating reverse proxy — never send the token over an untrusted network unencrypted.
|
||||
- **`serve-http`** is a convenience gateway with the inverse trade-off: commands are gated by the same `--allow-*` policy (safe-only by default) and requests are throttled per client address (`--rate-limit`, default `100`/s) with an 8 MB body cap, but the bearer token travels in **clear text over plain HTTP**. It binds to loopback by default; `--no-auth` is only permitted there, and binding beyond loopback prints a loud cleartext warning. If you must expose it, put it behind a TLS-terminating reverse proxy — never send the token over an untrusted network unencrypted, and prefer `serve` (encrypted) for real remote use.
|
||||
|
||||
For low latency, an authenticated encrypted remote connection is kept open and reused for further commands in the same process — so SDK scripts and multi-browser fan-out avoid repeating the TCP/TLS/challenge handshake on every command. Aggregate commands also fan out to remote targets concurrently. Both degrade gracefully against older servers that handle one command per connection.
|
||||
|
||||
---
|
||||
|
||||
## Python SDK
|
||||
|
||||
```python
|
||||
from browser_cli import AsyncBrowserCLI, BrowserCLI
|
||||
|
||||
@@ -485,7 +473,6 @@ raw = b.command("tabs.count", {"pattern": "github"}) # escape hatch for raw com
|
||||
```
|
||||
|
||||
**Error handling**
|
||||
|
||||
```python
|
||||
from browser_cli import BrowserCLI, BrowserNotConnected
|
||||
|
||||
@@ -517,7 +504,6 @@ if isinstance(counts, BrowserCounts):
|
||||
---
|
||||
|
||||
## Example scripts
|
||||
|
||||
See `examples/demo.py` (Python) and `examples/demo.sh` (Bash) for full walkthroughs covering tabs, groups, DOM extraction, and session management.
|
||||
|
||||
```sh
|
||||
@@ -528,7 +514,6 @@ bash examples/demo.sh
|
||||
---
|
||||
|
||||
## Development
|
||||
|
||||
```sh
|
||||
npm ci
|
||||
npm run check:extension # type-check, build extension bundles, syntax-check bundle
|
||||
@@ -569,7 +554,6 @@ For Firefox temporary testing via `about:debugging#/runtime/this-firefox`, run `
|
||||
---
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Browser internal pages** (`chrome://`, `brave://`, `edge://`, `about:`) cannot be scripted. DOM and extract commands only work on regular `http://` and `https://` pages.
|
||||
- **Multiple browser instances can be auto-distinguished, but generated aliases are temporary**. Unaliased browsers get UUID aliases from the native host, which avoids collisions but is less ergonomic than setting a stable alias with `browser-cli clients rename --browser <current-alias> <new-alias>`.
|
||||
- **Supported install targets are explicit, not “all Chromium browsers”**. The installer currently supports Chrome, Chromium, Brave, Edge, Vivaldi, and Firefox. Other Chromium-based browsers may use different or shared native messaging manifest locations, so they need browser-specific verification before being added safely.
|
||||
@@ -578,7 +562,6 @@ For Firefox temporary testing via `about:debugging#/runtime/this-firefox`, run `
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
PolyForm Noncommercial License 1.0.0. See [LICENSE](LICENSE).
|
||||
|
||||
Commercial use is not permitted under this license. For commercial licensing, contact the project maintainer.
|
||||
|
||||
Reference in New Issue
Block a user