feat: add n8n serve node and harden remote access

- Add the n8n community node package with credentials, command mapping, direct serve TCP client, and browser-cli protocol crypto helpers.

- Cover Ed25519 signing, canonical JSON, PQ transport encryption, request mapping, and security behavior with unit tests.

- Harden serve-http with per-address rate limiting, an 8 MB request body cap, and clear warnings when binding plain HTTP beyond loopback.

- Stop one-shot --key overrides from being persisted automatically; document explicit remote trust and keep key-management behind the keys policy tier.

- Make HTML-to-Markdown conversion safer by bounding tree depth and dropping unsafe link/image URL schemes.

- Bump package and extension release metadata to 0.16.3.
This commit is contained in:
2026-06-19 10:00:23 +02:00
parent 7fe0e27fec
commit cea8a7e994
28 changed files with 3687 additions and 164 deletions
+8 -25
View File
@@ -37,7 +37,6 @@ terminal / python script / remote client
No local server needs to be running beforehand. The browser manages the native host's lifecycle. For cross-machine control, `browser-cli serve` starts an explicit TCP listener protected by Ed25519 public-key authentication unless you opt out with `--no-auth`.
**Message format**
Every command is a JSON object:
```json
{ "id": "uuid", "command": "tabs.list", "args": {} }
@@ -50,7 +49,6 @@ Every response:
---
## Installation
**Requirements:** Python 3.10+, [uv](https://github.com/astral-sh/uv), Chrome, Chromium, Brave, Edge, Vivaldi, or Firefox
browser-cli has two parts: the **CLI / native host** (a Python package) and the **browser extension** (published on the public stores).
@@ -109,7 +107,6 @@ Only the `browser-cli` command needs to be on your `PATH`. The browser launches
---
## Project structure
```text
browser-cli/
├── browser_cli/
@@ -145,7 +142,6 @@ browser-cli/
---
## CLI reference
During source development, commands are usually run as `uv run browser-cli [--browser ALIAS] <command>`. After tool installation, use `browser-cli ...` directly. Add `--remote HOST[:PORT]` and optionally `--key PATH` to target a browser exposed by `browser-cli serve`.
If exactly one browser instance is connected, commands auto-target it. Use `--browser ALIAS` when multiple browser instances are connected. `tabs list`, `tabs count`, `groups list`, `groups count`, `windows list`, and `session list` aggregate across all active browsers when `--browser` is omitted; in that mode they show the source browser alias or UUID. When local and saved remote browsers are mixed, tables group rows by source (`local` or the remote endpoint) and indent the browser profile below that group. You can inspect active instances with `browser-cli clients` and assign a persistent profile alias from inside the target browser with `browser-cli clients rename --browser <current-alias> <new-alias>`. Closed browsers are removed from the client registry automatically.
@@ -153,7 +149,6 @@ If exactly one browser instance is connected, commands auto-target it. Use `--br
Important: profile aliases are browser-instance aliases, not window aliases. Window aliases created with `windows rename` are only for targeting windows in commands like `nav open --window work`. If a browser instance has no explicit profile alias set, the native host gives it a generated UUID alias so multiple unaliased browsers stay distinct.
### Navigation (`nav`)
```sh
# Open a URL (no focus stealing by default)
browser-cli nav open https://example.com
@@ -175,7 +170,6 @@ browser-cli nav focus github # focuses first tab whose URL contains "
```
### Search
Each search command opens the search results in your browser using the same flags as `nav open`.
```sh
@@ -199,7 +193,6 @@ browser-cli search so click choices
```
### Tabs
```sh
browser-cli tabs list # list all open tabs (all windows)
browser-cli tabs count # count all tabs
@@ -227,7 +220,6 @@ browser-cli tabs merge-windows # pull all tabs into the current wi
```
### Tab groups
```sh
browser-cli groups list # list all tab groups
browser-cli groups count # count groups
@@ -249,7 +241,6 @@ browser-cli groups move 42 -l # short left alias
```
### Windows
```sh
browser-cli windows list # list all windows
browser-cli windows open # open a new window
@@ -259,7 +250,6 @@ browser-cli windows close 1 # close a window
```
### DOM
These commands run on the **active tab**. The tab must be on a regular `http://` or `https://` page — not a browser internal page like `brave://newtab`.
```sh
@@ -272,7 +262,6 @@ browser-cli dom type "#search" "hello" # type text into an input
```
### Extract
```sh
browser-cli extract links # all <a href> links on the page
browser-cli extract images # all <img> tags (src + alt)
@@ -284,7 +273,6 @@ browser-cli extract markdown --selector "article" # specific DOM subtree as Ma
```
### Sessions
A session is a snapshot of all open tab URLs, stored inside the extension via `chrome.storage.local`. Sessions survive browser restarts but are lost if the extension is uninstalled or extension data is cleared.
```sh
@@ -298,7 +286,6 @@ browser-cli session auto-save off
```
### Misc
```sh
browser-cli clients # show connected browser info from the registry
browser-cli clients rename --browser abcd1234 work # rename one connected browser instance
@@ -309,7 +296,6 @@ browser-cli completion zsh --script # output raw completion script
```
### Remote control, auth, and gateways
```sh
# On the machine with the browser
browser-cli auth keygen --output ~/.config/browser-cli/client.key
@@ -334,22 +320,24 @@ browser-cli serve-http --port 8766
curl -H "Authorization: Bearer <token>" http://127.0.0.1:8766/tabs
```
Remote auth uses Ed25519 challenge/response. `--remote` domains default to port 443; explicit `host:port` endpoints are also supported. Saved remote endpoints participate in aggregate list/count commands, where output is grouped by endpoint.
Remote auth uses Ed25519 challenge/response. `--remote` domains default to port 443; explicit `host:port` endpoints are also supported. Use `browser-cli remote trust ENDPOINT KEY` to remember a key for later calls. Saved remote endpoints participate in aggregate list/count commands, where output is grouped by endpoint.
#### n8n integration
browser-cli can't be installed inside an n8n container, so the [`n8n-nodes-browser-cli`](n8n-nodes-browser-cli/) community node talks to a remote `serve-http` gateway over HTTP(S). Run the gateway on the browser machine (behind TLS), drop its token into the node's credential, and drive tabs/DOM/extraction/raw commands from a workflow. See [`n8n-nodes-browser-cli/README.md`](n8n-nodes-browser-cli/README.md).
#### Security model
- **`serve` (TCP)** authenticates every connection with an Ed25519 signature over a fresh server nonce and, for modern clients, wraps the transport in an ML-KEM-768 (post-quantum) AEAD channel. Commands are gated by a **safe-only policy by default** — even a trusted key can only run read-only status/listing commands until you open more with `--allow-read-page`, `--allow-control`, `--allow-dangerous`, or `--allow-all` (full control, including `dom.eval`/`storage.*`). `--no-auth` is rejected on non-loopback hosts.
- **Per-key authorization:** a key in `authorized_keys` can carry an optional `allow:` token (`<pubkey> <name> allow:read-page,control`) listing its categories (`all`, `safe`, `read-page`, `control`, `dangerous`). That key uses its own policy, overriding the server-wide `--allow-*` default; keys without a token fall back to the default. Set it with `auth trust <pubkey> --allow-control …` (works locally and over `--remote`); `auth keys` shows each key's policy.
- **`serve` (TCP)** authenticates every connection with an Ed25519 signature over a fresh server nonce and, for modern clients, wraps the transport in an ML-KEM-768 (post-quantum) AEAD channel. Commands are gated by a **safe-only policy by default** — even a trusted key can only run read-only status/listing commands until you open more with `--allow-read-page`, `--allow-control`, `--allow-dangerous`, `--allow-keys`, or `--allow-all` (full control, including `dom.eval`/`storage.*`). `--no-auth` is rejected on non-loopback hosts.
- **Per-key authorization:** a key in `authorized_keys` can carry an optional `allow:` token (`<pubkey> <name> allow:read-page,control`) listing its categories (`all`, `safe`, `read-page`, `control`, `dangerous`, `keys`). That key uses its own policy, overriding the server-wide `--allow-*` default; keys without a token fall back to the default. Set it with `auth trust <pubkey> --allow-control …` when adding a key, or change it later with `auth policy <pubkey|name> …` (interactive picker when run with no args; `--safe`/`--server-default`/`--allow-*` for scripting). Both work locally and over `--remote`; `auth keys` shows each key's policy.
- **Key-management is its own category:** listing/trusting/repolicing keys (`auth keys`/`auth trust`/`auth policy` over `--remote`) requires the `keys` category. A key trusted only for browsing — even with full `control`+`dangerous` — cannot manage the trust store unless granted `allow:keys` (or `allow:all`). This prevents a compromised browser key from escalating by trusting its own.
- **Rate limiting:** `--rate-limit N` caps commands/second per client key (token bucket, default `100`, `0` disables) so a compromised key can't hammer the browser.
- **Audit logging:** request logs include the acting key (its name from `authorized_keys` plus a short pubkey), not just the client address.
- **`serve-http`** is a convenience gateway with the inverse trade-off: commands are gated by the same `--allow-*` policy (safe-only by default), but the bearer token travels in **clear text over plain HTTP**. It binds to loopback by default; `--no-auth` is only permitted there. If you must expose it beyond loopback, put it behind a TLS-terminating reverse proxy — never send the token over an untrusted network unencrypted.
- **`serve-http`** is a convenience gateway with the inverse trade-off: commands are gated by the same `--allow-*` policy (safe-only by default) and requests are throttled per client address (`--rate-limit`, default `100`/s) with an 8 MB body cap, but the bearer token travels in **clear text over plain HTTP**. It binds to loopback by default; `--no-auth` is only permitted there, and binding beyond loopback prints a loud cleartext warning. If you must expose it, put it behind a TLS-terminating reverse proxy — never send the token over an untrusted network unencrypted, and prefer `serve` (encrypted) for real remote use.
For low latency, an authenticated encrypted remote connection is kept open and reused for further commands in the same process — so SDK scripts and multi-browser fan-out avoid repeating the TCP/TLS/challenge handshake on every command. Aggregate commands also fan out to remote targets concurrently. Both degrade gracefully against older servers that handle one command per connection.
---
## Python SDK
```python
from browser_cli import AsyncBrowserCLI, BrowserCLI
@@ -485,7 +473,6 @@ raw = b.command("tabs.count", {"pattern": "github"}) # escape hatch for raw com
```
**Error handling**
```python
from browser_cli import BrowserCLI, BrowserNotConnected
@@ -517,7 +504,6 @@ if isinstance(counts, BrowserCounts):
---
## Example scripts
See `examples/demo.py` (Python) and `examples/demo.sh` (Bash) for full walkthroughs covering tabs, groups, DOM extraction, and session management.
```sh
@@ -528,7 +514,6 @@ bash examples/demo.sh
---
## Development
```sh
npm ci
npm run check:extension # type-check, build extension bundles, syntax-check bundle
@@ -569,7 +554,6 @@ For Firefox temporary testing via `about:debugging#/runtime/this-firefox`, run `
---
## Limitations
- **Browser internal pages** (`chrome://`, `brave://`, `edge://`, `about:`) cannot be scripted. DOM and extract commands only work on regular `http://` and `https://` pages.
- **Multiple browser instances can be auto-distinguished, but generated aliases are temporary**. Unaliased browsers get UUID aliases from the native host, which avoids collisions but is less ergonomic than setting a stable alias with `browser-cli clients rename --browser <current-alias> <new-alias>`.
- **Supported install targets are explicit, not “all Chromium browsers”**. The installer currently supports Chrome, Chromium, Brave, Edge, Vivaldi, and Firefox. Other Chromium-based browsers may use different or shared native messaging manifest locations, so they need browser-specific verification before being added safely.
@@ -578,7 +562,6 @@ For Firefox temporary testing via `about:debugging#/runtime/this-firefox`, run `
---
## License
PolyForm Noncommercial License 1.0.0. See [LICENSE](LICENSE).
Commercial use is not permitted under this license. For commercial licensing, contact the project maintainer.