ADR-0008: Plugins are subprocesses over JSON-RPC on stdio

Status: Accepted
Date: 2026-05-21
Last amended: 2026-05-25 (system plugins)
Deciders: @karasu
Supersedes: —
Superseded by: —

Context

kaged sits in front of adjacent operator tooling and other ecosystems (ADR-0001). It expresses that relationship via plugins — adapters that expose external capabilities (oh-my-pi presets, ollama model management, agentic-harness session bridges, custom operator extensions) to the kaged daemon.

Some plugins are first-party (we maintain them). Others will be community-written. The plugin model is a public contract.

The constraint set:

Plugins are not subagents. Subagents do work for the LLM inside a project session, in a [CAGED] sandbox. Plugins serve capabilities to the daemon — query a catalog, install a package, fetch a list of models. Different trust boundary, different lifecycle.
Plugins are not in the daemon's trust boundary. The plugin host crashes if a plugin crashes; the daemon does not. A plugin cannot read other plugins' state or the daemon's internal data.
Language-agnostic. Some adapters will be shell. Some operators may write Python adapters. Some may write Rust. The plugin contract must not assume TypeScript.
Debuggable. When a plugin misbehaves, the operator wants logs they can read, requests they can replay, errors they can pinpoint. The transport must be inspectable.
Versioned. Plugins declare which kaged API version they target; the daemon refuses incompatible ones.
Capability-scoped. A plugin that only needs to read a preset library should not have arbitrary filesystem or network access. The daemon enforces a capability allowlist per plugin.
No telemetry, no remote loading. Plugins are local files installed by the operator. The daemon does not fetch them from a registry on its own.

The viable transport options:

Subprocess + JSON-RPC over stdio. Plugin is a separate process; daemon talks to it via line-delimited JSON.
Subprocess + gRPC. Same isolation, but with protobuf schemas and bidirectional streaming.
In-process via a TypeScript module SDK. Plugin is a Bun-compatible JS/TS module loaded by the daemon.
WASM components. Plugin is a WASM binary the daemon executes.
HTTP API. Plugin is a long-running HTTP server the daemon calls.

Decision

Plugins are separate subprocesses spawned by the daemon. Communication is JSON-RPC 2.0 over line-delimited stdio (\n-terminated JSON messages on stdin/stdout). Plugin stderr is captured for operator logs. Plugin manifests declare a kaged API version, a capability allowlist, and a startup command. The daemon will refuse to load plugins with mismatched or unsupported manifests.

Transport details

Wire format: JSON-RPC 2.0 (https://www.jsonrpc.org/specification). One JSON object per line on stdout (plugin → daemon) and stdin (daemon → plugin). UTF-8.
Stderr is for logs. Plugin authors write structured logs (one JSON object per line, or freeform text — daemon captures both verbatim into the audit log).
Bidirectional. Daemon can call plugin methods. Plugins can emit notifications (server-initiated push, no response expected) for events like "catalog updated."
Initialization handshake: First message from daemon is initialize, plugin responds with its capabilities and the API version it implements. Daemon refuses to send any other request before initialize succeeds.
Graceful shutdown: Daemon sends shutdown; plugin acknowledges and exits within a configurable timeout (default 5s). Daemon SIGTERMs after timeout, SIGKILLs after a second timeout.

Plugin manifest

Each plugin ships a kaged-plugin.yaml manifest in its install directory:

name: oh-my-pi
version: 1.4.2
kaged_api: 1
description: Expose the oh-my-pi preset library and installer to kaged.
author: <name>
license: MIT
command: ["/usr/bin/bash", "/opt/kaged/plugins/oh-my-pi/run.sh"]
capabilities:
  - read:fs:/opt/oh-my-pi
  - exec:bash:/opt/oh-my-pi
  - net: []
methods:
  - presets.list
  - presets.search
  - preset.install
  - preset.uninstall

The manifest is parsed and validated at plugin install time and at daemon startup. The daemon enforces the capability allowlist — a plugin claiming only read:fs:/opt/oh-my-pi and attempting to write elsewhere is killed and an audit event is logged.

Plugin install layout

${KAGED_HOME}/plugins/
  oh-my-pi/
    kaged-plugin.yaml
    run.sh
    ... (plugin's own files)
  ollama/
    kaged-plugin.yaml
    main.py
    ...

Plugins are installed by the operator (copy a directory, run kaged plugin install <path>). The daemon does not download plugins from a registry. A plugin registry may exist as a list (a kaged.dev-curated index of known-good plugins), but installs are manual or scripted by the operator.

Capability allowlist

The capability strings the manifest may declare (initial set, extensible):

read:fs:<path> — read access to a host filesystem path.
write:fs:<path> — write access (rare; most plugins should be read-only).
exec:<binary>:<path> — permission to execute a specific binary on a path.
net:<host:port> — permission to reach a network endpoint (e.g., for an LLM-provider plugin).
kaged:storage:read / kaged:storage:write — access to a plugin-scoped slice of the daemon's SQLite store (each plugin gets its own logical schema).

The daemon enforces these via a thin sandbox on the plugin process (see Implementation notes). A plugin without net:* capability runs in a netns with no network. A plugin without read:fs:/foo cannot see /foo.

Trust boundaries

Plugin → daemon: Untrusted. The daemon validates every JSON-RPC payload against the method's schema, rejects bad inputs, treats plugin output as untrusted strings.
Daemon → plugin: Trusted by the plugin. Plugins assume the daemon's calls are well-formed; the daemon is responsible for not corrupting plugins.
Plugin → plugin: No direct path. Plugins cannot talk to other plugins. If two plugins need to coordinate, they do it through daemon-mediated state or events.
Plugin → host: Mediated by the plugin's capability allowlist. The daemon spawns the plugin under bwrap (same mechanism as cages — see ADR-0009) compiled from the manifest's capabilities.

Consequences

What this commits us to

A plugin SDK package. TypeScript SDK first (matches daemon's language; easiest to keep in sync). Reference implementations in Python and shell follow. The SDK handles the JSON-RPC framing, the initialize handshake, manifest authoring helpers, and the capability-allowlist contract.
A wire-protocol spec. docs/specs/plugin-host.md will be normative. Methods, error codes, notification taxonomy.
Per-plugin sandbox via bwrap. Same mechanism as cages. The plugin manifest's capability allowlist compiles to a bwrap argv (mostly read-only mounts, no network unless declared).
Per-plugin SQLite slice. Plugins get a logical schema in the daemon's database. The daemon brokers; plugins call kaged:storage methods.
Manifest schema versioning. kaged_api: 1 is the first contract. Bumping to 2 requires every plugin to update its manifest. Compatibility windows are documented per release.
A kaged plugin CLI surface. kaged plugin list, kaged plugin install <path>, kaged plugin enable/disable <name>, kaged plugin logs <name>.

What this forecloses

No in-process plugins for v0. A plugin cannot import daemon internals or run in the daemon's address space. The plugin SDK does not expose internal types beyond the JSON-RPC schema.
No automatic plugin updates from a registry. kaged doesn't phone home for plugin manifests; operators are responsible for updating plugins.
No plugin-to-plugin direct calls. All cross-plugin coordination is daemon-mediated. Keeps the topology tractable; prevents a plugin from impersonating another plugin's caller.
No "trusted" plugin tier. Every plugin is sandboxed, even first-party ones. The oh-my-pi adapter is no more trusted than a random community plugin. This is intentional.

What becomes easier

Language-agnostic plugin authoring. Bash + jq is enough to write a simple plugin. Python with a small SDK is enough for a real one. We don't gatekeep on TypeScript.
Plugin debugging. A plugin's stdin/stdout is a stream of JSON. tee, jq, replay scripts, all just work. We document this for plugin authors.
Plugin testing. Test a plugin in isolation by piping JSON-RPC requests to its stdin and asserting on stdout. No daemon needed.
Plugin crashes are recoverable. Plugin SIGSEGV → daemon notices the EOF on stdin/stdout → daemon restarts the plugin (with backoff) → operator sees an audit event. No cascade into the daemon.
Audit story. Every plugin call lives in the daemon's audit log with the user_id, the plugin name, the method, the payload (redacted if marked sensitive). Operators can answer "what did plugin X do this week?"

What becomes harder

Performance for chatty plugins. Plugin call overhead is process-boundary + JSON-RPC parse cost. Plugins that need many small calls per second feel this. For the v0 plugin profile (preset browsing, ollama model listing, occasional install/run), it's irrelevant. If chattier use cases emerge, we add an in-process trusted-plugin tier later — but only with deliberate scope.
Schema discipline. Every plugin method's request/response is a JSON Schema. Drift between SDK types and plugin implementations is a class of bugs. Mitigated by code-gen and CI on the schema.
Capability allowlist complexity. The allowlist grammar must be expressive enough for real plugins and restrictive enough to be a meaningful boundary. We'll iterate.
Plugin authors learn a new contract. Not Express, not FastAPI, not just-call-this-function — they implement JSON-RPC methods. SDKs reduce friction; documentation makes or breaks adoption.

Implementation notes (not normative)

JSON-RPC parsing: Strict mode. Reject batched requests on stdin (we never need them, they complicate buffering). One request per line. Plugin must not write to stdout outside the JSON-RPC channel.
Stdout discipline: Plugin authors who print(...) to stdout for "debugging" will break the protocol. SDK provides a log() helper that goes to stderr.
Restart policy: Daemon supervises plugins with exponential backoff (1s, 2s, 4s, 8s, capped at 60s). After N consecutive failures, plugin is marked failed and disabled; operator can re-enable via CLI/UI.
Manifest validation: JSON Schema for the manifest, like the project DSL (ADR-0006). Same validation discipline.
Sandbox composition: The plugin's bwrap argv is generated by the same cage compiler used for subagents (DRY win). Capabilities map to bwrap flags exactly as cage policies do.
Plugin-scoped SQLite: Each plugin gets a SQL schema prefix (plugin_pi_apps_*). The daemon mediates; plugins call kaged:storage:exec with their statements. No raw connection handed to the plugin.

Alternatives considered

Alternative A — In-process TypeScript module SDK

Why tempting: Lowest call overhead. Type safety end-to-end. Trivial code.

Why rejected: Erases the trust boundary. A plugin bug is a daemon crash. A plugin's require('child_process').exec() is a sandbox bypass. Forces every plugin author into TypeScript. The performance argument doesn't hold for our usage profile.

Alternative B — Subprocess + gRPC

Why tempting: Strong schemas (protobuf), bidirectional streaming, multiple mature implementations across languages.

Why rejected: Protobuf is the wrong default for self-hosted, hackable software. Operators inspect plugin payloads with jq; with gRPC they need grpcurl + a .proto file + reflection enabled. Debugging gets harder. The wire-format learning curve for plugin authors is higher. We can add a gRPC transport in v1.x for plugins that need streaming throughput; v0 is JSON-RPC.

Alternative C — WASM components

Why tempting: Strong sandbox (WASM runtime), language-agnostic via WIT bindings, modern and trendy.

Why rejected: Tooling immature for many plugin-author languages (especially bash, Python). WASI capability model is still settling. Many likely v0 adapters are shell scripts today; making them WASM is a complete rewrite. We can revisit when WASM components have a Python and shell on-ramp.

Alternative D — HTTP server plugins

Why tempting: Familiar pattern, easy to write in any language, easy to test with curl.

Why rejected: Plugins as long-running HTTP servers consume ports and require lifecycle management (the daemon has to know each plugin's port, retry on failure, manage TLS or trust localhost). Subprocess + stdio has the same isolation with simpler operational shape. HTTP plugins are also harder to sandbox — they'd want their own netns or a unix socket, which is back to subprocess-with-extra-steps.

Alternative E — No plugin model in v0

Why tempting: Defer entirely. Build the daemon, add plugins later.

Why rejected: The oh-my-pi and ollama adapters are explicitly cited in ADR-0001 and the architecture doc as how kaged engages the prior ecosystem. Shipping v0 without a plugin model means shipping v0 without that engagement. The plugin contract being a public surface means we want operator feedback on it sooner, not later.

Amendments

2026-05-21 — Project plugins and install-on-load

ADR-0011 established that projects must be shareable. A project may declare plugins it needs; the operator receiving the project may not have them installed. This amendment defines the two-tier plugin model that resolves the tension between "projects can ask for plugins" and "operators control what runs on their machine."

Two plugin scopes:

Local plugins. Installed and configured by the operator for their own use, in their local config. Available to any project that wants to use them (subject to the project's plugins: list). The operator's own customizations — a personal kubernetes-helper adapter, a logging-into-grafana plugin — live here. Never travel with projects.
Project plugins. Declared in the project DSL's plugins: list (project-dsl.md). When the project is loaded by a daemon, each declared plugin is checked against the local plugin store. Missing plugins prompt the operator: "this project wants plugin X version Y. Install?" If accepted, the plugin is installed into the local store and activated for this project only. If declined, the project loads in pending state and refuses sessions until resolved.

Install location is the same. Whether a plugin came from operator-local-install (kaged plugin install <path>) or from project-driven install (operator approved the install prompt while loading a project), it ends up in the same local plugin store under ${KAGED_HOME}/plugins/. There is no separate "project plugin directory."

Activation is per-project. A plugin installed via a project-load prompt is, by default, activated only for that project. Other projects that don't list it in their DSL don't see it. The operator can promote it to local-plugin status (active for any project) via kaged plugin promote <name> or in local config.

Install sources for project plugins:

A project DSL's plugins: entry may carry a source field that tells kaged where to fetch the plugin from if it isn't already installed:

plugins:
  - name: oh-my-pi
    source: "https://github.com/oh-my-pi/kaged-adapter"  # git URL
    version: "1.4.2"
    config:
      preset_dir: ./oh-my-pi-presets
  - name: custom-thing
    source: "./plugins/custom-thing"                      # relative path inside the project

Sources are advisory: kaged shows them to the operator at the install prompt, but the operator decides whether to trust the source. There is no auto-installation from the internet without explicit operator consent on the install prompt. The operator may also pre-install the plugin manually and skip the prompt entirely.

Trust at install time:

Project plugins are not auto-trusted. The install prompt shows:

The plugin name, version, and source.
The capability allowlist from its manifest (read:fs:..., exec:..., net:...).
A diff if a different version of the same plugin is already installed.

The operator approves or declines. Declining is fine — the project loads pending. There is no "install everything" shortcut.

Plugin lifecycle in per-project activation:

Spawned when the activating project's first session starts.
Restart/health/shutdown semantics unchanged from this ADR's main body.
Multiple projects activating the same plugin share one process (it's the same install in the local store). The plugin sees calls from all activating projects; the daemon passes the project ID with each request so plugins that care can scope behavior.

Local config schema for plugins:

See local-config.md for the exact shape. Briefly:

[plugins.oh-my-pi]
installed = "1.4.2"
local = true                    # available to any project, not just project-activated
config = { preset_dir = "/opt/oh-my-pi/presets" }

[plugins.custom-internal]
installed = "0.3.0"
local = true
# (no `config` block — plugin defaults apply)

Plugins installed via project-load prompts start with local = false. The operator promotes them by setting local = true (via kaged plugin promote or by editing local config).

What does not change:

The plugin SDK and JSON-RPC contract.
The capability allowlist and sandbox.
The plugin manifest format (kaged-plugin.yaml).
The supervisor restart and health behavior.

This amendment only changes how plugins arrive on a machine and which projects they're active for. The runtime model is unchanged.

2026-05-25 — System plugins: trusted in-process tier

The original decision states "No 'trusted' plugin tier" and "No in-process plugins for v0." This amendment introduces a deliberate exception: system plugins — operator-installed TypeScript packages that run in the daemon process with full trust.

Why the exception:

Project plugins (the original model) are untrusted code from external sources — sandboxing via subprocess + JSON-RPC is correct. But a class of daemon-level integrations must hook into daemon internals that are impossible to expose over JSON-RPC:

Auth lifecycle hooks (nonce generation, launch URL availability, cookie issuance)
Daemon boot/shutdown lifecycle
Transport-level events (WebSocket connections, HTTP request lifecycle)
Future: alternative transport adapters (Matrix bridge, IRC, etc.)

These hooks touch private daemon state. Exposing them over JSON-RPC would mean inventing a daemon-internal event bus, serializing transient state, and trusting a sandboxed process with security-critical auth data. The complexity cost exceeds the security benefit — the operator installing a system plugin is the same operator who runs the daemon binary.

Two plugin tiers (updated model):

System plugins (new). TypeScript (Bun-compatible) packages, dynamically imported by the daemon at startup. Run in-process. No sandbox. Operator installs them explicitly; they are never auto-installed from project DSL. Configured in local config under [system_plugins.<name>]. The daemon exposes a typed hook API; system plugins register callbacks. System plugins are the operator's own extensions to the daemon — they have the same trust level as the daemon itself.
Project plugins (unchanged). Subprocess + JSON-RPC + sandbox. Everything in the original decision and the 2026-05-21 amendment applies unchanged. Project plugins remain untrusted, sandboxed, and language-agnostic.

What does NOT change:

Project plugins: subprocess model, JSON-RPC contract, capability allowlist, manifest schema, supervisor behavior — all unchanged.
"Every plugin is sandboxed" becomes "every project plugin is sandboxed." System plugins are explicitly trusted.
The plugin SDK (@kaged/plugin-sdk) remains the project-plugin SDK. System plugins use a separate, smaller API surface (the daemon hook types).
--no-sandbox flag still only affects subagent cages, not project plugins.

Why this is not scope creep:

System plugins are deliberately minimal. They are:

TypeScript only (same language as the daemon — no language-agnostic contract needed).
In-process only (no subprocess, no JSON-RPC, no manifest, no capability grammar).
Hook-based only (they register callbacks on daemon lifecycle events — they don't declare methods or expose an API surface).
Operator-local only (never declared in project DSL, never travel with projects, never auto-installed).

The first system plugin (webhook-notify) demonstrates the pattern: it hooks onLaunchUrlReady and POSTs the URL to a configured webhook endpoint. ~30 lines of TypeScript. No sandbox overhead. No JSON-RPC framing. Just a callback.

Permission model (deferred):

System plugins run with full daemon trust in v0. A capability-scoped permission model for system plugins (e.g. "this plugin may only hook auth events, not storage events") is important but deferred. The spec documents this as a required future addition. See docs/specs/plugins/system-plugins.md.

Spec location:

System plugin specs live in docs/specs/plugins/ alongside individual plugin specs. The existing plugin-host.md remains the normative spec for project plugins (the subprocess model). See:

docs/specs/plugins/README.md — plugin ecosystem overview
docs/specs/plugins/system-plugins.md — system plugin framework
docs/specs/plugins/webhook-notify.md — first system plugin

References

docs/02-architecture.md — the plugin host component
docs/03-glossary.md — plugin, adapter, distinction from subagent
ADR-0001 — the "in-front" position that requires a plugin model
ADR-0004 — the runtime hosting the plugin host
ADR-0005 — the storage layer plugins access via brokered methods
ADR-0009 — the sandbox mechanism plugins are spawned under
ADR-0011 — project portability, the reason for the local/project split
docs/specs/local-config.md — plugin store schema in local config
JSON-RPC 2.0: https://www.jsonrpc.org/specification
Original discussion: design conversation with colleagues, 2026-05-21
Amendment (project plugins): colleagues, 2026-05-21