Spec: Project Terminals

Purpose

This spec defines the daemon-internal wiring that makes project terminals work end-to-end: from the operator clicking "Run" in the UI, through the daemon spawning a process (raw PTY or tmux), to terminal output streaming through the WebSocket to xterm.js in the browser.

This spec fills the gap identified in app-shell-plan.md §A9: the task-runner package has a state machine and backend resolution logic, but nothing executes the side effects those machines emit. This spec is the bridge.

This document is normative for:

  • The PTY broker — daemon-internal registry that owns PTY file descriptors (or tmux control-mode connections), buffers output, and routes I/O to WebSocket subscribers.
  • The tmux runtime driver — how the daemon interacts with tmux: session creation, control-mode attachment, keystroke relay, resize, and daemon-restart recovery.
  • The side-effect interpreter — how @kaged/task-runner side effects (allocate_pty, create_tmux_window, stream_output, etc.) become real I/O operations in the daemon.
  • The xterm.js component — the React component in packages/ui/ that wraps xterm.js, handles WebSocket attachment, resize, and reconnection.
  • The npm dependency contract — which xterm.js packages to install and how they're loaded.

It is not normative for:

  • The task model, task lifecycle, or DSL tasks: block (that's task-runner.md).
  • The HTTP API endpoints for launching, listing, or stopping tasks (that's task-runner.md §API surface and http-api.md).
  • The WebSocket frame structure or PTY channel protocol (that's http-api.md §PTY channel).
  • The xterm.js theme or brand-specific terminal config (that's ui/README.md §Terminal rendering — already implemented in packages/ui/src/terminal-theme.ts).
  • The tab-strip integration or terminal naming in the shell (that's ui/app-shell.md §Project terminals).
  • The sandbox mechanism (tasks are not caged — per task-runner.md).

Constraints (from ADRs)

Constraint Source
Web UI is the product; terminal output renders in xterm.js in the browser ADR-0002
Runtime is Bun; process spawning uses Bun.spawn; no node-pty, no Node.js APIs ADR-0004, AGENTS.bun.md
Task metadata persisted to SQLite/Postgres ADR-0005
Works identically in per-user and system-wide deployment modes ADR-0010
Task definitions use project-relative paths ADR-0011

Bun-specific constraint (binding)

Per AGENTS.bun.md and ADR-0004:

  • No node-pty. The daemon uses Bun's built-in PTY support. Bun.spawn with stdio: ["pipe", "pipe", "pipe"] provides raw process I/O. For true PTY allocation (terminal escape sequences, SIGWINCH, job control), the daemon uses the PTY functionality exposed by Bun's native API or falls back to spawning through tmux (which handles the PTY internally).
  • No ws package. WebSocket is Bun.serve() built-in.
  • No child_process. Bun.spawn is the only process-spawning API.

The tmux backend sidesteps the raw PTY question entirely: tmux owns the PTY, and the daemon communicates with tmux via control mode (tmux -C), which is plain text over stdin/stdout of a Bun.spawn child.


Architecture overview

Operator's browser                         Daemon process                              Host OS
┌─────────────┐                   ┌──────────────────────────────────┐     ┌──────────────────┐
│  xterm.js   │◀── WebSocket ───▶│  PTY Broker                      │     │                  │
│  component  │    (binary        │  ┌──────────────────────────────┐│     │  tmux server     │
│             │     frames)       │  │ PTY Registry                 ││     │  ┌────────────┐  │
│  FitAddon   │                   │  │  task-id → PtyHandle         ││     │  │ kaged-proj │  │
│  WebglAddon │                   │  │  ┌─────────┐ ┌────────────┐ ││     │  │  window 0  │  │
└─────────────┘                   │  │  │ output  │ │ scrollback │ ││     │  │  window 1  │  │
                                  │  │  │ ring    │ │ buffer     │ ││     │  └────────────┘  │
                                  │  │  └─────────┘ └────────────┘ ││     │                  │
                                  │  └──────────────────────────────┘│     │  Shell processes │
                                  │                                  │     │  (bash, zsh, sh) │
                                  │  Side-Effect Interpreter         │     └──────────────────┘
                                  │  ┌──────────────────────────────┐│
                                  │  │ TaskRunner state machine     ││
                                  │  │  → allocate_pty              ││
                                  │  │  → create_tmux_window        ││
                                  │  │  → stream_output             ││
                                  │  │  → send_sigterm              ││
                                  │  │  → destroy_pty               ││
                                  │  └──────────────────────────────┘│
                                  └──────────────────────────────────┘

Three layers, each with a single responsibility:

  1. Side-effect interpreter — receives TaskSideEffect values from the task-runner state machine and translates them into PTY broker calls and tmux driver commands.
  2. PTY broker — owns the registry of active PTY handles, buffers output in a ring buffer (10,000 lines), routes I/O between WebSocket subscribers and the underlying PTY or tmux pane.
  3. tmux runtime driver — manages the tmux server connection via control mode, parses structured output, maps pane IDs to task IDs, and handles daemon restart recovery.

PTY broker

The PTY broker is a daemon-internal registry. It does not expose an HTTP API — it is consumed by the side-effect interpreter and the WebSocket handler.

PtyHandle interface

interface PtyHandle {
  /** Task instance ID (ULID). */
  taskId: string;
  /** Project this task belongs to. */
  projectId: string;
  /** The backend kind. */
  backend: "pty" | "tmux";
  /** Write bytes to the process's stdin. */
  write(data: Uint8Array): void;
  /** Resize the terminal. */
  resize(cols: number, rows: number): void;
  /** Kill the process (SIGTERM, then SIGKILL after grace). */
  kill(): Promise<void>;
  /** Register a listener for output bytes. */
  onData(cb: (data: Uint8Array) => void): void;
  /** Register a listener for process exit. */
  onExit(cb: (exitCode: number) => void): void;
  /** Remove all listeners and release resources. */
  dispose(): void;
  /** Current terminal dimensions. */
  cols: number;
  rows: number;
}

PtyBroker interface

interface PtyBroker {
  /** Spawn a process and register it. Returns the handle. */
  spawn(opts: PtySpawnOptions): Promise<PtyHandle>;
  /** Attach to an existing tmux pane (for daemon-restart recovery). */
  reattach(opts: PtyReattachOptions): Promise<PtyHandle>;
  /** Look up a handle by task ID. */
  get(taskId: string): PtyHandle | undefined;
  /** List all active handles for a project. */
  listByProject(projectId: string): PtyHandle[];
  /** Total active handles. */
  count(): number;
  /** Shut down all handles (daemon shutdown). */
  disposeAll(): Promise<void>;
}

interface PtySpawnOptions {
  taskId: string;
  projectId: string;
  command: string;
  cwd: string;
  env?: Record<string, string>;
  backend: "pty" | "tmux";
  /** tmux-specific: session and window names. */
  tmuxSession?: string;
  tmuxWindow?: string;
  /** Initial terminal dimensions (from the first subscriber, or defaults). */
  cols?: number;
  rows?: number;
}

interface PtyReattachOptions {
  taskId: string;
  projectId: string;
  tmuxSession: string;
  tmuxWindow: string;
}

Output ring buffer

Each PtyHandle maintains a ring buffer of recent output. This is the scrollback that gets replayed when an operator reconnects.

  • Capacity: 10,000 lines (matches task-runner.md §Operating limits and ui/README.md §xterm.js integration scrollback: 10000).
  • Implementation: use @kaged/utils Ring (fixed-capacity circular buffer) with raw Uint8Array chunks. Line counting is approximate (count \n bytes) for the limit; the buffer stores raw bytes, not parsed lines.
  • Replay on subscribe: when a WebSocket client subscribes to a PTY channel (pty:task:<task_id>), the broker sends the ring buffer contents as an initial burst before switching to live streaming.

Concurrency enforcement

The broker enforces the per-project PTY limit:

  • Default: 8 concurrent PTYs per project (from task-runner.md §Operating limits, aligned with session-manager.md PTY limits).
  • Shared limit: project terminals and subagent PTYs count against the same pool (per ui/app-shell.md §Limits).
  • Enforcement: spawn() checks listByProject(projectId).length before allocating. If at limit, it throws a typed error that the side-effect interpreter translates to a 429 rate_limited API response.

tmux runtime driver

When the resolved backend is tmux, the daemon manages tmux sessions via the tmux CLI. The driver is a stateful singleton created at daemon startup.

tmux session model

Per task-runner.md §tmux session naming:

  • One tmux session per project: kaged-<project_slug> (e.g., kaged-music-site).
  • One tmux window per task: task-<task_slug_or_id> (e.g., task-test, task-01HXAB).
  • The daemon owns these sessions. It creates them on first task launch per project and destroys them when the project is unloaded.
  • All daemon tmux traffic runs on a dedicated private tmux server. Every tmux invocation MUST use -L <socketName> -f /dev/null, where socketName is derived from the daemon home directory (config.daemon.home) via a stable short hash. The daemon MUST NOT touch the operator's default tmux server or personal tmux config.

Control mode connection

Per task-runner.md §tmux attachment mode, the daemon connects to tmux via control mode (tmux -C), not raw tmux attach. This keeps tmux chrome (status bar, key bindings, prefix) invisible to the operator's xterm.js.

interface TmuxDriver {
  /** Ensure the tmux session exists. Creates it if not. */
  ensureSession(sessionName: string, cwd: string, cols?: number, rows?: number): Promise<void>;
  /** Create a new window in an existing session and run a command. */
  createWindow(sessionName: string, windowName: string, command: string, cwd: string, env?: Record<string, string>, cols?: number, rows?: number): Promise<TmuxPaneInfo>;
  /** Attach to a session in control mode. Returns a handle for I/O. */
  attachControlMode(sessionName: string): Promise<TmuxControlConnection>;
  /** Send keystrokes to a specific pane. */
  sendKeys(sessionName: string, paneId: string, data: string): Promise<void>;
  /** Resize a pane. */
  resizePane(sessionName: string, paneId: string, cols: number, rows: number): Promise<void>;
  /** Resize a task window authoritatively. */
  resizeWindow(sessionName: string, windowName: string, cols: number, rows: number): Promise<void>;
  /** Kill a specific window (graceful: C-c first, then kill-window). */
  killWindow(sessionName: string, windowName: string): Promise<void>;
  /** Kill an entire session. */
  killSession(sessionName: string): Promise<void>;
  /** List existing kaged-* sessions for recovery. */
  listKagedSessions(): Promise<TmuxSessionInfo[]>;
  /** List windows in a session. */
  listWindows(sessionName: string): Promise<TmuxWindowInfo[]>;
  /** Read the scrollback buffer of a pane. */
  capturePane(sessionName: string, paneId: string, lines: number): Promise<string>;
}

interface TmuxControlConnection {
  /** Stream of per-pane output events. */
  onPaneOutput(cb: (paneId: string, data: Uint8Array) => void): void;
  /** Stream of window/pane lifecycle events. */
  onEvent(cb: (event: TmuxControlEvent) => void): void;
  /** Close the control-mode connection. */
  close(): void;
}

interface TmuxPaneInfo {
  paneId: string;
  windowName: string;
  pid: number;
}

interface TmuxSessionInfo {
  name: string;
  windowCount: number;
  created: number;
}

interface TmuxWindowInfo {
  name: string;
  paneId: string;
  pid: number;
  active: boolean;
}

type TmuxControlEvent =
  | { type: "window-add"; windowName: string; paneId: string }
  | { type: "window-close"; windowName: string }
  | { type: "pane-exited"; paneId: string; exitCode: number }
  | { type: "session-closed"; sessionName: string };

Control mode implementation

The daemon spawns tmux -L <socketName> -f /dev/null -C attach-session -t <session> -f ignore-size via Bun.spawn. The -L <socketName> selector binds the daemon to its dedicated tmux server, the first -f /dev/null ignores the operator's personal tmux config (per task-runner.md open question #6), and the trailing -f ignore-size is required because tmux control-mode clients otherwise participate in window-size arbitration.

Every other tmux call uses the same prefix ordering before the subcommand:

tmux -L <socketName> -f /dev/null new-session ...
tmux -L <socketName> -f /dev/null set-option ...
tmux -L <socketName> -f /dev/null list-sessions ...

This isolation is load-bearing: server-global tmux options such as set-option -g window-size manual and set-option -g default-size 120x40 must apply only to the daemon's own tmux server, never to the operator's shared default server.

Command ordering and option scope (required): because the daemon's tmux server is private and starts empty, ensureSession MUST issue new-session before any set-option. A set-option against a socket whose server is not yet running fails with no server running / error connecting to <socket>. Order: has-session check → new-session (if absent) → set-option -g exit-empty offset-option -t <session> window-size manualset-option -t <session> default-size <cols>x<rows>.

Three tmux behaviors are load-bearing here (verified against tmux 3.6b):

  • exit-empty off (server-global) keeps the private server alive when a window's process exits, so the server survives between commands and across daemon restarts. Without it, the server tears itself down ("server exited unexpectedly").
  • window-size/default-size are set PER-SESSION (-t), never globally (-g). A global window-size manual causes the subsequent new-window to abort the server ("server exited unexpectedly") on tmux 3.6b. The per-session option is honored and is sufficient for operator-driven resize-window to stick.
  • new-window targets <session>: (trailing colon), which selects the next free window index. Plain -t <session> resolves to index 0 and collides with the session's default window (create window failed: index 0 in use).

The driver forces window-size manual per session (applied even when the session already exists):

tmux set-option -t <session> window-size manual
tmux set-option -t <session> default-size 120x40

This must NOT be set globally (-g window-size manual) — on tmux 3.6b that makes the next new-window abort the server. The per-session option is honored and is sufficient for operator-driven resize-window to stick. Without it, the control-mode client clamps the session toward its own terminal size and browser-driven resizes don't stick, leaving full-screen TUIs in a stale 80-column corner with ghosted redraws.

Control-mode output decoding (byte-correct)

The control-mode reader MUST parse tmux -C output as raw bytes, never via a UTF-8 TextDecoder. Per tmux control.c (control_append_data), in a %output/%extended-output line tmux escapes only bytes 0x00–0x1F and 0x5C (\) as \ddd (3-digit octal); all other bytes — including 0x7F and the entire high range 0x80–0xFF — are passed through raw. Consequences for the parser:

  • Read the child's stdout into a byte buffer and split records on raw byte 0x0A. tmux escapes \n/\r inside payloads as \012/\015, so 0x0A only ever terminates a record — line-splitting on it is safe.
  • Decode the %output payload byte-wise: \ddd → one byte; copy every other byte unchanged. This preserves multibyte UTF-8 (e.g. Nerd Font Private-Use-Area glyphs like U+F07B = EF 81 BB), ESC/CSI sequences, and arbitrary 8-bit data.
  • A literal backslash is emitted by tmux as \134, never \\.

Decoding the stream as a UTF-8 string (or applying charCodeAt to a decoded string) corrupts the byte stream: invalid UTF-8 becomes U+FFFD, and any code point > 0xFF truncates to 8 bits (U+F07B0x7B). That manifests as garbled Nerd Font glyphs and lost clear/alternate-screen escape sequences (TUI ghosting) in the browser terminal — the byte stream is faithfully rendered by xterm.js, but it was already corrupt before it left the daemon.

This fallback covers windows born before the browser sends its first pty.resize frame.

Control-mode output is a structured text stream. Each notification line starts with %. The daemon parses:

Line prefix Meaning Action
%output <pane_id> <data> Pane produced output Route data (base64-decoded) to the corresponding PtyHandle.onData listeners
%window-add <window_id> A window was created Update internal window→pane mapping
%window-close <window_id> A window was closed Fire PtyHandle.onExit for the associated task
%pane-exited <pane_id> <exit_code> Pane process exited Fire PtyHandle.onExit with the exit code
%session-closed The tmux session was destroyed Mark all tasks in this session as failed with error: "tmux_session_closed"

Input to a pane is sent via the control-mode stdin: send-keys -t <pane_id> <key_data>.

The send-keys command uses -l to send literal characters (not tmux key names). However, -- only ends option parsing; tmux still treats ; as a command separator at the CLI level. Therefore, the driver MUST escape standalone ; characters as \; before passing them to send-keys. With -l, \; is interpreted by tmux's command parser as a single literal ; — not a \ followed by ;. This is necessary for characters like ; that would otherwise split the command into multiple tmux commands.

Resize is authoritative at the tmux window level and may also be mirrored at the pane level:

tmux resize-window -t <session>:<window> -x <cols> -y <rows>
tmux resize-pane -t <pane_id> -x <cols> -y <rows>

For kaged-managed task terminals, each task is one tmux window with one pane, so resize-window is the authoritative size operation. resize-pane may still be issued as a belt-and-suspenders update, but it is not sufficient on its own when tmux is still arbitrating window size against an attached control client.

Initial terminal dimensions flow from the task-launch request when the client knows them. For the tmux backend, the driver uses those dimensions in two places:

  1. new-session -x <cols> -y <rows> when creating the project's first tmux session.
  2. resize-window -x <cols> -y <rows> immediately after new-window, so a new window in an already-existing session still starts at the operator's requested size.

Each new task window MUST also disable aggressive-resize:

tmux set-window-option -t <session>:<window> aggressive-resize off

This keeps full-screen TUIs from participating in their own resize tug-of-war inside kaged-managed one-pane task windows.

If launch-time dimensions are absent, the daemon falls back to the same default as the PTY broker (80x24) until the browser subscribes and sends the first pty.resize frame.

Fallback: pipe mode

If control mode is unavailable (tmux < 3.2 slipped through, unusual build), the driver falls back to pipe mode per task-runner.md §Fallback:

  1. Output: tmux pipe-pane -t <pane> 'cat > <fifo>' — daemon reads the FIFO.
  2. Input: tmux send-keys -t <pane> <data>.
  3. Resize: tmux resize-pane -t <pane> -x <cols> -y <rows>.

Pipe mode is less efficient (polling, no structured events) but functional. The daemon logs task_runner.tmux_pipe_fallback at startup.

Raw PTY backend

When the backend is pty (no tmux), the daemon spawns the process directly via Bun.spawn:

const proc = Bun.spawn(["sh", "-c", command], {
  cwd,
  env: { ...Bun.env, ...taskEnv },
  stdin: "pipe",
  stdout: "pipe",
  stderr: "pipe",
});

The PtyHandle wraps this process:

  • write()proc.stdin.write(data)
  • onData() → reads from proc.stdout and proc.stderr (merged into a single stream for the terminal)
  • resize() → no-op for raw pipe mode (true PTY resize requires the tmux backend or Bun's native PTY API when available)
  • kill()proc.kill("SIGTERM"), then proc.kill("SIGKILL") after 5 seconds
  • onExit()proc.exited.then(code => cb(code))

Limitation: raw Bun.spawn with pipe stdio does not allocate a real PTY. Programs that check isatty() will get false, colors may be suppressed, and SIGWINCH-based resize won't work. This is acceptable for the pty fallback backend — the tmux backend is the recommended default precisely because it handles real PTY allocation. The backend resolution table in task-runner.md §Backend resolution defaults to tmux when available.


Side-effect interpreter

The interpreter is a daemon-internal module (packages/daemon/src/runtime/interpret-task-effects.ts) that receives TaskSideEffect[] from the task-runner state machine and executes them against the PTY broker, tmux driver, storage adapter, and WebSocket registry.

Effect → action mapping

Side effect Action
persist_instance Write TaskInstanceRecord to SQLite via StorageAdapter
emit_task_launched Publish task.launched on the project's task WebSocket events channel
allocate_pty Call PtyBroker.spawn() with backend: "pty"
create_tmux_window Call TmuxDriver.ensureSession(cols?, rows?) then TmuxDriver.createWindow(cols?, rows?), then PtyBroker.spawn() with backend: "tmux"
stream_output Register PtyHandle.onData() → publish bytes on WebSocket pty:task:<task_id> channel
persist_exit Update TaskInstanceRecord in storage (state, exit_code, exited_at, duration_ms)
persist_transcript Read the PtyHandle's ring buffer, write to task_transcripts table
send_sigterm Call PtyHandle.kill() (which sends SIGTERM then SIGKILL)
send_sigkill (Handled by PtyHandle.kill()'s escalation timer)
destroy_pty Call PtyHandle.dispose() — removes from broker registry, releases fd/connection
emit_task_exited Publish task.exited on the project's task WebSocket events channel
emit_task_stopped Publish task.stopped on the project's task WebSocket events channel
persist_stop Update TaskInstanceRecord in storage (state → stopped)
persist_failure Update TaskInstanceRecord in storage (state → failed, error message)
emit_task_state Publish task.state on the project's task WebSocket events channel

Execution order

Side effects from a single state transition are executed in array order. If an effect fails, the interpreter logs the error and continues with remaining effects — partial completion is better than rolling back a PTY spawn because a WebSocket publish failed.

The one exception: if allocate_pty or create_tmux_window fails, the interpreter stops and feeds a spawn_failed event back to the state machine, which transitions the task to failed.


WebSocket integration

Task terminal I/O flows through the existing WebSocket infrastructure defined in http-api.md §PTY channel. This spec adds task-specific addressing.

Task WebSocket endpoint

Per task-runner.md §Task WebSocket:

GET /api/v1/projects/:id/tasks/socket → WebSocket upgrade

This socket uses the same frame structure as the session socket (channel, seq, type, payload) and supports the same control-channel commands (hello, subscribe, ping).

PTY channel addressing for tasks

Task PTYs are addressed as pty:task:<task_id> (distinct from subagent PTYs which use pty:<invocation_id>).

Client subscribes:

{ "channel": "control", "type": "subscribe", "payload": { "channels": ["pty:task:01HXAB..."] } }

Server streams binary frames with the same format as subagent PTYs:

  • 0x01 prefix + raw terminal bytes (stdout/stderr from the task process)

Client sends input:

  • Binary frame: 0x01 prefix + keystroke bytes

Client sends resize:

{ "channel": "control", "type": "pty.resize", "payload": { "task_id": "01HXAB...", "cols": 120, "rows": 40 } }

Scrollback replay on subscribe

When a client subscribes to a task PTY channel:

  1. The broker reads the ring buffer for that task.
  2. Sends the buffered output as an initial burst of binary frames.
  3. Switches to live streaming.

This handles the reconnection case: operator closes browser, reopens, subscribes to the same task — they see the recent output immediately.

Events channel for tasks

The task socket's events channel emits task lifecycle events:

type payload When
task.launched { "task_id": "...", "task_name": "test", "command": "bun test", "backend": "tmux" } Task started
task.state { "task_id": "...", "state": "running", "from": "starting" } Task state changed
task.exited { "task_id": "...", "exit_code": 0, "duration_ms": 12345 } Process exited
task.stopped { "task_id": "...", "stopped_by": "operator" } Operator stopped the task

These events invalidate TanStack Query caches in the UI (same pattern as session run.started/run.ended events).


Daemon lifecycle

Startup (tmux recovery)

During the daemon's self_check phase (per daemon.md Phase 2), when the tmux backend is resolved:

  1. Scan for existing sessions: TmuxDriver.listKagedSessions() finds all kaged-* tmux sessions on the host.
  2. Cross-reference with storage: for each session, list its windows via TmuxDriver.listWindows() and match against task_instances records in the database.
  3. Re-attach running tasks: for tasks in running state whose tmux window still exists:
    • Call PtyBroker.reattach() to create a new PtyHandle connected to the existing tmux pane.
    • Update the task_instances record with the current PID (it may have changed if the shell respawned within tmux).
    • Log task_runner.reattached: task_id=<id> tmux_session=<name> tmux_window=<window>.
  4. Reconcile exited tasks: for tasks in running state whose tmux window no longer exists:
    • Read the exit status from tmux's dead-pane metadata (if available) or mark as failed with error: "exited_while_daemon_down".
    • Capture whatever scrollback remains via TmuxDriver.capturePane() and persist the transcript.
    • Update the task_instances record.
  5. Handle orphaned sessions: kaged-* sessions with no matching project in the registry are logged as task_runner.orphaned_session and left alone. The operator cleans them up manually.
  6. Connect control mode: for each active session that has re-attached tasks, start a control-mode connection for live I/O routing.

Startup (raw PTY)

When the backend is pty, tasks do not survive daemon restart:

  1. Scan task_instances for records in running state.
  2. Mark them all as failed with error: "daemon_restart".
  3. Log task_runner.orphaned_pty_tasks: count=<N>.

Shutdown

On daemon shutdown (SIGTERM received):

  1. tmux backend: close all control-mode connections. Do NOT kill the tmux sessions — running tasks continue in tmux. The daemon is disposable; tmux is the persistence layer.
  2. Raw PTY backend: send SIGTERM to all child processes. Wait up to 5 seconds. Send SIGKILL to survivors. Mark all as stopped in storage.
  3. Drain WebSocket: send closing { code: "server_shutdown" } on all task sockets. This matches the session socket behaviour.

Project unload

When a project is unloaded via DELETE /api/v1/projects/:id:

  1. Stop all running tasks for the project (SIGTERM → SIGKILL).
  2. If tmux backend: TmuxDriver.killSession("kaged-<project_slug>").
  3. Remove all PtyHandle entries from the broker.
  4. Persist final states to storage.

xterm.js UI component

npm dependencies

Added to packages/ui/package.json:

Package Version (pinned, exact) Purpose
@xterm/xterm 6.1.0-beta.274 Core terminal emulator
@xterm/addon-fit 0.12.0-beta.274 Auto-resize terminal to container
@xterm/addon-webgl 0.20.0-beta.273 GPU-accelerated rendering
@xterm/addon-web-fonts 0.2.0-beta.188 Synchronize the custom mono webfont before first glyph measurement

These are the renamed packages (formerly xterm, xterm-addon-fit, xterm-addon-webgl). The @xterm/* scope is the current upstream. There is no @xterm/addon-canvas in v6 — the canvas renderer was removed upstream; the only renderers are the built-in DOM renderer (automatic fallback) and the WebGL renderer.

The @xterm/addon-web-fonts addon is required because 'JetBrains Mono' is served asynchronously from the CDN (cdn.kaged.dev), while xterm.js measures glyph metrics synchronously and caches them on first render. Without it, the terminal can render before the font lands, measure a fallback font, and lock in the wrong cell size for the session.

Why the beta pins. @xterm/addon-web-fonts only reached a usable loadFonts/WebFontsAddon API on the @xterm/[email protected] line (its stable 0.1.0 peer-requires ^6.1.0-beta.86). To get the addon we pin the whole @xterm/* set to the aligned beta.274 pre-release. Versions are pinned exactly (no ^) because beta channels do not follow semver ranges. This is a deliberate, justified pre-release dependency (per AGENTS.md "justify any npm dep"); revisit when @xterm/[email protected] GA ships.

Loading strategy

Per ui/README.md §Performance budget, xterm.js must be loaded on demand (< 100 KB gzipped, dynamic import on first terminal open). The component uses React lazy():

const TerminalView = lazy(() => import("./TerminalView.tsx"));

The dynamic import pulls @xterm/xterm, @xterm/addon-fit, and @xterm/addon-webgl into a separate chunk. Vite's code splitting handles this automatically.

TerminalView component

Location: packages/ui/src/components/terminal/TerminalView.tsx

interface TerminalViewProps {
  /** Task instance ID to connect to. */
  taskId: string;
  /** Project ID (for the task WebSocket URL). */
  projectId: string;
  /** Called when the terminal reports a resize. */
  onResize?: (cols: number, rows: number) => void;
  /** Called when the terminal's underlying process exits. */
  onExit?: (exitCode: number) => void;
}

Lifecycle:

  1. Mount: create the Terminal instance with config from terminal-theme.ts (KAGED_TERMINAL_THEME + DEFAULT_TERMINAL_CONFIG). Load FitAddon and WebFontsAddon, then open() the terminal into the container div, then load WebglAddon (after open() — it otherwise defers to onWillOpen). Call fitAddon.fit() for an initial size. When the webfont resource resolves, call webFontsAddon.relayout() (it re-inserts 'JetBrains Mono' into fontFamily and forces a remeasure) then fitAddon.fit() again so cols/rows match the corrected cell metrics. WebFontsAddon tolerates a missing/failed webfont — the terminal stays usable on the Consolas/monospace fallback.
  2. Connect: open/reuse the project task WebSocket (/api/v1/projects/:id/tasks/socket). Send subscribe { channels: ["pty:task:<taskId>"] } on the control channel.
  3. Data flow — server → terminal: binary frames from pty:task:<taskId> channel → terminal.write(data).
  4. Data flow — terminal → server: terminal.onData(data => ws.send(binaryFrame(data))).
  5. Resize: FitAddon.fit() on a ResizeObserver callback (container resize, window resize, panel drag), coalesced through requestAnimationFrame to avoid thrashing during continuous drags. After fitting, send a pty.resize control frame with the new cols/rows.
  6. Unmount: unsubscribe from PTY channel. Dispose Terminal, FitAddon, WebglAddon. Do not close the WebSocket (it may be shared with other terminal tabs).

lineHeight is 1.0. Per ui/README.md §Terminal rendering, the cell grid must use lineHeight: 1.0; any larger value breaks box-drawing alignment for TUIs (k9s, htop, watch).

WebGL fallback: if WebglAddon fails to initialize (no GPU, context lost), the terminal falls back to the built-in DOM renderer (xterm.js v6 has no canvas renderer). Log the fallback to console, do not crash.

Reduced motion: when prefers-reduced-motion is active, disable cursor blink (cursorBlink: false). The terminal itself has no other animations.

useTaskSocket hook

Location: packages/ui/src/hooks/useTaskSocket.ts

A thin wrapper around the existing useSessionSocket pattern, connecting to the task-specific WebSocket:

function useTaskSocket(projectId: string): {
  socket: WebSocket | null;
  connected: boolean;
  subscribe: (channel: string) => void;
  unsubscribe: (channel: string) => void;
  send: (data: ArrayBuffer) => void;
  sendControl: (type: string, payload: unknown) => void;
}

The hook manages connection lifecycle, reconnection (same 10-minute buffer window as session sockets), and the hello/welcome handshake.

Component file structure

packages/ui/src/components/terminal/
├── index.ts              — barrel export
├── TerminalView.tsx      — the xterm.js wrapper (≤ 300 LOC)
├── useTaskSocket.ts      — WebSocket hook for task PTY
└── types.ts              — TerminalViewProps, etc.

The existing terminal-theme.ts at packages/ui/src/terminal-theme.ts remains where it is — it's already consumed by multiple potential consumers and is not terminal-component-specific.


Testing notes

Per ADR-0003:

PTY broker tests

  • Spawn + output: spawn a task (raw PTY backend) → assert handle returned → write data to stdin → assert process receives it → assert output arrives via onData.
  • Ring buffer: spawn a task that produces >10,000 lines → assert ring buffer does not exceed capacity → subscribe → assert replay starts from ring buffer head.
  • Concurrency limit: spawn 8 tasks for one project → assert 9th throws limit error.
  • Dispose: spawn a task → dispose the handle → assert handle removed from registry → assert fd/process cleaned up.

tmux driver tests (integration)

  • Session creation: ensureSession("kaged-test", "/tmp") → assert tmux has-session -t kaged-test succeeds.
  • Window creation: create a window → assert tmux list-windows -t kaged-test includes the window name.
  • Control mode I/O: attach control mode → send keystrokes → assert the pane receives input → assert %output events arrive.
  • Daemon restart recovery: create session + window → kill daemon (simulate) → restart → assert listKagedSessions finds the session → assert reattach succeeds → assert task state is consistent.
  • Session cleanup: killSession("kaged-test") → assert session no longer exists.

Side-effect interpreter tests

  • Happy path: create a task via state machine → interpret effects → assert PTY spawned, instance persisted, event emitted.
  • Spawn failure: mock PTY broker to throw → assert spawn_failed event fed back → assert task state is failed → assert error persisted.
  • Stop flow: running task → stop event → interpret effects → assert SIGTERM sent, transcript persisted, PTY destroyed.

xterm.js component tests (pure logic, no DOM)

  • Config: assert TerminalView uses KAGED_TERMINAL_THEME and DEFAULT_TERMINAL_CONFIG values.
  • WebSocket subscription: assert subscribe called with pty:task:<taskId> on mount, unsubscribe on unmount.
  • Resize: assert pty.resize control frame sent when container dimensions change.

Integration tests (Playwright, deferred)

  • End-to-end: launch a task via UI → terminal renders → type a command → see output → stop the task → see exit badge.
  • Reconnect: open task terminal → kill WebSocket → reconnect → scrollback replayed.
  • Daemon restart (tmux): launch a tmux-backed task → restart daemon → reopen UI → task terminal reconnects with scrollback.

Failure modes

Failure Detection Behavior
Bun.spawn fails (command not found, permission denied) Bun.spawn throws or process exits immediately Side-effect interpreter feeds spawn_failed event → task marked failed
tmux not on PATH at startup resolveBackend() in @kaged/task-runner If config says "tmux": daemon refuses to start (exit 16). If "auto": falls back to pty backend
tmux server crashes mid-session Control-mode connection drops (EOF on stdout) All tasks in that tmux session marked failed with error: "tmux_crash". Log critical. Tasks can be relaunched
Control-mode parse error Unexpected line format in tmux output Log warning, skip the line. Do not crash the driver
WebSocket subscriber slow (back-pressure) Per-channel buffer exceeds 1 MB for PTY Close the socket with closing { code: "backpressure" } (per http-api.md)
Ring buffer overflow Output exceeds ring capacity Oldest bytes evicted. Reconnecting subscriber sees only the most recent 10,000 lines
Project unloaded while tasks running handleDeleteProject flow All tasks stopped (SIGTERM → SIGKILL), tmux session killed, handles removed
Daemon restart (raw PTY) Startup scan Running PTY tasks marked failed with error: "daemon_restart"
Daemon restart (tmux) Startup scan tmux tasks re-attached. See Startup (tmux recovery)
WebGL context lost in browser WebglAddon fires contextLoss event Terminal falls back to the built-in DOM renderer (v6 has no canvas renderer). No data loss

Open questions

  1. Bun native PTY API. Bun has been working on native PTY support (Bun.spawn with pty: true). If/when this ships, the raw PTY backend should use it instead of pipe mode. This would give raw-backend tasks real PTY features (isatty, SIGWINCH, job control) without requiring tmux. Monitor Bun PTY tracking issue. The tmux backend remains superior for daemon-restart survival regardless.

  2. Shared WebSocket vs dedicated. The current design uses a dedicated task WebSocket (/api/v1/projects/:id/tasks/socket) separate from the session WebSocket. An alternative is multiplexing task PTYs onto the session socket. Dedicated is simpler for v0 — no coordination between session and task lifecycle on a single socket.

  3. Terminal scrollback persistence. The ring buffer is in-memory only. Should the daemon persist the ring buffer to disk periodically (for crash recovery of the raw PTY backend)? v0: no. The tmux backend doesn't need it (tmux has its own scrollback), and the raw PTY backend already doesn't survive restarts.

  4. Multiple operators on one terminal. v0 is single-operator per daemon. When multi-operator lands (v2), should two operators be able to attach to the same task terminal simultaneously? Deferred — same open question as session socket multiplexing.


Amendments

2026-06-03 — tmux manual window sizing and launch-time dimensions

Fixes the load-bearing tmux sizing bug that left TUIs trapped in a stale 80x24 pane even when xterm.js was much larger.

  1. window-size manual is now mandatory for every kaged-managed tmux session. The spec now requires tmux set-option -t <session> window-size manual after session creation or re-use so operator-driven pane resizes stick.
  2. Launch-time cols / rows are now part of the spawn path. PtySpawnOptions already carried them; this amendment makes the tmux driver consume them for both new-session and newly created windows.
  3. Fresh task terminals may start at the client size before WebSocket subscribe. When the launch request includes dimensions, the daemon seeds the tmux pane with them immediately; when it does not, the existing pty.resize on subscribe remains the fallback path.

2026-06-03 — byte-correct tmux control-mode output decoding

Fixes garbled Nerd Font glyphs and ghosted TUI redraws. Root cause: the control-mode reader decoded tmux -C output through a UTF-8 TextDecoder and applied charCodeAt to the decoded string, corrupting the raw PTY byte stream before it reached xterm.js (so the renderer faithfully drew corrupt bytes — not an xterm/font bug).

  1. #pumpStdout now buffers and splits the control stream as raw bytes (Uint8Array, split on byte 0x0A), with no TextDecoder. tmux escapes \n/\r inside payloads, so 0x0A is always a record terminator.
  2. decodeTmuxOctalEscapes and parsePaneOutput operate on Uint8Array. Per tmux control.c, only 0x00–0x1F and 0x5C are octal-escaped; 0x7F and 0x80–0xFF pass raw. The decoder now emits \ddd as one byte and copies every other byte unchanged, preserving multibyte UTF-8 (Nerd Font PUA glyphs like U+F07B = EF 81 BB) and escape sequences exactly.
  3. Corrected octal handling: fixed an off-by-one that failed to decode an octal escape at end-of-line, and removed the bogus \\ (double-backslash) case — tmux emits a literal backslash as \134.
  4. Non-%output control lines (ASCII, tmux-generated) are still parsed as strings; only the binary %output payload is kept as raw bytes.

2026-06-03 — dedicated tmux socket isolation for daemon-owned terminals

Fixes the severe cross-session bug where daemon-owned tmux operations could mutate or destroy the operator's unrelated tmux sessions.

  1. The daemon now talks to a dedicated private tmux server only. The server is selected with -L <socketName> -f /dev/null on every tmux invocation, with socketName derived from config.daemon.home via a stable short hash so tasks survive daemon restarts and remain recoverable.
  2. Control-mode attach now includes both -f flags in distinct roles. The required argv is tmux -L <socketName> -f /dev/null -C attach-session -t <session> -f ignore-size; the first -f is the tmux client config override, the second belongs to attach-session and disables size arbitration.
  3. Startup recovery scans only the daemon's dedicated server. list-sessions still filters to kaged-*, but the scan now runs against the private socket; "no server running" on that socket is treated as an empty recovery state, not an error.
  4. The daemon never runs kill-server. Normal shutdown leaves the dedicated tmux server alive so task sessions survive daemon restarts. Destructive operations remain limited to kaged-* session/window names as defense-in-depth.

2026-06-03 — tmux control-mode clients no longer arbitrate terminal size

Fixes the narrower-than-pane rendering and ghosted redraw bug seen with full-screen TUIs under tmux control mode.

  1. Control-mode attach now requires -f ignore-size. The daemon's control client must stop participating in tmux size arbitration; otherwise its own 80-column terminal view can keep clamping the managed window.
  2. window-size manual is now required both globally and per session. The global setting protects the very first tmux window created for a project; the per-session setting remains mandatory on session reuse.
  3. resize-window is now the authoritative resize operation. Because each task maps to one window and one pane, the daemon resizes the window itself on launch and on every pty.resize, optionally mirroring with resize-pane.
  4. Task windows disable aggressive-resize and get a sane default-size fallback. This prevents tmux from reintroducing resize contention before the browser has sent its first dimensions.

2026-06-03 — xterm.js v6 alignment: addon versions, web-fonts dependency, lineHeight 1.0, renderer fallback

Reconciles this spec with the implemented packages/ui on xterm.js v6 and fixes broken TUI rendering.

  1. npm dependency table pinned to the aligned @xterm/* v6.1 beta set. @xterm/xterm 6.1.0-beta.274, @xterm/addon-fit 0.12.0-beta.274, @xterm/addon-webgl 0.20.0-beta.273, and the newly added @xterm/addon-web-fonts 0.2.0-beta.188 (synchronizes the async CDN webfont with xterm.js's synchronous glyph measurement). Pinned exactly because addon-web-fonts' usable API only exists on the 6.1.0-beta line; revisit at 6.1.0 GA.
  2. @xterm/addon-canvas clarified as nonexistent in v6. The canvas renderer was removed upstream; the two renderers are the built-in DOM renderer (automatic fallback) and the WebGL renderer. The failure-modes "WebGL context lost" row and the TerminalView fallback note now say "DOM renderer" instead of "canvas renderer."
  3. TerminalView mount lifecycle rewritten to the correct, race-free order: load FitAddon + WebFontsAddonopen() → load WebglAddon (after open()) → fit()webFontsAddon.relayout() + refit once the webfont resolves. ResizeObserver fits are coalesced via requestAnimationFrame.
  4. lineHeight: 1.0 made explicit here too, cross-referencing ui/README.md §Terminal rendering — the cell grid breaks for TUIs at any value > 1.0.

2026-06-03 — tmux send-keys semicolon escape

Fixes the bug where typing ; directly into the terminal had no effect (pasting a string containing ; worked fine). The root cause is that tmux's CLI parser treats ; as a command separator even after the -- flag that ends option parsing.

  1. The sendKeys driver method now replaces ; with \; before passing the input string to tmux send-keys. With the -l (literal) flag, \; is interpreted by tmux's command parser as a single literal semicolon — not a backslash followed by a semicolon. This is the only special character that requires escaping when using -l -- (tmux's -- stops flag parsing, but not command-separator parsing).
  2. Regression test added. The mock-based sendKeys test verifies that standalone ;, x;x, and hello\;world are all correctly escaped before the runTmux call.

References