Spec: Task Runner

Purpose

This spec defines the task runner: the daemon subsystem that lets operators launch, observe, and manage project-scoped commands from the web UI. Build scripts, test suites, dev servers, linters, deploy commands — anything the operator would run in a terminal while working on a project.

The task runner is not the agent orchestration layer. Agents dispatch subagents through the DSL; the task runner dispatches shell commands at the operator's explicit request. The relationship is: agents think, the task runner does the manual labor the operator points it at.

This document is normative for:

  • The task model — what a task is, its lifecycle, how it relates to projects and sessions.
  • The backend strategy — raw PTY by default, tmux as an optional multiplexer backend for persistent and multi-window task sessions.
  • The xterm.js rendering contract — how task terminal output reaches the browser through the existing WebSocket PTY infrastructure.
  • The DSL integration — declaring runnable tasks in project.yaml.
  • The API surface — endpoints for launching, listing, attaching to, and killing tasks.
  • The persistence model — what survives a browser disconnect, a daemon restart, and what doesn't.

It is not normative for:

  • The PTY broker's internal wiring, the tmux runtime driver, the side-effect interpreter that executes task state-machine effects, or the xterm.js UI component (that's project-terminals.md — the task runner emits side effects, project-terminals executes them).
  • The xterm.js configuration or theme (that's ui/README.md — tasks render in the same terminal component as subagent PTYs).
  • The sandbox mechanism (tasks are not caged — they run as the daemon user with the project directory as cwd, same as cage: disabled subagents).
  • The subagent dispatch model (that's daemon.md and session-manager.md).

Constraints (from ADRs)

Constraint Source
Web UI is the product; task output renders in xterm.js in the browser ADR-0002
Runtime is Bun; process spawning uses Bun.spawn or tmux CLI ADR-0004
Task metadata persisted to SQLite/Postgres alongside session data ADR-0005
Works identically in per-user and system-wide deployment modes ADR-0010
Task definitions in the DSL use project-relative paths ADR-0011

Core concepts

Task

A task is a single operator-initiated command execution within a project. It is the unit of "I want to run this thing and watch it."

Key properties:

  • Project-scoped. Every task belongs to exactly one project. The task's working directory is the project root.
  • Operator-initiated. Tasks are never auto-started by agents or the daemon. The operator explicitly launches them via the UI or API.
  • Named or ad-hoc. A task is either a named task (declared in the project DSL, reusable) or an ad-hoc task (a one-off command typed into the UI).
  • Independent of sessions. Tasks are project-level, not session-level. An operator can run bun test while no agent session is active. Tasks and sessions coexist but don't depend on each other.
  • Observable. Task output streams to the operator's browser via xterm.js over the existing WebSocket PTY channel. The operator can interact with the task terminal (stdin flows back).
  • Optionally persistent via tmux. When tmux is available and opted into, tasks run inside tmux sessions. This gives them survival across browser disconnects, daemon restarts, and operator re-logins — the tmux session on the host outlives everything except the host itself.

Named task

A named task is declared in the project DSL under a tasks: block. It has a slug, a command, and optional metadata (description, environment, working directory override). Named tasks appear as buttons in the project UI — one-click launch.

Ad-hoc task

An ad-hoc task is a command the operator types into the task runner's input field. It runs in the project root with the daemon user's environment. Ad-hoc tasks are not persisted in the DSL; they appear in the task history but are not reusable without re-typing.

Task group

A task group is an optional organizational unit in the DSL. Groups are cosmetic — they affect how tasks are displayed in the UI (grouped sections), not how they execute.


Backend strategy

The task runner supports two backends. The operator does not choose per-task; the backend is resolved once at daemon startup based on availability and configuration.

Backend 1: Raw PTY (default)

The daemon spawns the task command via Bun.spawn (pipe mode) or Bun's native PTY API when available. The PTY broker in project-terminals.md wraps the spawned process and routes I/O to WebSocket subscribers.

Properties:

  • Works everywhere. No external dependency beyond the OS's PTY support.
  • Task survives browser disconnect (the PTY stays alive; the daemon buffers output in the scrollback ring).
  • Task does not survive daemon restart. If the daemon stops, the PTY's child process is orphaned (same semantics as subagent orphans per daemon.md).
  • No multi-window within a single task. One command, one PTY.

Backend 2: tmux

When tmux is available and enabled, the daemon delegates task execution to tmux. Each task becomes a tmux window (or pane) inside a per-project tmux session. The daemon attaches to the tmux session's PTY for streaming to the browser.

Properties:

  • Task survives daemon restart. The tmux session is a host-level process independent of the daemon. On daemon restart, the task runner re-discovers existing tmux sessions and re-attaches.
  • Task survives browser disconnect (inherently — tmux doesn't care about the browser).
  • Multi-window support. Multiple tasks in the same project share a tmux session with separate windows. The operator can switch between them in the UI (each window maps to a tab in the terminal panel).
  • tmux's own scrollback buffer augments the daemon's PTY ring buffer.
  • tmux key bindings and status bar are not exposed to xterm.js. The daemon attaches to tmux in a way that makes it transparent — the operator sees the task output, not tmux chrome. (See tmux attachment mode.)

Backend resolution

At daemon startup, during self_check (per daemon.md Phase 2):

  1. Check if tmux is on PATH and is a recent-enough version (≥ 3.2, for -f /dev/null support and modern control-mode features).
  2. Check if the operator has enabled tmux in config ([task_runner].backend = "tmux" in config.toml).
  3. Resolve:
tmux on PATH Config says Resolved backend Startup log
Yes "tmux" tmux task_runner: backend=tmux (tmux 3.4 found)
Yes "auto" (default) tmux task_runner: backend=tmux (auto-detected)
Yes "pty" raw PTY task_runner: backend=pty (tmux available but not selected)
No "tmux" error — daemon refuses to start Exit code 16, message names tmux package
No "auto" (default) raw PTY task_runner: backend=pty (tmux not found)
No "pty" raw PTY task_runner: backend=pty

The resolved backend applies to all tasks for the daemon's lifetime. There is no per-task backend override (mixing backends within a project would create confusing persistence semantics).


tmux attachment mode

When the tmux backend is active, the daemon does not just run tmux attach and pipe the output. That would expose tmux's status bar, key bindings, and prefix handling to xterm.js — a UX collision (the operator would accidentally trigger tmux shortcuts while meaning to interact with the task).

Instead, the daemon uses tmux control mode (tmux -C):

  1. The daemon connects to the tmux server via tmux -C attach-session -t <session>. Control mode emits structured output (escape-prefixed event lines) rather than raw terminal bytes.
  2. The daemon parses the control-mode stream to:
    • Capture per-pane output and route it to the correct xterm.js instance via the WebSocket PTY channel.
    • Detect window/pane creation, destruction, and focus changes.
    • Send keystrokes to specific panes without tmux prefix interference.
  3. The operator's xterm.js sees the task's raw output — no tmux chrome, no status bar, no prefix.

This is the same approach used by iTerm2's tmux integration (tmux -CC). The daemon acts as the tmux client; the browser never talks to tmux directly.

Fallback: pipe mode

If control mode is unavailable (tmux version too old, unusual configuration), the daemon falls back to pipe mode: tmux pipe-pane -t <pane> 'cat > <fifo>' for output capture, and tmux send-keys -t <pane> for input. This is less efficient (polling, no structured events) but functional. The daemon logs task_runner.tmux_pipe_fallback at startup.

tmux session naming

  • Per-project tmux session: kaged-<project_slug> (e.g., kaged-music-site).
  • Per-task window: task-<task_slug_or_id> (e.g., task-test, task-01HXAB).
  • The daemon owns these sessions. It creates them on first task launch and destroys them when the project is unloaded (or on operator request). The operator can also interact with them directly via tmux attach -t kaged-music-site from a host terminal — this is supported, not prohibited.

tmux remain-on-exit

When history: true for a named task (the default) and the tmux backend is active:

  1. On window creation: the daemon sets tmux set-window-option -t <window> remain-on-exit on so the pane persists after the command exits.
  2. On process exit: the daemon detects the exit via control-mode pane-exited event. It transitions the instance to done or failed as usual, but does NOT destroy the tmux window. The pane remains visible with its final output (tmux shows [exited] in the pane status).
  3. History cull: immediately after transitioning to a terminal state, the daemon counts all non-running instances for this task. If the count exceeds history_count, the oldest instances are culled — the tmux window is killed (tmux kill-window) and the DB record is deleted.
  4. Operator deletion: the operator can manually close historical instances from the task detail page's tab bar. This triggers the same cleanup: tmux window kill + DB delete.

When history: false, the daemon does NOT set remain-on-exit, and tmux windows are destroyed on process exit as usual.

Ad-hoc tasks (no task_name) and raw PTY tasks never use remain-on-exit.


DSL integration

tasks block (optional, additive)

The project DSL gains an optional tasks: top-level key — a named-object map keyed by task name:

version: 1
project: music-site

# ... primary, subagents, etc ...

tasks:
  test:
    command: bun test
    description: Run the test suite
    group: ci

  dev:
    command: bun run dev
    description: Start the dev server
    group: dev
    long_running: true

  lint:
    command: bun run lint
    description: Run the linter
    group: ci

  build:
    command: bun run build
    description: Production build
    group: ci

  deploy-staging:
    command: ./scripts/deploy.sh staging
    description: Deploy to staging
    group: deploy
    confirm: true

  db-migrate:
    command: bun run db:migrate
    description: Run database migrations
    cwd: packages/api

Fields

tasks (named-object map, optional)

  • Optional. Absence means no named tasks; the operator can still run ad-hoc tasks.
  • Max entries: 64 tasks per project.
  • Keys are task names (slugs). Keys are unique by definition (YAML map semantics). Setting a key to null in project.local.yaml removes the task (ADR-0015 nullification).

Task key (string, required)

The task's slug, used as the map key. Used in API paths, UI buttons, and audit logs.

  • Pattern: ^[a-z][a-z0-9_-]{0,30}[a-z0-9]$ — lowercase letters, digits, hyphens, underscores; 2–32 chars.
  • Reserved names: adhoc, all, new. Using these is a parse error.

tasks.<name>.command (string, required)

The shell command to execute. Passed to the system shell (/bin/sh -c "<command>") or, when using the tmux backend, to tmux send-keys.

  • Required.
  • No environment variable interpolation by the DSL parser. The command string is passed verbatim to the shell; the shell does its own expansion. This is intentional — the DSL is declarative, the shell is imperative.
  • No path validation at parse time. The command may reference binaries or scripts that aren't yet installed. Failure is at run time, not parse time.

tasks.<name>.description (string, optional)

Human-readable description. Shown in the UI next to the task button.

  • Max length: 280 characters.

tasks.<name>.group (string, optional)

Organizational group. Affects UI display (tasks in the same group are visually clustered). Does not affect execution.

  • Pattern: ^[a-z][a-z0-9_-]{0,30}[a-z0-9]$
  • Ungrouped tasks appear in a default "Other" group in the UI.

tasks.<name>.cwd (string, optional)

Working directory override, relative to the project root.

  • Default: project root.
  • Must be project-relative. Same validation rules as cage.fs[].path — no absolute paths, no .. escape.
  • Existence checked at task-launch time, not parse time.

tasks.<name>.long_running (boolean, optional)

Hint that this task is a long-lived process (dev server, watcher, etc.) rather than a run-to-completion command.

  • Default: false.
  • Effect on UI: long-running tasks show a stop button instead of "waiting for exit." The task runner does not auto-kill long-running tasks on any timeout.
  • Effect on backend: no difference in execution. The hint is purely for the UI.

tasks.<name>.confirm (boolean, optional)

Whether to prompt the operator for confirmation before launching.

  • Default: false.
  • Effect on UI: tasks with confirm: true show a confirmation dialog before running. The dialog includes the command text and the task name.
  • Use case: destructive or expensive operations (deploy, database wipe, etc.).

tasks.<name>.env (object, optional)

Extra environment variables to set for this task. Merged on top of the daemon's environment.

  • Default: {} (empty — inherits the daemon's environment).
  • Values are strings. No interpolation.
  • Does not replace the base environment — additive merge. To unset a variable, set it to an empty string.
tasks:
  test-ci:
    command: bun test
    env:
      CI: "true"
      NODE_ENV: test

tasks.<name>.history (boolean, optional)

Whether the task runner retains completed instances in the tmux session for later review. When true and using the tmux backend, completed task panes are kept alive via tmux set-window-option remain-on-exit on so the operator can review output, scroll back, and inspect exit state.

  • Default: true.
  • Effect on tmux backend: When the task process exits, the tmux window is kept alive (not destroyed). The daemon tracks the instance as a historical entry. When the number of historical instances for this task exceeds history_count, the oldest is culled: tmux window killed and DB record deleted.
  • Effect on raw PTY backend: No effect. Raw PTY tasks cannot retain history — the PTY dies with the process. The history field is accepted but ignored.
  • Ad-hoc tasks: Ad-hoc tasks (no task_name) never get history regardless of this field.

tasks.<name>.history_count (number, optional)

Maximum number of completed instances to retain per task when history is true. The cull happens on each task completion: if the count of historical (non-running) instances for this task exceeds history_count, the oldest instances (by launched_at) are removed — tmux window killed, DB record deleted.

  • Default: 3.
  • Minimum: 1.
  • Maximum: 20.
  • Cull semantics: The cull targets historical instances only (state done, failed, or stopped). Running instances are never culled. The cull runs immediately after a task transitions to a terminal state.
  • Per-task, not per-project: Each named task maintains its own independent history count.
tasks:
  test:
    command: bun test
    history: true
    history_count: 5

  build:
    command: bun run build
    history: false

JSON Schema addition

The tasks block is additive to the existing project DSL schema. The tasks key is optional; projects without it remain valid. The schema addition:

{
  "tasks": {
    "type": "object",
    "maxProperties": 64,
    "additionalProperties": {
      "oneOf": [
        { "$ref": "#/$defs/Task" },
        { "type": "null" }
      ]
    },
    "propertyNames": {
      "pattern": "^[a-z][a-z0-9_-]{0,30}[a-z0-9]$",
      "not": { "enum": ["adhoc", "all", "new"] }
    },
    "description": "Named-object map of tasks keyed by task name. Null values remove the task (ADR-0015 nullification)."
  }
}
{
  "Task": {
    "type": "object",
    "additionalProperties": false,
    "required": ["command"],
    "properties": {
      "command": { "type": "string", "minLength": 1 },
      "description": { "type": "string", "maxLength": 280 },
      "group": {
        "type": "string",
        "pattern": "^[a-z][a-z0-9_-]{0,30}[a-z0-9]$"
      },
      "cwd": { "$ref": "#/$defs/ProjectRelativePath" },
      "long_running": { "type": "boolean", "default": false },
      "confirm": { "type": "boolean", "default": false },
      "env": {
        "type": "object",
        "additionalProperties": { "type": "string" }
      },
      "history": { "type": "boolean", "default": true },
      "history_count": { "type": "integer", "minimum": 1, "maximum": 20, "default": 3 }
    }
  }
}

Task lifecycle

States

    launch (operator action)
         │
    ┌────▼─────┐
    │ starting │──── process spawned ────▶┌──────────┐
    └──────────┘                          │ running  │
                                          └──┬──┬──┬─┘
                                             │  │  │
                          exits with code 0  │  │  │ operator stops
                                             │  │  │
                                       ┌─────▼┐ │ ┌▼──────────┐
                                       │ done │ │ │ stopped   │
                                       └──────┘ │ └───────────┘
                                                 │
                                          error  │
                                                 │
                                          ┌──────▼─┐
                                          │ failed │
                                          └────────┘
State Description
starting Command issued; process not yet spawned (tmux session being created, PTY being allocated).
running Process is alive. Output streaming to any attached operator.
done Process exited with code 0. Terminal output preserved in scrollback / transcript.
failed Process exited with non-zero code, or failed to spawn. Exit code and last output preserved.
stopped Operator explicitly stopped the task (SIGTERM → SIGKILL).

Lifecycle events

Event When
task.launched Task started.
task.attached Operator connected to the task's terminal.
task.detached Operator disconnected from the task's terminal.
task.exited Process exited (any reason). Includes exit code.
task.stopped Operator explicitly stopped the task.
task.reattached Operator re-connected after a disconnect (scrollback replayed).

xterm.js integration

Task terminals render in the same xterm.js component as subagent PTYs (per ui/README.md Terminal rendering). The terminal configuration (font, theme, scrollback, addons) is identical. The operator cannot distinguish a task terminal from a subagent terminal by appearance — only by the label and context.

How task output reaches the browser

Raw PTY backend:

  1. Daemon spawns the task command via Bun.spawn (pipe mode, or Bun's native PTY API when available).
  2. The spawned process is wrapped in a PtyHandle and registered with the PTY broker (per project-terminals.md §PTY broker).
  3. Output flows: process stdout/stderr → broker ring buffer → WebSocket pty channel → browser → xterm.js.
  4. Input flows: xterm.js → WebSocket → broker → process stdin.

tmux backend:

  1. Daemon creates (or reuses) the tmux session kaged-<project>.
  2. Creates a new tmux window task-<name> and sends the command to it.
  3. Daemon connects to the tmux session via control mode (tmux -C).
  4. Control-mode output is parsed; per-pane bytes are extracted and routed to the PTY broker as if they were a regular PTY.
  5. The broker streams them over the WebSocket pty channel to xterm.js.
  6. Input from xterm.js is translated to tmux send-keys commands directed at the specific pane.

In both cases, the xterm.js instance is unaware of the backend. It receives raw terminal bytes and renders them. The FitAddon handles resize; resize events propagate back through the broker to either the raw PTY or tmux resize-pane.

Terminal panel integration

Tasks appear in the session view's right panel alongside subagent terminals:

  • Terminal tab label: task:<name> (e.g., task:test, task:dev). Distinguished from subagent tabs by the task: prefix.
  • Tab badge: exit code badge on completion ( green for 0, magenta for non-zero, amber for stopped).
  • Multiple tasks: each running task gets its own tab. No limit beyond the project's concurrency limit (see Operating limits).
  • No task terminal when no tasks are running: the task section is absent from the tab list. It appears when the first task is launched and remains until the operator dismisses closed-task tabs.

Project view integration

Tasks also surface in the project overview (/projects/:id), not only within sessions.

Task panel

The project overview gains a Tasks section below the session list:

┌─────────────────────────────────────────────────────┐
│ Tasks                                    [▶ Run...] │
├─────────────────────────────────────────────────────┤
│                                                     │
│ ┌─ ci ────────────────────────────────────────────┐ │
│ │ [▶ test]  Run the test suite              idle  │ │
│ │ [▶ lint]  Run the linter                  idle  │ │
│ │ [▶ build] Production build                idle  │ │
│ └─────────────────────────────────────────────────┘ │
│                                                     │
│ ┌─ dev ───────────────────────────────────────────┐ │
│ │ [▶ dev]   Start the dev server          running │ │
│ └─────────────────────────────────────────────────┘ │
│                                                     │
│ ┌─ deploy ────────────────────────────────────────┐ │
│ │ [▶ deploy-staging] Deploy to staging      idle  │ │
│ └─────────────────────────────────────────────────┘ │
│                                                     │
│ Recent ad-hoc: bun run check (done ✓ 2m ago)       │
│                                                     │
│ ┌─────────────────────────────────────────┐         │
│ │ $ _                         [Run ad-hoc]│         │
│ └─────────────────────────────────────────┘         │
└─────────────────────────────────────────────────────┘
  • Named tasks are shown as buttons, grouped by group.
  • Status badges per task: each named task's sidebar icon reflects the state of its most recent instance:
    • Running / starting: pulsing blue <Footprints /> icon — clicking navigates to the running instance.
    • Failed (most recent instance): magenta <TriangleAlert /> icon — clicking starts a new instance.
    • Done (most recent instance, exit code 0): green <SquareCheckBig /> icon — clicking starts a new instance.
    • Never run: <Terminal /> icon (current default) — clicking starts a new instance.
  • Ad-hoc input at the bottom — a single-line text field with a "Run" button.
  • "Run..." dropdown (top right) for selecting a named task when the list is long.
  • Tapping a running task opens/navigates to the task's terminal view.

Dedicated task route

Tasks also have their own route for full-screen terminal viewing:

/projects/:id/tasks                    → Task list + ad-hoc runner
/projects/:id/tasks/:task_id           → Full-screen task terminal

The /tasks/:task_id view is a full-viewport xterm.js instance with a minimal header (task name, status, stop/restart button). Useful on mobile where the split-panel layout is too cramped for a terminal.

Task instance tabs

When the operator navigates to a named task's terminal view (via the sidebar or the task list), the detail page shows a tab bar at the top when the task has a task_name. The tab bar displays all instances of that task:

  • Tab order: running/starting instances first (sorted by start time ascending), then completed instances (sorted by start time descending, newest first).
  • Active tab: the instance the operator is currently viewing. Highlighted with the task's status accent color.
  • Running tab: shows the live terminal. Cannot be closed while running.
  • Historical tab: shows the preserved terminal output from a completed instance. Has a close button (×) to delete the instance (tmux window kill + DB record delete). Alternatively, the operator can click the tab to view it and then use the existing delete button in the toolbar.
  • No tabs for ad-hoc tasks: ad-hoc tasks (task_name: null) display the single instance without a tab bar.

The tab bar enables the operator to review past runs, compare outputs, and clean up history without leaving the terminal view.


API surface

Endpoints

GET    /api/v1/projects/:id/tasks              → list named tasks (from DSL) + running instances
POST   /api/v1/projects/:id/tasks/run          → launch a named or ad-hoc task
GET    /api/v1/projects/:id/tasks/instances     → list task instances (running + recent)
POST   /api/v1/projects/:id/tasks/cleanup       → delete all non-running task instances for project
GET    /api/v1/tasks/:tid                       → task instance detail
POST   /api/v1/tasks/:tid/stop                  → stop a running task
POST   /api/v1/tasks/:tid/restart               → stop + re-launch
DELETE /api/v1/tasks/:tid                       → remove from history (not kill — stopped tasks only)

Task terminals attach via the existing WebSocket PTY channel on the session socket — or, if no session is active, via a dedicated task socket:

GET    /api/v1/projects/:id/tasks/socket        → WebSocket upgrade for task PTY streaming

POST /api/v1/projects/:id/tasks/run

Launch a task.

Request (named task):

{
  "task": "test",
  "cols": 132,
  "rows": 44
}

Request (ad-hoc):

{
  "command": "bun run check",
  "cwd": "./packages/api",
  "cols": 132,
  "rows": 44
}

cols and rows are optional positive integers. When present, they seed the initial terminal size for the spawned PTY or tmux pane before the browser subscribes. Values above 1000 clamp to 1000. Missing or invalid values fall back to 80x24 until the client's first pty.resize frame arrives.

Response (202):

{
  "id": "01HXAB...",
  "task_name": "test",
  "command": "bun test",
  "state": "starting",
  "backend": "tmux",
  "launched_at": 1716300000000
}

If confirm: true is set on the named task, the daemon returns 200 with a confirmation prompt instead:

{
  "confirm_required": true,
  "confirm_id": "01HXAB...",
  "task_name": "deploy-staging",
  "command": "./scripts/deploy.sh staging",
  "message": "This task has confirm: true. Proceed?"
}

The client confirms with POST /api/v1/projects/:id/tasks/run/confirm:

{ "confirm_id": "01HXAB...", "proceed": true }

GET /api/v1/projects/:id/tasks/instances

List task instances (running and recent). Paginated. Supports optional task_name query parameter to filter instances for a specific named task.

{
  "items": [
    {
      "id": "01HXAB...",
      "task_name": "test",
      "command": "bun test",
      "state": "running",
      "backend": "tmux",
      "tmux_window": "task-test",
      "launched_at": 1716300000000,
      "pid": 12345
    },
    {
      "id": "01HXAA...",
      "task_name": null,
      "command": "bun run check",
      "state": "done",
      "exit_code": 0,
      "launched_at": 1716299000000,
      "exited_at": 1716299060000,
      "duration_ms": 60000
    }
  ],
  "next_cursor": null,
  "has_more": false
}

POST /api/v1/projects/:id/tasks/cleanup

Delete all task instances for a project that are not currently running or starting. This includes done, failed, and stopped instances. For tmux-backed instances, the daemon kills the tmux window before deleting the record.

Response (200):

{
  "deleted": 3
}

Running and starting tasks are skipped and remain intact.

POST /api/v1/tasks/:tid/stop

Stop a running task. The daemon sends SIGTERM, waits 5s, then SIGKILL. For tmux-backed tasks, the daemon sends tmux send-keys -t <pane> C-c first (graceful), then tmux kill-pane if the process doesn't exit.

Response (200):

{
  "id": "01HXAB...",
  "state": "stopped",
  "exit_code": null,
  "stopped_at": 1716300100000
}

Task WebSocket

The task WebSocket at /api/v1/projects/:id/tasks/socket uses the same frame structure and channel semantics as the session WebSocket (per http-api.md WebSocket protocol). The key difference:

  • pty channel is addressed by task instance ID: pty:task:<task_id>.
  • events channel emits task lifecycle events (task.launched, task.exited, etc.).
  • No output channel. Tasks don't produce agent messages — only terminal output.

If the operator has both a session socket and a task socket open, task terminals can appear in either UI context. The daemon doesn't enforce which socket carries which PTY; the client subscribes by ID.


Persistence

What is persisted

Data Persisted when Storage
Task instance record (id, project, command, state, exit code, timing) On launch, on state change task_instances table
Task terminal transcript On task exit task_transcripts table
Named task definitions In the DSL file (not database) .kaged/project.yaml

Schema sketch

CREATE TABLE task_instances (
  id            TEXT PRIMARY KEY,    -- ULID
  project_id    TEXT NOT NULL,
  task_name     TEXT,                -- null for ad-hoc tasks
  command       TEXT NOT NULL,
  cwd           TEXT,                -- resolved absolute path
  state         TEXT NOT NULL,       -- starting, running, done, failed, stopped
  backend       TEXT NOT NULL,       -- pty, tmux
  tmux_session  TEXT,                -- tmux session name, null for pty backend
  tmux_window   TEXT,                -- tmux window name, null for pty backend
  pid           INTEGER,
  exit_code     INTEGER,
  launched_at   INTEGER NOT NULL,    -- epoch ms
  exited_at     INTEGER,
  duration_ms   INTEGER,
  launched_by   TEXT NOT NULL,       -- user_id
  FOREIGN KEY (project_id) REFERENCES projects(id)
);

CREATE TABLE task_transcripts (
  id            TEXT PRIMARY KEY,
  task_id       TEXT NOT NULL,
  transcript    BLOB NOT NULL,       -- raw bytes with timing markers
  line_count    INTEGER NOT NULL,
  created_at    INTEGER NOT NULL,
  FOREIGN KEY (task_id) REFERENCES task_instances(id)
);

CREATE INDEX idx_task_instances_project ON task_instances(project_id, launched_at DESC);

tmux persistence across daemon restarts

When the tmux backend is active:

  1. Daemon shutdown: the daemon closes its control-mode connection to tmux sessions but does NOT kill the tmux sessions. Running tasks continue in tmux.
  2. Daemon restart: during self_check, the task runner scans for existing kaged-* tmux sessions. For each:
    • Lists windows and checks against the task_instances table.
    • Running tasks whose instances are in running state: re-attach (update the PTY broker to stream their output again).
    • Tasks that exited while the daemon was down: read the tmux pane's scrollback buffer, persist the transcript, update the instance to done or failed based on exit code.
  3. Orphaned tmux sessions: a kaged-* session with no matching project in the registry is logged as task_runner.orphaned_session and left alone. The operator can clean it up manually (tmux kill-session -t kaged-old-project).

With the raw PTY backend, tasks do NOT survive daemon restart. On restart, instances in running state are marked failed with error: "daemon_restart" — same semantics as subagent orphans.


Operating limits

Resource Default Enforced by
Max concurrent tasks per project 8 Task runner — counts only non-exited tasks (exited handles are ignored)
Max concurrent tasks per operator (across projects) 32 Task runner
Max ad-hoc command length 4096 characters API validation
Task transcript max size 10 MB per task Transcript writer (truncates oldest lines)
Task scrollback ring 10,000 lines (same as PTY broker default) PTY broker

Exceeding task concurrency limits returns 429 rate_limited with details.reason: "task_limit".


Failure modes

Failure Detection Behavior
Command not found Shell exits with 127 Task marked failed, exit code 127. UI shows "command not found."
Permission denied Shell exits with 126 Task marked failed, exit code 126.
cwd doesn't exist Pre-spawn check Task never starts. API returns 400 with details.reason: "cwd_not_found".
tmux session creation fails tmux CLI error Task runner falls back to raw PTY for this task. Logs task_runner.tmux_fallback.
tmux server crashes Control-mode connection drops All tmux-backed tasks for this daemon are marked failed with error: "tmux_crash". Daemon logs critical. Tasks can be relaunched.
Daemon restart (raw PTY) Startup scan Running PTY tasks marked failed with error: "daemon_restart".
Daemon restart (tmux) Startup scan tmux tasks re-attached. See tmux persistence.
Operator launches same named task twice Concurrency check Second launch is allowed — multiple instances of the same named task can run concurrently. (The operator may want parallel test runs.)
Project unloaded while tasks running Project unload flow All running tasks for the project are stopped (SIGTERM → SIGKILL). tmux session is destroyed.

Audit events

Event When Carries
task.launched Task started task_id, project_id, task_name, command (truncated to 200 chars), backend, user_id
task.exited Task process exited task_id, exit_code, duration_ms
task.stopped Operator stopped a task task_id, user_id
task.reattached Operator re-connected to running task task_id, user_id, scrollback_lines_sent
task_runner.backend_resolved At startup backend, tmux_version (if applicable)
task_runner.tmux_fallback tmux failed for a specific task task_id, error
task_runner.orphaned_session Daemon found an unmatched tmux session tmux_session_name

Testing notes

Per ADR-0003:

  • DSL schema tests: tasks block validates (and rejects invalid entries) against the JSON Schema extension. Unknown fields in task entries are errors.
  • Backend resolution tests: all six cells of the backend resolution table exercised.
  • Raw PTY lifecycle tests: launch a task → assert running → wait for exit → assert done with exit code. Launch a failing command → assert failed.
  • tmux lifecycle tests (integration): launch a task with tmux backend → verify tmux session exists → kill daemon → verify tmux session survives → restart daemon → verify task is re-attached and state is consistent.
  • Stop tests: launch a long-running task → stop → assert SIGTERM sent → assert stopped.
  • Transcript tests: launch a task that produces output → wait for exit → assert transcript in database → fetch via API → verify content.
  • xterm.js rendering tests (Playwright): launch a task → open the task terminal in the UI → assert terminal renders output → type input → assert input reaches the process.
  • Concurrency tests: launch 9 tasks in a project (limit is 8) → assert 9th returns 429. Exited handles do not count against the limit.
  • Confirm tests: launch a confirm: true task → assert confirmation prompt returned → confirm → assert task starts.
  • Ad-hoc tests: submit an ad-hoc command → assert it runs in the project root → assert it appears in task instances with task_name: null.
  • History tests: launch a named task with history: true → wait for exit → assert tmux pane alive → assert instance visible in history → launch another → assert oldest culled when count exceeds history_count.
  • History-disabled tests: launch a named task with history: false → wait for exit → assert tmux window destroyed → assert no history retained.
  • Sidebar status tests: launch a task → assert sidebar icon shows running state (pulsing blue) → wait for exit → assert icon updates to done (green) or failed (magenta).
  • Tab bar tests: launch multiple instances of the same named task → assert tab bar shows all instances → close a historical tab → assert tmux window killed and DB record deleted.
  • tmux control-mode tests (unit): parse a sample control-mode output stream → assert per-pane bytes are correctly routed.

Open questions

  1. Task environment inheritance. Tasks inherit the daemon's environment. Should there be a project-level env: block in the DSL (applied to all tasks)? v0: no, per-task env: is sufficient. If operators consistently request project-level env, add it as a minor.
  2. Task dependencies. Should tasks support depends_on: [build] so that deploy auto-runs build first? v0: no. The complexity of dependency graphs and failure propagation is significant. Operators chain commands in the command field (bun run build && ./deploy.sh) or run them manually in order.
  3. Task history retention. How long are completed task instances kept? v0: 100 most recent per project, oldest auto-pruned. No time-based retention. Resolved 2026-06-04: Per-task history via history (default true) and history_count (default 3) DSL fields. tmux remain-on-exit keeps panes alive for review. Culling is per-task, count-based, on completion. See task history fields.
  4. Task output search. Should task transcripts be full-text searchable? v0: no. Transcripts are blobs. v0.x: consider indexing for grep-like search across task history.
  5. Agent-initiated tasks. Could an agent dispatch a named task instead of (or in addition to) a subagent? v0: no. Tasks are operator-initiated only. This boundary keeps the trust model clear — agents work through the subagent/cage system; tasks are the operator's hands.
  6. tmux configuration file. Should kaged ship a minimal tmux.conf for task sessions (to ensure consistent behavior)? v0: yes, tasks use tmux -f /dev/null to ignore the operator's tmux config. The daemon controls the tmux environment completely. If operators want their tmux config, they use tmux directly on the host.
  7. Task notifications. Should completed/failed tasks push notifications to the operator's device? v0: no sound or push. The task.exited event over the WebSocket updates the UI badge. v0.x: opt-in browser notifications.
  8. Relationship between tasks and sessions. v0 makes them independent (project-level, not session-level). Should a session be able to "own" a set of tasks, so ending the session also stops those tasks? Deferred — the independence model is simpler and matches how operators think about running commands vs. talking to agents.

Amendments

  • 2026-06-04: Added history (boolean, default true) and history_count (integer 1–20, default 3) fields to the task DSL schema. When history: true on the tmux backend, task panes use remain-on-exit on to preserve terminal output after process exit. Completed instances are culled to history_count per task on each completion (tmux window kill + DB delete). Updated sidebar status icons: running instances show pulsing blue <Footprints />, failed shows magenta <TriangleAlert />, done shows green <SquareCheckBig />, never-run shows <Terminal />. Clicking a running task navigates to it; clicking a completed/never-run task starts a new instance. Added task instance tab bar to the detail view for named tasks: running instances first, then historical, with close/delete capability. Updated GET /projects/:id/tasks/instances to accept optional task_name filter. Resolved open question #3 (history retention).
  • 2026-06-04: Added POST /api/v1/projects/:id/tasks/cleanup endpoint to bulk-delete all non-running task instances for a project. Added "Cleanup history" button to the Tasks overview page. Fixed PTY broker limit counting to exclude exited handles — previously, exited tmux panes that remained in the broker registry (for history/review) were counted against the 8-task limit, blocking new spawns. Now only handles with exited === false count toward the per-project limit.
  • 2026-06-03: Added optional cols / rows to POST /api/v1/projects/:id/tasks/run so the daemon can seed terminal size before WebSocket subscribe. Validation is now explicit: positive integers only, clamp oversized values to 1000, fall back to 80x24 when absent or invalid. Synced with project-terminals.md so tmux sessions/windows consume those launch-time dimensions immediately.
  • 2026-05-24: Added sibling spec cross-reference to project-terminals.md. Fixed raw PTY backend description to use Bun.spawn (was incorrectly referencing node-pty, which contradicts ADR-0004 and AGENTS.bun.md). Updated "not normative for" section to clarify that the PTY broker, tmux runtime driver, side-effect interpreter, and xterm.js component are defined in project-terminals.md.

References