ADR-0005: Storage default is SQLite; PostgreSQL is optional
- Status: Accepted
- Date: 2026-05-21
- Last amended: 2026-05-21 (deployment-mode paths)
- Deciders: @karasu
- Supersedes: —
- Superseded by: —
Context
kaged needs persistent storage. The data the daemon holds:
- Projects — DSL config, prompt versions, plugin manifests
- Sessions — persistent across operator disconnects; long-lived
- Agent runs and subagent invocations — every dispatch with its inputs, outputs, exit status
- Debug checkpoints — paused states the operator can resume or roll back
- Audit log — every prompt edit, every policy decision, every sandbox event
- PTY transcripts — terminal scrollback for replay when an operator reattaches
The constraint set:
- Runs on a small Linux host. Flash-backed filesystem, limited RAM, no dedicated DBA.
- Single operator, single daemon. No multi-writer contention from N application servers; the daemon is the writer.
- Self-hosted, no external services. Per the manifesto, kaged does not call home and does not require auxiliary infrastructure.
- Backups must be trivial. Operators back up by copying a file or running a script; not by configuring
pg_dumpcron jobs. - Schema must be portable. The same schema should work on the small host and on a server-class deployment, because operators sometimes graduate from one to the other.
- No native build pain. Per ADR-0004, the runtime is Bun. SQLite via
bun:sqliteis built-in; PostgreSQL clients are not. - Some operators have Postgres anyway. Homelabbers running multiple services often have a shared Postgres. Forcing them to a separate SQLite file is a small annoyance but not unreasonable.
The shortlist:
- SQLite — file-backed, zero ops, fits a small host.
- PostgreSQL — server-class, requires an external service.
- Both, switchable — SQLite by default, Postgres opt-in via config.
- Embedded KV (lmdb, leveldb, mdbx) — fast, but loses SQL ergonomics.
- In-memory + periodic snapshot — fast, simple, loses durability guarantees on crash.
The question is load-bearing because it determines: schema portability, the migration tooling, the transaction model the daemon code is written against, the backup story documented for operators, and how plugin authors persist their own state.
Decision
SQLite is the default storage engine for kaged. PostgreSQL is supported as an optional backend selected by configuration. The schema is portable between the two: every migration is written to apply on both engines.
Concretely:
- Default install: SQLite, file-backed at
${KAGED_HOME}/data/kaged.db. No external service required. - Opt-in Postgres: Set
KAGED_DATABASE_URL=postgres://...(or equivalent config), kaged uses Postgres instead. No SQLite file is created. - Schema is portable. Migrations use a portable SQL subset (no SQLite-only types like
DATETIMEtext-encoding shortcuts; no Postgres-onlyJSONBfeatures without a documented SQLite-on-JSON1fallback). Where engines diverge, the migration tooling emits engine-specific DDL from a single source. - Driver layer is
bun:sqliteandBun.sql. Both are built into Bun (ADR-0004). Nobetter-sqlite3, nopg, nopostgres.js. - No mongo, no kv-only, no in-memory. SQL semantics are load-bearing for the audit log and the sessions/runs join surface.
- Backups are documented as
sqlite3 kaged.db .dump(or file copy with WAL flushed) for SQLite, andpg_dumpfor Postgres. Daemon ships akaged backupcommand that wraps both.
Consequences
What this commits us to
- Two test matrices for the storage layer. Every migration must be tested on both engines. Every query that touches engine-specific syntax must have both code paths covered.
- A portability discipline in the schema. Type choices, default values, JSON handling, full-text-search — all must be either portable or paired with adapter code. Examples:
- Use
TEXTfor timestamps stored as ISO-8601, notDATETIME(SQLite) orTIMESTAMPTZ(Postgres). Or: store as integer milliseconds. - Use
JSONcolumns; access viajson_extract(SQLite) and->>(Postgres) through a small abstraction. - Avoid SQLite's lax type affinity tricks; treat the schema as strict.
- Use
- Migration tooling. We need a migration runner that targets both engines. Either we write one (small, kaged-shaped) or we adopt one (drizzle-kit, kysely-migrations) and check that it actually handles both correctly.
- Documentation of when to use which. SQLite: single-host, single-operator, default. Postgres: heavier deployments, shared homelab DB, or larger projects where SQLite's write contention becomes visible.
What this forecloses
- No Postgres-only features in the daemon's hot paths. No
LISTEN/NOTIFYfor change notification (we use in-process pub/sub instead). Nopg_partman. NoJSONB-specific indexing tricks that SQLite can't approximate. - No SQLite-only features either. No
WITHOUT ROWIDtables in shared schema. NoFTS5virtual tables unless we accept the Postgres parallel (tsvector) and abstract them. - No vendor-locked migration tool. If we adopt one and it later turns out to be Postgres-biased, we eat the cost of switching.
- No "we'll add Postgres later" hedge. Adding Postgres later means retrofitting portability into a schema that was written with SQLite assumptions, which is the standard cause of database-portability disasters. We do both from day one or we commit to SQLite-only.
What becomes easier
- First-run UX. A new operator installs kaged and it works. No "now install Postgres" step.
- Backups. File copy or
sqlite3 .dumpare operator-familiar. The host runs nightly cron with rsync to a NAS and that's the backup story. Done. - Local development. A clone of the repo with no database server needed. Tests run against a temp SQLite file.
- Reproducing bugs. "Send me your
kaged.db" is a tractable bug report. "Send me your Postgres cluster state" is not. - Scaling up. Operators who outgrow SQLite (rare, for kaged's profile) can move to Postgres without rewriting application code.
What becomes harder
- Concurrent writes under load. SQLite serializes writes. Long-running write transactions block other writers. The daemon must keep transactions short and use WAL mode. We will hit this if we are sloppy.
- Full-text search. SQLite's FTS5 and Postgres'
tsvectorare both fine but different. Cross-engine FTS adds code. - Schema diff tools. Most schema-diff tooling assumes one engine. We may need to write our own or carefully use two.
- Performance tuning. Two engines means two sets of EXPLAIN PLANs, two tuning playbooks, two failure modes. The operator playbook gets longer.
Implementation notes (not normative)
- Connection pooling: SQLite needs
bun:sqlitewith WAL mode (PRAGMA journal_mode=WAL;) and busy-timeout. Postgres usesBun.sql's built-in pool. - Transaction model: Daemon code treats the storage layer as transactional and synchronous. SQLite transactions are obvious; Postgres equivalent is just a longer round-trip.
- JSON columns: When storing structured data (prompt history, plugin manifests), use a
JSONcolumn with strict in-app validation. Don't rely on engine-side JSON constraints; SQLite's are weaker than Postgres'. - Timestamps: Store as integer Unix milliseconds (epoch). Format in the application layer. Portable, unambiguous, monotonic-friendly.
- IDs: ULIDs or UUIDv7 (sortable, generated client-side). Don't rely on engine-specific autoincrement.
Alternatives considered
Alternative A — SQLite only
Why tempting: Simplest. One engine, one test matrix, one code path. Fits the small-host profile perfectly.
Why rejected: Forces operators with existing Postgres infra to run a second database for kaged. Closes off the "homelab graduates kaged to a server" path. The portability tax is real but bounded; the lock-in cost is unbounded.
Alternative B — Postgres only
Why tempting: One engine, predictable scaling, mature tooling.
Why rejected: Kills the "install and run on a single small host" UX. Every operator now has to operate Postgres. For a category whose audience is "people who run their own infra and are tired of corporate complexity," requiring a separate database server is the wrong cost to impose by default.
Alternative C — Both, but with separate codebases / adapters
Why tempting: Frees the schema to use engine-specific power features.
Why rejected: The maintenance cost of two parallel storage stacks is enormous, and the operator-facing benefit is small. We are not Twitter; we don't need engine-specific tuning. Portability is the simpler discipline.
Alternative D — Embedded KV store (LMDB, RocksDB, etc.)
Why tempting: Fast, file-backed, single-process-friendly.
Why rejected: Loses SQL. The audit log, the sessions-with-runs join surface, and the operator's "I want to query my history" use case all want SQL. Embedded KV makes us re-implement secondary indexes and joins by hand. Bad trade.
Alternative E — In-memory + periodic snapshot
Why tempting: Fastest possible reads/writes. Trivial code.
Why rejected: Loses durability on crash. The audit log promise is "every prompt edit is captured." That is incompatible with "between snapshots, data may be lost." Non-starter.
Amendments
2026-05-21 — Storage paths follow deployment mode
ADR-0010 introduced per-user and system-wide deployment modes as equal first-class citizens. The default storage location depends on mode:
| Mode | Default SQLite path | ${KAGED_HOME} |
|---|---|---|
| Per-user | ~/.local/share/kaged/kaged.db (XDG: $XDG_DATA_HOME/kaged/) |
~/.local/share/kaged |
| System-wide | /var/lib/kaged/kaged.db |
/var/lib/kaged |
Both modes use the same SQLite engine, the same schema, the same migrations. The path differs; the storage substrate does not. The portability discipline this ADR commits to (portable migrations, no engine-specific features in hot paths) remains intact across modes.
The storage.url config field overrides the default in either mode. Operators can point at a custom path (sqlite:///path/to/elsewhere.db) or a Postgres URL (postgres://...) regardless of which mode the daemon is in. Mode determines the default; operator config wins when set.
References
docs/02-architecture.md— storage component description- ADR-0004 — provides
bun:sqliteandBun.sqlnatively - ADR-0008 — plugins use the same storage abstraction for their own state
- ADR-0010 — deployment modes that determine the default path
bun:sqlitedocs: https://bun.com/docs/api/sqliteBun.sql(Postgres) docs: https://bun.com/docs/api/sql- XDG Base Directory Specification: https://specifications.freedesktop.org/basedir-spec/
- Original discussion: design conversation with colleagues, 2026-05-21
- Amendment: colleagues, 2026-05-21