Specification
This is the canonical long-form specification. For day-to-day reference, the short-form Reference pages — CLI, MCP tools, Configuration, Schema, Page types, Errors — are usually what you want.
Spec metadata
Section titled “Spec metadata”- Created: 2026-04-06
- Updated: 2026-04-13
- Status: spec-complete-v4
- Tags: open-source, knowledge-base, sqlite, rag, rust, mcp, thin-harness, fat-skills, candle
- Sources: Garry Tan’s earlier local-memory spec + internal architecture notes and reviews
Quaid - Personal Knowledge Memory
Section titled “Quaid - Personal Knowledge Memory”Open-source personal knowledge memory. SQLite + FTS5 + vector embeddings in one file. Thin CLI harness, fat skill files. MCP-ready from day one. Runs anywhere. No API keys, no internet, no Docker. Truly static single binary.
Inspired by Garry Tan’s earlier local-memory work, with this spec adapting similar goals to a local-first Rust + SQLite architecture intended for portable, offline use.
- Status: Spec complete v4 — ready to build core (see Phased Delivery)
- Repo: GitHub.com/quaid-app/quaid
- License (planned): MIT
- Origin: Inspired by Garry Tan’s earlier local-memory spec (2026-04-05), then extended with architecture improvements (2026-04-06) and memory research integration (2026-04-08)
- v1 differentiator over Garry’s spec: Local embeddings, Rust binary instead of TypeScript/Bun, true zero-dependency single binary
- v2 additions (Apr 2026 research): Set-union hybrid search, palace-style hierarchical filtering, progressive retrieval, selective ingestion, temporal knowledge graph, contradiction detection, four-tier memory consolidation. Techniques sourced from MemPalace (96.6% R@5 LongMemEval), OMNIMEM (+411% F1), agentmemory (92% token reduction).
- v3 additions (Architecture Review): Exact-Match Short-Circuit (SMS) search, Temporal Sub-chunking for timelines, Assertions table for heuristic contradiction detection, strict Optimistic Concurrency on MCP writes, true zero-dependency static linking via
candle(replacingfastembed/ONNX). - v4 additions (Community Research + Garry v0.8.0): Knowledge gap detection (
knowledge_gapstable +memory_gapMCP tool), graph neighborhood traversal (memory_graph),originalpage type for user’s own thinking, standardised source attribution format with authority hierarchy, filing disambiguation rules, richer person templates, new skills (upgrade, alerts, research), and PGLite convergence validation. Informed by community prototypes, public discussion, and Garry Tan’s v0.8.0 skillpack analysis.
Table of Contents
Section titled “Table of Contents”- The Problem (includes Non-Goals)
- The Solution
- Architecture Overview
- Technology Stack (v3: candle replaces fastembed)
- Database Schema (v4: + knowledge_gaps table, original page type)
- CLI Reference (v4: + graph, gaps commands)
- MCP Server (v4: + memory_graph, memory_gap, memory_gaps)
- Hybrid Search (v3: SMS short-circuit + set-union + palace filtering)
- Progressive Retrieval (v2: token-budget-gated expansion)
- Ingest Pipeline (v3: assertions + sub-chunking + idempotency)
- Migration Plan (v4: + original type mapping)
- Export and Round-trip
- Skills (Fat Markdown) (v4: + alerts, research, upgrade skills; source attribution; filing disambiguation)
- Repository Structure (v4: + graph.rs, gaps.rs, 3 new skills)
- Build and Release
- Phased Delivery (v4: knowledge_gaps in P1, graph in P2, gaps/alerts/research/upgrade in P3)
- Implementation Roadmap (v4: + graph, gaps, new skills)
- Benchmarks and Release Gates (corpus-reality + LongMemEval, LoCoMo, BEIR, Ragas)
- Design Decisions (v4: + SQLite vs PGLite, links-as-graph-layer)
- Schema Versioning and DB Migration
- Security and Data Sensitivity
- Error Handling and Graceful Degradation
- Comparison Table (v4: updated for Garry v0.8.0 + PGLite, + graph/gap rows)
- Open Questions (v2: new questions added)
- Spec History
The Problem
Section titled “The Problem”Git doesn’t scale past ~5,000 markdown files. At 7,471 files and 2.3GB, a wiki-memory directory becomes slow to clone, painful to search, and unusable for structured queries. Full-text search requires grep. Semantic search requires an external vector database. Cross-references are just markdown links with no queryable graph.
The compiled-truth + timeline architecture (Karpathy-style: always-current intelligence above the line, append-only evidence below the line) is the right knowledge model - it just needs a real database underneath.
Additionally: every existing knowledge-base tool (Obsidian, Notion, RAG frameworks) either requires a GUI, locks data in a SaaS platform, or needs an internet connection and API keys to do anything useful. An agent-first world needs a knowledge layer that:
- Lives in a single logical database (one
.dbfile; WAL sidecars during operation, compactable to single file for transport) - Does full-text + semantic search natively
- Exposes an MCP server for any AI client
- Works on a plane, in an air-gapped environment, with no ongoing API costs
- Is fast, small, and has zero runtime dependencies
Non-Goals (v1)
Section titled “Non-Goals (v1)”- Not a collaborative platform — single-user, single-writer. No auth, no RBAC, no multi-tenant.
- Not a sync product — no real-time replication, no CRDTs, no cloud sync.
rsync/scpis the transport. - Not a full graph database — typed links with temporal validity, not arbitrary traversals or Cypher queries.
- Not a general note-taking app — structured knowledge pages, not freeform notes. Use Obsidian for that.
- Not a document warehouse — pages are compiled intelligence, not raw file storage. Raw data goes in
raw_datatable. - Not a semantic contradiction oracle — heuristic detection via assertions, not LLM-powered reasoning. The binary is dumb.
- Not multimodal — text only. Images, audio, video are not indexed or embedded.
The Solution
Section titled “The Solution”A single Rust CLI distributed in two BGE-small channels — airgapped (~180MB, embedded weights) or online (~90MB, downloads on first use) — wrapping:
- SQLite with WAL mode - single logical database (
memory.db+ WAL sidecars while live;quaid compactcheckpoints to true single file for transport/backup) - FTS5 - full-text search, built into SQLite
- sqlite-vec - vector similarity search as a SQLite extension, statically linked
- candle + BGE-small-en-v1.5 - pure-Rust ML framework running a local embedding model, no ONNX runtime dependencies
- MCP stdio server - any MCP-compliant client can search, read, write, and ingest
- Fat skills - intelligence lives in markdown SKILL.md files, not in code
One cargo build --release --target x86_64-unknown-linux-musl. One truly static binary. Drop it anywhere and run it.
Architecture Overview
Section titled “Architecture Overview”╔══════════════════════════════════════════════════════════════╗║ CONSUMERS ║╠══════════════════════════════════════════════════════════════╣║ ║║ Claude Code OpenClaw / Doug Any MCP Client ║║ (via MCP) (via MCP/CLI) (via MCP) ║║ │ │ │ ║║ └──────────┬────────────────────┘ │ ║║ │ │ ║║ ┌───────────────▼──────────┐ ┌─────────────▼──────────┐ ║║ │ MCP Server │ │ CLI │ ║║ │ (stdio transport) │ │ bin/quaid │ ║║ │ quaid serve │ │ single Rust binary │ ║║ └───────────────┬──────────┘ └─────────────┬──────────┘ ║║ │ │ ║║ └──────────────┬─────────────┘ ║║ │ ║║ ┌───────────────────▼──────────────┐ ║║ │ quaid-core │ ║║ │ (Rust) │ ║║ │ │ ║║ │ ┌──────────────────────────────┐ │ ║║ │ │ db.rs (rusqlite) │ │ ║║ │ │ fts.rs (FTS5 queries) │ │ ║║ │ │ inference.rs (candle ML) │ │ ║║ │ │ search.rs (SMS + union) │ │ ║║ │ │ progressive.rs (retrieval) │ │ ║║ │ │ palace.rs (wing/room) │ │ ║║ │ │ novelty.rs (dedup) │ │ ║║ │ │ assertions.rs(heuristics) │ │ ║║ │ │ chunking.rs (temporal) │ │ ║║ │ │ markdown.rs (parse/render) │ │ ║║ │ │ links.rs (temporal KG) │ │ ║║ │ │ migrate.rs (import/export)│ │ ║║ │ └──────────────────────────────┘ │ ║║ └───────────────────┬────────────────┘ ║║ │ ║║ ┌───────────────────▼──────────────┐ ║║ │ SQLite DB │ ║║ │ memory.db │ ║║ │ │ ║║ │ ┌──────────────────────────────┐ │ ║║ │ │ pages │ │ ║║ │ │ page_fts (FTS5 vtable) │ │ ║║ │ │ page_embeddings (vec0 vtable)│ │ ║║ │ │ links │ │ ║║ │ │ assertions │ │ ║║ │ │ tags │ │ ║║ │ │ raw_data │ │ ║║ │ │ timeline_entries │ │ ║║ │ │ raw_imports │ │ ║║ │ │ config │ │ ║║ │ └──────────────────────────────┘ │ ║║ └───────────────────────────────────┘ ║║ ║╠══════════════════════════════════════════════════════════════╣║ SKILLS (Fat Markdown) ║╠══════════════════════════════════════════════════════════════╣║ ║║ skills/ingest/SKILL.md — meeting/doc/article ingestion ║║ skills/query/SKILL.md — search + synthesis ║║ skills/maintain/SKILL.md — lint, contradictions, orphans ║║ skills/enrich/SKILL.md — external API enrichment ║║ skills/briefing/SKILL.md — daily briefing compilation ║║ skills/alerts/SKILL.md — interrupt-driven notifications ║║ skills/research/SKILL.md — knowledge gap resolution ║║ skills/upgrade/SKILL.md — agent-guided binary upgrades ║║ ║╚══════════════════════════════════════════════════════════════╝Core Philosophy
Section titled “Core Philosophy”Thin harness, fat skills. The binary is plumbing. The intelligence lives in SKILL.md files. Claude Code, OpenClaw, or any agent reads SKILL.md at session start and knows every workflow, heuristic, and edge case without that logic being compiled into the binary. Default skills are embedded in the binary via include_str!() and extracted to ~/.quaid/skills/ on first run. External skill files in the working directory override embedded defaults. quaid skills doctor shows active resolution order and content hashes.
Above the line / Below the line. Every knowledge page has two zones:
- compiled_truth - Always current. Rewritten when new info arrives. The intelligence assessment. The “what we know now.”
- timeline - Append-only. Never rewritten. The evidence base. The “what happened and when.”
The horizontal rule (---) is the boundary. Reconstructed on export.
Single logical database, total ownership. memory.db is the database. During operation, SQLite WAL mode creates -wal and -shm sidecars for write performance. Run quaid compact to checkpoint back to a true single file for transport. The practical artifact is: binary + DB + skill pack (embedded defaults, optional overrides). No connection strings. No Docker. No managed database. No API keys required at runtime.
Technology Stack
Section titled “Technology Stack”| Component | Choice | Rationale |
|---|---|---|
| Language | Rust | Single binary via cargo build --release. No runtime. No GC pauses. Memory safe. Cross-compiles cleanly. |
| Database | rusqlite with bundled feature | SQLite compiled into the binary. Zero system dependency. bundled cargo feature = self-contained. |
| Full-text search | FTS5 | Built into SQLite. Porter stemmer + unicode61 tokenizer. Handles 100K+ documents trivially. |
| Vector search | sqlite-vec (statically linked) | Alex Garcia’s sqlite-vec extension. Stores float32 arrays as BLOBs. Native cosine similarity. Same DB, same query. v0.1+ stable. Statically linked via rusqlite. |
| Embeddings | candle + BGE-small-en-v1.5 | HuggingFace’s pure-Rust ML framework. Unlike fastembed (ONNX runtime), candle allows true musl static compilation. Weights embedded via include_bytes!() for zero-network binary. |
| CLI | clap | Industry standard Rust CLI framework. Auto-generates help text. |
| MCP server | rmcp | Rust MCP crate. Stdio transport. |
| Markdown | pulldown-cmark + gray-matter port | Fast CommonMark parser. Frontmatter parsing via custom YAML header extraction. |
| JSON/YAML | serde_json / serde_yaml | Standard serialization. |
Why Rust over TypeScript/Bun (the earlier reference stack)
Section titled “Why Rust over TypeScript/Bun (the earlier reference stack)”| Garry’s earlier spec (TypeScript/Bun) | This spec (Rust) | |
|---|---|---|
| Binary size | ~10MB (Bun compiled) | ~90MB (includes model weights) |
| Embeddings | text-embedding-3-small (OpenAI API, costs money, needs internet) | BGE-small-en-v1.5 via candle (local, free, fast, pure Rust) |
| Internet required | Yes (for embeddings) | No |
| API keys required | Yes (OPENAI_API_KEY) | No |
| Cross-compile | Bun’s compile works but CGO complications | cargo cross or GitHub Actions matrix = trivial |
| Memory | Node/Bun overhead | Minimal, no GC |
| sqlite-vec linking | Native addon complications with Bun | rusqlite bundled feature handles it cleanly |
| Air-gapped use | No | Yes |
The key differentiator: runs on a plane, in a datacenter without egress, on a client machine with no API keys configured. “Runs on client” is the real value.
Database Schema
Section titled “Database Schema”-- memory.db schema-- Quaid v4
PRAGMA journal_mode = WAL;PRAGMA foreign_keys = ON;
-- Load sqlite-vec extension (statically linked in Rust binary)-- SELECT load_extension('./vec0'); -- handled at db init in Rust
-- ============================================================-- pages: the core content table-- ============================================================CREATE TABLE IF NOT EXISTS pages ( id INTEGER PRIMARY KEY AUTOINCREMENT, slug TEXT NOT NULL UNIQUE, -- e.g. "people/pedro-franceschi" type TEXT NOT NULL, -- person, company, deal, yc, civic, project, concept, original, source, media, decision, commitment, action_item title TEXT NOT NULL, summary TEXT NOT NULL DEFAULT '', -- executive summary (blockquote at top of compiled_truth) compiled_truth TEXT NOT NULL DEFAULT '', -- markdown, above the line timeline TEXT NOT NULL DEFAULT '', -- markdown, below the line frontmatter TEXT NOT NULL DEFAULT '{}',-- JSON blob (original YAML converted) wing TEXT NOT NULL DEFAULT '', -- palace hierarchy: entity grouping (auto-derived from slug, override via frontmatter) room TEXT NOT NULL DEFAULT '', -- palace hierarchy: topic within wing (derived from section headers or frontmatter) version INTEGER NOT NULL DEFAULT 1, -- optimistic concurrency: incremented on every write created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), -- bumped on ANY page-scoped mutation truth_updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), -- bumped ONLY when compiled_truth changes timeline_updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')));
CREATE INDEX IF NOT EXISTS idx_pages_type ON pages(type);CREATE INDEX IF NOT EXISTS idx_pages_slug ON pages(slug);CREATE INDEX IF NOT EXISTS idx_pages_updated ON pages(updated_at);CREATE INDEX IF NOT EXISTS idx_pages_wing ON pages(wing);CREATE INDEX IF NOT EXISTS idx_pages_wing_room ON pages(wing, room);
-- ============================================================-- page_fts: full-text search over compiled_truth + timeline-- ============================================================CREATE VIRTUAL TABLE IF NOT EXISTS page_fts USING fts5( title, slug, compiled_truth, timeline, content='pages', content_rowid='id', tokenize='porter unicode61');
-- Triggers to keep FTS in syncCREATE TRIGGER IF NOT EXISTS pages_ai AFTER INSERT ON pages BEGIN INSERT INTO page_fts(rowid, title, slug, compiled_truth, timeline) VALUES (new.id, new.title, new.slug, new.compiled_truth, new.timeline);END;
CREATE TRIGGER IF NOT EXISTS pages_ad AFTER DELETE ON pages BEGIN INSERT INTO page_fts(page_fts, rowid, title, slug, compiled_truth, timeline) VALUES ('delete', old.id, old.title, old.slug, old.compiled_truth, old.timeline);END;
CREATE TRIGGER IF NOT EXISTS pages_au AFTER UPDATE ON pages BEGIN INSERT INTO page_fts(page_fts, rowid, title, slug, compiled_truth, timeline) VALUES ('delete', old.id, old.title, old.slug, old.compiled_truth, old.timeline); INSERT INTO page_fts(rowid, title, slug, compiled_truth, timeline) VALUES (new.id, new.title, new.slug, new.compiled_truth, new.timeline);END;
-- ============================================================-- page_embeddings: vector embeddings via sqlite-vec-- vec0 virtual table for native cosine similarity-- ============================================================-- ── Embedding model registry (single source of truth for active model) ──-- Each model gets its own vec0 table (dimension is baked into the virtual table).-- The embedding_models table is the ONLY authoritative selector for the active model.-- The config keys 'embedding_model' and 'embedding_dimensions' are derived from this-- table at init and are read-only aliases — never write them directly.CREATE TABLE IF NOT EXISTS embedding_models ( name TEXT PRIMARY KEY, -- e.g. 'bge-small-en-v1.5' dimensions INTEGER NOT NULL, -- e.g. 384 vec_table TEXT NOT NULL UNIQUE, -- e.g. 'page_embeddings_vec_384' active INTEGER NOT NULL DEFAULT 0, -- 1 = currently used for writes created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')));
-- Enforce exactly one active model at all times.-- Partial unique index: only rows where active=1 participate in the constraint.-- Zero active rows → init/migration must seed one. Multiple active rows → rejected.CREATE UNIQUE INDEX IF NOT EXISTS idx_embedding_models_one_active ON embedding_models(active) WHERE active = 1;
-- Seed the default model at init (MUST happen before any embedding operation):-- INSERT INTO embedding_models (name, dimensions, vec_table, active)-- VALUES ('bge-small-en-v1.5', 384, 'page_embeddings_vec_384', 1);-- On model upgrade: INSERT new model with active=0, re-embed all pages under it,-- then in a single transaction: UPDATE old model SET active=0, UPDATE new model SET active=1.-- The unique index guarantees at most one active=1 row at any point.-- Old table is kept until explicitly dropped, so rollback is safe.
-- Default model — vec table created dynamically at init based on registered dimensions.-- Example for BGE-small (384-dim):-- CREATE VIRTUAL TABLE IF NOT EXISTS page_embeddings_vec_384 USING vec0(-- embedding float[384]-- );-- On model upgrade (e.g. to 768-dim), a new vec table is created and a re-embed-- migration populates it before flipping the active flag. Old table is kept until-- explicitly dropped, so rollback is safe.
-- Metadata for each embedding chunk — model-scoped to avoid rowid collisions.-- Each model's vec table has its own rowid space. page_embeddings stores the-- vec_rowid for the specific model's vec table, NOT a shared autoincrement.CREATE TABLE IF NOT EXISTS page_embeddings ( id INTEGER PRIMARY KEY AUTOINCREMENT, -- internal metadata ID (NOT vec rowid) page_id INTEGER NOT NULL REFERENCES pages(id) ON DELETE CASCADE, model TEXT NOT NULL DEFAULT 'bge-small-en-v1.5' REFERENCES embedding_models(name), vec_rowid INTEGER NOT NULL, -- rowid in the model's vec table (model-scoped) chunk_type TEXT NOT NULL, -- 'truth_section' | 'timeline_entry' chunk_index INTEGER NOT NULL, -- 0-based index within page chunk_text TEXT NOT NULL, -- the text that was embedded content_hash TEXT NOT NULL, -- SHA-256 of chunk_text (skip re-embed if unchanged) token_count INTEGER NOT NULL, -- approximate token count (whitespace-split) heading_path TEXT NOT NULL DEFAULT '', -- e.g. "## State" or "## Timeline > 2024-03-01" last_embedded_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), UNIQUE(model, vec_rowid) -- one metadata row per vector per model);-- Join: page_embeddings.vec_rowid = <model_vec_table>.rowid-- WHERE page_embeddings.model = <active_model>-- On model migration: new rows created for new model, old model rows kept for rollback.-- Cutover: flip embedding_models.active, queries switch to new model's rows.-- Rollback: flip back, old rows + old vec table still intact.
CREATE INDEX IF NOT EXISTS idx_embeddings_page ON page_embeddings(page_id);CREATE INDEX IF NOT EXISTS idx_embeddings_model ON page_embeddings(model);CREATE INDEX IF NOT EXISTS idx_embeddings_lookup ON page_embeddings(model, page_id, chunk_index);
-- ============================================================-- links: cross-references between pages-- ============================================================-- Surrogate ID is the stable target for link-close operations.-- No UNIQUE on (from, to, relationship, valid_from) — multiple intervals with-- unknown start dates are allowed. Dedup and non-overlap enforced in app logic.-- memory_link_close targets by link ID, not by date columns.CREATE TABLE IF NOT EXISTS links ( id INTEGER PRIMARY KEY AUTOINCREMENT, from_page_id INTEGER NOT NULL REFERENCES pages(id) ON DELETE CASCADE, to_page_id INTEGER NOT NULL REFERENCES pages(id) ON DELETE CASCADE, relationship TEXT NOT NULL DEFAULT 'related', context TEXT NOT NULL DEFAULT '', valid_from TEXT DEFAULT NULL, valid_until TEXT DEFAULT NULL, created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), -- Temporal ordering: valid_until must be after valid_from when both are known CHECK (valid_from IS NULL OR valid_until IS NULL OR valid_until >= valid_from));
CREATE INDEX IF NOT EXISTS idx_links_from ON links(from_page_id);CREATE INDEX IF NOT EXISTS idx_links_to ON links(to_page_id);CREATE INDEX IF NOT EXISTS idx_links_current ON links(valid_until); -- fast filter for current-only queries
-- ============================================================-- assertions: heuristic contradiction detection-- Populated by agents during Tier 2 ingest. Enables pure-SQL-- consistency checks without burning LLM tokens.-- ============================================================-- Surrogate ID is the stable target for supersession.-- No UNIQUE on (page_id, subject, predicate, object, valid_from) — multiple intervals-- with unknown start dates are allowed. Tier 2 rewrites supersede old assertions by-- setting valid_until on the prior row AND pointing supersedes_id to it.-- Contradiction detection: SELECT WHERE valid_until IS NULL (current beliefs only).-- Dedup enforced in application logic during ingest.CREATE TABLE IF NOT EXISTS assertions ( id INTEGER PRIMARY KEY AUTOINCREMENT, page_id INTEGER NOT NULL REFERENCES pages(id) ON DELETE CASCADE, subject TEXT NOT NULL, predicate TEXT NOT NULL, object TEXT NOT NULL, valid_from TEXT DEFAULT NULL, valid_until TEXT DEFAULT NULL, supersedes_id INTEGER DEFAULT NULL REFERENCES assertions(id), confidence REAL DEFAULT 1.0, asserted_by TEXT NOT NULL DEFAULT 'agent', source_ref TEXT NOT NULL DEFAULT '', evidence_text TEXT NOT NULL DEFAULT '', created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), CHECK (valid_from IS NULL OR valid_until IS NULL OR valid_until >= valid_from), CHECK (asserted_by IN ('agent', 'manual', 'import', 'enrichment')));
CREATE INDEX IF NOT EXISTS idx_assertions_subj ON assertions(subject);CREATE INDEX IF NOT EXISTS idx_assertions_pred ON assertions(predicate);
-- ============================================================-- tags-- ============================================================CREATE TABLE IF NOT EXISTS tags ( id INTEGER PRIMARY KEY AUTOINCREMENT, page_id INTEGER NOT NULL REFERENCES pages(id) ON DELETE CASCADE, tag TEXT NOT NULL, UNIQUE(page_id, tag));
CREATE INDEX IF NOT EXISTS idx_tags_tag ON tags(tag);CREATE INDEX IF NOT EXISTS idx_tags_page_id ON tags(page_id);
-- ============================================================-- raw_data: sidecar data (replaces .raw/ JSON files)-- ============================================================CREATE TABLE IF NOT EXISTS raw_data ( id INTEGER PRIMARY KEY AUTOINCREMENT, page_id INTEGER NOT NULL REFERENCES pages(id) ON DELETE CASCADE, source TEXT NOT NULL, -- "crustdata", "happenstance", "exa", "partiful" data TEXT NOT NULL, -- full JSON response fetched_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), UNIQUE(page_id, source) -- one row per source per page, overwrite on re-enrich);
CREATE INDEX IF NOT EXISTS idx_raw_data_page ON raw_data(page_id);
-- ============================================================-- timeline_entries: structured timeline (supplements markdown)-- ============================================================CREATE TABLE IF NOT EXISTS timeline_entries ( id INTEGER PRIMARY KEY AUTOINCREMENT, page_id INTEGER NOT NULL REFERENCES pages(id) ON DELETE CASCADE, date TEXT NOT NULL, -- ISO 8601: YYYY-MM-DD source TEXT NOT NULL DEFAULT '', -- "meeting", "email", "manual", etc. summary TEXT NOT NULL, -- one-line summary summary_hash TEXT NOT NULL DEFAULT '', -- SHA-256 of summary for dedupe detail TEXT NOT NULL DEFAULT '', -- full markdown detail created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), UNIQUE(page_id, date, summary_hash) -- prevent replay duplicates);
CREATE INDEX IF NOT EXISTS idx_timeline_page ON timeline_entries(page_id);CREATE INDEX IF NOT EXISTS idx_timeline_date ON timeline_entries(date);
-- ============================================================-- raw_imports: original file bytes for byte-exact round-trip-- ============================================================CREATE TABLE IF NOT EXISTS raw_imports ( id INTEGER PRIMARY KEY AUTOINCREMENT, page_id INTEGER NOT NULL REFERENCES pages(id) ON DELETE CASCADE, import_id TEXT NOT NULL, -- identifies the import batch is_active INTEGER NOT NULL DEFAULT 1, -- 1 = current snapshot for this page, 0 = historical content_hash TEXT NOT NULL DEFAULT '', -- SHA-256 of raw_bytes for indexed duplicate detection raw_bytes BLOB NOT NULL, -- original file content, byte-for-byte file_path TEXT NOT NULL, -- relative path within the import source created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), UNIQUE(page_id, import_id));-- On re-import: set is_active=0 on prior rows for the same page_id, then insert new row.-- Raw export uses the active snapshot by default, or a specific import_id if provided.
CREATE INDEX IF NOT EXISTS idx_raw_imports_page ON raw_imports(page_id);CREATE INDEX IF NOT EXISTS idx_raw_imports_active ON raw_imports(page_id, is_active) WHERE is_active = 1;CREATE INDEX IF NOT EXISTS idx_raw_imports_content_hash ON raw_imports(content_hash) WHERE content_hash != '';
-- Import manifest: tracks each import batch for rollback/auditCREATE TABLE IF NOT EXISTS import_manifest ( import_id TEXT PRIMARY KEY, -- UUID or timestamp-based batch ID source_dir TEXT NOT NULL, -- original import path page_count INTEGER NOT NULL, created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')));
-- ============================================================-- raw_imports: byte-exact source storage for restore + duplicate checks-- ============================================================-- Single-file ingest first probes `raw_imports.content_hash` (SHA-256) via the-- partial index above, then confirms equality with `raw_imports.raw_bytes`-- before treating a file as already ingested unless `--force` is used.-- Content-changing writes rotate the active `raw_imports` row in the same-- transaction, preserving exactly one active source snapshot per page for-- restore/export while bounded retention keeps older inactive rows available-- for forensics.
-- ============================================================-- config: memory-level settings-- ============================================================CREATE TABLE IF NOT EXISTS config ( key TEXT PRIMARY KEY, value TEXT NOT NULL);
INSERT OR IGNORE INTO config (key, value) VALUES ('version', '8'), -- embedding_model and embedding_dimensions are derived from embedding_models -- at startup and kept in sync automatically. Do not write them directly. ('embedding_model', 'bge-small-en-v1.5'), -- read-only alias, derived from embedding_models WHERE active=1 ('embedding_dimensions', '384'), -- read-only alias, derived from embedding_models WHERE active=1 ('chunk_strategy', 'section'), -- "page", "section", or "paragraph" ('search_merge_strategy','set-union'), -- "set-union" (default) or "rrf" (fallback) ('default_token_budget', '4000'); -- default for progressive retrieval queries
-- ============================================================-- contradictions: detected inconsistencies across pages-- ============================================================CREATE TABLE IF NOT EXISTS contradictions ( id INTEGER PRIMARY KEY AUTOINCREMENT, page_id INTEGER NOT NULL REFERENCES pages(id) ON DELETE CASCADE, other_page_id INTEGER REFERENCES pages(id) ON DELETE SET NULL, type TEXT NOT NULL, -- "temporal", "cross_page", "stale" description TEXT NOT NULL, detected_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), resolved_at TEXT DEFAULT NULL -- NULL = unresolved);
CREATE INDEX IF NOT EXISTS idx_contradictions_page ON contradictions(page_id);CREATE INDEX IF NOT EXISTS idx_contradictions_unresolved ON contradictions(resolved_at) WHERE resolved_at IS NULL;
-- ============================================================-- knowledge_gaps: queries the memory couldn't answer well-- Privacy-safe by default: raw query text is NOT retained-- unless explicitly approved. Only query_hash is stored on-- detection; query_text is populated post-approval.-- ============================================================CREATE TABLE IF NOT EXISTS knowledge_gaps ( id INTEGER PRIMARY KEY AUTOINCREMENT, query_hash TEXT NOT NULL, -- SHA-256 of original query, always stored query_text TEXT DEFAULT NULL, -- raw text retained only after approval context TEXT NOT NULL DEFAULT '', confidence_score REAL DEFAULT NULL, sensitivity TEXT NOT NULL DEFAULT 'internal', approved_by TEXT DEFAULT NULL, approved_at TEXT DEFAULT NULL, redacted_query TEXT DEFAULT NULL, resolved_at TEXT DEFAULT NULL, resolved_by_slug TEXT DEFAULT NULL, detected_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')), CHECK (sensitivity IN ('internal', 'external', 'redacted')), CHECK (query_text IS NULL OR (approved_by IS NOT NULL AND approved_at IS NOT NULL)));
CREATE INDEX IF NOT EXISTS idx_gaps_unresolved ON knowledge_gaps(resolved_at) WHERE resolved_at IS NULL;Schema Notes
Section titled “Schema Notes”- All text fields: UTF-8
- All timestamps: ISO 8601 (
YYYY-MM-DDTHH:MM:SSZfor timestamps,YYYY-MM-DDfor dates) - Embeddings: BGE-small-en-v1.5 produces 384-dimensional float32 vectors. sqlite-vec stores them natively in
vec0virtual table. 384 floats × 4 bytes = 1,536 bytes per chunk (vs 6,144 bytes for OpenAI’s 1536-dim model - 4x smaller) - JSON fields (
frontmatter,data,pages_updated): stored as TEXT, parsed in application layer - Slugs include directory prefix:
people/pedro-franceschi,companies/river-ai,deals/river-ai-series-a - Embedding model registry (single source of truth): Each model gets its own vec0 table (dimension is baked in). The
embedding_modelstable is the ONLY authoritative selector for the active model — a partial unique index (WHERE active = 1) enforces that exactly one model is active at any time. Theconfigkeysembedding_modelandembedding_dimensionsare derived from the registry at startup and are read-only aliases. Switching models: register new model withactive=0, runquaid embed --allto populate its vec table, then in a single transaction flipactive=0on old andactive=1on new. Old table kept for rollback.quaid validate --embeddingschecks: (a) exactly one active model exists, (b) all chunks reference the active model, (c) all vec_rowids resolve correctly. - Assertions: A temporal fact table with provenance (
subject, predicate, object, valid_from, valid_until, supersedes_id, asserted_by, source_ref, evidence_text). Each assertion has a surrogateid. Tier 2 rewrites supersede old assertions by settingvalid_untilon the prior row AND pointingsupersedes_idto it in the new row. Contradiction detection queries only current beliefs (valid_until IS NULL).valid_fromis nullable (NULL = unknown). Dedup enforced in application logic. - Chunk types:
page_embeddings.chunk_typetracks whether a chunk is atruth_sectionortimeline_entry. Timeline entries are embedded individually for hyper-specific temporal retrieval. - Palace hierarchy (
wing,room): Auto-derived from slug by default.people/pedro-franceschi→ wing:pedro-franceschi, room: derived from section headers (State, Assessment, etc.). Override via frontmatterwing:androom:fields. Wings map to MemPalace’s concept of entity groupings; rooms map to topic sub-areas within an entity. - Summary: Extracted from the first blockquote (
> ...) incompiled_truthduringput/ingest. Used by progressive retrieval to serve lightweight results without loading full pages. - Temporal links (
valid_from,valid_until,relationship): Links carry typed relationships and temporal validity windows.valid_until IS NULL= currently active.valid_fromis nullable (NULL = unknown start date). Each link has a surrogateidused bymemory_link_closefor unambiguous targeting. Multiple intervals with unknown start dates are allowed — dedup and non-overlap enforced in application logic. - Freshness timestamps:
updated_atbumped on any page-scoped mutation.truth_updated_atbumped only whencompiled_truthchanges (Tiers 2-4).timeline_updated_atbumped only when timeline content ortimeline_entrieschange (Tier 1). Staleness =timeline_updated_at>truth_updated_atby 30+ days. - Contradictions: Detected by
quaid check(CLI) andmemory_check(MCP). Stored withresolved_atfor tracking. Unresolved contradictions surface in briefings and maintenance reports. - Knowledge gaps: Logged by
memory_gap(MCP) whenmemory_queryreturns no results or only low-confidence matches (below configurable threshold). Privacy-safe by default: only a SHA-256query_hashis stored at detection time; rawquery_textisNULLuntil explicitly approved.confidence_scorestores the highest search score from the triggering query. Sensitivity is alwaysinternalat creation —memory_gapdoes not accept a sensitivity parameter. Escalation toredactedorexternalrequires a separatememory_gap_approvecall that recordsapproved_by,approved_at, and optionally populatesquery_textandredacted_query(the anonymised version forredactedmode). A CHECK constraint enforces thatquery_textcan only be non-NULL when an approval audit trail exists. The research skill (skills/research/SKILL.md) refuses external calls for any gap without an approval record. When a gap is resolved via ingest,resolved_atandresolved_by_slugare set. Unresolved gaps surface in briefings alongside contradictions.
CLI Reference
Section titled “CLI Reference”The CLI is a thin dispatcher. Each command maps to a handler in src/commands/.
USAGE: quaid [OPTIONS] <COMMAND>
OPTIONS: --db <PATH> Path to memory.db [env: QUAID_DB] [default: ./memory.db] --json Output JSON instead of human-readable text --version Print version --tools-json Print MCP tool discovery JSON
COMMANDS: init [PATH] Create a new memory.db get <SLUG> Read a page by slug put <SLUG> [FILE] Write/update a page (stdin or file) search <QUERY> FTS5 full-text search query <QUESTION> Hybrid semantic search (FTS5 + vector, set-union merge) --depth <LEVEL> summary|section|full|auto [default: auto] --token-budget <N> Max tokens to return [default: from config] --wing <WING> Filter to specific palace wing ingest <FILE> Ingest a source document --type <TYPE> meeting|article|doc|conversation link <FROM> <TO> Create cross-reference --context <TEXT> Sentence containing the link --relationship <TYPE> Relationship type (works_at, founded, invested_in, board_member, related) --valid-from <DATE> When relationship started (ISO 8601) --valid-until <DATE> When relationship ended (NULL = still current) link-close <LINK_ID> Close a temporal link interval by ID --valid-until <DATE> When relationship ended (required) links <SLUG> List all outbound links with IDs (for discovering link IDs) --temporal <current|historical|all> Filter by temporal state [default: all] --json Include link IDs, valid_from, valid_until in output unlink <FROM> <TO> Remove cross-reference entirely --relationship <TYPE> Specific relationship to remove (default: all) backlinks <SLUG> Show pages that link TO this slug (includes link IDs) --temporal <current|historical|all> Filter by temporal state [default: current] graph <SLUG> N-hop neighborhood graph (pages + links as JSON) --depth <N> Hops from center node [default: 2] --temporal <current|historical|all> Filter links by temporal state [default: current] --limit <N> Max nodes to return [default: 50] check [SLUG] Run contradiction detection --all Check entire memory --type <temporal|cross_page|stale> Filter by check type gaps List unresolved knowledge gaps --limit <N> Max results [default: 20] --resolved Include resolved gaps tags <SLUG> List tags for a page tag <SLUG> <TAG> Add a tag untag <SLUG> <TAG> Remove a tag timeline <SLUG> Show timeline entries timeline-add <SLUG> Add a structured timeline entry --date <YYYY-MM-DD> Date of entry (required) --summary <TEXT> One-line summary (required) --source <SOURCE> Source identifier (e.g. "meeting/123") --detail <TEXT> Full markdown detail list List pages with filters --type <TYPE> Filter by page type --tag <TAG> Filter by tag --limit <N> Max results [default: 50] --sort <updated|created|title> Sort order [default: updated] stats Memory statistics export [--dir <PATH>] Export to markdown directory [default: ./export/] --raw --import-id <ID> Byte-exact export from specific import batch import <DIR> Import from markdown directory compact Checkpoint WAL → single file (for transport/backup) embed [<SLUG>] Generate/regenerate embeddings --all Embed all pages --stale Only pages updated since last embedding config get <KEY> Read a config value config set <KEY> <VALUE> Write a config value config list List all config serve Start MCP server (stdio transport) call <TOOL> <JSON> Raw tool call (GL pattern) pipe JSONL pipe mode (one JSON object per line) validate Run integrity checks on memory.db --links Check link interval non-overlap, temporal ordering --assertions Check assertion dedup, supersession chains --embeddings Check all chunks have valid vec_rowids in active model --all Run all integrity checks skills List active skills (embedded + external) skills doctor Verify skill resolution, show versions/hashes version Version infoDB path resolution
Section titled “DB path resolution”--db /path/to/memory.dbflag (highest priority)QUAID_DBenvironment variable./memory.dbin current directory (default)
Output formats
Section titled “Output formats”- Default: Human-readable markdown/text (Claude-friendly)
--json: JSON for programmatic usequaid pipe: JSONL streaming modequaid --tools-json: MCP tool discovery format (compatible with Claude Code tool use)
Usage examples
Section titled “Usage examples”# Create a new memory$ quaid init ~/my-memory.db
# Attach and sync an existing markdown directory$ quaid collection add memory /data/memory/ --db ~/memory.dbInitial sync walking 7,471 files... people: 1,222 pages companies: 847 pages deals: 234 pages ... links: 14,329 cross-references extracted raw_data: 892 sidecar files loaded timeline_entries: 23,441 entries parsedDone. memory.db: 487MBGenerating embeddings (7,471 pages, section strategy)... Embedded 22,847 chunks in 3m 14smemory.db (with embeddings): 521MBValidation: 7,471 files → 7,471 pages ✓
# Full-text search$ quaid search "River AI"people/ali-partovi.md (score: 12.3) ...River AI board member since 2024...companies/river-ai.md (score: 45.7) ...River AI is building...
# Semantic query$ quaid query "who knows Jensen Huang?"Searching 7,471 pages (FTS5 + vec0, set-union merge)...people/ali-partovi.md — mentioned NVIDIA partnership (score: 0.89)people/ilya-sutskever.md — co-presented at NeurIPS (score: 0.84)people/marc-andreessen.md — board connection via Meta (score: 0.81)
# Read a page$ quaid get people/pedro-franceschi---title: Pedro Franceschitype: person...---# Pedro Franceschi> Co-founder and CEO of Brex. YC alum (W17)...
# Write/update a page$ cat updated-pedro.md | quaid put people/pedro-franceschi
# Stats$ quaid statsPages: 7,471 people: 1,222 companies: 847 deals: 234 ...Links: 14,329Tags: 8,892Raw data: 892Timeline entries: 23,441Embeddings: 22,847 chunks (bge-small-en-v1.5)DB size: 521MB
# Start MCP server$ quaid serveQuaid MCP server running (stdio)Model: bge-small-en-v1.5 (384-dim, local)DB: /Users/garry/memory.db (521MB, 7471 pages)Tools: search, get, put, ingest, link, query, timeline, tags, list, statsMCP Server
Section titled “MCP Server”Transport
Section titled “Transport”Stdio (standard MCP). The client spawns quaid serve as a subprocess and communicates via stdin/stdout JSON-RPC 2.0.
Claude Code config (~/.claude/mcp.json)
Section titled “Claude Code config (~/.claude/mcp.json)”{ "mcpServers": { "quaid": { "command": "quaid", "args": ["serve", "--db", "/path/to/memory.db"] } }}| Tool | Description | Parameters |
|---|---|---|
memory_search | FTS5 full-text search | { query: string, type?: string, wing?: string, limit?: number } |
memory_query | Hybrid search (SMS + set-union + progressive retrieval) | { question: string, depth?: "summary"|"section"|"full"|"auto", token_budget?: number, wing?: string, limit?: number } |
memory_get | Read a page by slug | { slug: string } |
memory_put | Write/update a page (auto-extracts summary + palace metadata) | { slug: string, content: string, expected_version: number, assertions?: Array<{subject, predicate, object, valid_from?, asserted_by?, source_ref?, evidence_text?}> } or { slug: string, compiled_truth?: string, timeline_append?: string, frontmatter?: object, expected_version: number } |
memory_ingest | Ingest a source document — single transactional mutation | { content: string, source_type: string, source_ref: string, force?: boolean, pages: Array<{slug, content, expected_version, assertions?, links?, timeline_entries?, tags?}> } |
memory_link | Create a new link interval (returns link ID) | { from: string, to: string, relationship?: string, context?: string, valid_from?: string, page_version: number } |
memory_links | List outbound links with IDs | { slug: string, temporal?: "current"|"historical"|"all" } |
memory_link_close | Close an existing link interval by ID | { link_id: number, valid_until: string, page_version: number } |
memory_unlink | Remove cross-reference entirely | { from: string, to: string, relationship?: string, page_version: number } |
memory_timeline | Get timeline entries | { slug: string, limit?: number } |
memory_timeline_add | Add timeline entry | { slug: string, date: string, summary: string, source?: string, detail?: string, page_version: number } |
memory_tags | List tags for a page | { slug: string } |
memory_tag | Add/remove tag | { slug: string, tag: string, remove?: boolean, page_version: number } |
memory_list | List pages with filters | { type?: string, tag?: string, wing?: string, limit?: number, sort?: string } |
memory_backlinks | Pages linking to a slug (temporal filtering) | { slug: string, temporal?: "current"|"historical"|"all" } |
memory_graph | N-hop neighborhood graph (pages + links as JSON) | { slug: string, depth?: number, temporal?: "current"|"historical"|"all", limit?: number } |
memory_check | Run contradiction detection | { slug?: string, type?: "temporal"|"cross_page"|"stale", resolve?: string } |
memory_gap | Log a knowledge gap (always created as internal — no caller override) | { query_text: string, context?: string, confidence_score?: number } |
memory_gap_approve | Escalate gap sensitivity (audited — records approver + timestamp) | { gap_id: number, sensitivity: "redacted"|"external", approver: string, redacted_query?: string } |
memory_gaps | List unresolved knowledge gaps | { limit?: number, include_resolved?: boolean } |
memory_stats | Memory statistics (includes contradiction + gap counts) | {} |
memory_raw | Read/write raw enrichment data | { slug: string, source?: string, data?: object, page_version: number } |
Concurrency Model:
memory_putrequiresexpected_version. Mismatch → MCP Tool Error (conflict). Agent mustmemory_get, merge, retry.memory_ingestis a single server-side transactional mutation. The agent passes all page updates, links, timeline entries, assertions, and tags in one call. The server wraps everything in a SQLite transaction, checksexpected_versionfor each page, and either commits all or rolls back all. This eliminates the window where side tables (links, timeline, tags) can desync from page content.- All page-scoped mutators (
memory_link,memory_unlink,memory_timeline_add,memory_tag,memory_raw) requirepage_version. The server verifies it matches before mutating and bumpspages.versionon success. Mismatch → same conflict error asmemory_put. No write path bypasses the version check.
Resources
Section titled “Resources”| Resource | URI | Description |
|---|---|---|
| Page | memory://pages/{slug} | Full page content as markdown |
| Index | memory://index | All page slugs grouped by type |
Prompts
Section titled “Prompts”| Prompt | Description |
|---|---|
memory_briefing | Compile a briefing from current memory state |
memory_ingest_meeting | Guide for ingesting a meeting transcript |
Hybrid Search (Exact-Match Short-Circuit + Set-Union)
Section titled “Hybrid Search (Exact-Match Short-Circuit + Set-Union)”The core search experience uses a 4-step pipeline designed to prevent semantic noise from burying exact keyword matches. FTS5 + vector fan-out, merged using set-union (default) or RRF (fallback). Palace-style pre-filtering narrows the search space. SMS (exact-match short-circuit) ensures title/slug matches always rank first.
Why set-union over RRF
Section titled “Why set-union over RRF”UNC’s AutoResearchClaw pipeline (Apr 2026) tested RRF-style score re-ranking and found it degrades performance — score-based re-ranking disrupts the semantic ordering that dense retrieval already established. Their discovery: set-union merging (keep vector ranking intact, append BM25-only results) delivered +44% F1 in a single iteration on LoCoMo benchmark. Ablation confirmed: removing the BM25 hybrid component = -14% F1. The sparse results add value, but only when they don’t interfere with dense ranking.
Config: search_merge_strategy in the config table. Default set-union, fallback rrf for A/B testing on different corpora.
Algorithm
Section titled “Algorithm”query = "who knows Jensen Huang?"
Step 0: Palace pre-filter (intent classification)──────────────────────────────────────────────────Classify query intent → target wing(s) + room(s).Rule-based: extract entity names from query, match against page slugs/titles.If match found: constrain Steps 1-2 to matching wing(s).If no match: search all pages (no filter).
Example: "who knows Jensen Huang?" → wing filter: '%jensen-huang%'Example: "what's our thesis on River AI?" → wing filter: '%river-ai%'Example: "all YC founders in batch W25" → no wing filter (cross-cutting query)
When an LLM agent is driving the query (via MCP), the agent can pass explicitwing/room filters based on skills/query/SKILL.md guidance.
Step 1: SMS (Exact-Match Short-Circuit) — ABSOLUTE RANKING────────────────────────────────────────────────────────────────-- If the query exactly matches a page title or slug, it jumps to the top.-- This prevents semantic fuzziness from burying the obvious result.SELECT id, title, slug FROM page_ftsWHERE title MATCH ? OR slug MATCH ?LIMIT 5;-- Result: exact_results[] — these appear first in the final output, always.
Step 2: Vector similarity search (candle + sqlite-vec) — PRIMARY RANKING────────────────────────────────────────────────────────────────-- Embed the query with pure-Rust candle (local, in-process)query_embedding = candle_embed("who knows Jensen Huang?")
-- Resolve active model's vec table from the embedding_models registry-- (e.g. active model = 'bge-small-en-v1.5' → vec_table = 'page_embeddings_vec_384')active_model = SELECT name, vec_table FROM embedding_models WHERE active = 1;
-- cosine similarity via sqlite-vec, with optional palace filter-- NOTE: vec table name is resolved at runtime from active_model.vec_table-- Join key: pev.rowid = pe.vec_rowid (NOT pe.id — id is internal metadata key)SELECT pe.page_id, pe.chunk_text, vec_distance_cosine(pev.embedding, ?) AS vec_scoreFROM {active_model.vec_table} pevJOIN page_embeddings pe ON pev.rowid = pe.vec_rowid AND pe.model = {active_model.name}JOIN pages p ON pe.page_id = p.idWHERE (? IS NULL OR p.wing LIKE ?) -- palace filter (NULL = no filter)ORDER BY vec_scoreLIMIT 50;
-- Deduplicate: if same page_id appears multiple times, keep highest-scoring chunk-- Result: vec_results[] with original ranking preserved
Step 3: FTS5 keyword search — SUPPLEMENTARY────────────────────────────────────────────SELECT pages.id, pages.slug, pages.title, bm25(page_fts) AS fts_scoreFROM page_ftsJOIN pages ON pages.id = page_fts.rowidWHERE page_fts MATCH ? AND (? IS NULL OR pages.wing LIKE ?) -- same palace filterORDER BY fts_scoreLIMIT 50;
-- Result: fts_results[]
Step 4: Set-union merge (default)──────────────────────────────────-- Start with SMS exact matches (always first)merged = exact_results.clone()
-- Then vector results in their original order (primary ranking)for result in vec_results: if result.page_id NOT IN merged: merged.append(result)
-- Then FTS5-only results (those NOT in vector set) at the endfor result in fts_results: if result.page_id NOT IN merged: merged.append(result)
-- Apply lightweight boosts only to break ties within the appended FTS-only set:-- +0.01 if page type matches question intent-- +0.01 if updated_at within last 30 days
Step 4 (alt): RRF merge (fallback, config: search_merge_strategy = "rrf")───────────────────────────────────────────────────────────────────────────SMS exact matches still go first. Then for remaining pages, compute: rrf_score = 1/(k + rank_vec) + 1/(k + rank_fts) where k = 60 (standard RRF constant)If a page appears in only one set: rrf_score = 1/(k + rank)Sort by rrf_score DESC.
Step 5: Final ranking + fetch──────────────────────────────Top results returned with: - slug - title - summary (executive blockquote — for progressive retrieval Stage 1) - relevant excerpt (chunk_text from best-matching vector chunk, or FTS snippet) - score - type - wing, room
For deeper results, see Progressive Retrieval below.Palace filtering impact
Section titled “Palace filtering impact”MemPalace’s published ablation: base retrieval at 60.9% R@5 → 94.8% with wing+room filtering (+34%). quaid’s palace filter is derived from slug structure rather than a separate palace DB, but the principle is identical: constrain the search space before running expensive similarity queries.
Performance targets
Section titled “Performance targets”- SMS exact match: < 5ms (FTS5 title/slug lookup)
- Palace filter classification: < 5ms (regex match against slug index)
- FTS5 search: < 50ms for 100K pages
- Vector search: < 200ms for 50K chunks (sqlite-vec with float32[384])
- Full hybrid query (SMS + palace + vector + FTS5 + merge): < 250ms
- Embedding generation (query): < 20ms (BGE-small-en-v1.5 via candle on CPU, already loaded)
Progressive Retrieval
Section titled “Progressive Retrieval”Token-budget-gated expansion. Instead of returning full pages and hoping the agent handles context management, quaid controls how much content it serves based on a configurable token budget.
Why this matters
Section titled “Why this matters”OMNIMEM’s ablation (AutoResearchClaw, Apr 2026): removing progressive retrieval = -17% F1 — the largest single component contribution. agentmemory reports 92% fewer tokens vs dumping everything into context. The pattern: serve summaries first, expand on demand, stop when the budget is consumed.
Depth levels
Section titled “Depth levels”| Level | What’s returned | Tokens per result | Use case |
|---|---|---|---|
summary | Title + executive summary blockquote | ~50-100 | Quick scan, “do we know this person?” |
section | Best-matching chunk from vector search | ~200-500 | Targeted retrieval, “what’s the latest on X?” |
full | Complete compiled_truth | ~500-2000 | Deep read, preparing for a meeting |
auto | Start at summary, expand top results until token budget consumed | varies | Default — agent doesn’t need to choose |
Token budget
Section titled “Token budget”Default: 4000 tokens (configurable in config table as default_token_budget). Override per-query via the token_budget parameter.
Algorithm (auto mode)
Section titled “Algorithm (auto mode)”1. Run hybrid search → ranked results[]2. For each result (in rank order): a. Include summary (title + blockquote). Add to running token count. b. If token_budget - running_count > 500: Include best-matching section/chunk. Add to running count. c. If token_budget - running_count > 1500 AND result is top-3: Include full compiled_truth. Add to running count. d. If running_count >= token_budget: stop expanding, return.3. Return results with their expansion level noted.MCP integration
Section titled “MCP integration”The memory_query tool gains depth and token_budget parameters:
{ "question": "who knows Jensen Huang?", "token_budget": 4000, "depth": "auto", "wing": null}Response includes expansion metadata:
{ "results": [ { "slug": "people/ali-partovi", "title": "Ali Partovi", "depth": "full", "summary": "Co-founder of Neo. NVIDIA board connection via...", "content": "...full compiled_truth...", "score": 0.89, "tokens_used": 1200 }, { "slug": "people/ilya-sutskever", "title": "Ilya Sutskever", "depth": "section", "summary": "Co-founder of SSI. Previously Chief Scientist at OpenAI...", "content": "...best matching chunk...", "score": 0.84, "tokens_used": 450 } ], "total_tokens": 3800, "budget_remaining": 200}
---
## Ingest PipelineSource document (meeting notes, article, transcript)
│
▼
quaid ingest raw_imports.raw_bytes
│ - BEGIN IMMEDIATE TRANSACTION
│ - SELECT from raw_imports WHERE raw_bytes = ? → if exists, refresh source path + COMMIT
│ - All writes below happen inside this transaction
│ - rotate_active_raw_import() writes the active source row before COMMIT
│ - COMMIT writes everything atomically (page mutation + active source row)
│ - Crash before COMMIT → full rollback, retry starts clean
│
├─→ Parse source (Claude Code reads ingest/SKILL.md, follows workflow)
│ - Identify: participants, companies, topics, decisions, action items
│
├─→ For each entity mentioned (four-tier consolidation):
│ ├─ quaid get raw_imports row (exact bytes + source path)
│
└─→ Commit transaction
The `quaid ingest` command receives the raw source file. The actual intelligence — how to parse a meeting transcript, which entities get pages, how to rewrite compiled_truth, when to append vs rewrite — lives in `skills/ingest/SKILL.md`. The binary handles novelty checking (Jaccard + cosine, scoped to Tiers 2-4 only — Tier 1 evidence is never suppressed) and palace metadata derivation. Everything else is skill-driven.
### Novelty check implementation
```rustfn check_novelty(new_content: &str, existing_page: &Page, db: &Connection) -> NoveltyResult { // 1. Jaccard similarity between new compiled_truth and existing let jaccard = jaccard_similarity(new_content, &existing_page.compiled_truth); if jaccard > 0.85 { return NoveltyResult::DuplicateDerived; // skip Tiers 2-4 (derived rewrites) } // Tier 1 (timeline/links) ALWAYS proceeds
// 2. Embedding cosine similarity between new content and existing chunks let query_embedding = embed(new_content); let max_sim = max_chunk_similarity(&query_embedding, existing_page.id, db); if max_sim > 0.95 { return NoveltyResult::NearDuplicate; // warn on Tiers 2-4, proceed with --force } // Tier 1 ALWAYS proceeds
NoveltyResult::Novel // all tiers proceed}Critical invariant: Novelty checks only gate derived rewrites (Tiers 2-4: State, Assessment, summary). Tier 1 (raw timeline evidence, links, assertions) is always written regardless of similarity score. This prevents silent evidence loss from recurring meetings or incremental notes where compiled_truth looks similar but new timeline events contain material information.
Jaccard similarity uses token-level overlap (whitespace-split, lowercased). Fast and effective for detecting copy-paste or minimal-edit duplicates. The embedding check catches semantic duplicates where wording differs but meaning is identical.
Migration Plan
Section titled “Migration Plan”Importing an existing markdown memory (7,471 files at /data/memory/):
Type mapping
Section titled “Type mapping”Directory prefix → page type: people/ → person companies/ → company deals/ → deal yc/ → yc civic/ → civic projects/ → project concepts/ → concept originals/ → original sources/ → source media/ → media meetings/ → source programs/ → source decisions/ → decision commitments/ → commitment actions/ → action_itemParse algorithm
Section titled “Parse algorithm”fn parse_markdown_file(content: &str, file_path: &Path) -> ParsedPage { // 1. Extract YAML frontmatter (between first --- and second ---) let (frontmatter_str, body) = split_frontmatter(content); let frontmatter: serde_yaml::Value = serde_yaml::from_str(&frontmatter_str)?;
// 2. Split body at first horizontal rule (--- on its own line, after frontmatter) // This separates compiled_truth from timeline let (compiled_truth, timeline) = match body.find("\n---\n") { Some(idx) => (body[..idx].trim(), body[idx + 5..].trim()), None => (body.trim(), ""), };
// 3. Extract slug from file path // /data/memory/people/pedro-franceschi.md → "people/pedro-franceschi" let slug = file_path .strip_prefix(base_dir)? .with_extension("") .to_string_lossy() .to_string();
ParsedPage { slug, frontmatter, compiled_truth, timeline }}Link extraction
Section titled “Link extraction”// Wiki-style links: [Display Text](../people/name.md)// Convert to slugs: "people/name"let link_re = Regex::new(r"\[([^\]]+)\]\((\.\./)?([[\w/-]+)\.md\)")?;
// For each match: record from_slug, to_slug, surrounding sentence as context// Resolve relative paths against the source file's directoryTimeline parsing
Section titled “Timeline parsing”// Timeline line format: - **YYYY-MM-DD** | Source — Summary. Detail.let timeline_re = Regex::new( r"^- \*\*(\d{4}-\d{2}-\d{2})\*\*\s*\|\s*([^—]+)—\s*(.+)$")?;// Each match → { date, source, summary }// Multi-line continuation (indented) → detail fieldSidecar files
Section titled “Sidecar files”// For people/pedro-franceschi.md, check people/.raw/pedro-franceschi.json// Format: { "sources": { "crustdata": {...}, "happenstance": {...} } }// → raw_data rows: (page_id, "crustdata", JSON), (page_id, "happenstance", JSON)Transaction safety
Section titled “Transaction safety”conn.execute("BEGIN TRANSACTION", [])?;// Insert all pages// Insert all tags (after all pages exist for FK resolution)// Insert all links (resolve slugs → page IDs)// Insert timeline entries// Insert raw dataconn.execute("COMMIT", [])?;Embedding generation (post-import)
Section titled “Embedding generation (post-import)”$ quaid embed --allChunks compiled_truth at ## header boundaries (section strategy, chunk_type: 'truth_section').
Chunks timeline at individual entries — each - **YYYY-MM-DD** line becomes its own chunk (chunk_type: 'timeline_entry'). This prevents long timelines from becoming a single oversized chunk and enables hyper-specific temporal retrieval (“What did X do in March 2024?”).
- Pages without headers: chunk at ~500-token boundaries
- Target: ~200-800 tokens per truth section chunk, ~50-200 per timeline entry
- BGE-small-en-v1.5 on CPU: ~7,500 pages × ~3 chunks avg = ~22,500 embeddings
- Estimated time: 3-5 minutes on Apple Silicon, 8-12 minutes on Intel
Validation
Section titled “Validation”# Count pages in DB vs files on disk — must match# Count links vs parsed wiki links — must match# Spot-check 10 random pages: export → diff against original# Report any discrepancies$ quaid collection add memory /data/memory/ --db ~/memory.db # attach + initial sync in a scratch DBSpecial files
Section titled “Special files”index.md→ stored inconfigtable asoriginal_indexlog.md→ imported as a normal page if presentschema.md→ stored inconfigtable asoriginal_schemaREADME.md→ ignored during import
Export and Round-trip
Section titled “Export and Round-trip”The export command reconstructs the directory structure from DB state. The round-trip contract is semantic equivalence, not byte-for-byte identity — frontmatter key ordering, trailing whitespace, and YAML formatting may differ from the original source.
To support rollback and diffing, import stores the raw source bytes of every file in
raw_imports (keyed by page_id + import_id). These are immutable snapshots — they
are never updated when pages are mutated after import. This is intentional: raw exports
represent the state at import time, not current state.
Important: --raw export requires --import-id and only emits bytes from that
specific import batch. There is no “current state” raw export — use normalized export
for current state. This prevents the dangerous split-memory where raw export silently
discards post-import edits.
fn export_page(page: &Page, mode: ExportMode) -> String { match mode { // Raw mode: byte-for-byte faithful to a specific import batch (required) ExportMode::Raw { import_id } => { get_raw_import(&page.id, &import_id) .expect("no raw snapshot for this page in the specified import batch") } // Normalized mode: reconstructed from current DB columns (default) ExportMode::Normalized => export_normalized(page), }}
fn export_normalized(page: &Page) -> String { // 1. Reconstruct YAML frontmatter from frontmatter JSON let frontmatter = serde_yaml::to_string( &serde_json::from_str::<serde_yaml::Value>(&page.frontmatter)? )?;
// 2. Reconstruct body let mut body = page.compiled_truth.clone(); if !page.timeline.is_empty() { body.push_str("\n\n---\n\n"); body.push_str(&page.timeline); }
// 3. Combine format!("---\n{}---\n\n{}\n", frontmatter, body)}
// Write to: <export-dir>/<slug>.md// Reconstruct .raw/ sidecars from raw_data table// Regenerate index.md from page listRound-trip validation
Section titled “Round-trip validation”# Semantic validation (default): normalized fields must match$ quaid export --dir /tmp/memory-export/$ quaid validate --original /data/memory/ --exported /tmp/memory-export/# Checks: same pages, same frontmatter keys/values, same compiled_truth, same timeline entries
# Byte-exact validation: requires --import-id (raw exports are immutable snapshots)$ quaid export --raw --import-id <ID> --dir /tmp/memory-export-raw/$ diff -r /data/memory/ /tmp/memory-export-raw/ # should be empty
# --raw without --import-id is an error (prevents accidental stale-byte export)$ quaid export --raw --dir /tmp/memory-export-raw/# Error: --raw requires --import-id. Use normalized export for current state.Semantic validation is the primary correctness test. Byte-exact round-trip is available
via --raw for rollback scenarios but is not the default contract.
Skills (Fat Markdown)
Section titled “Skills (Fat Markdown)”Skills live in skills/ at the repo root. Each is a standalone SKILL.md that Claude Code, OpenClaw, or any agent reads and follows. No logic is compiled into the binary.
skills/ingest/SKILL.md
Section titled “skills/ingest/SKILL.md”---name: quaid-ingestdescription: | Ingest meetings, articles, docs, and conversations into the memory. Follows the compiled truth + timeline architecture: update existing pages with new info, create pages for new entities, maintain cross-references.---
# Ingest Skill
## Workflow
1. **Read the source.** Meeting transcript, article, document, conversation log. Identify: participants, companies, topics, decisions, commitments, action items.
2. **For each entity mentioned (four-tier consolidation):** - `quaid get <slug>` — does a page exist? Note the `version` number. - **If yes:** Apply tier-by-tier update: - **Tier 1 (Raw Evidence):** Append to timeline. Always. The exact words, dates, sources. Never summarised. → `quaid timeline-add` - **Tier 2 (Extracted Facts):** Update State section with structured assertions derived from evidence. Rewrite when facts change. Extract strict factual assertions (roles, statuses, locations) for the `assertions` table. When a fact changes, set `valid_until` on the old assertion before inserting the new one. → e.g., `{subject: "Pedro Franceschi", predicate: "is_ceo_of", object: "Brex", valid_from: "2018-01-01"}` - **Tier 3 (Synthesised Concepts):** Re-evaluate Assessment section if underlying facts shifted. Cross-reference patterns across linked pages. → e.g., "Brex's leadership team has been stable since 2024, with Pedro driving the enterprise pivot" - **Tier 4 (Narrative Intelligence):** Rewrite executive summary blockquote ONLY if the overall picture changed. This is the one-sentence answer to "what do we know and why does it matter?" → e.g., `> Pedro Franceschi runs Brex. Strong enterprise traction.` - `quaid put <slug>` with updated content, `expected_version` from step 2, and `assertions` array. If ConflictError, re-read and merge. The system auto-extracts the summary from the blockquote and derives palace metadata. - **If no:** Create page using the appropriate template (see templates below). `quaid put <slug>` with new content and `assertions`.
3. **Extract work-context entities.** - For each **decision** made: create/update a `decisions/<slug>` page. Link to stakeholders and projects. Record assertion: `{subject: "<decision-slug>", predicate: "decided_by", object: "<person>", valid_from: "<date>"}` - For each **commitment** with an owner and deadline: create/update a `commitments/<slug>` page. Record assertion: `{subject: "<person>", predicate: "committed_to", object: "<deliverable>", valid_from: "<date>"}` If a prior commitment shifted (new deadline, changed scope), set `valid_until` on the old assertion and create a new one. Update the commitment page's State section. - For each **action item** assigned: create/update an `actions/<slug>` page. - Link all work-context entities to the people, companies, and projects involved.
4. **Extract and create links (with temporal metadata).** - For every entity-to-entity reference: `quaid link <from> <to> --relationship "works_at" --valid-from "2024-01-15" --context "..."`. - Links are bidirectional in meaning but stored directionally. Create both if both pages exist. - When evidence shows a relationship ended, close the specific interval by its link ID: `quaid link-close <link_id> --valid-until "2026-03-01"`. The link ID is returned by `memory_link` on creation and by `memory_backlinks` on query. The binary enforces non-overlapping intervals: rejects close if it would create overlap.
5. **Parse timeline entries.** - For each datable event in the source: `quaid timeline-add <slug> --date YYYY-MM-DD --summary "..." --source "meeting/123"`
6. **Log the ingest.** - The system rotates `raw_imports` for the active source. Verify with `quaid stats` and `quaid export`.
7. **Refresh embeddings.** - After all puts: `quaid embed --stale` - This ensures search reflects the new content immediately.
8. **Handle raw data.** - If the source includes structured data (API responses, JSON): `quaid call memory_raw '{"slug":"...","source":"meeting","data":{...}}'`
## Entry criteria
Not everything gets a page. The bar:- Anyone you met 1:1 or in a small group: YES- YC staff, partners, active batch founders: YES- Companies discussed in deal context: YES- **Decisions** that affect multiple people or projects: YES- **Commitments** with a specific owner and deadline: YES- **Action items** assigned to a person with a due date: YES- Casual mentions with no substance: NO- Vague intentions without owner or deadline: NO (wait until they crystallize)- Create the page only if its existence serves future retrieval.
## Quality rules
- Executive summary (blockquote at top) must reflect latest state- State section gets REWRITTEN, not appended to- Timeline is append-only, reverse-chronological (newest first)- Open Threads: add new items, remove resolved ones (move to timeline)- Every wiki link uses relative path format: `[Name](../people/name.md)`
## Source attribution format
Every fact in compiled_truth needs inline source citation with full provenance. A tweet reference without a URL is a broken citation — this is the #1 failure mode at scale.
Standard format:[Source: User, direct message, 2026-04-07] Source: Meeting “Team Sync” #12345, 2026-04-03 Source: X/@handle, topic, 2026-04-05 [Source: email from Name re Subject, 2026-04-05] [Source: Crustdata LinkedIn enrichment, 2026-04-07] Source: Wall Street Journal, 2026-04-05
### Source authority hierarchy
When sources conflict, note the contradiction in compiled_truth with BOTH citations. Never silently pick one.
1. **User's direct statements** (highest authority)2. **Primary sources** (meetings, emails, direct conversations)3. **Enrichment APIs** (Crustdata, Happenstance, Captain)4. **Web search results**5. **Social media posts** (lowest — require URL, context, date)
## Filing disambiguation
When filing entities, apply these rules to prevent duplicate pages in wrong categories:
| Question | Answer ||----------|--------|| Could you teach it as a framework? | → `concepts/` || Is it the user's own idea, synthesis, or observation? | → `originals/` || Could you build it, but nobody is working on it yet? | → Use `ideas/` directory (stored as `project` type with `status: idea` frontmatter) || Is someone actively building it? | → `projects/` || About them as a human? | → `people/` || About the organisation? | → `companies/` (both pages link to each other) || Nothing fits? | → Flag for human review. The schema may need to evolve. |
**Key rule for originals:** Capture the user's EXACT phrasing. The language IS the insight. Use their words for the slug. `meatsuit-maintenance-tax.md` not `biological-needs-maintenance-overhead.md`. The vividness IS the concept.
## Page templates
### Person
```markdown---title: First Lasttype: persontags: []linkedin: ""twitter: ""score: 0---# First Last
> [Executive summary in one sentence.]
## State
**As of YYYY-MM-DD:** [What we know now. Current role, context, relationship status.]
## Assessment
[Intelligence assessment. What makes this person interesting or relevant.]
## Open Threads
- [ ] [Action item or open question]
---
## Timeline
- **YYYY-MM-DD** | meeting — [What happened]Optional enrichment sections (Tier 1 contacts — 5+ interactions)
Section titled “Optional enrichment sections (Tier 1 contacts — 5+ interactions)”For high-engagement contacts, expand the person template with these sections between Assessment and Open Threads when evidence supports them. Don’t add speculatively — each section requires direct observation or sourced claims.
## What They Believe- [Belief] — observed: [source, date]- [Belief] — self-described: [interview/bio, date]- [Belief] — inferred: [pattern across N interactions, confidence: high/medium/low]
## What They're BuildingCurrent projects, recent ships, product direction.
## Hobby HorsesTopics they return to obsessively across conversations.
## TrajectoryAscending, plateauing, pivoting, declining? Evidence-based, not speculative.
## Network- **Close to:** People frequently seen with- **Crew:** Which cluster or communityCompany
Section titled “Company”---title: Company Nametype: companytags: []domain: ""linkedin: ""stage: ""---# Company Name
> [What they do in one sentence.]
## State
**As of YYYY-MM-DD:** [Current status, funding, relevant context.]
## Assessment
[Thesis, opportunity, concerns.]
---
## Timeline
- **YYYY-MM-DD** | source — [Event]Decision
Section titled “Decision”---title: "Decision: [Short description]"type: decisiontags: []date: YYYY-MM-DDstatus: active # active | superseded | reversedstakeholders: [] # slugs of people involved---# Decision: [Short description]
> [One-sentence summary of what was decided and why.]
## Context
**Decided YYYY-MM-DD** by [who]. Discussed in [meeting/thread/email].
[What problem this solves. What alternatives were considered.]
## Implications
- [What changes as a result]- [What depends on this holding]
## Open Threads
- [ ] [Follow-up or risk to monitor]
---
## Timeline
- **YYYY-MM-DD** | meeting — Original decision madeCommitment
Section titled “Commitment”---title: "Commitment: [Who] → [What] by [When]"type: commitmenttags: []owner: "" # slug of person who committeddue: YYYY-MM-DDstatus: open # open | completed | shifted | dropped---# Commitment: [Who] → [What] by [When]
> [Owner] committed to [deliverable] by [date]. Status: [open/completed/shifted/dropped].
## State
**As of YYYY-MM-DD:** [Current status. On track? Shifted? Why?]
## Context
Made during [meeting/conversation]. Linked to [decision/project].
---
## Timeline
- **YYYY-MM-DD** | meeting — Commitment madeAction Item
Section titled “Action Item”---title: "Action: [Short description]"type: action_itemtags: []owner: "" # slug of person responsibledue: YYYY-MM-DDstatus: open # open | done | blocked | droppedpriority: normal # low | normal | high | urgent---# Action: [Short description]
> [Owner] to [do what] by [when]. Priority: [level].
## Context
From [meeting/conversation/decision]. Blocked by: [nothing / dependency].
---
## Timeline
- **YYYY-MM-DD** | source — CreatedOriginal (user’s own thinking)
Section titled “Original (user’s own thinking)”---title: "[Exact user phrasing]"type: originaltags: []origin: "" # meeting, conversation, shower thought, riff on [concept]confidence: high # high | medium | low | speculative---# [Exact user phrasing]
> [One-sentence summary of the original idea in the user's voice.]
## The Idea
[Full articulation. Use the user's exact words wherever possible.The language IS the insight — preserve phrasing, metaphors, and framing.]
## Why It Matters
[What this connects to. What it explains. What it predicts.]
## Connections
- Shaped by: [people who influenced the thinking]- Played out at: [companies/projects where it applied]- Discussed in: [meetings/conversations]- Builds on: [other originals, concepts]
---
## Timeline
- **YYYY-MM-DD** | source — First articulated---
### skills/query/SKILL.md
```markdown---name: quaid-querydescription: | Answer questions from the memory using FTS5 + semantic search + structured queries. Synthesize across multiple pages. Cite sources.---
# Query Skill
## Strategy: Four-layer search
1. **Palace-filtered hybrid search** — `quaid query "<question>" --wing "<entity>"` — the primary search path. Set-union merging with palace pre-filtering. Best for: most questions. The wing filter narrows the search space before vector + FTS5 run, dramatically improving precision.
2. **FTS5 keyword search** — `quaid search "<query>"` — fast, exact matches. Best for: names, company names, specific terms, known slugs.
3. **Semantic vector search** — `quaid query "<question>"` (no wing filter) — meaning-based. Best for: cross-cutting queries where the entity isn't known upfront.
4. **Structured queries** — `quaid list --type person --tag yc-alum` + `quaid backlinks <slug> --temporal current` — relational navigation. Best for: "all YC founders in batch W25", "who currently works at X?"
## Workflow
1. Decompose the question into search strategies.2. Identify target wing(s) from entity names in the question.3. Run hybrid query with palace filter and progressive retrieval: `quaid query "<question>" --wing "<entity>" --depth auto --token-budget 4000`4. Review summaries first (Tier 4 narrative). Expand to sections/full only if needed.5. For temporal questions, check link validity: `quaid backlinks <slug> --temporal current` vs `--temporal historical`6. Before surfacing, verify with contradiction check: `quaid check <slug>` — flag any unresolved contradictions in the answer.7. Synthesize answer with citations: `[Pedro Franceschi](people/pedro-franceschi)`8. If the answer is valuable enough to persist, consider creating a new source page.
## When you don't know
Say so. "The memory doesn't have info on X" is better than hallucinating.Suggest enrichment: "Want me to research X and add them?"skills/maintain/SKILL.md
Section titled “skills/maintain/SKILL.md”---name: quaid-maintaindescription: | Periodic memory maintenance. Find contradictions, stale info, orphan pages, missing cross-references. Keep the knowledge graph healthy.---
# Maintain Skill
## Lint checks (run every few days)
1. **Contradiction detection** — Run `quaid check --all` to detect: - **Link vs Assertion:** Current assertions (`valid_until IS NULL`) like `(Pedro, is_ceo, Brex)` where the link to `companies/brex` has a `valid_until` in the past. Pure SQL join. - **Temporal contradictions:** Page says "left X in 2025" but link to X has no `valid_until`. Compiled_truth dates vs link validity windows. - **Cross-page contradictions:** Multiple current assertions (`valid_until IS NULL`) with the same subject+predicate but different objects across pages. Superseded assertions (with `valid_until` set) are excluded — they're history, not contradictions. - **Staleness:** Pages where `timeline_updated_at` > `truth_updated_at` by 30+ days. These need Tier 2-4 rewrites. All findings stored in the `contradictions` table. Surface unresolved items in briefings.
2. **Stale info** — Pages where `timeline_updated_at` > `truth_updated_at` by 30+ days. Compiled truth is stale relative to new evidence. These need Tier 2 consolidation.
3. **Orphan pages** — `quaid backlinks <slug>` = 0 inbound links. Either add links from related pages or flag for deletion.
4. **Missing cross-references** — Scan compiled_truth for mentions of known page titles that aren't formally linked. Add via `quaid link` with relationship type.
5. **Dead links** — For each link, verify both pages still exist.
6. **Temporal link audit** — Links with `valid_until IS NULL` where evidence suggests the relationship ended. Flag for invalidation.
7. **Open thread audit** — Items older than 30 days. Resolved items still listed as open.
8. **Tag consistency** — Normalize: lowercase, hyphens. Merge near-duplicates.
9. **Palace metadata audit** — Pages with empty `wing` or `room` fields. Derive from slug structure and section headers.
10. **Embedding freshness** — Pages updated since last embedding: `quaid embed --stale`
## Output
Write maintenance report:`quaid put sources/maintenance-YYYY-MM-DD` with findings and actions taken.Include: contradictions found/resolved, stale pages rewritten, orphans linked/flagged, temporal links invalidated.skills/enrich/SKILL.md
Section titled “skills/enrich/SKILL.md”---name: quaid-enrichdescription: | Enrich person and company pages from external sources. Crustdata, Happenstance, Exa, Captain (Pitchbook). Validation rules enforced.---
# Enrich Skill
## Sources
| Source | Best for | Auth ||--------|----------|------|| Crustdata | LinkedIn profile data (90+ fields) | API key || Happenstance | Career history, network search | Credits || Exa | Web search, articles, mentions | API key || Captain/Pitchbook | Company financials, deals, investors | API key |
## Person enrichment workflow
1. Find LinkedIn URL (frontmatter, contacts, or Happenstance search)2. Hit Crustdata: `GET /screener/person/enrich?linkedin_profile_url=...` - Auth: `Token` (NOT Bearer!)3. Validate before writing: - Connection count < 20 → likely wrong person. Save raw_data with flag, don't update page. - Name mismatch (different last name) → skip.4. Store raw: `quaid call memory_raw '{"slug":"people/name","source":"crustdata","data":{...}}'`5. Distill to page: update compiled_truth with location, title, company, career arc, top skills. DO NOT dump full 90-field data into the page.
## Batch rules
- Checkpoint every 20 items- Exponential backoff on 429s: 10s → 20s → 40s → ... → 5min cap- Dry-run: `--dry-run` shows what would be enriched- Never re-enrich already-enriched pages (check raw_data table first)skills/briefing/SKILL.md
Section titled “skills/briefing/SKILL.md”---name: quaid-briefingdescription: | Compile a daily briefing from memory state plus real-time sources. What changed, what's coming, who's waiting, what needs attention.---
# Briefing Skill
## Briefing structure
1. **Calendar** — Today's meetings. For each: pull memory pages for participants using progressive retrieval (`--depth summary`).2. **Active deals** — `quaid list --type deal --tag active`3. **Commitments due** — `quaid list --type commitment --tag open` filtered to due within 7 days. Flag overdue.4. **Action items** — `quaid list --type action_item --tag open` sorted by priority + due date.5. **What shifted overnight** — Query assertions where `valid_until` was set in the last 24h (superseded facts, shifted commitments, reversed decisions). This is the "overnight shift report."6. **Open threads** — Scan pages for time-sensitive Open Threads items.7. **Unresolved contradictions** — `quaid check --all` → surface any unresolved items from the `contradictions` table.8. **Recent memory changes** — `quaid list --sort updated` filtered to last 24h.9. **People in play** — Person pages updated in last 7 days with score ≥ 3.10. **Stale alerts** — Pages flagged by maintain skill (including temporal link issues).
## Output
Write briefing to `sources/briefing-YYYY-MM-DD`.Return formatted markdown suitable for Telegram delivery.Alert-worthy items are handled by the alerts skill, not the briefing.skills/alerts/SKILL.md
Section titled “skills/alerts/SKILL.md”---name: quaid-alertsdescription: | Interrupt-driven notification thresholds. Defines what warrants an immediate push notification vs. waiting for the next scheduled briefing. Pairs with the briefing skill — briefing is pull, alerts are push.---
# Alerts Skill
## Alert tiers
### Immediate alert (push via Telegram within minutes)- High-priority entity first appears (e.g. RIFT first post, major competitor launch)- Extreme price moves (BTC +/-15% in an hour, portfolio company token event)- Legislative/regulatory events (CLARITY Act floor vote, SEC filing deadline)- Commitment overdue by 24h+ with no update- Contradiction detected on a page updated in last 24h (active deal or person)
### Next briefing (include in next scheduled digest)- New followers or engagement on our posts- Non-urgent replies to monitored threads- Routine enrichment completions- Knowledge gaps detected by `memory_gap`- Stale pages flagged by maintain skill
### Silent log (no notification, recorded in timeline)- Routine social media engagement- Minor price moves within normal range- Link saves and bookmarks- Background enrichment data updates
## How it works
1. Events arrive via agent monitoring (cron, webhooks, MCP tools).2. Agent classifies event against the tier definitions above.3. **Immediate:** Format for Telegram delivery (short, no markdown tables, action-oriented). Push immediately.4. **Next briefing:** Write to `sources/alerts-queue-YYYY-MM-DD` for the briefing skill to pick up.5. **Silent:** Write to relevant page timeline via `quaid timeline-add`. No notification.
## Customisation
Alert thresholds are in this skill file, not in the binary. Adjust tiers by editing this file.The agent should surface its classification reasoning if borderline ("I classified this as next-briefing because...").skills/research/SKILL.md
Section titled “skills/research/SKILL.md”---name: quaid-researchdescription: | Resolve knowledge gaps logged by memory_gap. Run on schedule or on demand. Respects sensitivity classification: internal gaps resolved from existing memory only, redacted gaps anonymised before external queries, external gaps may use web search and enrichment APIs. Default sensitivity is internal.---
# Research Skill
## Workflow
1. **Read gap log.** `quaid gaps` — list unresolved knowledge gaps.
2. **Prioritise.** Rank by: - Age (older gaps first — they've been unresolved longest) - Context (gaps from high-priority queries rank higher) - Frequency (same query_text appearing multiple times = high demand)
3. **Check sensitivity classification and approval before researching.**
| Sensitivity | Approval required? | Allowed research methods | |-------------|-------------------|------------------------| | `internal` (default) | No | Search existing memory pages only (`memory_query`, `memory_search`). No network calls. If the memory can't answer it, leave the gap unresolved and note "requires external research — escalate sensitivity via `memory_gap_approve` to proceed." | | `redacted` | Yes (`memory_gap_approve` with `approved_by`, `redacted_query`) | External search permitted using ONLY the `redacted_query` stored in the approval record (entity names, deal terms, dollar amounts stripped). **Never send the original `query_text` externally.** If no `redacted_query` exists in the approval record, refuse and ask for one. | | `external` | Yes (`memory_gap_approve` with `approved_by`) | External search permitted with the original query text. Use only for non-sensitive topics (public companies, open-source projects, general concepts). |
**Hard rule:** The research skill MUST verify that `approved_by IS NOT NULL AND approved_at IS NOT NULL` before any external call. If the approval record is missing (which shouldn't happen due to the CHECK constraint, but defense in depth), treat as `internal`.
4. **Research each gap (per sensitivity rules above).** - `internal`: re-query memory with alternate phrasing, check backlinks, scan related pages - `redacted`/`external`: Web search (Exa, Brave) for the (redacted or original) query text - If entity-related: check enrichment APIs (Crustdata, Happenstance) — `external` only - If topic-related: search for recent articles, papers, threads - Compile findings into a draft page or update to existing page
4. **Ingest findings.** Follow `skills/ingest/SKILL.md` — standard four-tier consolidation. - New entity discovered → create page via `quaid put` - Existing entity enriched → update via `memory_ingest` with `expected_version` - Topic research → create `concepts/` or `sources/` page
5. **Resolve gap.** After successful ingest: - The system marks the gap resolved with the slug of the page that filled it - If research yields nothing useful, add a note to the gap context and leave unresolved - Re-check after 7 days (topics evolve, new sources appear)
6. **Report.** Write research summary to `sources/research-YYYY-MM-DD` for audit trail.
## When to run
- **Scheduled:** Daily, after the morning briefing compile. Process top 5 unresolved gaps.- **On demand:** Agent notices a gap during conversation and wants to fill it immediately.- **Batch:** Weekly deep run — process all gaps older than 3 days.
## Quality rules
- Never fabricate. If research yields no results, say so.- Cite every source using the standard attribution format (see ingest skill).- Prefer primary sources over social media summaries.- If a gap is genuinely unanswerable (e.g. "what is X planning internally?"), mark it as `context: "unanswerable — requires insider access"` and leave unresolved.skills/upgrade/SKILL.md
Section titled “skills/upgrade/SKILL.md”---name: quaid-upgradedescription: | Agent-guided upgrade path for the quaid binary and skills. Inspired by Garry Tan's v0.8.0 "just ask your agent to upgrade" pattern. The skill file IS the upgrade guide — the binary handles mechanics.---
# Upgrade Skill
## Pre-upgrade checklist
1. **Check current version:** `quaid version`2. **Record the resolved DB path:** `echo "${QUAID_DB:-./memory.db}"` — needed for rollback.3. **Stop MCP server if running:** `pgrep -f "quaid serve" && echo "STOP: kill quaid serve before upgrading"`4. **Backup:** `quaid compact` then manual backup if desired (the binary creates its own WAL-safe backup during migration)5. **Validate current state:** `quaid validate --all` — fix any issues before upgrading6. **Check for new version:** query GitHub releases API for latest version tag7. **Record current binary path:** `which quaid` — needed for rollback
## Upgrade steps
1. **Download new binary to a staging path (NOT directly to the install location):** ```bash TARGET_VERSION="v0.2.0" # always pin an explicit version, never use 'latest' unverified PLATFORM="$(uname -s | tr A-Z a-z)-$(uname -m)" STAGING_DIR="/tmp/quaid-${TARGET_VERSION}" mkdir -p "${STAGING_DIR}"
# Download binary and checksum file using the release artifact filename curl -fsSL "https://github.com/quaid-app/quaid/releases/download/${TARGET_VERSION}/quaid-${PLATFORM}" \ -o "${STAGING_DIR}/quaid-${PLATFORM}" curl -fsSL "https://github.com/quaid-app/quaid/releases/download/${TARGET_VERSION}/quaid-${PLATFORM}.sha256" \ -o "${STAGING_DIR}/quaid-${PLATFORM}.sha256"-
Verify integrity before installing:
Terminal window # .sha256 files use standard 'hash filename' format — use --check directly(cd "${STAGING_DIR}" && shasum -a 256 --check "quaid-${PLATFORM}.sha256")# If check fails: STOP. Do not install. Report the mismatch. -
Preserve the current binary for rollback, then install:
Terminal window INSTALL_PATH="$(which quaid)"cp "${INSTALL_PATH}" "${INSTALL_PATH}.rollback"cp "${STAGING_DIR}/quaid-${PLATFORM}" "${INSTALL_PATH}" && chmod +x "${INSTALL_PATH}" -
Run migrations:
quaid version— the binary auto-migrates on startup if needed. It creates a WAL-safe backup viaVACUUM INTOtomemory.db.backup-v{N}before any schema migration. Migration does not proceed until backup succeeds. -
Validate post-migration:
quaid validate --all— all integrity checks passquaid stats— page counts match pre-upgradequaid embed --stale— re-embed any pages affected by model changes
-
Update skills: Pull latest
skills/from the repo. External skill files in the working directory override embedded defaults.quaid skills doctor— verify resolution order and content hashes. -
Verify round-trip: re-attach the exported markdown in a scratch DB if upgrading from a version with schema changes. Confirms no data loss.
-
Clean up: Remove staging file and rollback binary if everything passed.
Terminal window rm -rf "${STAGING_DIR}"# Keep rollback binary for 7 days, then clean: rm "${INSTALL_PATH}.rollback" -
Report: Tell the user what changed — version number, any schema migrations run, any skills updated, any action required.
Rollback
Section titled “Rollback”If anything goes wrong:
-
Stop all clients. Ensure no
quaid serveprocess is running. Check:pgrep -f "quaid serve"— kill any running MCP servers before restoring. -
Restore prior binary:
cp "$(which quaid).rollback" "$(which quaid)" -
Restore pre-migration DB backup (WAL-safe): The backup file is the resolved DB path — respect
--db/QUAID_DBif set.Terminal window # Resolve the actual DB path (same logic as the binary uses)DB_PATH="${QUAID_DB:-./memory.db}"# Delete WAL sidecars — they contain post-migration state that would# replay into the restored backup and corrupt the rollback.rm -f "${DB_PATH}-wal" "${DB_PATH}-shm"# Restore the backup over the actual DB pathcp "${DB_PATH}.backup-v{N}" "${DB_PATH}"Critical: You MUST delete
-waland-shmbefore restoring. Without this, SQLite will replay the WAL on next open, re-applying the migration you’re trying to undo. -
Verify:
quaid version— should show the pre-upgrade version.quaid validate --all— confirm DB integrity. -
Report what failed for debugging.
CI release requirements
Section titled “CI release requirements”Every GitHub release MUST publish:
- Platform binaries:
quaid-<platform>-<channel>where<platform>∈{darwin-arm64, darwin-x86_64, linux-x86_64, linux-aarch64}and<channel>∈{airgapped, online}. - SHA-256 checksums:
quaid-<platform>-<channel>.sha256. .github/release-assets.txtis the canonical manifest consumed by release validation and release-check tests.- Each
.sha256file contains standardshasum -a 256output:<hex-digest> <filename>, one line. - The release workflow generates checksums in CI, not locally, to prevent tampering.
---
## Repository Structurequaid/ ├── README.md # Project overview + quick start ├── CLAUDE.md # Claude Code session instructions ├── AGENTS.md # Generic agent session instructions ├── LICENSE # MIT ├── Cargo.toml ├── Cargo.lock │ ├── bin/ # Compiled binaries (gitignored, built in CI) │ ├── quaid-darwin-arm64-airgapped │ ├── quaid-darwin-arm64-online │ ├── quaid-darwin-x86_64-airgapped │ ├── quaid-darwin-x86_64-online │ ├── quaid-linux-aarch64-airgapped │ ├── quaid-linux-aarch64-online │ ├── quaid-linux-x86_64-airgapped │ └── quaid-linux-x86_64-online │ ├── src/ │ ├── main.rs # Entry point: arg parsing + command dispatch (clap) │ ├── commands/ │ │ ├── mod.rs │ │ ├── get.rs │ │ ├── put.rs │ │ ├── search.rs │ │ ├── query.rs │ │ ├── ingest.rs │ │ ├── link.rs │ │ ├── tags.rs │ │ ├── timeline.rs │ │ ├── list.rs │ │ ├── stats.rs │ │ ├── export.rs │ │ ├── import.rs │ │ ├── embed.rs │ │ ├── graph.rs # N-hop neighborhood traversal │ │ ├── gaps.rs # Knowledge gap list/management │ │ ├── serve.rs │ │ ├── call.rs │ │ ├── init.rs │ │ ├── config.rs │ │ └── version.rs │ ├── core/ │ │ ├── mod.rs │ │ ├── db.rs # Database: open(), schema init, WAL, sqlite-vec load │ │ ├── fts.rs # FTS5: search_fts(query, wing_filter) → ranked results │ │ ├── inference.rs # candle init, embed(text), search_vec(query, k, wing_filter) │ │ ├── search.rs # hybrid_search(query): SMS + palace filter + FTS5 + vec + set-union merge │ │ ├── progressive.rs # progressive_retrieve(results, token_budget, depth) → expanded results │ │ ├── novelty.rs # check_novelty(content, page): Jaccard + cosine dedup │ │ ├── assertions.rs # heuristic contradiction detection via assertions table │ │ ├── graph.rs # neighborhood_graph(slug, depth): N-hop BFS over links table │ │ ├── gaps.rs # knowledge gap detection and resolution tracking │ │ ├── chunking.rs # temporal sub-chunking: truth sections + timeline entries │ │ ├── palace.rs # derive_wing(slug), derive_room(content), classify_intent(query) │ │ ├── markdown.rs # parse_frontmatter(), split_content(), extract_summary(), render_page() │ │ ├── links.rs # extract_links(), resolve_slug(), temporal validity │ │ ├── migrate.rs # export_dir() + round-trip export helpers │ │ └── types.rs # Page, Link, Tag, TimelineEntry, SearchResult, Contradiction, KnowledgeGap, etc. │ ├── mcp/ │ │ ├── mod.rs │ │ └── server.rs # MCP stdio server: tool definitions + handlers │ └── schema.sql # DDL (embedded in db.rs via include_str!, also standalone) │ ├── skills/ │ ├── ingest/SKILL.md │ ├── query/SKILL.md │ ├── maintain/SKILL.md │ ├── enrich/SKILL.md │ ├── briefing/SKILL.md │ ├── alerts/SKILL.md │ ├── research/SKILL.md │ └── upgrade/SKILL.md │ ├── benchmarks/ │ ├── longmemeval.rs # LongMemEval multi-session memory (R@5 ≥ 85%) │ ├── locomo.rs # LoCoMo conversational memory (F1 regression) │ ├── beir_subset.rs # BEIR retrieval regression (nDCG@10) │ ├── ragas_eval.rs # Ragas answer quality (context_precision, recall) │ └── datasets/ # gitignored, downloaded by prep script │ ├── tests/ │ ├── roundtrip_semantic.rs # import → normalized export → semantic validate (MUST pass) │ ├── roundtrip_raw.rs # import → raw export by import_id → byte-exact diff (MUST pass) │ ├── fts.rs # FTS5 search correctness │ ├── inference.rs # candle embedding + vector search quality │ ├── links.rs # link extraction + temporal interval enforcement │ ├── mcp.rs # MCP tool call correctness │ └── fixtures/ │ ├── person.md │ ├── company.md │ └── .raw/person.json │ └── .github/ └── workflows/ ├── ci.yml # cargo test + cargo build —release └── release.yml # cross-compile matrix → GitHub release assets
### CLAUDE.md (embedded)
```markdown# Quaid
Personal knowledge memory. SQLite + FTS5 + local vector embeddings. One binary.
## Architecture
Thin CLI (src/main.rs) dispatches to commands (src/commands/).Core library (src/core/) handles DB, search, embeddings, markdown parsing.Skills (skills/) are fat markdown files - all intelligence lives there.
## Key files
- `src/core/db.rs` — rusqlite connection, schema init, WAL, sqlite-vec load- `src/core/fts.rs` — FTS5 search: `search_fts(query, wing_filter, db)` → ranked results- `src/core/inference.rs` — candle model init, `embed(text)`, `search_vec(query, k, wing_filter, db)`- `src/core/search.rs` — `hybrid_search(query, db)`: SMS + palace filter + FTS5 + vec + set-union merge- `src/core/progressive.rs`— `progressive_retrieve(results, budget, depth)`: token-budget expansion- `src/core/novelty.rs` — `check_novelty(content, page, db)`: Jaccard + cosine dedup- `src/core/assertions.rs` — `check_assertions(slug, db)`: heuristic contradiction detection via SQL- `src/core/graph.rs` — `neighborhood_graph(slug, depth, db)`: N-hop BFS over links table- `src/core/gaps.rs` — `log_gap(query, context, score, db)`, `list_gaps(db)`: knowledge gap tracking- `src/core/chunking.rs` — temporal sub-chunking: truth sections + individual timeline entries- `src/core/palace.rs` — `derive_wing(slug)`, `derive_room(content)`, `classify_intent(query)`- `src/core/markdown.rs` — parse frontmatter, split compiled_truth/timeline, extract_summary, render- `src/mcp/server.rs` — MCP stdio server exposing all tools (including memory_graph, memory_gap, memory_check)
## Build
```bashcargo build --release# Output: target/release/quaid (airgapped channel, default)
# Cross-compilecargo install crosscross build --release --target aarch64-apple-darwincross build --release --target x86_64-unknown-linux-muslcargo test# Key tests: tests/roundtrip_semantic.rs (normalized export validate) + tests/roundtrip_raw.rs (byte-exact diff by import_id).Embedding model
Section titled “Embedding model”BGE-small-en-v1.5 via candle (pure Rust). 384 dimensions. v0.9.2 ships two
compile-time channels:
embedded-model— airgapped binary withinclude_bytes!()model assetsonline-model— slim binary that downloads BGE-small on first semantic use
Skills
Section titled “Skills”Read skills/ before doing memory operations. They contain all workflow logic.
---
## Build and Release
### Cargo.toml (core dependencies)
```toml[package]name = "quaid"version = "0.1.0"edition = "2021"
[[bin]]name = "quaid"path = "src/main.rs"
[dependencies]# CLIclap = { version = "4", features = ["derive"] }
# Database - bundled = SQLite compiled into binary, no system deprusqlite = { version = "0.31", features = ["bundled"] }
# sqlite-vec - vector search as SQLite extension, statically linkedsqlite-vec = "0.1" # or inline via rusqlite loadable extension
# Embeddings - pure Rust ML, no ONNX runtime, true static linkingcandle-core = "0.8"candle-nn = "0.8"candle-transformers = "0.8"safetensors = "0.4"tokenizers = "0.20"
# MCP serverrmcp = "0.1"
# Serializationserde = { version = "1", features = ["derive"] }serde_json = "1"serde_yaml = "0.9"
# Markdownpulldown-cmark = "0.11"
# Regexregex = "1"
# Async runtime (for MCP server)tokio = { version = "1", features = ["full"] }
# Error handlinganyhow = "1"thiserror = "1"
[features]default = ["bundled", "embedded-model"]bundled = ["rusqlite/bundled"]embedded-model = [] # airgapped channel (default): include_bytes!() model weights into binaryonline-model = ["dep:reqwest"] # online channel; download weights on first runBuild commands
Section titled “Build commands”# Developmentcargo build
# Release — airgapped channel (default; embeds BGE-small weights into the binary)cargo build --release
# Release — online channel (downloads/caches BGE-small on first semantic use)cargo build --release --no-default-features --features bundled,online-model
# Run testscargo test
# Cross-compile — candle enables true musl static linkingcross build --release --target aarch64-apple-darwin # macOS ARMcross build --release --target x86_64-apple-darwin # macOS Intelcross build --release --target x86_64-unknown-linux-musl # Linux x86_64 (static)cross build --release --target aarch64-unknown-linux-musl # Linux ARM64 (static)
# Install locallycargo install --path .CI/CD (GitHub Actions)
Section titled “CI/CD (GitHub Actions)”strategy: matrix: include: - target: aarch64-apple-darwin os: macos-latest - target: x86_64-apple-darwin os: macos-latest - target: x86_64-unknown-linux-musl # static binary — no glibc dependency os: ubuntu-latest - target: aarch64-unknown-linux-musl # static binary — no glibc dependency os: ubuntu-latest
# Post-build verification (release gate):# - run: file target/${{ matrix.target }}/release/quaid# - run: ldd target/${{ matrix.target }}/release/quaid 2>&1 | grep -q "not a dynamic" || exit 1 # Linux# - run: otool -L target/${{ matrix.target }}/release/quaid | grep -qv "\.dylib" || true # macOS (system libs OK)
# Post-build: generate SHA-256 checksums for integrity verification# - run: shasum -a 256 ${{ matrix.artifact }} > ${{ matrix.artifact }}.sha256# Publish both binary and .sha256 file as release assets (format: 'hash filename', one line)Release artifacts published to GitHub Releases on tag push. Each release includes platform binaries and SHA-256 checksum files (generated in CI, not locally). Install via the upgrade skill (see skills/upgrade/SKILL.md) which handles version pinning and checksum verification. Quick install for first-time users:
VERSION="v0.1.0"PLATFORM="darwin-arm64" # darwin-arm64 | darwin-x86_64 | linux-x86_64 | linux-aarch64curl -fsSL "https://github.com/quaid-app/quaid/releases/download/${VERSION}/quaid-${PLATFORM}" -o "quaid-${PLATFORM}"curl -fsSL "https://github.com/quaid-app/quaid/releases/download/${VERSION}/quaid-${PLATFORM}.sha256" -o "quaid-${PLATFORM}.sha256"shasum -a 256 --check "quaid-${PLATFORM}.sha256"# Option A: install for the current usermkdir -p "${HOME}/.local/bin"mv "quaid-${PLATFORM}" "${HOME}/.local/bin/quaid"chmod +x "${HOME}/.local/bin/quaid"
# Option B: install system-wide (requires root)sudo install -m 755 "quaid-${PLATFORM}" /usr/local/bin/quaidPhased Delivery
Section titled “Phased Delivery”The spec describes the full vision. Build in phases — earn the right to add complexity.
Phase 1: Core (ship this first)
Section titled “Phase 1: Core (ship this first)”The smallest thing that proves the value proposition:
quaid init,get,put,list,statspagestable withversion, split timestampsknowledge_gapstable (schema only — tools in Phase 3)originalpage type (in type mapping, template in ingest skill)- FTS5 search (
quaid search) - Candle embeddings + sqlite-vec (
quaid embed,quaid query) - SMS exact-match short-circuit
- Basic set-union hybrid search (no palace filtering yet)
quaid collection add/quaid export(normalized round-trip)quaid compact(WAL checkpoint)- MCP server with
memory_get,memory_put,memory_query,memory_search,memory_list - Transactional ingest with idempotency
- Embedded default skills (including source attribution format and filing disambiguation in ingest skill)
- Round-trip test, corpus-reality tests, static binary verification
Ship gate: imports a real corpus, retrieves correctly, exports faithfully, binary is static.
Phase 2: Intelligence Layer
Section titled “Phase 2: Intelligence Layer”- Temporal links (
memory_link,memory_link_close, backlinks with--temporal) - Graph neighborhood traversal (
memory_graph,quaid graph) - Assertions with provenance
- Contradiction detection (
quaid check) - Progressive retrieval with token budgets
- Novelty checking (Tiers 2-4 gating)
- Work-context entities (decision, commitment, action_item)
- Palace wing filtering (validate with benchmarks before committing to room-level)
- Full MCP write surface with version checks
- Optional person template enrichment sections (What They Believe, Hobby Horses, Trajectory, Network) for Tier 1 contacts
Phase 3: Polish + Benchmarks
Section titled “Phase 3: Polish + Benchmarks”- Briefing skill with “what shifted” report
- Alerts skill (interrupt-driven notifications vs scheduled briefings)
- Research skill (knowledge gap resolution)
- Knowledge gap detection (
memory_gap,memory_gapsMCP tools,quaid gapsCLI) - Upgrade skill (agent-guided binary + skill updates)
- Enrichment skill
- LongMemEval, LoCoMo, BEIR, Ragas benchmarks
quaid skills doctorquaid validate --allintegrity checker--jsonoutput on all commandspipemode- CI/CD release pipeline with all gates
Deliberate Deferrals
Section titled “Deliberate Deferrals”- First-class chunks table: The current
page_embeddingstable serves as both chunk metadata and embedding join table. This is intentionally not split into a separatechunkstable for v1 — the enriched columns (content_hash, token_count, heading_path) are sufficient. If progressive retrieval, re-embedding, or chunk lifecycle management becomes painful at scale, promote chunks to their own table in a future version. This is a deliberate deferral, not an oversight. - Room-level palace filtering: Deferred until benchmarks on real corpus prove it helps. Wing-only in v1.
- LLM-assisted contradiction detection: Binary stays dumb. Cross-page semantic reasoning happens via the maintain skill.
- WASM compilation: Rust has strong WASM support. PGLite proves browser portability is viable. If we ever need quaid in a browser or serverless context, WASM is the path. Not a current priority.
- Overnight consolidation cycle: Garry’s DREAMS.md pattern (overnight entity sweep, enrichment, citation fixing) is powerful but is agent configuration, not quaid binary. Could be a skill added post-v1.
Implementation Roadmap
Section titled “Implementation Roadmap”Week 1 - Foundation
Section titled “Week 1 - Foundation”-
cargo init+Cargo.tomlwith dependencies -
src/core/types.rs— Page (with summary, wing, room), Link (with relationship, valid_from, valid_until), Tag, TimelineEntry, SearchResult, Contradiction, KnowledgeGap structs -
src/core/db.rs— rusqlite connection, schema DDL viainclude_str!, WAL, sqlite-vec load -
src/core/markdown.rs—parse_frontmatter(),split_content(),extract_summary(),render_page() -
src/core/palace.rs—derive_wing(slug),derive_room(content)— auto-derive palace metadata from slug structure and section headers - Unit tests for markdown parsing (round-trip frontmatter, compiled_truth/timeline split, summary extraction)
-
src/main.rs— clap CLI scaffold, command dispatch -
src/commands/init.rs— create new memory.db (v4 schema with palace + temporal + contradictions + knowledge_gaps) -
src/commands/get.rs— read page by slug -
src/commands/put.rs— write/update page (auto-extract summary, auto-derive wing/room) -
src/commands/list.rs— list pages with filters (including--wing) -
src/commands/stats.rs— memory statistics (including contradiction count) -
src/commands/tags.rs+ tag/untag — tag operations -
src/commands/link.rs+ unlink + backlinks — with--relationship,--valid-from,--valid-until,--temporalflags
Checkpoint: quaid init, quaid put (with auto-summary/palace), quaid get, quaid list, quaid stats, quaid link --relationship works_at --valid-from 2024-01-15 all working.
Week 2 - Search + Progressive Retrieval
Section titled “Week 2 - Search + Progressive Retrieval”-
src/core/fts.rs— FTS5 search logic, BM25 scoring, palace wing/room filter -
src/commands/search.rs— full-text search command with--wingfilter -
src/core/inference.rs— candle model init,embed(text), batch embedding, palace-filtered vector search -
src/core/chunking.rs— temporal sub-chunking: truth sections at##boundaries, timeline entries individually -
src/commands/embed.rs— generate/refresh embeddings,--alland--staleflags -
src/core/search.rs— hybrid search: SMS + palace pre-filter + FTS5 + vec0 fan-out + set-union merge (default) with RRF fallback via config flag -
src/core/progressive.rs—progressive_retrieve(results, token_budget, depth): summary → section → full expansion gated by token budget -
src/commands/query.rs— semantic query command with--depth,--token-budget,--wingflags - Unit tests for set-union vs RRF merge correctness, progressive retrieval token counting
Checkpoint: quaid search "River AI" and quaid query "who knows Jensen Huang?" --depth auto --token-budget 4000 both return ranked results with progressive expansion. Set-union merge is default, RRF switchable via quaid config set search_merge_strategy rrf.
Week 3 - Ingest + MCP + Integrity
Section titled “Week 3 - Ingest + MCP + Integrity”-
src/core/links.rs— extract wiki-links from markdown, resolve slugs, temporal validity management -
src/core/novelty.rs—check_novelty(): Jaccard similarity + cosine similarity dedup -
src/core/reconciler.rs+src/commands/collection.rs— recursive walk, attach/sync, and transactional page upsert -
src/commands/collection.rs— collection add/sync surface (derive metadata during reconcile) -
src/commands/export.rs— reconstruct markdown directory -
tests/roundtrip_semantic.rs— import → normalized export → semantic validate (primary correctness test) -
tests/roundtrip_raw.rs— import →export --raw --import-id→ byte-exact diff -
src/commands/timeline.rs— read timeline entries -
src/commands/ingest.rs— source document ingestion command (with novelty check,--forceoverride) -
src/core/assertions.rs— heuristic contradiction detection via assertions table: link vs assertion, temporal staleness -
src/core/graph.rs—neighborhood_graph(slug, depth, db): N-hop BFS over links table with temporal filtering -
src/core/gaps.rs—log_gap(),list_gaps(),resolve_gap(): knowledge gap tracking -
src/commands/check.rs—quaid check [SLUG] --all --type temporal|cross_page|stale -
src/commands/graph.rs—quaid graph <SLUG> --depth N --temporal current|historical|all -
src/commands/gaps.rs—quaid gaps --limit N --resolved -
src/mcp/server.rs— MCP stdio server with all tools (includingmemory_graph,memory_gap,memory_gaps,memory_check, progressivememory_query, temporalmemory_backlinks) -
src/commands/serve.rs— start MCP server
Checkpoint: quaid collection add memory /data/memory/ completes the initial sync with zero page loss. quaid ingest rejects duplicate exact-byte replays. quaid check --all detects temporal contradictions. quaid graph people/pedro-franceschi --depth 2 returns the 2-hop neighborhood as JSON. quaid serve connects to Claude Code with all 20 MCP tools.
Week 4 - Polish + Release
Section titled “Week 4 - Polish + Release”-
src/commands/call.rs— raw tool call (GL pattern) -
src/commands/config.rs— config get/set/list (includingsearch_merge_strategy,default_token_budget) -
src/commands/version.rs -
--tools-jsonoutput (MCP tool discovery) -
pipemode (JSONL streaming) -
--jsonoutput flag on all commands - Full test suite: fts.rs, inference.rs, links.rs, mcp.rs, novelty.rs, assertions.rs, progressive.rs, palace.rs, chunking.rs, graph.rs, gaps.rs
-
skills/markdown files finalized (four-tier consolidation + source attribution + filing disambiguation in ingest, palace-aware query, contradiction-aware maintain/briefing, alerts, research, upgrade) -
CLAUDE.md,AGENTS.md,README.md - CI/CD:
cargo test+ cross-compile matrix → GitHub Releases - richer
quaid collection syncvalidation / reporting output -
quaid embed --staleincremental re-embedding
Checkpoint: Full test suite passes. Cross-compiled binaries on GitHub Releases. Round-trip validated against production memory. Contradiction detection runs clean on imported data.
Week 5 - Release Gates (corpus-reality first, leaderboards second)
Section titled “Week 5 - Release Gates (corpus-reality first, leaderboards second)”-
benchmarks/directory with dataset prep scripts, pinned dataset versions, and evaluation harness
Offline CI gates (mandatory, no API keys, fully local)
Section titled “Offline CI gates (mandatory, no API keys, fully local)”- BEIR (retrieval subset) — Retrieval regression gate. Runs entirely offline.
- Dataset:
https://github.com/beir-cellar/beir— pin to specific commit hash inbenchmarks/datasets.lock - Subsets: NQ + FiQA (closest to personal KB workload)
- Metrics: nDCG@10. Target: no regression > 2% between releases
- Harness: custom Rust binary in
benchmarks/beir_eval.rs— embeds corpus, runs queries, compares to baseline
- Dataset:
- Corpus-reality tests (see below) — fully local, no LLM judge
API-dependent evaluation (optional, run manually or in separate CI job)
Section titled “API-dependent evaluation (optional, run manually or in separate CI job)”- LongMemEval — Multi-session memory benchmark.
- Dataset:
https://github.com/xiaowu0162/LongMemEval— pin commit inbenchmarks/datasets.lock - Harness: official
evaluate_qa.py(requiresOPENAI_API_KEYfor LLM judge) - Metrics: R@5 (Recall at 5). Target: ≥ 85%
- Adapter:
benchmarks/longmemeval_adapter.pyconverts quaid queries to LongMemEval format
- Dataset:
- LoCoMo — Long conversational memory benchmark.
- Dataset:
https://github.com/snap-research/locomo— pin commit inbenchmarks/datasets.lock - Harness: official evaluation scripts (API-dependent)
- Metrics: F1 on single-iteration retrieval. Target: ≥ +30% over naive FTS5 baseline
- Dataset:
- Ragas — Answer and context quality metrics for progressive retrieval.
- Framework:
https://docs.ragas.io/— pin version inbenchmarks/requirements.txt - Metrics: context_precision, context_recall, faithfulness
- Note: Ragas requires an LLM judge. Use local Ollama model or API key. Results are advisory, not a release gate.
- Framework:
- Corpus-reality tests — The benchmarks that actually matter for users.
- Import a messy real markdown corpus (7K+ files) → verify zero page loss
- Retrieve a known entity by name → correct page in top 1 (SMS test)
- Retrieve a known fact from timeline → correct entry within top 5 (temporal sub-chunk test)
- Ingest the same source twice → no duplicate timeline entries, no duplicate assertions
- Ingest two conflicting sources → contradiction detected
- Normalized export → reimport → normalized export → semantic diff = zero (idempotent round-trip)
- Run 100 queries against imported corpus → measure p50/p95 latency (target: p95 < 250ms)
- Concurrency and crash-safety stress tests — CI gate for safety invariants.
- Parallel writers: 4 threads calling
memory_puton the same slug with staleexpected_version→ all but one must get ConflictError, zero data corruption - Overlapping ingest: 2 threads ingesting the same source simultaneously → one write, one exact-byte short-circuit, zero duplicate timeline/assertion rows
- Kill-before-commit: start ingest,
kill -9before COMMIT, retry → clean state, no partial mutations,raw_importsnever ends with multiple active rows - WAL compact under load: run
quaid compactwhile a reader holds an open query → compact succeeds, reader gets consistent snapshot - Invariants: monotonic
pages.version, no lost timeline rows, no duplicate side-table rows across all stress scenarios
- Parallel writers: 4 threads calling
- Embedding model migration correctness — CI gate for vec search contract.
- Embed corpus with model A, run 20 queries, record top-5 results per query
- Register model B (same dimensions, different weights — or re-embed with same model under a new name)
- Re-embed all pages under model B, flip active flag
- Run same 20 queries → verify: (a) all results come from model B’s vec table, (b) no vec_rowid/id confusion, (c) no stale model A results leak through
- Rollback: flip active flag back to model A → verify original top-5 results return identically
- Gate: zero cross-model contamination across all queries
- Round-trip integrity — CI gate (two separate tests).
- Semantic: Import corpus → normalized export →
quaid validateagainst original. Checks same pages, same frontmatter keys/values, same compiled_truth, same timeline entries. MUST pass. - Byte-exact: Import corpus →
export --raw --import-id <ID>→diff -ragainst original source. MUST pass. Only valid for the specific import batch, not after mutations.
- Semantic: Import corpus → normalized export →
- Static binary verification — CI release gate.
ldd/file/otoolon every release artifact. Reject any binary with dynamic library dependencies.- Gate:
file quaid-linux-x86_64 | grep "statically linked"must succeed.
Checkpoint: All offline CI gates pass: BEIR nDCG regression, corpus-reality tests, concurrency stress tests, round-trip integrity (both semantic and raw), static binary verification. API-dependent benchmarks (LongMemEval R@5 ≥ 85%, LoCoMo F1, Ragas) run manually before major releases. A failing offline gate blocks the release; API-dependent benchmarks are advisory.
Philosophy: Corpus-reality beats benchmark theater. The leaderboard benchmarks validate architectural decisions; the corpus tests validate that the tool actually works for its intended user.
Design Decisions
Section titled “Design Decisions”Why SQLite over Postgres/Qdrant/Chroma
Section titled “Why SQLite over Postgres/Qdrant/Chroma”Postgres: Better for multi-user writes, replication, row-level security. None of those apply here. This is one person’s memory. One writer, many readers. SQLite’s sweet spot is exactly this workload.
Qdrant/Chroma/Pinecone: External services. Require network. Require containers or API keys. A personal memory shouldn’t need a sidecar container or a paid API to do semantic search. sqlite-vec gives native cosine similarity in the same file, same connection, same query.
The fundamental principle: memory.db is a 500MB file you can scp, rsync, back up to S3, or carry on a USB stick. No connection strings. No Docker. No managed database.
Why candle over fastembed (ONNX)
Section titled “Why candle over fastembed (ONNX)”The primary promise of this tool is a “single binary that runs anywhere.” fastembed relies on the ONNX runtime. Linking libonnxruntime statically — especially across musl for Linux or various macOS architectures — is fragile and often results in missing shared object errors at runtime.
candle is HuggingFace’s pure-Rust ML framework. By using candle to load safetensors for BGE-small-en-v1.5, we achieve true 100% static linking with zero C-dependencies (other than SQLite, which is bundled via rusqlite).
Why local embeddings over OpenAI
Section titled “Why local embeddings over OpenAI”Garry’s spec uses text-embedding-3-small (OpenAI API, 1536 dims, $0.02/1M tokens). Reasonable for a server-side tool. The problem: it requires internet access and an API key at embedding time.
BGE-small-en-v1.5 via candle:
- Pure Rust inference, no ONNX runtime
- 384 dimensions (4x smaller than OpenAI’s 1536-dim - smaller DB, faster search)
- Quality: excellent for personal knowledge base retrieval tasks (competitive with OpenAI small)
- No internet required (model weights embedded in binary by default)
- No API key
- No cost
- Runs in-process
Trade-off: Slightly lower quality on some retrieval benchmarks vs OpenAI text-embedding-3-large. Acceptable for a personal knowledge base where recall@10 matters more than recall@1.
Future: The model column in page_embeddings and embedding_models registry table allow swapping models. Upgrade path: register new model, run quaid embed --all, flip active flag.
Why Rust over TypeScript/Bun
Section titled “Why Rust over TypeScript/Bun”Garry’s spec is TypeScript/Bun. That’s a solid choice for his use case (server, API keys available). Rust is better for this spec because:
rusqlite --features bundled= SQLite compiled into the binary. No system SQLite version issues.- sqlite-vec statically linked = no native extension loading complications
- candle is pure Rust = in-process ML inference, no ONNX runtime, true musl static linking
cargo cross= trivial cross-compilation to arm64 + x86_64 macOS and Linux- No runtime (no Bun, no Node installed on target machine)
- Lower memory footprint
Bun’s compiled binary is ~10MB. This binary is ~90MB (including model weights). The 80MB difference is the cost of zero runtime dependencies. Worth it for a personal tool deployed on client machines.
Why RRF over weighted sum
Section titled “Why RRF over weighted sum”Garry’s spec uses FTS5 score × 0.4 + vector similarity × 0.6. This requires choosing weights and normalizing scores across two different scoring systems with different distributions.
RRF (Reciprocal Rank Fusion) is more robust: it only uses rank position, not raw scores. No normalization needed. No magic numbers. Works well empirically. Standard in hybrid search literature (SIGIR 2009).
Formula: RRF(d) = Σ 1/(k + rank(d, r)) where k=60, summed over result sets r.
Chunk strategy: section-level
Section titled “Chunk strategy: section-level”- Per-page: Too coarse. A 5,000-word person page has many distinct topics. Vector of whole page loses specificity.
- Per-paragraph: Too fine. Loses context. Short paragraphs embed poorly.
- Per-section (
##headers): Right balance. Each## State,## Assessment,## Timelinesection becomes a chunk. ~200-800 tokens. Good quality, good retrieval precision. - Fallback: Pages without headers chunk at ~500-token boundaries.
Multiple memories
Section titled “Multiple memories”A different DB file = a different memory. No application-level complexity.
QUAID_DB=/path/to/work.db quaid statsQUAID_DB=/path/to/personal.db quaid serve --port 3001Why set-union over RRF (v2 change)
Section titled “Why set-union over RRF (v2 change)”v1 spec used RRF (Reciprocal Rank Fusion). UNC’s AutoResearchClaw pipeline (Apr 2026) tested score-based re-ranking approaches and found they degrade performance — disrupting the semantic ordering dense retrieval already established. Set-union (vector results first in original order, FTS5-only results appended) delivered +44% F1 on LoCoMo in a single iteration. Ablation confirmed the gain. RRF is retained as a config fallback for A/B testing.
Why exact-match short-circuit (SMS)
Section titled “Why exact-match short-circuit (SMS)”Set-union merging starts with vector results, which means searching for “Pedro Franceschi” might not return that person’s actual page first if other pages have higher embedding similarity. In a personal memory, searching for a specific name MUST return that entity’s page at #1. SMS guarantees title and slug exact matches bypass semantic fuzziness entirely.
Why palace filtering (v2 addition)
Section titled “Why palace filtering (v2 addition)”MemPalace’s ablation: 60.9% → 94.8% R@5 with wing+room pre-filtering (+34%). Constraining the search space before running expensive vector queries is cheaper and more effective than post-hoc re-ranking. quaid’s palace metadata is auto-derived from slug structure (zero manual effort) with frontmatter override for custom taxonomies.
Caveat: The +34% improvement is from MemPalace’s synthetic benchmark, not validated on quaid’s corpus. Wing-level filtering ships in v1 as a low-cost bet (auto-derived, zero manual effort). Room-level filtering is deferred until benchmark results on real data confirm it helps. If benchmarks show palace filtering doesn’t materially improve retrieval on a personal knowledge corpus, demote it to optional.
Why progressive retrieval (v2 addition)
Section titled “Why progressive retrieval (v2 addition)”OMNIMEM’s ablation: removing progressive retrieval = -17% F1 (largest single component). Serving full pages is wasteful when the agent only needs a summary to decide relevance. Token-budget-gated expansion lets the system serve the right amount of content for the context window available.
Why temporal links (v2 addition)
Section titled “Why temporal links (v2 addition)”Knowledge graphs without temporal validity can’t distinguish “who works at Brex?” from “who ever worked at Brex?”. MemPalace’s temporal triples (valid_from/valid_until) enable this distinction. Contradiction detection builds directly on temporal metadata — you can’t catch “page says he left, but the link is still active” without it.
Why selective ingestion (v2 addition)
Section titled “Why selective ingestion (v2 addition)”OMNIMEM’s three principles: selective ingestion, multimodal atomic units, progressive retrieval. Jaccard overlap + cosine similarity dedup at ingest time prevents the corpus from accumulating noise. Cleaner corpus = better retrieval precision without any search algorithm changes.
Why SQLite over PGLite (v4 validation)
Section titled “Why SQLite over PGLite (v4 validation)”Garry Tan’s earlier v0.8.0 local-memory prototype (Apr 2026) moved from Supabase to PGLite — an in-process Postgres that runs in a browser or Node.js via WASM. Same principle as our SQLite choice: zero external dependencies, fully local. Three independent teams in the same week (us with SQLite, Garry with PGLite, @ansubkhan with Fastify/SQLite) converged on local embedded databases for agent memory. The architecture is validated.
| SQLite (quaid) | PGLite (Garry’s earlier prototype) | |
|---|---|---|
| Transport | cp memory.db / scp / USB stick | Requires WASM runtime to read |
| Runtime | None (statically linked into Rust binary) | Node.js/Bun + WASM |
| True single file | Yes (after quaid compact) | No (PGLite has its own data directory) |
| Vector search | sqlite-vec (statically linked) | pgvector via WASM |
| Cross-compile | cargo cross to any musl target | Requires WASM-compatible platform |
| Browser portability | No (desktop binary) | Yes (WASM) |
Both are good choices for their respective stacks. We chose SQLite because cp memory.db is the entire backup and migration story. PGLite’s browser portability is a future option for us via Rust→WASM compilation, but not a current priority.
The links table as a graph layer (v4 positioning)
Section titled “The links table as a graph layer (v4 positioning)”quaid’s links table with typed relationships and temporal validity windows is a knowledge graph without Neo4j. Combined with memory_graph (N-hop neighborhood traversal), memory_backlinks (temporal filtering), and palace-style hierarchy (wing/room), quaid provides GraphRAG capabilities in a single SQLite file — no separate graph database required.
This is worth calling out because the “separate graph + vector store” problem is a known pain point in the GraphRAG community. Every implementation (nano-graphrag, LangChain GraphRAG, etc.) makes you choose between a graph database and a vector store. quaid doesn’t — wikilink traversal in the links table, vector similarity in sqlite-vec, FTS5 keyword search, all in one file, one connection, one query.
No file watcher (v1)
Section titled “No file watcher (v1)”The memory is written by AI agents using the CLI or MCP. Vault-backed workflows now attach directories with quaid collection add and stay fresh through quaid serve on Unix platforms, while quaid ingest and quaid put remain the explicit single-file write surfaces.
Schema Versioning and DB Migration
Section titled “Schema Versioning and DB Migration”The config table stores version (currently '4'). On startup, the binary compares the DB schema version to its own expected version:
| Scenario | Behavior |
|---|---|
| DB version == binary version | Normal operation |
| DB version < binary version | Acquire exclusive write lock, WAL-safe backup, then run entire migration chain + version bump in a single transaction. Rollback on any error. The binary ships with a migration chain (v1→v2, v2→v3, etc.). No concurrent writers during migration. |
| DB version > binary version | Refuse to open. Print error: “memory.db is version N, but this binary supports up to version M. Upgrade quaid.” |
Migrations are tested by importing a fixture memory at each prior schema version and verifying post-migration integrity via quaid validate --all.
fn migrate(conn: &Connection, db_path: &Path) -> Result<()> { let db_version: u32 = get_config(conn, "version")?.parse()?; let target_version: u32 = SCHEMA_VERSION; // compiled into binary
if db_version > target_version { bail!("memory.db v{} is newer than this binary (supports up to v{})", db_version, target_version); }
if db_version == target_version { return Ok(()); // no migration needed }
// Step 1: Acquire exclusive write lock BEFORE backup. // BEGIN IMMEDIATE prevents concurrent writers from committing between // backup and migration, ensuring the backup is a consistent snapshot. conn.execute("BEGIN IMMEDIATE", [])?;
// Step 2: WAL-safe backup inside the exclusive lock. // VACUUM INTO runs as a read operation on the current snapshot, // producing a standalone copy with all WAL content checkpointed. let backup_path = db_path.with_extension(format!("db.backup-v{}", db_version)); if let Err(e) = conn.execute( &format!("VACUUM INTO '{}'", backup_path.display()), [] ) { conn.execute("ROLLBACK", [])?; bail!("WAL-safe backup failed, migration aborted: {}", e); }
// Step 3: Run entire migration chain inside the same transaction. // If ANY step fails, the entire chain rolls back — no partial migration. for step in (db_version + 1)..=target_version { if let Err(e) = conn.execute_batch(MIGRATIONS[step]) { conn.execute("ROLLBACK", [])?; bail!("Migration v{}→v{} failed at step {}, rolled back: {}", db_version, target_version, step, e); } }
// Step 4: Bump version inside the same transaction. set_config(conn, "version", &target_version.to_string())?;
// Step 5: Commit atomically. All migrations + version bump succeed or none do. conn.execute("COMMIT", [])?; Ok(())}Migration protocol:
- Exclusive lock first:
BEGIN IMMEDIATEblocks all concurrent writers before the backup snapshot. No writes can interleave between backup and migration. - WAL-safe backup under lock:
VACUUM INTOproduces a standalone, fully-checkpointed copy at{db_path}.backup-v{N}. Because the exclusive lock is held, the backup is guaranteed to reflect the exact state that migration will operate on. - Atomic migration chain: All migration steps + version bump run inside the same transaction. If any step fails,
ROLLBACKrestores the DB to its pre-migration state — no partial migration possible. - Rollback safety: The backup file is a complete, self-contained SQLite database (no WAL sidecars). Restoring it requires replacing the DB file AND deleting any
-wal/-shmsidecars (see rollback procedure below).
Important: Migration requires no concurrent clients. The MCP server should not be running during migration. The CLI handles this naturally (single process), but the upgrade skill should verify no quaid serve process is active before proceeding.
Security and Data Sensitivity
Section titled “Security and Data Sensitivity”memory.db contains sensitive personal intelligence: deal assessments, people evaluations, relationship context, business strategy. The security model:
At rest:
- memory.db is a regular file. Protect it with filesystem permissions (
chmod 600). - For encryption at rest, use OS-level full-disk encryption (FileVault, LUKS) or SQLite’s SEE extension (commercial). quaid does not implement its own encryption — that’s a footgun.
quaid compactbefore transport to ensure no WAL sidecar contains unencrypted data.
In transit:
- MCP server runs on stdio (local pipes only). No network listener. No remote access by default.
scp/rsyncfor transfer. Use encrypted channels.
Operational:
- No telemetry. No analytics. No phone-home. The quaid binary itself makes zero network calls at runtime.
- Skills are local markdown files. The binary does not exfiltrate data.
- Network boundary for agent-driven skills: The enrichment skill (
skills/enrich/SKILL.md) and research skill (skills/research/SKILL.md) instruct the agent to call external APIs (Crustdata, Exa, Brave, Happenstance). These network calls are made by the agent, not by the quaid binary — but the effect is the same: memory content (queries, entity names) can reach third-party services. Theknowledge_gaps.sensitivityfield controls this: gaps default tointernal(no external research), and must be explicitly upgraded toredactedorexternalbefore the research skill will issue network calls. Raw query text (query_text) is never retained at detection time — only aquery_hashis stored;query_textis populated only after explicit approval. Agents must respect this classification. quaid exportwrites plaintext markdown. Treat export directories with the same sensitivity as the DB..envfiles or API keys for enrichment skills (Crustdata, Exa) are the user’s responsibility. quaid never stores them in memory.db.
Non-goals: quaid does not implement user auth, access control, audit logging, or data classification. It is a single-user tool on a single machine. If the machine is compromised, the memory is compromised.
Error Handling and Graceful Degradation
Section titled “Error Handling and Graceful Degradation”| Failure | Behavior |
|---|---|
| Candle model fails to load | Fatal on startup. Binary refuses to serve if embeddings can’t work. Clear error message with path to model weights. |
| sqlite-vec not available | Fall back to pure-Rust cosine similarity (O(n) scan). Log warning. Performance degrades but search still works. |
| Skill file missing | Use embedded default. If embedded default also missing (shouldn’t happen), warn and continue — the binary still works for read/write/search, just without agent workflows. |
| DB file corrupt | quaid validate --all reports errors. quaid refuses destructive operations on a corrupt DB. Recommend restore from backup. |
| WAL sidecar missing | SQLite handles this — rolls back uncommitted transactions, creates fresh WAL. No data loss for committed writes. |
| Disk full during write | SQLite transaction rolls back cleanly. No partial state. Error surfaced to caller. |
| Concurrent writer conflict | ConflictError returned to caller with current version. Caller retries. No data corruption. |
Principle: Fail loud, fail safe. Never silently corrupt. Never silently drop data. If in doubt, refuse the operation and tell the user why.
Comparison Table
Section titled “Comparison Table”| Garry’s earlier spec (v0.8) | This spec (v4) | MemPalace | agentmemory | Obsidian | Notion | |
|---|---|---|---|---|---|---|
| Language | TypeScript/Bun | Rust | Python | Node.js | Electron | Web/Cloud |
| Binary | ~10MB + API dep | ~90MB self-contained | pip install | npm install | Heavy app | SaaS |
| Embeddings | PGLite (local, v0.8) | BGE-small local (free) | ChromaDB (local) | BM25 + vector | Plugin | Built-in AI |
| Storage | PGLite (local Postgres) | Single SQLite | ChromaDB + SQLite | SQLite | Markdown files | Cloud DB |
| Search | FTS5 + vector + weighted | Set-union + palace filter + progressive | Palace-filtered semantic | Triple-stream (BM25+vec+KG) | Plugin | Cloud |
| Graph traversal | No | Yes (N-hop memory_graph) | No | Yes (KG edges) | No | No |
| Retrieval | Full pages | Progressive (budget-gated) | Closet → drawer drill-down | Context injection + budget | Manual | Full pages |
| Dedup/novelty | None | Jaccard + cosine | None published | TTL + importance eviction | None | None |
| Temporal graph | No | Yes (valid_from/until) | Yes (KG triples) | Yes (versioned) | No | No |
| Contradiction detection | No | Yes (CLI + MCP) | Yes (fact_checker.py) | Yes (cascading staleness) | No | No |
| Knowledge gap detection | No | Yes (memory_gap + research skill) | No | No | No | No |
| Memory tiers | 2 (truth + timeline) | 4 (evidence → fact → concept → narrative) | 4 (L0-L3) | 4 (observation → narrative) | 1 (flat notes) | 1 (pages) |
| Knowledge model | Compiled truth + timeline | Compiled truth + timeline (tiered) | Palace (wings/halls/rooms) | Knowledge graph | Flat notes | Pages + DBs |
| LongMemEval | Not published | Est. 85-92% R@5 | 96.6% R@5 (raw mode) | 64% Recall@10 | N/A | N/A |
| Air-gapped | No (PGLite OK, but Bun runtime) | Yes (true static binary) | Yes | No (needs MCP client) | Yes | No |
| API keys | None (v0.8) | None | None | None | None | None |
| Backup story | PGLite data dir | cp memory.db | ChromaDB dir | SQLite file | rsync | Cloud |
Open Questions
Section titled “Open Questions”1. Embed model weights into binary?
Section titled “1. Embed model weights into binary?”Option A: include_bytes!() the ONNX model file into the binary at compile time
- Pros: Truly zero-dependency, no network ever, guaranteed version consistency
- Cons: ~90MB binary, model updates require recompile
Option B: fastembed downloads BGE-small to ~/.cache/fastembed/ on first run (33MB download, then cached)
- Pros: Smaller binary distribution (~55MB vs ~90MB), model updates independent of binary
- Cons: Requires internet on first run, breaks offline/air-gapped deployment
Decision: Ship Option A (embedded weights) as the default — the spec’s core promise is
zero-dependency, offline-first operation. Option B available via --features online-model
for users who prefer smaller binaries and accept the network dependency.
2. sqlite-vec linking mechanism — RESOLVED
Section titled “2. sqlite-vec linking mechanism — RESOLVED”Decision: Compile vec0.c directly into rusqlite via a custom build script. sqlite-vec is a single C file (~3K lines) with no external dependencies beyond SQLite itself. The approach:
- Add
vec0.candvec0.htosrc/vendor/sqlite-vec/ - In
build.rs, compilevec0.cwithcc::Buildand link it into the rusqlite bundled SQLite - Register the extension at runtime via
sqlite3_auto_extension(noload_extensioncall needed)
This produces a single statically linked binary on all targets including musl. The approach is proven by other SQLite extension projects (e.g., sqlean bundles extensions the same way).
Fallback: If vec0.c integration is blocked, pure-Rust cosine similarity (O(n) scan over all embeddings) is acceptable for < 100K chunks. Performance is ~50ms for 50K chunks on Apple Silicon — within the 200ms target.
CI verification: Every release build runs ldd (Linux) / otool -L (macOS) / file to confirm no dynamic dependencies. This is a release gate — see Week 5 benchmarks.
3. Embedding dimension: 384 vs higher
Section titled “3. Embedding dimension: 384 vs higher”BGE-small-en-v1.5 = 384 dims. BGE-base = 768 dims. BGE-large = 1024 dims.
- Larger = better quality, larger DB, slower search
- 384 is fast and good enough for personal knowledge base retrieval
- Config table and
modelcolumn make it trivial to switch
4. MCP server: which rmcp version?
Section titled “4. MCP server: which rmcp version?”rmcp is newer and less battle-tested than @modelcontextprotocol/sdk (used in Garry’s spec). Alternatives:
- Implement MCP protocol directly (it’s JSON-RPC 2.0 over stdio, relatively simple)
- Use a thin wrapper layer
5. Ingest command UX
Section titled “5. Ingest command UX”Does quaid ingest <file> attempt to parse entities itself (requiring an LLM call), or is it a pass-through that stores the file and expects the agent (via MCP memory_ingest tool) to do the parsing?
Recommendation: quaid ingest stores the raw source + auto-logs it + runs novelty check. The actual entity extraction and page creation happens via the agent following skills/ingest/SKILL.md. This keeps the binary dumb and the intelligence in markdown.
6. Palace room derivation granularity (v2)
Section titled “6. Palace room derivation granularity (v2)”Wing derivation from slug is straightforward. Room derivation from section headers is less obvious. Options:
- Section-based: Each
## Headerin compiled_truth becomes a room. Simple but creates many rooms per page. - Fixed enum: Map to standardised halls (facts, events, discoveries, preferences, advice) like MemPalace. More structured but requires classification.
- Hybrid: Use fixed halls for embedding metadata (which hall does this chunk belong to?), section headers for page-level room. Best of both worlds but more complex.
Recommendation: Start with wing-only filtering (v2.0). Add room-level filtering as v2.1 once we have empirical data on palace filter effectiveness with real corpus.
7. Contradiction detection: heuristic vs LLM-assisted (v2) — RESOLVED
Section titled “7. Contradiction detection: heuristic vs LLM-assisted (v2) — RESOLVED”Decision: Heuristic-only in the binary, powered by the assertions table. Agents populate temporal assertions during Tier 2 ingest ({subject, predicate, object, valid_from, valid_until}). Fact changes supersede old assertions (valid_until set). The binary runs pure-SQL consistency checks on current beliefs only (valid_until IS NULL):
- Link vs Assertion: Current assertions where the corresponding link has
valid_untilin the past. - Cross-page conflicts: Multiple current assertions with same subject+predicate but different objects.
- Temporal staleness:
timeline_updated_at>truth_updated_atby more than 30 days.
LLM-assisted cross-page checks happen via the maintain skill. Binary stays dumb.
Spec History
Section titled “Spec History”| Date | Event |
|---|---|
| 2026-04-05 | Garry Tan drafts an initial v1 memory prototype (TypeScript/Bun, OpenAI embeddings). Inspired by hitting git scaling limits at 7,471 files / 2.3GB. |
| 2026-04-05 | Garry posts spec to GitHub Gist. Architecture: SQLite + FTS5 + vector, thin CLI, fat skills, MCP-first. |
| 2026-04-06 | Initial architecture review of Garry’s spec. Key improvements identified: Rust over TypeScript, local BGE-small-en-v1.5 over OpenAI embeddings, sqlite-vec over pure-JS cosine similarity, RRF over weighted sum. |
| 2026-04-06 | Full standalone spec written (v1). Incorporates Garry’s schema verbatim (adapted for 384-dim vectors), all CLI commands, all skill files. Adds: Rust implementation details, cross-compile CI, embedding decision rationale. |
| 2026-04-08 | Memory research integration (v2). Incorporated techniques from MemPalace (500/500 LongMemEval), UNC AutoResearchClaw/OMNIMEM (+411% F1), agentmemory (92% token reduction), and Obsidian Mind. Seven changes: (1) palace-style hierarchical filtering (+34% retrieval), (2) set-union hybrid search replacing RRF (+44% F1), (3) progressive retrieval with token budgets, (4) selective ingestion via Jaccard/cosine novelty checks, (5) temporal knowledge graph with validity windows, (6) contradiction detection (CLI + MCP), (7) four-tier memory consolidation in ingest skill. Schema version bumped to v2. |
| 2026-04-08 | v3 Architecture Review. Five changes: (1) Exact-Match Short-Circuit (SMS) ensures title/slug matches always rank first, (2) Temporal sub-chunking embeds timeline entries individually instead of as one blob, (3) Assertions table enables pure-SQL heuristic contradiction detection, (4) Strict optimistic concurrency with expected_version on MCP writes, (5) Switched from fastembed/ONNX to pure-Rust candle for true musl static binary. Also: embedding model registry for safe model upgrades, temporal link uniqueness fix, ingest idempotency, semantic round-trip contract, raw_imports table. |
| 2026-04-09 | Work-context entity types. Inspired by Rowboat (rowboatlabs/rowboat) knowledge-graph-vs-wiki insight: added decision, commitment, action_item as first-class page types with templates. Updated ingest skill to extract work-context entities from meetings. Updated briefing skill with commitments due, action items, and “what shifted overnight” report (superseded assertions in last 24h). |
| 2026-04-09 | External review integration. Accepted 11 of 12 findings from adversarial review. Added: Non-Goals section, Phased Delivery (core → intelligence → polish), WAL/single-file honesty (quaid compact), embedded skills with external override + skills doctor, enriched chunk model (content_hash, token_count, heading_path, last_embedded_at), assertion provenance (asserted_by, source_ref, evidence_text), memory_link_close for targeted temporal closure, corpus-reality benchmarks alongside leaderboard benchmarks, palace filtering caveat. Rejected: valid_from = 'unknown' sentinel replacement (alternatives add complexity for marginal benefit). |
| 2026-04-13 | v4 Community Research + Garry v0.8.0 integration. Reviewed new research notes from Apr 9-12. Eight spec changes: (1) knowledge_gaps table + memory_gap/memory_gaps MCP tools for self-improving knowledge base — agent detects what it doesn’t know and logs it for research skill resolution. (2) memory_graph MCP tool + quaid graph CLI for N-hop neighborhood traversal — returns pages + links as JSON for UI/graph visualization. (3) original as first-class page type with template — distinguishes the user’s own thinking from world concepts. (4) Standardised source attribution format with authority hierarchy — inline citation format, URL requirement for social refs, source conflict rules. (5) Filing disambiguation rules in ingest skill — concept vs original vs idea vs project decision tree. (6) Richer person template with optional enrichment sections (What They Believe, Hobby Horses, Trajectory, Network) for Tier 1 contacts. (7) Three new skills: alerts/SKILL.md (interrupt-driven vs scheduled notifications), research/SKILL.md (knowledge gap resolution), upgrade/SKILL.md (agent-guided binary + skill updates inspired by Garry’s v0.8.0 auto-upgrade UX). (8) “Why SQLite over PGLite” design decision + links-table-as-graph-layer positioning. Comparison table updated for Garry v0.8.0 + PGLite. Schema version bumped to v4. |
This spec is designed to stand alone. Everything needed to build Quaid is above — no prior context required. It is explicitly inspired by Garry Tan’s earlier local-memory work while pursuing a Rust + SQLite implementation with different deployment trade-offs. v4 integrates memory research from MemPalace, OMNIMEM, and agentmemory, plus community research and Garry Tan’s v0.8.0 skillpack analysis. Architecture additions: knowledge gap detection, graph traversal, source attribution standards, filing disambiguation, and three new skills (alerts, research, upgrade).