name: sql-db-architecture description: Understand the toy relational database architecture, crate organization, data flow, and implementation patterns. Use when exploring the database codebase, understanding query execution, debugging storage/indexing, or planning new features. Keywords: sql-database, database, crate, storage, executor, planner, parser, catalog, WAL, buffer pool, index, architecture
sql-database Architecture Expert
Provides deep understanding of the toy relational database implementation, crate boundaries, and system design.
When to Activate
Use this skill when:
- Exploring how SQL queries flow through the system
- Understanding storage layer (heap files, slotted pages, WAL)
- Debugging parser → planner → executor pipeline
- Working with indexes (B+Tree, Hash, Bitmap, Trie)
- Navigating crate dependencies and boundaries
- Planning new features that span multiple crates
System Overview
sql-database is a minimal RDBMS in Rust implementing:
- SQL subset: CREATE/DROP TABLE/INDEX, INSERT, SELECT, UPDATE, DELETE
- Storage: slotted pages + buffer pool + WAL (redo-only)
- Execution: Volcano model with SeqScan/IndexScan operators
- Indexes: B+Tree (sled), Hash, Bitmap (roaring), Trie
Instructions
1. Identify the Query
When user asks about functionality, map to crates:
"How does SELECT work?" → parser → planner → executor → storage/index → buffer
"Where are tables created?" → catalog (metadata) → storage (heap allocation) → WAL (logging)
"How are indexes used?" → planner (index selection) → executor (IndexScan) → index crate
2. Crate Responsibility Map
Use this mapping to locate relevant code:
| Crate | Responsibility | Key Types |
|---|---|---|
common |
Shared types, errors | Row, RecordBatch, DbError, ColumnId |
types |
SQL types & values | SqlType, Value |
expr |
Expression AST + eval | Expr, BinaryOp, UnaryOp |
parser |
SQL → AST | sqlparser-rs adapter |
planner |
AST → logical → physical | Plan nodes, optimization rules |
executor |
Volcano operators | Exec trait, SeqScan, Filter, Project |
storage |
Heap table, tuples | HeapTable, RecordId, slotted pages |
buffer |
Page cache (LRU) | Pager, PageId, TableId |
wal |
Write-ahead log | WalRecord, append/replay |
catalog |
Schema metadata | TableMeta, IndexMeta, TableSchema |
index |
All index types | Index trait, BTree/Hash/Bitmap/Trie |
repl |
CLI shell | rustyline, tabled rendering |
testsupport |
Test fixtures | run_sql_script, snapshots |
3. Data Flow Patterns
Query Execution (SELECT):
SQL string
→ parser (sqlparser-rs → AST)
→ planner (logical plan → physical plan with index selection)
→ executor (Volcano operators: Scan → Filter → Project)
→ storage/index (fetch rows via RecordId)
→ buffer (page cache lookup/fetch)
→ REPL (format RecordBatch with tabled)
Write Path (INSERT/UPDATE/DELETE):
SQL string
→ parser → planner → executor
→ WAL (append WalRecord, fsync)
→ storage (heap table modification)
→ index (maintain all indexes on affected columns)
→ buffer (mark pages dirty)
Recovery:
Startup
→ WAL replay (read WalRecord log)
→ storage (re-apply operations)
→ catalog (restore table/index metadata)
4. Code Navigation Strategy
Finding implementations:
- Trait definitions: Search crate root (e.g.,
executor/lib.rsforExec) - Concrete operators: Look in
<crate>/src/<operator>.rs - Tests: Adjacent
.rsfiles ortests/directory - Integration:
testsupport/for end-to-end scripts
Dependency order (from CLAUDE.md):
common ← types ← expr
↓
parser
↓
catalog
↓
buffer ← storage ← wal
↓
planner
↓
index
↓
executor
↓
repl
5. Key Design Principles
From the design doc:
- Tiny interfaces - Prefer small traits (
Exec,Index,HeapTable,Pager) - No implicit coercion - Same-type comparisons only (v1)
- WAL-first writes - (1) append WAL (2) fsync (3) apply to storage
- Rule-based planning - Predicate pushdown + index selection
- Single writer - No transactions/MVCC in v1
- Snapshot tests - Use
instafor query output verification
6. Response Format
When explaining architecture:
Component Overview:
- Purpose in 1 sentence
- Key types/traits
- Dependencies (what it calls)
- Dependents (what calls it)
Code References:
- Use
crate/path/file.rs:lineformat - Link related components
- Show trait → impl relationships
Examples:
- Provide concrete SQL queries
- Show expected output format
- Reference existing tests when available
Project Context
- Language: Rust (workspace with 13 crates)
- Build:
cargo check/test/fmt/clippy - Coverage:
scripts/coverage.sh(llvm-cov) - Testing: Unit tests inline, integration in
tests/, snapshots withinsta - Dependencies: Pinned in workspace
Cargo.tomlwithworkspace = true
Common Questions
"How does index selection work?"
→ Check planner/ for rules matching WHERE predicates to available indexes
"Where is Row serialized?"
→ storage/ uses bincode for tuple layout in slotted pages
"How does the buffer pool work?"
→ buffer/ implements LRU cache over PageId, backed by file segments
"What SQL is supported?"
→ See SKILL.md header or parser/ for DDL/DML subset
"How to add a new operator?"
→ Implement Exec trait in executor/, wire into planner physical node generation
Tool Usage
- Grep: Find trait implementations, error types, specific SQL keywords
- Read: Examine crate root
lib.rs, trait definitions, test fixtures - Glob: Locate all operators (
executor/**/*.rs), test files (**/tests/*.rs)
For detailed API signatures and implementation notes, see REFERENCE.md.