| name | notebooks-back-end |
| description | Use when working with DuckDB databases, Makefiles, or building/deploying notebooks. Triggers on DuckDB queries, database creation, Makefile editing, make targets (build, data, etl), GitHub Actions workflows, CI/CD, and creating new notebook repositories. |
Notebook build and deployment
Makefile targets
Script philosophy: Prefer shell scripts (scripts/*.sh) with Unix tools (curl, jq, sed) for data fetching, and DuckDB SQL (via duckdb data.duckdb < query.sql) for data processing. Only use Python in unusual cases where shell scripts genuinely can't do the job.
Every notebook should define two data targets:
| Target | Purpose | Where |
|---|---|---|
make etl |
Expensive computation (large downloads, model training, heavy processing) | Local only |
make data |
Lightweight refresh (fetch artifacts, run analysis, export for notebook) | GitHub Actions |
Simple notebook:
.PHONY: build preview etl data clean
build:
yarn build
preview:
yarn preview
etl: data
data:
./scripts/fetch.sh
duckdb data/data.duckdb < scripts/transform.sql
clean:
rm -rf docs/.observable/dist data/data.duckdb
Complex notebook (with heavy ETL uploaded to GitHub Releases):
.PHONY: build preview etl data clean
build:
yarn build
preview:
yarn preview
etl: data/infrastructure.duckdb
data/infrastructure.duckdb:
./scripts/build_infra.sh
data:
gh release download latest -p infrastructure.duckdb.gz -D data --clobber
gunzip -f data/infrastructure.duckdb.gz
duckdb data/data.duckdb < scripts/export.sql
clean:
rm -rf docs/.observable/dist data/data.duckdb
Usage:
make preview- local dev server with hot reload (http://localhost:3000)make build- compile todocs/.observable/dist/make etl- run expensive local computation (manual, infrequent)make data- lightweight data refresh (runs in GitHub Actions)make clean- remove build artifacts
Build process
Compiles docs/index.html into standalone page:
- Parse
<notebook>element - Compile JS cells to modules
- Bundle dependencies
- Apply
template.html - Output to
docs/.observable/dist/
Important: SQL cells query at build time. Database needed for build, not deployment (results embedded in HTML).
GitHub Actions deployment
Each notebook repo has a minimal deploy.yml that calls a shared reusable workflow:
name: Deploy notebook
on:
schedule:
- cron: '0 6 1 * *' # Monthly - adjust per repo
workflow_dispatch:
push:
branches: [main]
jobs:
deploy:
uses: data-desk-eco/.github/.github/workflows/notebook-deploy.yml@main
permissions:
contents: write
pages: write
id-token: write
secrets: inherit
The reusable workflow handles:
- Checkout and setup (Node, Yarn, DuckDB)
- Download shared
template.htmland.claude/(includes skills and shared CLAUDE.md) - Run
make data - Commit any changes
- Run
make build - Deploy to GitHub Pages
Pages setup: Settings → Pages → Source: GitHub Actions
Skip data step: For notebooks without a data target:
jobs:
deploy:
uses: data-desk-eco/.github/.github/workflows/notebook-deploy.yml@main
with:
skip_data: true
# ...
Creating a new notebook
- Use
data-desk-eco.github.ioas GitHub template - Enable Pages (Settings → Pages → Source: GitHub Actions)
- Clone:
git clone [url] && cd [repo] && yarn - Preview:
make preview - Edit
docs/index.html - Push - deploys to
https://research.datadesk.eco/[repo-name]/
Auto-updating files
These files download from the .github repo on each deploy:
template.html- HTML wrapper.claude/- Claude Code skills and shared instructions (.claude/CLAUDE.md)
Don't edit these locally - changes will be overwritten.
Project-specific instructions: Create a root CLAUDE.md in your notebook repo for project-specific context. This file won't be overwritten and should be committed.