name	notebooks-back-end
description	Use when working with DuckDB databases, Makefiles, or building/deploying notebooks. Triggers on DuckDB queries, database creation, Makefile editing, make targets (build, data, etl), GitHub Actions workflows, CI/CD, and creating new notebook repositories.

Notebook build and deployment

Makefile targets

Script philosophy: Prefer shell scripts (scripts/*.sh) with Unix tools (curl, jq, sed) for data fetching, and DuckDB SQL (via duckdb data.duckdb < query.sql) for data processing. Only use Python in unusual cases where shell scripts genuinely can't do the job.

Every notebook should define two data targets:

Target	Purpose	Where
`make etl`	Expensive computation (large downloads, model training, heavy processing)	Local only
`make data`	Lightweight refresh (fetch artifacts, run analysis, export for notebook)	GitHub Actions

Simple notebook:

.PHONY: build preview etl data clean

build:
	yarn build

preview:
	yarn preview

etl: data

data:
	./scripts/fetch.sh
	duckdb data/data.duckdb < scripts/transform.sql

clean:
	rm -rf docs/.observable/dist data/data.duckdb

Complex notebook (with heavy ETL uploaded to GitHub Releases):

.PHONY: build preview etl data clean

build:
	yarn build

preview:
	yarn preview

etl: data/infrastructure.duckdb

data/infrastructure.duckdb:
	./scripts/build_infra.sh

data:
	gh release download latest -p infrastructure.duckdb.gz -D data --clobber
	gunzip -f data/infrastructure.duckdb.gz
	duckdb data/data.duckdb < scripts/export.sql

clean:
	rm -rf docs/.observable/dist data/data.duckdb

Usage:

make preview - local dev server with hot reload (http://localhost:3000)
make build - compile to docs/.observable/dist/
make etl - run expensive local computation (manual, infrequent)
make data - lightweight data refresh (runs in GitHub Actions)
make clean - remove build artifacts

Build process

Compiles docs/index.html into standalone page:

Parse <notebook> element
Compile JS cells to modules
Bundle dependencies
Apply template.html
Output to docs/.observable/dist/

Important: SQL cells query at build time. Database needed for build, not deployment (results embedded in HTML).

GitHub Actions deployment

Each notebook repo has a minimal deploy.yml that calls a shared reusable workflow:

name: Deploy notebook

on:
  schedule:
    - cron: '0 6 1 * *'  # Monthly - adjust per repo
  workflow_dispatch:
  push:
    branches: [main]

jobs:
  deploy:
    uses: data-desk-eco/.github/.github/workflows/notebook-deploy.yml@main
    permissions:
      contents: write
      pages: write
      id-token: write
    secrets: inherit

The reusable workflow handles:

Checkout and setup (Node, Yarn, DuckDB)
Download shared template.html and .claude/ (includes skills and shared CLAUDE.md)
Run make data
Commit any changes
Run make build
Deploy to GitHub Pages

Pages setup: Settings → Pages → Source: GitHub Actions

Skip data step: For notebooks without a data target:

jobs:
  deploy:
    uses: data-desk-eco/.github/.github/workflows/notebook-deploy.yml@main
    with:
      skip_data: true
    # ...

Creating a new notebook

Use data-desk-eco.github.io as GitHub template
Enable Pages (Settings → Pages → Source: GitHub Actions)
Clone: git clone [url] && cd [repo] && yarn
Preview: make preview
Edit docs/index.html
Push - deploys to https://research.datadesk.eco/[repo-name]/

Auto-updating files

These files download from the .github repo on each deploy:

template.html - HTML wrapper
.claude/ - Claude Code skills and shared instructions (.claude/CLAUDE.md)

Don't edit these locally - changes will be overwritten.

Project-specific instructions: Create a root CLAUDE.md in your notebook repo for project-specific context. This file won't be overwritten and should be committed.

notebooks-back-end

Install Skill

SKILL.md