Claude Code Plugins

Community-maintained marketplace

Feedback

Comprehensive DuckDB database backup toolkit supporting both file-based (cp) and native (ATTACH+COPY) backup approaches. Use when you need to backup DuckDB databases locally or to remote storage, create daily scheduled backups, verify backup integrity, or manage backup retention policies.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name DuckDB Backup
description Comprehensive DuckDB database backup toolkit supporting both file-based (cp) and native (ATTACH+COPY) backup approaches. Use when you need to backup DuckDB databases locally or to remote storage, create daily scheduled backups, verify backup integrity, or manage backup retention policies.

DuckDB Backup Skill

Quick Start

Two backup methods, choose based on your needs:

File-Based Backup (cp method) - Fast for local backups

python3 scripts/backup_duckdb.py \
  --db data/awsntp.duckdb \
  --backup data/backup-20251223.duckdb \
  --method cp

Native DuckDB Backup (attach method) - For remote/cloud backups

python3 scripts/backup_duckdb.py \
  --db data/awsntp.duckdb \
  --backup data/backup-20251223.duckdb \
  --method attach

With Automatic Timestamp Naming

python3 scripts/backup_duckdb.py \
  --db data/awsntp.duckdb \
  --backup data/backups/ \
  --method cp \
  --timestamp
# Creates: data/backups/backup-20251223-153045.duckdb

Features

  • Two backup methods: cp (fast, local) and attach (flexible, cloud-ready)
  • Timestamped backups: Automatic unique naming with YYYYMMDD-HHMMSS format
  • Error handling: Comprehensive logging and error reporting
  • Scheduled backups: cron/Task Scheduler integration for daily backups
  • Verification: Tools to verify backup integrity and table counts
  • Retention policies: Scripts to manage backup history and cleanup

When to Use

File-Based (cp method):

  • Daily local backups (fastest approach)
  • Pre-materialization safety snapshots
  • Development/testing environments
  • When speed is critical
  • Local machine backups

Native DuckDB (attach method):

  • Cloud/remote storage backups
  • Cross-environment transfers
  • Database in active use scenarios
  • When you need portable backups
  • Complex backup scenarios

Command Reference

python3 scripts/backup_duckdb.py --db <source> --backup <target> [options]

Arguments:

  • --db (required): Path to source DuckDB database
  • --backup (required): Backup target (file path for cp, directory for attach with --timestamp)
  • --method (optional): Backup method - cp (default) or attach
  • --timestamp (optional): Add YYYYMMDD-HHMMSS timestamp to filename

Exit codes:

  • 0: Backup successful
  • 1: Backup failed (check logs)

Examples

Example 1: One-time Local Backup

python3 scripts/backup_duckdb.py \
  --db ~/LocalRepos/awsntpdagster/data/awsntp.duckdb \
  --backup ~/LocalRepos/awsntpdagster/data/backup-20251223.duckdb \
  --method cp

Example 2: Daily Timestamped Backups to Folder

python3 scripts/backup_duckdb.py \
  --db ~/LocalRepos/awsntpdagster/data/awsntp.duckdb \
  --backup ~/LocalRepos/awsntpdagster/data/backups \
  --method cp \
  --timestamp

Example 3: Scheduled Daily Backup (cron)

Add to crontab:

0 2 * * * python3 /path/to/backup_duckdb.py --db /path/to/awsntp.duckdb --backup /path/to/backups/ --method cp --timestamp >> /var/log/duckdb_backup.log 2>&1

Example 4: Backup with Retention Cleanup

#!/bin/bash
# Backup and keep only 7 most recent
BACKUP_DIR="/path/to/backups"

python3 scripts/backup_duckdb.py \
  --db data/awsntp.duckdb \
  --backup "$BACKUP_DIR" \
  --method cp \
  --timestamp

# Keep only 7 most recent backups
ls -t "$BACKUP_DIR"/backup-*.duckdb | tail -n +8 | xargs rm -f

Backup Strategy Selection

For 975 MB DuckDB database (awsntp.duckdb):

Scenario Method Speed Best For
Daily backup before asset materialization cp ~1-2s Production safety
Weekly archive to external drive cp ~1-2s Local storage
Cloud backup to S3 attach ~30-60s Remote storage
Backup during active queries attach N/A Concurrent access

Advanced Usage

Verify Backup Integrity

# Check backup exists and is readable
duckdb data/backup-20251223-153045.duckdb \
  "SELECT COUNT(*) as table_count FROM information_schema.tables;"

Compare Original and Backup Sizes

ls -lh data/awsntp.duckdb data/backup-*.duckdb | awk '{print $5, $9}'

List All Backups Chronologically

ls -lt data/backups/backup-*.duckdb | head -10

Troubleshooting

Issue: "duckdb: command not found"

  • Install DuckDB CLI or update PATH
  • Use full path to duckdb binary if attach method needed

Issue: "database is locked"

  • Ensure no active connections to source database
  • attach method can work around this (doesn't lock database)

Issue: "out of disk space"

  • Check available disk space: df -h
  • Use attach method for cloud storage
  • Implement retention cleanup scripts

Issue: Slow backup

  • For large databases, cp method is fastest
  • Schedule backups during off-peak hours
  • Consider incremental backup strategies

Integration with Your Pipeline

Pre-materialization Backup

# Backup before running expensive assets
python3 scripts/backup_duckdb.py \
  --db data/awsntp.duckdb \
  --backup data/pre-materialization-backup.duckdb \
  --method cp

# Then run your asset materialization
dagster asset materialize -m awsntpdagster.definitions --select awsntp_features_merged_v6

Scheduled Daily Backup

For your awsntpdagster project:

macOS/Linux crontab:

0 2 * * * cd /Users/zhaoliang/LocalRepos/awsntpdagster && python3 src/awsntpdagster/scripts/backup_duckdb.py --db data/awsntp.duckdb --backup data/backups/ --method cp --timestamp

Reference

For detailed backup strategy guide, scheduling patterns, and retention policies, see:

Performance Notes

  • cp method: ~1-2 seconds for 975 MB database (your awsntp.duckdb)
  • attach method: ~30-60 seconds for 975 MB database
  • Both methods: Minimal CPU usage
  • Disk space required: 1x original database size