Claude Code Plugins

Community-maintained marketplace

Feedback

Snowflake, BigQuery, Redshift, Delta Lake, and data warehouse design

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name data-warehousing
description Snowflake, BigQuery, Redshift, dimensional modeling, and modern data warehouse architecture
sasmp_version 1.3.0
bonded_agent 01-data-engineer
bond_type PRIMARY_BOND
skill_version 2.0.0
last_updated 2025-01
complexity intermediate
estimated_mastery_hours 130
prerequisites sql-databases
unlocks big-data, etl-tools

Data Warehousing

Production-grade data warehouse design with Snowflake, BigQuery, and dimensional modeling patterns.

Quick Start

-- Snowflake Modern Data Warehouse Setup
CREATE WAREHOUSE analytics_wh
    WITH WAREHOUSE_SIZE = 'MEDIUM'
    AUTO_SUSPEND = 300
    AUTO_RESUME = TRUE
    MIN_CLUSTER_COUNT = 1
    MAX_CLUSTER_COUNT = 4;

-- Create dimensional model
CREATE TABLE marts.fact_orders (
    order_key BIGINT AUTOINCREMENT PRIMARY KEY,
    date_key INT NOT NULL REFERENCES dim_date(date_key),
    customer_key INT NOT NULL,
    product_key INT NOT NULL,
    quantity INT NOT NULL,
    unit_price DECIMAL(10,2) NOT NULL,
    total_amount DECIMAL(12,2) NOT NULL,
    _loaded_at TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP()
) CLUSTER BY (date_key);

-- Dimension with SCD Type 2
CREATE TABLE marts.dim_customer (
    customer_key INT AUTOINCREMENT PRIMARY KEY,
    customer_id VARCHAR(50) NOT NULL,
    customer_name VARCHAR(255),
    segment VARCHAR(50),
    valid_from DATE NOT NULL,
    valid_to DATE DEFAULT '9999-12-31',
    is_current BOOLEAN DEFAULT TRUE
);

Core Concepts

1. Dimensional Modeling (Kimball)

-- Star Schema Design
-- Fact table: measurable business events
-- Dimension tables: context for analysis

-- Date dimension (conformed)
CREATE TABLE dim_date (
    date_key INT PRIMARY KEY,
    full_date DATE NOT NULL,
    day_of_week INT,
    day_name VARCHAR(10),
    month_num INT,
    month_name VARCHAR(10),
    quarter INT,
    year INT,
    is_weekend BOOLEAN,
    fiscal_year INT,
    fiscal_quarter INT
);

-- SCD Type 2 MERGE pattern
MERGE INTO dim_customer AS target
USING staging_customer AS source
ON target.customer_id = source.customer_id AND target.is_current = TRUE
WHEN MATCHED AND (
    target.customer_name != source.customer_name OR
    target.segment != source.segment
) THEN UPDATE SET valid_to = CURRENT_DATE - 1, is_current = FALSE
WHEN NOT MATCHED THEN INSERT (
    customer_id, customer_name, segment, valid_from
) VALUES (
    source.customer_id, source.customer_name, source.segment, CURRENT_DATE
);

2. Snowflake Optimization

-- Clustering for performance
ALTER TABLE fact_orders CLUSTER BY (date_key, customer_key);
SELECT SYSTEM$CLUSTERING_INFORMATION('fact_orders');

-- Materialized views for aggregations
CREATE MATERIALIZED VIEW mv_daily_sales AS
SELECT date_key, SUM(total_amount) AS daily_revenue, COUNT(*) AS order_count
FROM fact_orders GROUP BY date_key;

-- Search optimization
ALTER TABLE fact_orders ADD SEARCH OPTIMIZATION ON EQUALITY(order_id);

-- Time travel for debugging
SELECT * FROM fact_orders AT(TIMESTAMP => '2024-01-15 10:00:00'::TIMESTAMP);

-- Zero-copy cloning
CREATE TABLE fact_orders_dev CLONE fact_orders;

3. BigQuery Patterns

-- Partitioned and clustered table
CREATE TABLE `project.dataset.fact_events`
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
OPTIONS (partition_expiration_days = 365, require_partition_filter = TRUE)
AS SELECT * FROM source_events;

-- Efficient query with partition pruning
SELECT event_type, COUNT(*) AS event_count
FROM `project.dataset.fact_events`
WHERE DATE(event_timestamp) BETWEEN '2024-01-01' AND '2024-01-31'
GROUP BY event_type;

-- BigQuery ML inline
CREATE OR REPLACE MODEL `project.dataset.churn_model`
OPTIONS (model_type = 'LOGISTIC_REG', input_label_cols = ['churned'])
AS SELECT tenure_months, monthly_spend, churned FROM customer_features;

Tools & Technologies

Tool Purpose Version (2025)
Snowflake Cloud data warehouse Latest
BigQuery Serverless analytics Latest
Redshift AWS data warehouse Serverless
Databricks SQL Lakehouse analytics Latest
dbt Transformation 1.7+
Monte Carlo Data observability Latest

Troubleshooting Guide

Issue Symptoms Root Cause Fix
Slow Query Query timeout No clustering Add clustering key
High Cost Budget exceeded Large warehouse Auto-suspend, right-size
Data Skew Uneven processing Poor partition key Choose better key

Best Practices

-- ✅ DO: Use surrogate keys
customer_key INT AUTOINCREMENT PRIMARY KEY

-- ✅ DO: Add audit columns
_loaded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP()

-- ✅ DO: Cluster on filter columns
CLUSTER BY (date_key)

-- ❌ DON'T: Use natural keys as PK
-- ❌ DON'T: SELECT * in production

Resources


Skill Certification Checklist:

  • Can design star/snowflake schemas
  • Can implement SCD Type 2 dimensions
  • Can optimize with clustering/partitioning
  • Can monitor and optimize costs