Claude Code Plugins

Community-maintained marketplace

Feedback
22
0

>

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name ai-ml-data-science
description End-to-end data science patterns (modern best practices): problem framing → data → EDA → feature engineering (with feature stores) → modelling → evaluation → reporting, plus SQL transformation (SQLMesh). Emphasizes MLOps integration, drift monitoring, and production-ready workflows.

Data Science Engineering Suite – Quick Reference

This skill turns raw data and questions into validated, documented models ready for production:

  • EDA workflows: Structured exploration with drift detection
  • Feature engineering: Feature store patterns (50%+ of DS time)
  • Model selection: LightGBM-first approach (proven best performance)
  • Evaluation & reporting: Slice analysis, model cards, production metrics
  • SQL transformation: SQLMesh for staging/intermediate/marts layers
  • MLOps: CI/CD, CT (continuous training), CM (continuous monitoring)
  • Production patterns: Data contracts, lineage, feedback loops, streaming features

Modern emphasis (December 2025): Feature stores, automated retraining, drift monitoring, and train-serve parity. Tools: LightGBM 4.6, scikit-learn 1.7, PyTorch 2.9.1, Polars 1.x.


Quick Reference

Task Tool/Framework Command When to Use
EDA & Profiling Pandas, Great Expectations df.describe(), ge.validate() Initial data exploration and quality checks
Feature Engineering Pandas, Polars, Feature Stores df.transform(), Feast materialization Creating lag, rolling, categorical features
Model Training LightGBM 4.6, XGBoost, scikit-learn 1.7 lgb.train(), xgb.train(), model.fit() Tabular ML with structured data
Hyperparameter Tuning Optuna, Ray Tune optuna.create_study(), tune.run() Optimizing model parameters
SQL Transformation SQLMesh sqlmesh plan, sqlmesh run Building staging/intermediate/marts layers
Experiment Tracking MLflow, W&B mlflow.log_metric(), wandb.log() Versioning experiments and models
Model Evaluation scikit-learn, custom metrics metrics.roc_auc_score(), slice analysis Validating model performance

When to Use This Skill

Claude should invoke this skill when the user asks for hands-on DS/ML workflow help, e.g.:

  • "Help me explore this dataset / find issues."
  • "Design features for this problem."
  • "Choose metrics and evaluate this model."
  • "Write a model evaluation report or model card."
  • "Structure an end-to-end DS project."
  • "Set up SQL transformations with SQLMesh."
  • "Build incremental feature pipelines in SQL."
  • "Create staging/intermediate/marts layers."
  • "Set up MLOps: CI/CD, continuous training, monitoring."

Data Lake & Lakehouse

For comprehensive data lake/lakehouse patterns (beyond SQLMesh transformation), see data-lake-platform:

  • Table formats: Apache Iceberg, Delta Lake, Apache Hudi
  • Query engines: ClickHouse, DuckDB, Apache Doris, StarRocks
  • Alternative transformation: dbt (alternative to SQLMesh)
  • Ingestion: dlt, Airbyte (connectors)
  • Streaming: Apache Kafka patterns
  • Orchestration: Dagster, Airflow

This skill focuses on ML feature engineering and modeling. Use data-lake-platform for general-purpose data infrastructure.


Related Skills

For adjacent topics, reference:


Decision Tree: Choosing Data Science Approach

User needs ML for: [Problem Type]
    ├─ Tabular Data?
    │   ├─ Small-Medium (<1M rows)? → **LightGBM** (fast, efficient)
    │   ├─ Large & Complex (>1M rows)? → **LightGBM** first, then NN if needed
    │   └─ High-dim sparse (text, counts)? → Linear models, then shallow NN
    │
    ├─ Time Series?
    │   ├─ Seasonality? → LightGBM, then see ai-ml-timeseries skill
    │   └─ Long-term dependencies? → Transformers (ai-ml-timeseries)
    │
    ├─ Text or Mixed Modalities?
    │   └─ LLMs/Transformers → See ai-llm
    │
    └─ SQL Transformations?
        └─ SQLMesh (staging/intermediate/marts layers)

Key Finding: Tree-based methods (LightGBM) deliver best performance with significant computational efficiency advantage.


Core Patterns (Overview)

Pattern 1: End-to-End DS Project Lifecycle

Use when: Starting or restructuring any DS/ML project.

Stages:

  1. Problem framing - Business objective, success metrics, baseline
  2. Data & feasibility - Sources, coverage, granularity, label quality
  3. EDA & data quality - Schema, missingness, outliers, leakage checks
  4. Feature engineering - Per data type with feature store integration
  5. Modelling - Baselines first, then LightGBM, then complexity as needed
  6. Evaluation - Offline metrics, slice analysis, error analysis
  7. Reporting - Model evaluation report + model card
  8. MLOps - CI/CD, CT (continuous training), CM (continuous monitoring)

Detailed guide: EDA Best Practices


Pattern 2: Feature Engineering

Use when: Designing features before modelling or during model improvement.

By data type:

  • Numeric: Standardize, handle outliers, transform skew, scale
  • Categorical: One-hot/ordinal (low cardinality), target/frequency/hashing (high cardinality)
    • Feature Store Integration: Store encoders, mappings, statistics centrally
  • Text: Cleaning, TF-IDF, embeddings, simple stats
  • Time: Calendar features, recency, rolling/lag features

Key Modern Practice: Use feature stores (Feast, Tecton, Databricks) for versioning, sharing, and train-serve parity.

Detailed guide: Feature Engineering Patterns


Pattern 3: Data Contracts & Lineage

Use when: Building production ML systems with data quality requirements.

Components:

  • Contracts: Schema + ranges/nullability + freshness SLAs
  • Lineage: Track source → feature store → train → serve
  • Feature store hygiene: Materialization cadence, backfill/replay, encoder versioning
  • Schema evolution: Backward/forward-compatible migrations with shadow runs

Detailed guide: Data Contracts & Lineage


Pattern 4: Model Selection & Training

Use when: Picking model families and starting experiments.

Decision guide (modern benchmarks):

  • Tabular: Start with LightGBM (best performance + efficiency)
  • Baselines: Always implement simple baselines first (majority class, mean, naive forecast)
  • Train/val/test splits: Time-based (forecasting), group-based (user/item leakage), or random (IID)
  • Hyperparameter tuning: Start manual, then Bayesian optimization (Optuna, Ray Tune)
  • Overfitting control: Regularization, early stopping, cross-validation

Detailed guide: Modelling Patterns


Pattern 5: Evaluation & Reporting

Use when: Finalizing a model candidate or handing over to production.

Key components:

  • Metric selection: Primary (ROC-AUC, PR-AUC, RMSE) + guardrails (calibration, fairness)
  • Threshold selection: ROC/PR curves, cost-sensitive, F1 maximization
  • Slice analysis: Performance by geography, user segments, product categories
  • Error analysis: Collect high-error examples, cluster by error type, identify systematic failures
  • Evaluation report: 8-section report (objective, data, features, models, metrics, slices, risks, recommendation)
  • Model card: Documentation for stakeholders (intended use, data, performance, ethics, operations)

Detailed guide: Evaluation Patterns


Pattern 6: Reproducibility & MLOps

Use when: Ensuring experiments are reproducible and production-ready.

Modern MLOps (CI/CD/CT/CM):

  • CI (Continuous Integration): Automated testing, data validation, code quality
  • CD (Continuous Delivery): Environment-specific promotion (dev → staging → prod), canary deployment
  • CT (Continuous Training): Drift-triggered and scheduled retraining
  • CM (Continuous Monitoring): Real-time data drift, performance, system health

Versioning:

  • Code (git commit), data (DVC, LakeFS), features (feature store), models (MLflow Registry)
  • Seeds (reproducibility), hyperparameters (experiment tracker)

Detailed guide: Reproducibility Checklist


Pattern 7: Feature Freshness & Streaming

Use when: Managing real-time features and streaming pipelines.

Components:

  • Freshness contracts: Define freshness SLAs per feature, monitor lag, alert on breaches
  • Batch + stream parity: Same feature logic across batch/stream, idempotent upserts
  • Schema evolution: Version schemas, add forward/backward-compatible parsers, backfill with rollback
  • Data quality gates: PII/format checks, range checks, distribution drift (KL, KS, PSI)

Detailed guide: Feature Freshness & Streaming


Pattern 8: Production Feedback Loops

Use when: Capturing production signals and implementing continuous improvement.

Components:

  • Signal capture: Log predictions + user edits/acceptance/abandonment (scrub PII)
  • Labeling: Route failures/edge cases to human review, create balanced sets
  • Dataset refresh: Periodic refresh (weekly/monthly) with lineage, protect eval set
  • Online eval: Shadow/canary new models, track solve rate, calibration, cost, latency

Detailed guide: Production Feedback Loops


Resources (Detailed Guides)

For comprehensive operational patterns and checklists, see:


Templates

Use these as copy-paste starting points:

Project & Workflow Templates

  • Standard DS project template: templates/project/template-standard.md
  • Quick DS experiment template: templates/project/template-quick.md

Feature Engineering & EDA

  • Feature engineering template: templates/features/template-feature-engineering.md
  • EDA checklist & notebook template: templates/eda/template-eda.md

Evaluation & Reporting

  • Model evaluation report: templates/evaluation/template-evaluation-report.md
  • Model card: templates/evaluation/template-model-card.md

SQL Transformation (SQLMesh)

For SQL-based data transformation and feature engineering:

  • SQLMesh project setup: templates/transformation/template-sqlmesh-project.md
  • SQLMesh model types: templates/transformation/template-sqlmesh-model.md (FULL, INCREMENTAL, VIEW)
  • Incremental models: templates/transformation/template-sqlmesh-incremental.md
  • DAG and dependencies: templates/transformation/template-sqlmesh-dag.md
  • Testing and data quality: templates/transformation/template-sqlmesh-testing.md

Use SQLMesh when:

  • Building SQL-based feature pipelines
  • Managing incremental data transformations
  • Creating staging/intermediate/marts layers
  • Testing SQL logic with unit tests and audits

For data ingestion (loading raw data), use:

  • ai-mlops skill (dlt templates for REST APIs, databases, warehouses)

Navigation

Resources

Templates

Data


External Resources

See data/sources.json for 82 curated resources:

  • Core ML/DL: scikit-learn, XGBoost, LightGBM, PyTorch, TensorFlow, JAX
  • Data processing: pandas, NumPy, Polars, DuckDB, Spark, Dask
  • SQL transformation: SQLMesh, dbt (staging/marts/incremental patterns)
  • Feature stores: Feast, Tecton, Databricks Feature Store (centralized feature management)
  • Data validation: Pydantic, Great Expectations, Pandera, Evidently (quality + drift)
  • Visualization: Matplotlib, Seaborn, Plotly, Streamlit, Dash
  • MLOps: MLflow, W&B, DVC, Neptune (experiment tracking + model registry)
  • Hyperparameter tuning: Optuna, Ray Tune, Hyperopt
  • Model serving: BentoML, FastAPI, TorchServe, Seldon, Ray Serve
  • Orchestration: Kubeflow, Metaflow, Prefect, Airflow, ZenML
  • Cloud platforms: AWS SageMaker, Google Vertex AI, Azure ML, Databricks, Snowflake

Use this skill to execute data science projects end-to-end: concrete checklists, patterns, and templates, not theory.