Claude Code Plugins

Community-maintained marketplace

Feedback

ff-statistical-methods

@zazu-22/ff_data_analytics
0
0

Expert guidance on statistical analysis methodologies and Monte Carlo simulation for fantasy football. Use this skill when selecting regression approaches, designing simulations, performing variance analysis, or conducting hypothesis tests. Covers regression types (OLS, Ridge, Lasso, GAMs), Monte Carlo frameworks, regression-to-mean analysis, and statistical best practices for player performance modeling.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name ff-statistical-methods
description Expert guidance on statistical analysis methodologies and Monte Carlo simulation for fantasy football. Use this skill when selecting regression approaches, designing simulations, performing variance analysis, or conducting hypothesis tests. Covers regression types (OLS, Ridge, Lasso, GAMs), Monte Carlo frameworks, regression-to-mean analysis, and statistical best practices for player performance modeling.

Statistical Analysis & Simulation for Fantasy Football

Overview

Provide expert guidance on statistical methodologies and simulation techniques for fantasy football analytics. Apply appropriate regression methods, design Monte Carlo simulations, perform variance analysis, and conduct hypothesis tests using research-backed approaches.

When to Use This Skill

Trigger this skill for queries involving:

  • Regression selection: "Should I use OLS or Lasso?" "When to use GAMs?" "Ridge vs Elastic Net?"
  • Monte Carlo simulation: "How do I simulate rest-of-season?" "Calculate championship probability?" "Quantify trade impact?"
  • Variance analysis: "Identify regression-to-mean candidates?" "Calculate confidence intervals?" "Prediction intervals?"
  • Statistical testing: "Hypothesis test for performance trends?" "Is this improvement significant?"
  • Aging curves: "Model non-linear age effects?" "GAMs for position-specific curves?"
  • Uncertainty quantification: "Error bars on projections?" "Probability of outcomes?"

Note: For ML model selection and feature engineering, use ff-ml-modeling. For dynasty strategy domain knowledge, use ff-dynasty-strategy.

Core Capabilities

1. Regression Methods

Decision Framework:

Linear Regression (OLS): Baseline, interpretability, small samples

Ridge (L2): Multicollinearity, keep all features, shrink coefficients

Lasso (L1): High-dimensional data, automatic feature selection, sparse models

Elastic Net: Best default for fantasy (combines Ridge + Lasso)

GAMs: Non-linear relationships (aging curves), interpretable smooth functions

Reference: references/regression_methods.md for detailed comparisons and Python code.

2. Monte Carlo Simulation

Applications:

  • Rest-of-season projections with uncertainty
  • Championship probability estimation
  • Trade scenario impact analysis
  • Lineup optimization under uncertainty

Core Approach:

# Simulate player week: Normal(projection, std_dev)
simulated_points = np.random.normal(projection, std_dev, n_sims=10000)
simulated_points = np.maximum(simulated_points, 0)  # Floor at zero

Key Considerations:

  • Iterations: 10,000 default (SE ≈ 0.5%), 100,000 for critical decisions
  • Error correlation: QB and WRs are correlated, model synergies
  • Path dependence: Update team ratings within simulations (FiveThirtyEight approach)
  • Flaw of averages: Analyze full distribution, not just mean

Reference: references/simulation_design.md for frameworks, templates, and best practices.

Asset: assets/monte_carlo_template.py - Python templates for common simulations.

3. Regression to the Mean

Concept: Extreme values tend toward average in subsequent measurements

Fantasy Application:

  • +TDOE (Touchdowns Over Expected): Declines 86% next year
  • -TDOE: Improves 93% next year
  • High TD rates regress downward, low TD rates improve

Position-Specific Sample Sizes (50% regression):

  • QB: 21 games
  • RB: 29-30 games
  • WR: 13-14 games
  • TE: ~20 games

Implementation:

regression_factor = sample_size / (sample_size + n_50[position])
regressed_estimate = (regression_factor * current_stat) + ((1 - regression_factor) * position_mean)

Reference: references/regression_methods.md section on regression to the mean.

4. Confidence vs Prediction Intervals

Confidence Interval: Uncertainty in estimated mean (narrow)

Prediction Interval: Uncertainty for new observation (wider - use this for player projections!)

Why it matters: Individual player performance has more variability than average performance

# Prediction interval accounts for both parameter uncertainty AND individual variance
margin = t_score * residual_standard_error
lower, upper = prediction - margin, prediction + margin

5. Generalized Additive Models (GAMs)

When to use: Non-linear relationships like aging curves

How it works: Fit smooth spline for each feature: y = β₀ + f₁(age) + f₂(experience) + ...

Fantasy use cases:

  • Aging curves (inverted-U shapes for position-specific performance)
  • Experience effects on production
  • Visualize smooth trends

Research finding: GAMs reveal QB peaks at 28-33, RB declines post-27

Python:

from pygam import LinearGAM, s, f

# s() = smooth (non-linear), f() = factor (categorical)
gam = LinearGAM(s(0) + s(1) + f(2))  # age, experience, position
gam.fit(X_train, y_train)

# Visualize smooth curves
gam.partial_dependence(term=0)  # Age curve

Reference: references/regression_methods.md section on GAMs with Python and R code.

Workflows

Choosing a Regression Method

Step 1: Define Goal

  • Interpretation needed? → OLS or GAMs
  • Prediction focus? → Consider regularization

Step 2: Check Data Characteristics

  • Small sample (<100)? → OLS or Ridge
  • High-dimensional (many features)? → Lasso or Elastic Net
  • Multicollinearity (VIF > 5)? → Ridge or Elastic Net
  • Non-linear patterns? → GAMs

Step 3: Baseline

  • Always start with OLS to establish floor

Step 4: Regularization

  • If overfitting, try Ridge/Lasso/Elastic Net
  • Use cross-validation to select regularization strength

Step 5: Non-linearity

  • If residuals show patterns, consider GAMs
  • Particularly for aging curves

Designing a Monte Carlo Simulation

Step 1: Define Scenario

  • What are you simulating? (Rest-of-season, trade impact, championship probability)
  • What's the time horizon? (Weeks remaining)

Step 2: Gather Inputs

  • Player projections (expected values)
  • Standard deviations (from historical performance or model residuals)
  • Correlations (QB-WR pairs, teammates)

Step 3: Build Simulation

  • Use assets/monte_carlo_template.py as starting point
  • Implement correlated errors for teammates
  • Consider path dependence if multi-week

Step 4: Run Simulations

  • 10,000 iterations default
  • 100,000 for final decisions

Step 5: Analyze Distribution

  • Don't just report mean!
  • Show percentiles (10th, 50th, 90th)
  • Probability of exceeding thresholds
  • Visualize histograms and CDFs

Analyzing Regression to the Mean

Step 1: Identify Extreme Performers

  • Find players with unusually high/low TD rates
  • Calculate TDOE (Touchdowns Over Expected)

Step 2: Check Sample Size

  • How many games/opportunities?
  • Compare to position-specific threshold (QB: 21, RB: 30, WR: 14)

Step 3: Apply Regression Formula

  • regression_factor = n / (n + n_50)
  • regressed = (factor * current) + ((1 - factor) * mean)

Step 4: Identify Buy-Low / Sell-High

  • Positive TDOE → Likely to regress down (sell high)
  • Negative TDOE → Likely to improve (buy low)
  • Volume matters more than TDs!

Identifying Data Requirements

For Regression Analysis:

  • Dependent variable (fantasy points, production metrics)
  • Independent variables (age, usage, efficiency stats)
  • Historical data (3+ years for robust estimates)
  • Position labels (for position-specific models)

For Monte Carlo Simulation:

  • Player projections (weekly expected values)
  • Historical variance (to estimate std dev)
  • Roster compositions (for team simulations)
  • Correlation structures (teammate relationships)

For Variance Analysis:

  • Historical performance distributions
  • Sample sizes (games played, opportunities)
  • Position-specific baselines (for regression to mean)

Integrating with Other Skills

Complement with ff-ml-modeling when:

  • Choosing between regression types and tree-based models
  • Feature engineering informs what to include in regression
  • Validation strategies apply to statistical models too

Complement with ff-dynasty-strategy when:

  • Identifying TD regression candidates (TDOE analysis)
  • Applying aging curves to trade decisions
  • Understanding domain context for statistical findings

Best Practices

Start Simple - OLS baseline before complex methods

Regularize for High Dimensions - Use Lasso/Elastic Net when features > samples

Use GAMs for Clear Non-Linearity - Aging curves, experience effects

Model Correlations in Simulations - QB and WRs are correlated (ρ ≈ 0.6)

Sufficient Iterations - 10,000 minimum for stable estimates

Analyze Full Distribution - Percentiles and probabilities, not just means

Validate Assumptions - Plot residuals, check for patterns

Regression to Mean is Powerful - TDs regress, volume is king

Common Pitfalls

Ignoring non-linearity - Age curves aren't linear, use GAMs

Too few simulations - <1,000 gives unstable estimates

Independence assumptions - Teammates are correlated

Flaw of averages - Non-linear outcomes make "average" misleading

Over-interpreting small samples - NFL has only 17 games/season

Forgetting regression to mean - Extreme TDs will regress

Python Libraries

# Regression
from sklearn.linear_model import Ridge, Lasso, ElasticNet, LinearRegression
import statsmodels.api as sm  # For statistical inference
from pygam import LinearGAM, s, f  # GAMs

# Simulation
import numpy as np
import scipy.stats

# Analysis
from sklearn.metrics import mean_absolute_error, mean_squared_error
import matplotlib.pyplot as plt

References

  • references/regression_methods.md - OLS, Ridge, Lasso, Elastic Net, GAMs, regression to mean, confidence/prediction intervals
  • references/simulation_design.md - Monte Carlo frameworks, championship probability, trade impact, path dependence, error correlation

Assets

  • assets/monte_carlo_template.py - Python templates for rest-of-season simulation, championship probability, and trade impact analysis