name	selection-diversity-validation
description	FAIL LOUDLY pattern for selection constraints. Trigger when: (1) correlated stocks selected together, (2) min_crypto_positions ignored, (3) constraints violated silently.
author	Claude Code
date	Thu Jan 01 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Selection Diversity Validation - Research Notes

Experiment Overview

Item	Details
Date	2026-01-01
Goal	Ensure selection constraints (max_correlation, min_crypto_positions) are ENFORCED, not silently ignored
Environment	alpaca_trading/selection/universe.py, tests/test_selection_diversity.py
Status	Success

Context

User reported MSFT and GOOGL both appearing in portfolio despite max_correlation=0.60 setting. Additionally, crypto was not being selected despite min_crypto_positions=1.

Root Cause: Selection had two silent failure modes:

Notebook re-sorted by trainability AFTER diversity optimization, defeating correlation filtering
min_crypto_positions searched only top-ranked symbols instead of all symbols passing hard filters

The fundamental problem: constraints were checked but not enforced - failures were silent.

The Pattern: FAIL LOUDLY

Key Principle

If a constraint is supposed to be enforced, it MUST raise an error when violated. Silent failures lead to production bugs that waste time and money.

Implementation

def validate_selection_constraints(
    symbols: List[str],
    config: SelectionConfig,
    result: 'UniverseSelectionResult',
) -> None:
    """
    Validate that selection result meets ALL constraints.

    This function FAILS LOUDLY if any constraint is violated.
    It should be called at the END of selection to catch bugs.

    Raises:
        ValueError: If any constraint is violated (NOT silent!)
    """
    errors = []

    # 1. Check crypto count
    crypto_count = sum(1 for s in symbols if s.endswith('USD') or '/' in s)
    if crypto_count < config.min_crypto_positions:
        errors.append(
            f"CRYPTO VIOLATION: Required min_crypto_positions={config.min_crypto_positions}, "
            f"but portfolio has {crypto_count} crypto symbols"
        )

    # 2. Check correlation (if matrix available)
    if result.correlation_matrix is not None:
        for i, sym1 in enumerate(symbols):
            for sym2 in symbols[i+1:]:
                try:
                    corr = abs(result.correlation_matrix.get_correlation(sym1, sym2))
                    if corr >= config.max_correlation:
                        errors.append(
                            f"CORRELATION VIOLATION: {sym1}/{sym2} correlation={corr:.2f} "
                            f">= max_correlation={config.max_correlation}"
                        )
                except (KeyError, IndexError):
                    pass  # Symbol not in matrix

    # 3. FAIL LOUDLY if any errors
    if errors:
        error_msg = (
            f"\n{'='*70}\n"
            f"SELECTION CONSTRAINT VIOLATION - THIS IS A BUG!\n"
            f"{'='*70}\n"
            f"Portfolio: {symbols}\n"
            f"\nViolations:\n" +
            "\n".join(f"  - {e}" for e in errors) +
            f"\n\nThe selection system is NOT enforcing constraints correctly.\n"
            f"{'='*70}"
        )
        logger.error(error_msg)
        raise ValueError(error_msg)

    # Log success
    logger.info(
        f"Selection validation PASSED: {len(symbols)} symbols, "
        f"{crypto_count} crypto, max_correlation={config.max_correlation}"
    )

Integration Point

Call validation at the END of select_compatible_universe():

def select_compatible_universe(...) -> Tuple[List[str], UniverseSelectionResult]:
    # ... selection logic ...

    top_symbols = result.get_top_symbols(n=target_size)

    # VALIDATE CONSTRAINTS - FAIL LOUDLY IF VIOLATED
    if top_symbols:
        validate_selection_constraints(top_symbols, config, result)

    return top_symbols, result

Failed Attempts (Critical)

Attempt	Why it Failed	Lesson Learned
Log warning on constraint violation	Warnings are ignored in production	Use `raise ValueError` not `logger.warning`
Notebook re-sorts after selection	Trainability sort defeats diversity optimization	Use `selected_symbols` directly, don't re-sort
Search ranked_symbols for crypto	Crypto may not rank high by trainability	Search ALL symbols that passed hard filters
MAX_EQUITIES = 200 limit	Arbitrary limit missed good candidates	Default to None (scan all ~11k)
Silent constraint checks	Failures go unnoticed until production	ALWAYS fail loudly on violations

Test Suite for Constraints

# tests/test_selection_diversity.py

class TestMaxCorrelationEnforcement:
    """Tests that max_correlation constraint is ACTUALLY enforced."""

    def test_highly_correlated_pairs_excluded(self):
        """CRITICAL: If MSFT and GOOGL are both selected, this is a BUG."""
        # Create highly correlated returns
        np.random.seed(42)
        base = np.random.randn(500)
        returns = {
            'MSFT': pd.Series(base + np.random.randn(500) * 0.1),
            'GOOGL': pd.Series(base + np.random.randn(500) * 0.1),  # Correlated
            'JPM': pd.Series(np.random.randn(500)),  # Independent
        }

        # ... run selection with max_correlation=0.60 ...

        # CRITICAL ASSERTION
        both_selected = 'MSFT' in selected and 'GOOGL' in selected
        assert not both_selected, (
            f"DIVERSITY BUG: Both MSFT and GOOGL selected despite high correlation\n"
            f"Selected: {selected}\n"
            f"This means max_correlation is NOT being enforced!"
        )

class TestMinCryptoPositionsEnforcement:
    """Tests that min_crypto_positions is ACTUALLY enforced."""

    def test_crypto_guaranteed(self):
        """If min_crypto_positions=1 and no crypto, this is a BUG."""
        config = SelectionConfig(min_crypto_positions=1)
        # ... run selection ...
        crypto_count = sum(1 for s in selected if s.endswith('USD'))
        assert crypto_count >= 1, "min_crypto_positions=1 but no crypto selected!"

Removing Arbitrary Limits

Problem

MAX_EQUITIES = 200  # Why 200? Arbitrary!
MAX_CRYPTO = 50     # Why 50? Arbitrary!

Solution

# pipeline.py
def list_equities_by_market_cap(max_symbols: Optional[int] = None) -> List[str]:
    """Return equity universe. None = no limit (all ~11k)."""
    if max_symbols is not None:
        return api_symbols[:max_symbols]
    return api_symbols  # All symbols

# notebook
MAX_EQUITIES = None  # Scan all ~11k equities
MAX_CRYPTO = None    # Scan all crypto

Why remove limits?

Alpaca has ~11k equities - why only scan 200?
Statistical selection will filter down to best candidates anyway
Artificial limits may exclude good opportunities

Key Insights

1. Silent Failures Are Bugs

If a config parameter exists (like max_correlation), the system MUST enforce it. Silent violation of constraints is a production bug waiting to happen.

2. Validate at the END, Not During

Don't scatter constraint checks throughout the code. Add a single validation function that runs AFTER selection completes and raises on ANY violation.

3. Test Constraint Enforcement, Not Just Logic

Don't just test that the function runs. Test that violations actually get caught:

# BAD: Only tests that it runs
assert len(selected) > 0

# GOOD: Tests that constraint is enforced
assert not (both_msft_and_googl_selected), "Correlation constraint violated!"

4. Don't Override Selection Results

If the selection system returns a diversity-optimized portfolio, USE IT. Don't re-sort or filter afterwards - that defeats the optimization.

Files Modified

alpaca_trading/selection/universe.py:
  - Added validate_selection_constraints() function
  - Called at end of select_compatible_universe()

alpaca_trading/data/pipeline.py:
  - Changed max_symbols default from 50 to None
  - list_equities_by_market_cap() scans all by default
  - list_crypto_symbols() scans all by default

notebooks/training.ipynb:
  - MAX_EQUITIES = None (was 200)
  - MAX_CRYPTO = None (was 50)
  - Use selected_symbols directly (no re-sorting)

tests/test_selection_diversity.py: (NEW)
  - 7 tests for constraint enforcement

References

alpaca_trading/selection/universe.py: validate_selection_constraints()
tests/test_selection_diversity.py: Constraint enforcement tests
.skills/plugins/trading/symbol-selection-statistical/: Statistical selection guide
.skills/plugins/trading/drawdown-guardrails-pattern/: Similar "fail loudly" pattern

selection-diversity-validation

Install Skill

SKILL.md