name	distribution-search
description	Guidance for finding probability distributions that satisfy specific statistical constraints such as KL divergence targets. This skill should be used when tasks involve constructing probability distributions with exact numerical properties, optimization over high-dimensional probability spaces, or satisfying multiple simultaneous statistical constraints within tight tolerances.

Distribution Search

Overview

This skill provides structured guidance for finding probability distributions that satisfy specific statistical constraints. These problems typically involve constructing a discrete probability distribution over a large vocabulary that achieves target values for metrics like KL divergence, entropy, or other information-theoretic quantities.

Mathematical Analysis First

Before writing any code, perform thorough mathematical analysis to constrain the solution space:

Derive analytical relationships between the target metrics and distribution properties
- For KL divergence from uniform: KL(P||U) = -H(P) + log(V) where H(P) is entropy and V is vocabulary size
- For KL divergence to uniform: KL(U||P) = log(V) - (1/V) * Σ log(P(i))
Calculate fixed quantities from the problem specification
- Vocabulary size V determines log(V)
- Target values combined with log(V) constrain feasible entropy ranges
Identify implied constraints from simultaneous requirements
- Multiple target metrics often severely restrict the feasible solution space
- Determine if the problem is over-constrained before attempting optimization
Estimate the solution structure analytically
- Determine the approximate entropy the distribution must have
- Estimate how concentrated or spread the probability mass should be

Parameterization Strategy

High-dimensional optimization over all probability values is impractical. Use reduced parameterizations:

Recommended Parameterizations

Power law with uniform tail: k high-probability elements following p_i ∝ i^(-α), remaining elements share probability equally
- Parameters: k (number of special elements), α (power law exponent), p_rest (probability for tail)
Two-group distribution: k elements with probability p_high, remaining elements with probability p_low
- Parameters: k, p_high (p_low is determined by normalization)
Multi-tier distribution: Several groups of elements with distinct probability levels
- Parameters: group sizes and probability levels

Parameter Selection Guidance

Start with few parameters (2-3) and add flexibility only if needed
Derive parameter bounds from mathematical constraints rather than arbitrary choices
Calculate reasonable initial values from the analytical solution estimates

Implementation Approach

Modular Code Structure

Organize code into separate, tested functions:

- kl_forward(P, V): Compute KL(P||Uniform)
- kl_backward(P, V): Compute KL(Uniform||P)
- create_distribution(params, V): Generate distribution from parameterization
- objective(params): Optimization objective combining constraints
- verify_solution(P, V, targets): Independent verification

Computational Efficiency

Avoid full array operations when possible - use analytical formulas for symmetric distributions
For two-group distributions: compute entropy/KL directly from group sizes and probabilities
Estimate computational cost before running - set appropriate timeouts

Optimization Strategy

Coarse grid search over reduced parameter space to find promising regions
Local optimization from multiple starting points in promising regions
Refinement with tighter tolerances near solutions

Verification

Independent Verification

Always verify solutions with code independent from the optimization:

Check probability constraints: All values non-negative, sum to 1.0
Compute metrics directly: Use explicit formulas, not the optimization objective
Test against known cases: Verify computation on uniform distribution or other known solutions

Common Verification Bugs

Incorrect handling of log(0) - use appropriate thresholds (e.g., max(p, 1e-30))
Array indexing errors when elements are counted vs indexed
Forgetting to normalize after parameter adjustments
Multiplication errors when computing sums over groups

Common Pitfalls

Computational

Starting with full-dimensional optimization: Immediately recognize the need for reduced parameterization
Insufficient timeout estimation: Estimate iterations needed before running
Rewriting entire scripts: Use modular code to enable targeted fixes

Mathematical

Incomplete constraint analysis: Fully leverage mathematical relationships before coding
Arbitrary parameter bounds: Derive bounds from problem constraints
Poor initial values: Use analytical estimates for starting points

Verification

Trusting optimization output directly: Always verify with independent computation
Not testing verification code: Verify the verifier using known solutions first
Numerical precision issues: Handle small probabilities carefully

Problem-Solving Workflow

Analyze: Derive analytical relationships and estimate solution structure
Parameterize: Choose reduced parameterization matching expected structure
Implement: Write modular, tested code for each component
Search: Grid search followed by local optimization
Verify: Independent verification of candidate solutions
Refine: Adjust parameterization if tolerances not achieved

When Solutions Are Not Found

If optimization fails to find valid solutions:

Check if the problem is mathematically feasible given constraints
Verify the parameterization can represent valid solutions
Expand the parameterization to add flexibility
Check for bugs in objective function or constraint handling
Try different optimization algorithms or starting points

distribution-search

Install Skill

SKILL.md

Distribution Search

Overview

Mathematical Analysis First

Parameterization Strategy

Recommended Parameterizations

Parameter Selection Guidance

Implementation Approach

Modular Code Structure

Computational Efficiency

Optimization Strategy

Verification

Independent Verification

Common Verification Bugs

Common Pitfalls

Computational

Mathematical

Verification

Problem-Solving Workflow

When Solutions Are Not Found