| name | superforecaster |
| description | Make well-calibrated probability estimates using superforecasting methodology. Use when user asks about probability, likelihood, chance, odds, "will X happen", "when will X happen", "how much will X cost", "what could go wrong", failure modes, risk assessment, forecasting, or any question involving uncertainty and estimation. |
Superforecaster Skill
Make well-calibrated probability estimates. Adapts methodology based on question attributes and desired output.
Forecasting Cookbook
Questions have overlapping attributes. Mix and match these heuristics based on what applies.
By information availability
Public/searchable topic:
- Check prediction markets first (Polymarket, Metaculus, Manifold)
- Search for expert forecasts, news analysis
- Look for historical base rates in public data
Personal/private situation:
- Fall back on reference class base rates
- Ask user for specifics that affect the estimate
- Example: "When will I get a job?" → ask: field, experience, how long since last job, actively applying?, then search: current hiring rates in that field, unemployment trends, typical job search duration by seniority
Sparse data:
- Use hierarchical priors (general → specific)
- Fermi decomposition into estimable parts
- Widen confidence intervals
By output type needed
Binary (yes/no):
- State probability + 90% CI on the probability itself
Quantity (how much/many):
- Point estimate + 90% CI
- Fermi breakdown if complex
- Consider: inflation, trends, comparable anchors
Timeline (when/how long):
- Median + 90% CI (not just mean—distributions are often right-skewed)
- Apply planning fallacy correction (+50-100% for personal projects)
- Note distribution shape (symmetric vs heavy right tail)
Distribution/trajectories:
- Model the process, simulate, show percentiles
- Plot survival curves, CDFs, or sample paths
- Report both conditional and unconditional stats
Failure modes / risks:
- Enumerate systematically, don't just list top 3
- Score: P(occur) × Impact
- Identify correlated failure modes
By time direction
Past events (verification):
- Direct search, fact-checking
- Triangulate across sources
- Output: boolean + confidence in the answer
Future events:
- Outside view first (base rates, markets)
- Inside view adjustments (specific factors)
- Pre-mortem: "if wrong, why?"
By complexity
Single-factor:
- Find the base rate, adjust, done
Multi-factor but independent:
- Estimate each, multiply/combine analytically or via simple simulation
Multi-factor with dependencies:
- Model the dependency structure (event tree, Bayesian network)
- Simulate through the network
- Example: "Will I be able to buy a house?" depends on savings, income trajectory, home prices, interest rates, down payment requirements—model jointly
Sequential/process:
- Step through time, update state
- Handle branching paths with probabilities
- Example trajectory output:
[Filed I-485] ─┬─ 65% → [Direct Approval] → 8-14 months
├─ 25% → [RFE] ─┬─ 80% → [Approved] → +2-4 months
│ └─ 20% → [Denied]
└─ 10% → [NOID] ─┬─ 40% → [Approved]
└─ 60% → [Denied]
By domain (common patterns)
Personal finance:
- Monte Carlo with historical returns
- Account for sequence risk, inflation, expense variance
- Ask about: allocation, withdrawal strategy, risk tolerance
Job/career:
- Base rates by field, seniority, market conditions
- Ask about: field, experience, active search status, location flexibility
Applications/approvals:
- Published approval rates by category
- Processing time distributions (often right-skewed)
- Ask about: application type, any red flags, supporting evidence strength
Health/medical:
- Published outcome statistics by condition/treatment
- Adjust for age, comorbidities, specific circumstances
- Note: sensitive—frame as "questions to ask your doctor"
Startups/ventures:
- Power law outcomes, high failure rates
- Stage-specific survival rates
- Factor in: team, market, traction, funding
Related Search Strategy
Map questions to searchable topics:
| Question | Related Searchable Topics |
|---|---|
| "How much will groceries cost?" | "USA inflation 2026", "food price index forecast", "grocery CPI" |
| "Will my visa get approved?" | "[visa type] approval rate [year]", "[country] visa processing statistics" |
| "When will project finish?" | "software project delay statistics", "[similar project type] timeline" |
| "Will startup X succeed?" | "startup success rate by stage", "YC company outcomes", "[sector] startup survival" |
Search expansion rules:
- Identify the category/domain
- Find statistical base rates for that category
- Look for recent trends/changes
- Check for expert forecasts on related topics
Prediction Market References
Major Markets
| Market | URL | Best For | Notes |
|---|---|---|---|
| Polymarket | polymarket.com | Crypto, politics, current events | Largest volume, real money |
| Metaculus | metaculus.com | Science, tech, long-term | High-quality forecasters, detailed rationales |
| Manifold | manifold.markets | Wide variety, niche topics | Easy to find obscure questions |
| PredictIt | predictit.org | US politics | Legal US market, capped stakes |
| Kalshi | kalshi.com | Economics, events | CFTC-regulated |
Search patterns:
site:polymarket.com [topic]
site:metaculus.com [topic]
site:manifold.markets [topic]
How to use market prices:
- Market at 70% = strong prior toward 70%
- Diverge only with specific inside information
- Note liquidity (low volume = less reliable)
- Check when market was created (stale = less reliable)
Creating Prediction Markets
When to suggest creating a Manifold question:
- Question is resolvable with clear criteria
- Public interest (others might want to forecast too)
- Time horizon is reasonable (weeks to years, not hours)
- No existing market covers the question well
- User would benefit from crowd wisdom / tracking
Good Manifold question types:
- Binary: "Will X happen by [date]?"
- Numeric: "How many Y by [date]?"
- Multiple choice: "Which of A/B/C will happen first?"
- Date: "When will X happen?"
Question design principles:
- Precise resolution criteria (who judges, what counts)
- Reasonable close date
- Unambiguous wording
- Include relevant context in description
Suggestion format:
This question might benefit from a prediction market. Consider creating on Manifold:
**Title**: "Will OpenAI have a tender offer or IPO by end of 2025?"
**Type**: Binary
**Close date**: 2025-12-31
**Resolution**: Resolves YES if OpenAI completes any liquidity event
(IPO, tender offer, or acquisition) allowing employee stock sales by Dec 31, 2025.
This would let you:
1. Get crowd wisdom from other forecasters
2. Track how probability changes over time
3. Potentially profit if you have better information
Want me to help draft the full question description?
Note: User has a Manifold Markets account for creating questions.
Statistical Sources
General statistics:
- Our World in Data (ourworldindata.org)
- Statista (statista.com)
- US government (census.gov, bls.gov, data.gov)
- Eurostat (ec.europa.eu/eurostat)
Domain-specific:
- Software: Standish Group, Stack Overflow surveys
- Startups: CB Insights, Crunchbase
- Science: Nature Index, PubMed meta-analyses
- Economics: FRED, IMF, World Bank
Base rate queries:
"[category] success rate"
"[type] approval rate [year]"
"[event type] historical frequency"
"[domain] statistics meta-analysis"
Output Formats
Binary probability (yes/no questions)
**Probability**: X%
**90% CI**: [low]% - [high]%
**Verbal**: [Almost certainly not | Unlikely | Toss-up | Likely | Almost certain]
Quantity estimate
**Estimate**: [value] [units]
**90% CI**: [low] - [high]
**Fermi breakdown**: (if applicable)
- Component 1: X
- Component 2: Y
- Combined: X × Y = Z
Timeline estimate
**Median**: [time]
**90% CI**: [earliest] - [latest]
**Distribution shape**: [symmetric | right-skewed (delays likely) | ...]
Failure modes
| Rank | Failure Mode | P(occur) | Impact | Risk Score |
|------|--------------|----------|--------|------------|
| 1 | [mode] | X% | High | [P×I] |
Hierarchical Priors
For questions where multiple reference classes apply, layer priors from general to specific:
Example: OpenAI stock valuation
Level 1 (broadest): "It's a company"
- Base: Companies have varied outcomes, mean reversion applies
Level 2: "It's a startup"
- Prior: High variance, ~90% fail, but survivors grow fast
- Power law distribution of outcomes
Level 3: "It's a late-stage AI company"
- Prior: Currently high growth sector, but potential bubble risk
- Relevant comps: Anthropic, Databricks, other unicorns
Level 4: "It's OpenAI specifically"
- Unique factors: Leadership changes, Microsoft relationship, regulatory scrutiny
- Historical: Past valuations, revenue growth rate
Level 5: "My specific situation"
- Vesting schedule, tax implications, liquidity constraints
How to combine:
- Start with broadest applicable reference class
- Update Bayesian-style with each narrower class
- Weight more specific priors higher when data is available
- Be explicit about which level dominates your estimate
Tail risk decomposition example:
P(OpenAI worth <50% of current) by 2026:
- P(general market crash) × P(affects OpenAI | crash) = 15% × 80% = 12%
- P(AI winter / hype collapse) = 10%
- P(major competitive loss) = 8%
- P(regulatory action) = 5%
- P(internal collapse / scandal) = 5%
- P(nationalization / forced restructure) = 2%
Combined tail risk (some overlap): ~25-30%
Core Methodology
Step 1: Classify the question
- What type? (A-I from taxonomy above)
- What output format is appropriate?
- What related topics should I search?
Step 2: Outside view first
- Find base rates for reference class
- Check prediction markets
- Look for expert forecasts
Step 3: Inside view adjustments
- List factors that make this case different
- Estimate direction (+/-) and magnitude for each
- Adjust base rate accordingly
Step 4: Synthesize
- Combine outside and inside views
- State confidence interval (not just point estimate)
- Identify key uncertainties
Step 5: Pre-mortem
- "If this forecast is wrong, why?"
- List top failure modes for the forecast itself
Calibration Reference
| Probability | Verbal | Interpretation |
|---|---|---|
| 1-5% | Almost certainly not | Would be very surprised |
| 10-20% | Unlikely | Possible but not expected |
| 30-40% | Probably not | Lean against |
| 45-55% | Toss-up | Genuinely uncertain |
| 60-70% | Probably | Lean toward |
| 80-90% | Likely | Expected outcome |
| 95-99% | Almost certain | Would be very surprised if not |
Common biases to counter:
- Planning fallacy: Add 50-100% to time estimates
- Overconfidence: Default to wider CIs
- Availability: Check base rates, don't rely on memorable examples
- Anchoring: Consider multiple starting points
Computational Tools
Write and run Python scripts when helpful. Use numpy, pandas, matplotlib.
When to use code
- Multiple interacting uncertainties (simulation beats mental math)
- User wants to see the distribution shape
- Complex Fermi decomposition with many factors
- Trajectory analysis with many branches
Example applications
- Timeline Monte Carlo: Model phases as lognormal (captures right-skew delays), combine, plot distribution
- FIRE/Portfolio survival: Bootstrap historical S&P returns, simulate 10k trajectories, report survival rate and percentiles
- Bayesian updates: Compute posteriors from priors and likelihoods
- Sensitivity analysis: Vary key parameters, show which dominate uncertainty
- Distribution visualization: Histograms, CDFs, violin plots for communicating uncertainty
Combining distributions
When outcome Z = f(X, Y) for uncertain X and Y:
- Sample N draws from X's distribution
- Sample N draws from Y's distribution (jointly if correlated, independently if not)
- Compute Z = f(X, Y) for each sample pair
- Report percentiles of Z
Example: "What will my OpenAI equity be worth after taxes?"
Inputs:
- shares: 10,000 (known)
- price_per_share: Lognormal(mean=$150, std=$80) — uncertainty from valuation
- P(liquidity_event_by_2026): 60%
- tax_rate: 40% federal+state (if exercised as income)
Model:
If no liquidity event: worth = $0 realized
If liquidity event:
gross = shares × price_per_share
net = gross × (1 - tax_rate)
Simulation:
for i in 1..10000:
if random() < 0.6: # liquidity event happens
price = sample_lognormal(150, 80)
net[i] = 10000 * price * 0.6
else:
net[i] = 0
Output:
P(>$0): 60%
Median (conditional on liquidity): $900k
90% CI (conditional): $400k - $1.8M
Expected value: $540k (includes 40% chance of $0)
Correlation handling: If X and Y are correlated (e.g., OpenAI valuation and probability of IPO both depend on AI market sentiment), model the common factor explicitly or use copulas.
Complex probability networks
When a question involves interconnected uncertain variables, conditional dependencies, or state that evolves through stages—don't try to solve analytically. Model the network and simulate.
Types of structures:
- Sequential processes: state evolves step-by-step (portfolio drawdown, disease progression, project phases)
- Event trees: branching paths with probabilities at each node (immigration outcomes, startup funding rounds)
- Bayesian networks: variables with conditional dependencies (diagnosis given symptoms, success given multiple factors)
- Queuing/waiting: arrivals and processing with random timing (application processing, service times)
General approach:
- Define the state variables and their dependencies
- Define transition/update rules (deterministic functions of random inputs)
- Sample all random inputs, propagate through the network
- Collect the output distribution
Example: Portfolio drawdown
Inputs:
- initial_balance: $800,000
- allocation: 70% S&P 500, 30% bonds
- monthly_expenses: Normal(mean=$5000, std=$800) # varies month to month
- stock_returns: sample from historical monthly S&P (mean ~0.8%, std ~4.5%)
- bond_returns: sample from historical monthly bonds (mean ~0.3%, std ~1.5%)
- correlation: stocks and bonds ~0.2 correlated (model jointly)
Process (per simulation):
balance = initial_balance
month = 0
while balance > 0 and month < 600: # cap at 50 years
# Draw correlated returns
stock_return, bond_return = sample_correlated(rho=0.2)
portfolio_return = 0.7 * stock_return + 0.3 * bond_return
# Draw expenses
expenses = sample_normal(5000, 800)
# Update
balance = balance * (1 + portfolio_return) - expenses
month += 1
record time_to_ruin = month (or "never" if survived 50 years)
Run 10,000 simulations, output:
P(never runs out in 50 years): 45%
Median time to ruin (if it happens): 22 years
10th percentile: 12 years (unlucky sequence)
90th percentile: 40+ years
Plot: survival curve (% still solvent vs time)
Key modeling choices:
- Sequence risk: Early bad returns hurt more (less capital to recover)
- Expense shocks: Can add P(major expense) for medical/home repair
- Inflation: Expenses drift upward ~3%/year
- Drawdown strategy: Fixed $ vs % of portfolio vs guardrails
Evidence Handling
Light questions (quick/standard):
- Cite sources inline with URLs
- No file storage
Heavy questions (deep analysis, user requests):
- Create
docs/forecasts/<topic-slug>/folder - Store evidence files, synthesis document
- Only when explicitly warranted or requested
Depth Adaptation
Quick (~30 sec)
- Triggered by: "roughly", "ballpark", conversational tone
- 1-2 mental models, inline response
Standard (2-5 min)
- Default for most questions
- WebSearch for base rates + markets
- Structured output with CI
Deep (10+ min)
- Triggered by: "thorough", "deep dive", high-stakes context
- Extensive research, multiple sources
- Optional evidence folder
- Optional subagent delegation for parallel research