| name | policyengine-analysis |
| description | Common analysis patterns for PolicyEngine research repositories (CRFB, newsletters, dashboards, impact studies) |
PolicyEngine Analysis
Patterns for creating policy impact analyses, dashboards, and research using PolicyEngine.
For Users 👥
What are Analysis Repositories?
Analysis repositories produce the research you see on PolicyEngine:
Blog posts:
- "How Montana's tax cuts affect poverty"
- "Harris EITC proposal costs and impacts"
- "UK Budget 2024 analysis"
Dashboards:
- State tax comparisons
- Policy proposal scorecards
- Interactive calculators (GiveCalc, SALT calculator)
Research reports:
- Distributional analyses for organizations
- Policy briefs for legislators
- Impact assessments
How Analysis Works
- Define policy reform using PolicyEngine parameters
- Create household examples showing specific impacts
- Run population simulations for aggregate effects
- Calculate distributional impacts (who wins, who loses)
- Create visualizations (charts, tables)
- Write report following policyengine-writing-skill style
- Publish to blog or share with stakeholders
Reading PolicyEngine Analysis
Key sections in typical analysis:
The proposal:
- What policy changes
- Specific parameter values
Household impacts:
- 3-5 example households
- Dollar amounts for each
- Charts showing impact across income range
Statewide/national impacts:
- Total cost or revenue
- Winners and losers by income decile
- Poverty and inequality effects
See policyengine-writing-skill for writing conventions.
For Analysts 📊
When to Use This Skill
- Creating policy impact analyses
- Building interactive dashboards with Streamlit/Plotly
- Writing analysis notebooks
- Calculating distributional impacts
- Comparing policy proposals
- Creating visualizations for research
- Publishing policy research
Example Analysis Repositories
crfb-tob-impacts- Policy impact analysesnewsletters- Data-driven newsletters2024-election-dashboard- Policy comparison dashboardsmarginal-child- Specialized policy analysesgivecalc- Charitable giving calculator
Repository Structure
Standard analysis repository structure:
analysis-repo/
├── analysis.ipynb # Main Jupyter notebook
├── app.py # Streamlit app (if applicable)
├── requirements.txt # Python dependencies
├── README.md # Documentation
├── data/ # Data files (if needed)
├── outputs/ # Generated charts, tables
└── .streamlit/ # Streamlit config
└── config.toml
Common Analysis Patterns
Pattern 1: Impact Analysis Across Income Distribution
import pandas as pd
import numpy as np
from policyengine_us import Simulation
# Define reform
reform = {
"gov.irs.credits.ctc.amount.base_amount": {
"2024-01-01.2100-12-31": 5000
}
}
# Analyze across income distribution
incomes = np.linspace(0, 200000, 101)
results = []
for income in incomes:
# Baseline
situation = create_situation(income=income)
sim_baseline = Simulation(situation=situation)
tax_baseline = sim_baseline.calculate("income_tax", 2024)[0]
# Reform
sim_reform = Simulation(situation=situation, reform=reform)
tax_reform = sim_reform.calculate("income_tax", 2024)[0]
results.append({
"income": income,
"tax_baseline": tax_baseline,
"tax_reform": tax_reform,
"tax_change": tax_reform - tax_baseline
})
df = pd.DataFrame(results)
Pattern 2: Household-Level Case Studies
# Define representative households
households = {
"Single, No Children": {
"income": 40000,
"num_children": 0,
"married": False
},
"Single Parent, 2 Children": {
"income": 50000,
"num_children": 2,
"married": False
},
"Married, 2 Children": {
"income": 100000,
"num_children": 2,
"married": True
}
}
# Calculate impacts for each
case_studies = {}
for name, params in households.items():
situation = create_family(**params)
sim_baseline = Simulation(situation=situation)
sim_reform = Simulation(situation=situation, reform=reform)
case_studies[name] = {
"baseline_tax": sim_baseline.calculate("income_tax", 2024)[0],
"reform_tax": sim_reform.calculate("income_tax", 2024)[0],
"ctc_baseline": sim_baseline.calculate("ctc", 2024)[0],
"ctc_reform": sim_reform.calculate("ctc", 2024)[0]
}
case_df = pd.DataFrame(case_studies).T
Pattern 3: State-by-State Comparison
states = ["CA", "NY", "TX", "FL", "PA", "OH", "IL", "MI"]
state_results = []
for state in states:
situation = create_situation(income=75000, state=state)
sim_baseline = Simulation(situation=situation)
sim_reform = Simulation(situation=situation, reform=reform)
state_results.append({
"state": state,
"baseline_net_income": sim_baseline.calculate("household_net_income", 2024)[0],
"reform_net_income": sim_reform.calculate("household_net_income", 2024)[0],
"change": (sim_reform.calculate("household_net_income", 2024)[0] -
sim_baseline.calculate("household_net_income", 2024)[0])
})
state_df = pd.DataFrame(state_results)
Pattern 4: Marginal Analysis (Winners/Losers)
import plotly.graph_objects as go
# Calculate across income range
situation_with_axes = {
# ... setup ...
"axes": [[{
"name": "employment_income",
"count": 1001,
"min": 0,
"max": 200000,
"period": 2024
}]]
}
sim_baseline = Simulation(situation=situation_with_axes)
sim_reform = Simulation(situation=situation_with_axes, reform=reform)
incomes = sim_baseline.calculate("employment_income", 2024)
baseline_net = sim_baseline.calculate("household_net_income", 2024)
reform_net = sim_reform.calculate("household_net_income", 2024)
gains = reform_net - baseline_net
# Identify winners and losers
winners = gains > 0
losers = gains < 0
neutral = gains == 0
print(f"Winners: {winners.sum() / len(gains) * 100:.1f}%")
print(f"Losers: {losers.sum() / len(gains) * 100:.1f}%")
print(f"Neutral: {neutral.sum() / len(gains) * 100:.1f}%")
Visualization Patterns
Standard Plotly Configuration
import plotly.graph_objects as go
# PolicyEngine brand colors
TEAL = "#39C6C0"
BLUE = "#2C6496"
DARK_GRAY = "#616161"
def create_pe_layout(title, xaxis_title, yaxis_title):
"""Create standard PolicyEngine chart layout."""
return go.Layout(
title=title,
xaxis_title=xaxis_title,
yaxis_title=yaxis_title,
font=dict(family="Roboto Serif", size=14),
plot_bgcolor="white",
hovermode="x unified",
xaxis=dict(
showgrid=True,
gridcolor="lightgray",
zeroline=True
),
yaxis=dict(
showgrid=True,
gridcolor="lightgray",
zeroline=True
)
)
# Use in charts
fig = go.Figure(layout=create_pe_layout(
"Tax Impact by Income",
"Income",
"Tax Change"
))
fig.add_trace(go.Scatter(x=incomes, y=tax_change, line=dict(color=TEAL)))
Common Chart Types
1. Line Chart (Impact by Income)
fig = go.Figure()
fig.add_trace(go.Scatter(
x=df.income,
y=df.tax_change,
mode='lines',
name='Tax Change',
line=dict(color=TEAL, width=3)
))
fig.update_layout(
title="Tax Impact by Income Level",
xaxis_title="Income",
yaxis_title="Tax Change ($)",
xaxis_tickformat="$,.0f",
yaxis_tickformat="$,.0f"
)
2. Bar Chart (State Comparison)
fig = go.Figure()
fig.add_trace(go.Bar(
x=state_df.state,
y=state_df.change,
marker_color=TEAL
))
fig.update_layout(
title="Net Income Change by State",
xaxis_title="State",
yaxis_title="Change ($)",
yaxis_tickformat="$,.0f"
)
3. Waterfall Chart (Budget Impact)
fig = go.Figure(go.Waterfall(
x=["Baseline", "Tax Credit", "Phase-out", "Reform"],
y=[baseline_revenue, credit_cost, phaseout_revenue, 0],
measure=["absolute", "relative", "relative", "total"],
connector={"line": {"color": "gray"}}
))
Streamlit Dashboard Patterns
Basic Streamlit Setup
import streamlit as st
from policyengine_us import Simulation
st.set_page_config(page_title="Policy Analysis", layout="wide")
st.title("Policy Impact Calculator")
# User inputs
col1, col2, col3 = st.columns(3)
with col1:
income = st.number_input("Income", value=60000, step=5000)
with col2:
state = st.selectbox("State", ["CA", "NY", "TX", "FL"])
with col3:
num_children = st.number_input("Children", value=0, min_value=0, max_value=10)
# Calculate
if st.button("Calculate"):
situation = create_family(
parent_income=income,
num_children=num_children,
state=state
)
sim_baseline = Simulation(situation=situation)
sim_reform = Simulation(situation=situation, reform=reform)
# Display results
col1, col2, col3 = st.columns(3)
with col1:
st.metric(
"Baseline Tax",
f"${sim_baseline.calculate('income_tax', 2024)[0]:,.0f}"
)
with col2:
st.metric(
"Reform Tax",
f"${sim_reform.calculate('income_tax', 2024)[0]:,.0f}"
)
with col3:
change = (sim_reform.calculate('income_tax', 2024)[0] -
sim_baseline.calculate('income_tax', 2024)[0])
st.metric("Change", f"${change:,.0f}", delta=f"${-change:,.0f}")
Interactive Chart with Streamlit
# Create chart based on user inputs
incomes = np.linspace(0, income_max, 1001)
results = []
for income in incomes:
situation = create_situation(income=income, state=selected_state)
sim = Simulation(situation=situation, reform=reform)
results.append(sim.calculate("household_net_income", 2024)[0])
fig = go.Figure()
fig.add_trace(go.Scatter(x=incomes, y=results, line=dict(color=TEAL)))
st.plotly_chart(fig, use_container_width=True)
Jupyter Notebook Best Practices
Notebook Structure
# Cell 1: Title and Description
"""
# Policy Analysis: [Policy Name]
**Date:** [Date]
**Author:** [Your Name]
## Summary
Brief description of the analysis and key findings.
"""
# Cell 2: Imports
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from policyengine_us import Simulation
# Cell 3: Configuration
YEAR = 2024
STATES = ["CA", "NY", "TX", "FL"]
# Cell 4+: Analysis sections with markdown headers
Export Results
# Save DataFrame
df.to_csv("outputs/impact_analysis.csv", index=False)
# Save Plotly chart
fig.write_html("outputs/chart.html")
fig.write_image("outputs/chart.png", width=1200, height=600)
# Save summary statistics
summary = {
"total_winners": winners.sum(),
"total_losers": losers.sum(),
"avg_gain": gains[winners].mean(),
"avg_loss": gains[losers].mean()
}
pd.DataFrame([summary]).to_csv("outputs/summary.csv", index=False)
Repository-Specific Examples
This skill includes example templates in the examples/ directory:
impact_analysis_template.ipynb- Standard impact analysisdashboard_template.py- Streamlit dashboardstate_comparison.py- State-by-state analysiscase_studies.py- Household case studiesreform_definitions.py- Common reform patterns
Common Pitfalls
Pitfall 1: Not Using Consistent Year
Problem: Mixing 2024 and 2025 calculations
Solution: Define year constant at top:
CURRENT_YEAR = 2024
# Use everywhere
simulation.calculate("income_tax", CURRENT_YEAR)
Pitfall 2: Inefficient Simulations
Problem: Creating new simulation for each income level
Solution: Use axes for efficiency:
# SLOW
for income in incomes:
situation = create_situation(income=income)
sim = Simulation(situation=situation)
results.append(sim.calculate("income_tax", 2024)[0])
# FAST
situation_with_axes = create_situation_with_axes(incomes)
sim = Simulation(situation=situation_with_axes)
results = sim.calculate("income_tax", 2024) # Array of all results
Pitfall 3: Forgetting to Compare Baseline and Reform
Problem: Only showing reform results
Solution: Always show both:
results = {
"baseline": sim_baseline.calculate("income_tax", 2024),
"reform": sim_reform.calculate("income_tax", 2024),
"change": reform - baseline
}
PolicyEngine API Usage
For larger-scale analyses, use the PolicyEngine API:
import requests
def calculate_via_api(situation, reform=None):
"""Calculate using PolicyEngine API."""
url = "https://api.policyengine.org/us/calculate"
payload = {
"household": situation,
"policy_id": reform_id if reform else baseline_policy_id
}
response = requests.post(url, json=payload)
return response.json()
Testing Analysis Code
import pytest
def test_reform_increases_ctc():
"""Test that reform increases CTC as expected."""
situation = create_family(income=50000, num_children=2)
sim_baseline = Simulation(situation=situation)
sim_reform = Simulation(situation=situation, reform=reform)
ctc_baseline = sim_baseline.calculate("ctc", 2024)[0]
ctc_reform = sim_reform.calculate("ctc", 2024)[0]
assert ctc_reform > ctc_baseline, "Reform should increase CTC"
assert ctc_reform == 5000 * 2, "CTC should be $5000 per child"
Documentation Standards
README Template
# [Analysis Name]
## Overview
Brief description of the analysis.
## Key Findings
- Finding 1
- Finding 2
- Finding 3
## Methodology
Explanation of approach and data sources.
## How to Run
\```bash
pip install -r requirements.txt
python app.py # or jupyter notebook analysis.ipynb
\```
## Outputs
- `outputs/chart1.png` - Description
- `outputs/results.csv` - Description
## Contact
PolicyEngine Team - hello@policyengine.org
Additional Resources
- PolicyEngine API Docs: https://policyengine.org/us/api
- Analysis Examples: https://github.com/PolicyEngine/analysis-notebooks
- Streamlit Docs: https://docs.streamlit.io
- Plotly Docs: https://plotly.com/python/