| name | stan-coding |
| description | Best practices for writing efficient, clean Stan programs |
Stan Coding Guidelines
Use this skill when writing or modifying Stan programs to ensure clean, efficient code.
Program Structure
Use canonical block order: functions, data, transformed data, parameters, transformed parameters, model, generated quantities.
Follow Stan style:
- Two-space indents, no tabs, ≤80 character lines
- Opening braces at end of line:
for (n in 1:N) { - Spaces around operators and after commas
- Variable names: lowercase with underscores (
sigma_y,mu_group) - Dimension constants: single uppercase letters (
N,K,J) - Declare locals close to use; scalars inside loops, reused containers outside
Types and Containers
Use appropriate types:
- Linear algebra:
matrix,vector,row_vectorwith matrix operations (x * beta) - Indexing/containers:
array[N] real y(not legacyreal y[N]) - Repeated row access:
array[M] row_vector[N] xovermatrix[M, N] - Heterogeneous returns:
tuple(...)for multiple values - Sum-to-zero:
sum_to_zero_vector,sum_to_zero_matrixinstead of manual constraints
Memory layout: matrices are column-major, arrays are row-major.
Distributions and Vectorization
Always use log form:
- Write
y ~ normal(mu, sigma)ortarget += normal_lpdf(y | mu, sigma) - Vectorize:
y ~ normal(mu, sigma)for arrays, not loops - Use GLM functions:
bernoulli_logit_glm,poisson_log_glm,normal_id_glm - Precompute shared expressions: compute
mu = X * betaonce, reuse - Finite mixtures: use
log_sum_expon log scale
Parameterization
Use constrained types over manual checks:
<lower=0>,<upper=...>,ordered,positive_ordered,simplex,unit_vector- Covariance (K≥3):
cholesky_factor_corr[K] L_Omegawithmulti_normal_cholesky - Sum-to-zero: use built-in types, not "last element = minus sum"
For custom transforms, use built-in *_constrain, *_unconstrain, *_jacobian functions.
Parallelization
For large-N models with independent terms, use reduce_sum:
- Write partial sum function that takes data slice and returns log-density contribution
- Keep partial sum vectorized internally
- No side effects (no printing, no mutation)
Functions
Modularize complex logic in functions block:
- Reused operations, complex math, custom likelihoods
- Signature: data arguments first, then parameters, then tuning constants
- Use
tuplereturns for multiple heterogeneous outputs
Preventing Crashes
Both compilation and sampling can crash or OOM.
Defensive Stan patterns:
- Always use tight bounds:
int<lower=1, upper=K> id[N] - Guard math: check parameters before
log,sqrt, division - Add explicit bounds for dispersion:
real<lower=0.01> phi(never exactly 0)
Execution:
- Wrap
CmdStanModel()andmodel.sample()in try-except - Probe with short runs before full sampling
On crash/OOM:
- Reduce
parallel_chains(4 → 2 → 1) - Reduce
max_treedepth(10 → 8) - Subsample data or simplify model
ArviZ Integration
Design Stan programs for downstream ArviZ workflow:
Generated quantities:
- Always include pointwise log-likelihood:
vector[N] log_lik- required for model comparison and downstream workflow - Always include posterior predictive draws:
vector[N] y_rep- required for all predictive checks - For multiple observed variables, use one vector per variable:
log_lik_y1,log_lik_y2 - This will incur modest overhead, but might be worth workflow simplicity
Transformed parameters:
- Put reusable intermediate quantities here (e.g.,
vector[N] mu = alpha + X * beta) - Avoids recomputation in Python and makes them available in posterior samples
Extending without refitting:
- To add new derived quantities, use
generate_quantitiesmode with original posterior draws - Write new Stan file with same data/parameters/transformed parameters but extended generated quantities
- Call
model.generate_quantities(data=data, mcmc_sample=fit)- orders of magnitude faster than refitting
Save and cache:
- Convert to InferenceData:
az.from_cmdstanpy(fit, log_likelihood="log_lik", posterior_predictive=["y_rep"]) - Save as NetCDF:
idata.to_netcdf("posterior.nc")- makes all downstream analysis instant - Use consistent coords/dims for all models in the workflow
References
If stuck on Stan patterns or ArviZ usage, search these resources:
- Stan case studies: https://mc-stan.org/learn-stan/case-studies.html
- ArviZ API documentation: https://python.arviz.org/en/latest/api/index.html
Use WebSearch or WebFetch to find specific examples.