| name | spatial-transcriptomics-tutorials-with-omicverse |
| title | Spatial transcriptomics tutorials with omicverse |
| description | Guide users through omicverse's spatial transcriptomics tutorials covering preprocessing, deconvolution, and downstream modelling workflows across Visium, Visium HD, Stereo-seq, and Slide-seq datasets. |
Spatial transcriptomics tutorials with omicverse
Overview
Use this skill to navigate the spatial analysis tutorials located under Tutorials-space. The notebooks span preprocessing utilities (t_crop_rotate.ipynb, t_cellpose.ipynb), deconvolution frameworks (t_decov.ipynb, t_starfysh.ipynb), and downstream spatial modelling or integration tasks (t_cluster_space.ipynb, t_staligner.ipynb, t_spaceflow.ipynb, t_commot_flowsig.ipynb, t_gaston.ipynb, t_slat.ipynb, t_stt.ipynb). Follow the staged instructions below to match the "Preprocess", "Deconvolution", and "Downstream" groupings presented in the notebooks.
Instructions
Preprocess
- Load spatial slides and manipulate coordinates
- Import
omicverse as ov,scanpy as sc, and enable plotting defaults withov.plot_set()orov.plot_set(font_path='Arial').t_crop_rotate.ipynb - Fetch public Visium data via
sc.datasets.visium_sge(...), inspectadata.obsm['spatial'], and respectuns['spatial'][library_id]['scalefactors']when rescaling coordinates for high-resolution overlays. - Apply region selection and alignment helpers:
ov.space.crop_space_visium(...)for bounding-box crops,ov.space.rotate_space_visium(...)followed byov.space.map_spatial_auto(..., method='phase'), and refine offsets withov.space.map_spatial_manual(...)before plotting usingsc.pl.embedding(..., basis='spatial').
- Import
- Segment Visium HD tiles into cells
- Organise Visium HD outputs (binned parquet counts,
.btfhistology) and load them throughov.space.read_visium_10x(path, source_image_path=...).t_cellpose.ipynb - Filter sparse bins (
ov.pp.filter_genes(..., min_cells=3)andov.pp.filter_cells(..., min_counts=1)) prior to segmentation. - Run nucleus/cell segmentation variants:
ov.space.visium_10x_hd_cellpose_he(...)for H&E,ov.space.visium_10x_hd_cellpose_expand(...)to grow labels across neighbouring bins, andov.space.visium_10x_hd_cellpose_gex(...)for gene-expression driven seeds. Harmonise labels withov.space.salvage_secondary_labels(...)and aggregate to cell-level AnnData usingov.space.bin2cell(..., labels_key='labels_joint').
- Organise Visium HD outputs (binned parquet counts,
- Initial QC for downstream tasks
- For Visium/DLPFC re-analyses, compute QC metrics (
sc.pp.calculate_qc_metrics(adata, inplace=True)) and persist intermediate AnnData snapshots (adata.write('data/cluster_svg.h5ad', compression='gzip')) for reuse across tutorials.t_cluster_space.ipynb
- For Visium/DLPFC re-analyses, compute QC metrics (
Deconvolution
- Configure single-cell references and spatial targets
- Load scRNA-seq references (
adata_sc = ov.read('data/sc.h5ad')) with harmonised gene IDs and spatial slides (adata_sp = sc.datasets.visium_sge(...)).t_decov.ipynb - Instantiate the unified wrapper
ov.space.Deconvolution(...), passing shared keys likecelltype_key,adata_sc, andadata_sp.
- Load scRNA-seq references (
- Execute Tangram and cell2location pipelines
- Call
decov_obj.preprocess_sc(...)/decov_obj.preprocess_sp(...)to align matrices, then rundecov_obj.deconvolution(method='tangram', ...)and persist outputs withov.utils.save(...)plus.write(...)hooks for AnnData members. - For cell2location, reinitialise
ov.space.Deconvolution(..., method='cell2location'), train (decov_obj.deconvolution(max_epochs=...)), monitor viadecov_obj.mod_sc.plot_history(...), and store models (decov_obj.save_model(...)). - Visualise inferred proportions using
ov.space.plot_cell2location(...),sc.pl.spatial(..., color=list_of_celltypes), and ROI-focused pie charts after cropping (ov.space.crop_space_visium(...)).
- Call
- Run Starfysh archetypal deconvolution
- Import Starfysh utilities (
from omicverse.external.starfysh import AA, utils, plot_utils, post_analysis) and prepare expression counts plus optional signature sets.t_starfysh.ipynb - Identify anchor spots with
utils.prepare_data(...), optionally infer archetypes viaAA.ArchetypalAnalysis(...), and refine signatures usingutils.refine_anchors(...). - Train Starfysh models (
utils.run_starfysh(poe=False, ...)orpoe=Truewith histology) across multiple restarts, then parse outputs throughpost_analysis.load_model(...),plot_utils.pl_spatial_inf_feature(...), andcell2proportion(...)for per-cell-type maps.
- Import Starfysh utilities (
Downstream
- Spatial clustering and denoising
- Generate embeddings using omicverse wrappers:
ov.utils.cluster(..., use_rep='graphst|original|X_pca', method='mclust'),ov.space.merge_cluster(...), and evaluate ARI (adjusted_rand_score(...)).t_cluster_space.ipynb - Explore algorithm-specific toggles: GraphST/BINARY require precalculated latent spaces, STAGATE training (
ov.utils.cluster(..., use_rep='STAGATE', ...)), and CAST for multi-slice single-cell resolution data.
- Generate embeddings using omicverse wrappers:
- Integrate multi-slice datasets
- Concatenate Stereo-seq/Slide-seqV2 batches (
ad.concat(Batch_list, label='slice_name', keys=section_ids)) and initialiseov.space.pySTAligner(...).t_staligner.ipynb - Train with
STAligner_obj.train_STAligner_subgraph(...), callSTAligner_obj.train(), and retrieve latent embeddings viaSTAligner_obj.predicted()before clustering (sc.pp.neighbors(..., use_rep='STAligner'),ov.utils.cluster(...)).
- Concatenate Stereo-seq/Slide-seqV2 batches (
- Model spatial gradients and trajectories
- For pseudo-spatial maps, build
sf_obj = ov.space.pySpaceFlow(adata)and train usingsf_obj.train(spatial_regularization_strength=0.1, ...), then computesf_obj.cal_pSM(...)to populateadata.obs['pSM_spaceflow'].t_spaceflow.ipynb - Analyse transition dynamics with
STT_obj = ov.space.STT(adata, spatial_loc='xy_loc', region='Region'), followed bySTT_obj.train(...),STT_obj.stage_estimate(), and downstream visualisations (STT_obj.plot_pathway(...),STT_obj.infer_lineage(...)).t_stt.ipynb
- For pseudo-spatial maps, build
- Infer communication and flow networks
- Pull ligand–receptor resources via
ov.external.commot.pp.ligand_receptor_database(species='human'), filter withfilter_lr_database(...), and compute signaling usingov.external.commot.tl.spatial_communication(...).t_commot_flowsig.ipynb - Construct FlowSig inputs (
adata.layers['normalized'] = adata.X.copy(),ov.external.flowsig.tl.construct_intercellular_flow_network(...)), retain spatially informative modules (Moran’s I filtering), and validate edges through bootstrapping thresholds (edge_threshold = 0.7).
- Pull ligand–receptor resources via
- Extract structural layers and align developmental slices
- Train GASTON with
gas_obj = ov.space.GASTON(adata), rescale GLM-PC matrices viagas_obj.load_rescale(A), and infer iso-depths usinggas_obj.cal_iso_depth(n_layers). Visualise withgas_obj.plot_isodepth(...),gas_obj.plot_clusters_restrict(...), and probe continuous/discontinuous gene lists (gas_obj.cont_genes_layer).t_gaston.ipynb - For SLAT, construct spatial graphs (
Cal_Spatial_Net(adata1, k_cutoff=20)), run alignment (run_SLAT(...),spatial_match(...)), and examine correspondences through Sankey diagrams (Sankey_multi(...)) and lineage-focused subsetting (cal_matching_cell(...)).t_slat.ipynb
- Train GASTON with
Dependencies
- Core:
omicverse,scanpy,anndata,numpy,matplotlib,squidpy(deconvolution + QC),networkx(FlowSig graphs). - Segmentation:
cellpose,stardist,opencv-python/tifffile, optional GPU-enabled PyTorch for acceleration.t_cellpose.ipynb - Deconvolution:
tangram,cell2location,pytorch-lightning,pandas,h5py, plus optional GPU/CUDA stacks; Starfysh additionally needstorch,scikit-learn, and curated signature CSVs.t_decov.ipynb,t_starfysh.ipynb - Downstream modelling:
scikit-learn(clustering, KMeans, ARI),gseapy==1.0.4for STT enrichment,commot,flowsig,torch-backed modules (STAligner, SpaceFlow, GASTON, SLAT), plus HTML exporters (Plotly) for Sankey plots.
Critical functions and artefacts to surface quickly
- Spatial preprocessing:
ov.space.crop_space_visium,ov.space.rotate_space_visium,ov.space.map_spatial_auto,ov.space.map_spatial_manual,ov.space.bin2cell. - Deconvolution containers:
ov.space.Deconvolution.preprocess_sc,.preprocess_sp,.deconvolution,.adata_cell2location,.adata_impute. - Archetypal/Starfysh:
AA.ArchetypalAnalysis,utils.refine_anchors,utils.run_starfysh,plot_utils.pl_spatial_inf_feature. - Clustering/integration:
ov.utils.cluster,ov.space.merge_cluster,ov.space.pySTAligner,ov.space.pySpaceFlow,ov.space.STT,ov.space.GASTON,Cal_Spatial_Net,run_SLAT,Sankey_multi. - Communication:
ov.external.commot.pp.ligand_receptor_database,ov.external.commot.tl.spatial_communication,ov.external.flowsig.tl.construct_intercellular_flow_network.
Troubleshooting
- Coordinate mismatches after rotation/cropping: ensure scalefactors are applied when plotting and cast
adata.obsm['spatial']tofloat64before runningmap_spatial_auto.t_crop_rotate.ipynb - Cellpose runtime errors: verify
.btfimage paths, memory-map large TIFFs viabackend='tifffile', and adjustmppplusbufferfor dense tissues; GPU runs require matching CUDA/PyTorch builds.t_cellpose.ipynb - Gene ID overlap failures in Tangram/cell2location: harmonise identifiers (ENSEMBL vs gene symbols) and drop non-overlapping genes before
decov_obj.preprocess_*.t_decov.ipynb - mclust errors in spatial clustering: install
rpy2and the Rmclustpackage, or switch to the pure Pythonmethod='mclust'fallback when R bindings are unavailable.t_cluster_space.ipynb - STAligner/SpaceFlow convergence: confirm
adata.obsm['spatial']exists and scale coordinates; tune learning rates/regularisation strength when embeddings collapse to a point.t_staligner.ipynb,t_spaceflow.ipynb - FlowSig network sparsity: build spatial graphs prior to Moran’s I filtering and raise
edge_thresholdor increase bootstraps to stabilise edges.t_commot_flowsig.ipynb - STT pathway downloads:
gseapylookups need network access; cache gene sets locally and reuse viaov.utils.geneset_prepare(...)to avoid repeated requests.t_stt.ipynb - GASTON output directories: provide writable
out_dirpaths and account for PyTorch nondeterminism when comparing replicate runs.t_gaston.ipynb - SLAT alignment quality: regenerate spatial graphs with appropriate
k_cutoffand inspectlow_quality_indexflags before trusting downstream lineage analyses.t_slat.ipynb
Examples
- "Crop, rotate, and manually re-align Visium coordinates before running Visium HD cell segmentation, then aggregate bins into cell-level AnnData."
- "Execute Tangram and cell2location through
ov.space.Deconvolution, save trained models, and plot lymph node cell-type proportions." - "Train STAligner and SpaceFlow on DLPFC slices, infer communication networks with COMMOT+FlowSig, and visualise iso-depth layers via GASTON."
References
- Tutorials:
Tutorials-space/ - Notebook index:
t_crop_rotate.ipynb,t_cellpose.ipynb,t_cluster_space.ipynb,t_decov.ipynb,t_starfysh.ipynb,t_staligner.ipynb,t_spaceflow.ipynb,t_commot_flowsig.ipynb,t_gaston.ipynb,t_slat.ipynb,t_stt.ipynb - Quick copy/paste commands:
reference.md