| name | bulktrajblend-trajectory-interpolation |
| title | BulkTrajBlend trajectory interpolation |
| description | Extend scRNA-seq developmental trajectories with BulkTrajBlend by generating intermediate cells from bulk RNA-seq, training beta-VAE and GNN models, and interpolating missing states. |
BulkTrajBlend trajectory interpolation
Overview
Invoke this skill when users need to bridge gaps in single-cell developmental trajectories using matched bulk RNA-seq. It follows t_bulktrajblend.ipynb, showcasing how BulkTrajBlend deconvolves PDAC bulk samples, identifies overlapping communities with a GNN, and interpolates "interrupted" cell states.
Instructions
- Prepare libraries and inputs
- Import
omicverse as ov,scanpy as sc,scvelo as scv, and helper functions likefrom omicverse.utils import mde; runov.plot_set(). - Load the reference scRNA-seq AnnData (
scv.datasets.dentategyrus()) and raw bulk counts withov.utils.read(...)followed byov.bulk.Matrix_ID_mapping(...)for gene ID harmonisation.
- Import
- Configure BulkTrajBlend
- Instantiate
ov.bulk2single.BulkTrajBlend(bulk_seq=bulk_df, single_seq=adata, bulk_group=['dg_d_1','dg_d_2','dg_d_3'], celltype_key='clusters'). - Explain that
bulk_groupnames correspond to raw bulk columns and the method expects unscaled counts.
- Instantiate
- Set beta-VAE expectations
- Call
bulktb.vae_configure(cell_target_num=100)(or pass a dictionary) to define expected cell counts per cluster. Mention that omitting the argument triggers TAPE-based estimation.
- Call
- Train or load the beta-VAE
- Use
bulktb.vae_train(batch_size=512, learning_rate=1e-4, hidden_size=256, epoch_num=3500, vae_save_dir='...', vae_save_name='dg_btb_vae', generate_save_dir='...', generate_save_name='dg_btb'). - Highlight resuming with
bulktb.vae_load('.../dg_btb_vae.pth')and the need to regenerate cells with consistent random seeds for reproducibility.
- Use
- Generate synthetic cells
- Produce filtered AnnData via
bulktb.vae_generate(leiden_size=25)and inspect compositions withov.bulk2single.bulk2single_plot_cellprop(...). - Save outputs to disk for reuse (
adata.write_h5ad).
- Produce filtered AnnData via
- Configure and train the GNN
- Call
bulktb.gnn_configure(max_epochs=2000, use_rep='X', neighbor_rep='X_pca', gpu=0, ...)to set hyperparameters. - Train using
bulktb.gnn_train(); reload checkpoints withbulktb.gnn_load('save_model/gnn.pth'). - Generate overlapping community assignments through
bulktb.gnn_generate().
- Call
- Visualise community structure
- Create MDE embeddings:
bulktb.nocd_obj.adata.obsm['X_mde'] = mde(bulktb.nocd_obj.adata.obsm['X_pca']). - Plot clusters vs. discovered communities using
sc.pl.embedding(..., color=['clusters','nocd_n'], palette=ov.utils.pyomic_palette())and filtered subsets excluding synthetic labels with hyphens.
- Create MDE embeddings:
- Interpolate missing states
- Run
bulktb.interpolation('OPC')(replace with target lineage) to synthesise continuity, then preprocess the interpolated AnnData (HVG selection, scaling, PCA). - Compute embeddings with
mde, visualise withov.utils.embedding, and compare to the original atlas.
- Run
- Analyse trajectories
- Initialise
ov.single.pyVIAon both original and interpolated data to derive pseudotime, followed byget_pseudotime,sc.pp.neighbors,ov.utils.cal_paga, andov.utils.plot_pagafor topology validation.
- Initialise
- Troubleshooting tips
- If the VAE collapses (high reconstruction loss), lower
learning_rateor reducehidden_size. - Ensure the same generated dataset is used before calling
gnn_train; regenerating cells changes the graph and can break checkpoint loading. - Sparse clusters may need adjusted
cell_target_numthresholds or a smallerleiden_sizefilter to retain rare populations.
- If the VAE collapses (high reconstruction loss), lower
Examples
- "Train BulkTrajBlend on PDAC cohorts, then interpolate missing OPC states in the trajectory."
- "Load saved beta-VAE and GNN weights to regenerate overlapping communities and plot cluster vs. nocd labels."
- "Run VIA on interpolated cells and compare PAGA graphs with the original scRNA-seq trajectory."
References
- Tutorial notebook:
t_bulktrajblend.ipynb - Example datasets and checkpoints:
omicverse_guide/docs/Tutorials-bulk2single/data/ - Quick copy/paste commands:
reference.md