| name | monocle3-truly-complete |
| description | Monocle3 单细胞轨迹分析工具包 - 100%覆盖文档(18个文件:完整教程+API+轨迹推断) |
Monocle3-Truly-Complete Skill
Comprehensive assistance with Monocle 3 for single-cell trajectory analysis, including co-embedding, projection, and advanced visualization techniques.
When to Use This Skill
This skill should be triggered when:
Data Analysis & Processing
- Loading and preprocessing single-cell data - Working with CellDataSet objects, UMI filtering, size factor estimation
- Co-embedding multiple datasets - Combining reference and query datasets for comparative analysis
- Projecting query data onto reference - Using transform models to map new data into existing reference space
- Cell type label transfer - Transferring annotations from reference to query cells using nearest neighbor indexing
Installation & Setup
- Installing Monocle 3 - Setting up R environment, Bioconductor dependencies, GitHub installation
- Troubleshooting installation issues - Resolving gdal, Xcode, gfortran, or reticulate errors
- Testing installation - Verifying that Monocle 3 is properly installed and functional
Visualization & Analysis
- Creating trajectory plots - Generating 2D/3D UMAP visualizations with cell type annotations
- Comparing datasets - Visualizing combined reference and query datasets
- Interactive plotting - Working with plotly for 3D trajectory visualizations
Quick Reference
Essential Code Examples
Example 1: Basic Installation
# Install Bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(version = "3.21")
# Install Monocle 3
devtools::install_github('cole-trapnell-lab/monocle3')
Example 2: Loading Reference and Query Datasets
library(monocle3)
library(Matrix)
# Load reference dataset
cds_ref <- new_cell_data_set(matrix_ref,
cell_metadata = cell_ann_ref,
gene_metadata = gene_ann_ref)
# Load query dataset
cds_qry <- new_cell_data_set(matrix_qry,
cell_metadata = cell_ann_qry,
gene_metadata = gene_ann_qry)
Example 3: Gene Filtering and UMI Cutoffs
# Find shared genes
genes_shared <- intersect(row.names(cds_ref), row.names(cds_qry))
# Keep only shared genes
cds_ref <- cds_ref[genes_shared,]
cds_qry <- cds_qry[genes_shared,]
# Apply UMI cutoffs (example: 1000)
cds_ref <- cds_ref[, colData(cds_ref)[['Total_mRNAs']] >= 1000]
cds_qry <- cds_qry[, colData(cds_qry)[['n.umi']] >= 1000]
Example 4: Processing Reference Dataset
# Estimate size factors
cds_ref <- estimate_size_factors(cds_ref)
cds_qry <- estimate_size_factors(cds_qry)
# Process reference with PCA and UMAP
cds_ref <- preprocess_cds(cds_ref, num_dim=100)
cds_ref <- reduce_dimension(cds_ref, build_nn_index=TRUE)
# Save transform models for projection
save_transform_models(cds_ref, 'cds_ref_test_models')
Example 5: Project Query Data into Reference Space
# Load reference transform models
cds_qry <- load_transform_models(cds_qry, 'cds_ref_test_models')
# Apply transformations to query data
cds_qry <- preprocess_transform(cds_qry)
cds_qry <- reduce_dimension_transform(cds_qry)
Example 6: Cell Type Label Transfer
# Transfer cell type labels from reference to query
cds_qry <- transfer_cell_labels(cds_qry,
reduction_method='UMAP',
ref_coldata=colData(cds_ref),
ref_column_name='Main_cell_type',
query_column_name='cell_type_xfr',
transform_models_dir='cds_ref_test_models')
# Fix any missing labels
cds_qry <- fix_missing_cell_labels(cds_qry,
reduction_method='UMAP',
from_column_name='cell_type_xfr',
to_column_name='cell_type_fix')
Example 7: Combining and Visualizing Datasets
# Label datasets for visualization
colData(cds_ref)[['data_set']] <- 'reference'
colData(cds_qry)[['data_set']] <- 'query'
# Combine datasets
cds_combined <- combine_cds(list(cds_ref, cds_qry),
keep_all_genes=TRUE,
cell_names_unique=TRUE,
keep_reduced_dims=TRUE)
# Plot combined data
plot_cells(cds_combined, color_cells_by='data_set')
Example 8: Basic Visualization
# Plot individual datasets
plot_cells(cds_ref)
plot_cells(cds_qry)
# Color by specific metadata
plot_cells(cds_combined, color_cells_by='Main_cell_type')
Key Concepts
Core Monocle 3 Objects
- CellDataSet (cds) - Primary data structure containing expression matrix, cell metadata, and gene metadata
- Transform Models - Saved PCA/UMAP transformations from reference data for projecting query data
- Nearest Neighbor Index - Spatial index used for efficient cell type label transfer
Analysis Workflow
- Data Loading - Import expression matrices and metadata
- Preprocessing - Filter genes, apply UMI cutoffs, estimate size factors
- Reference Processing - Create PCA/UMAP embeddings with nearest neighbor indexing
- Projection - Transform query data into reference space using saved models
- Label Transfer - Transfer annotations from reference to query cells
- Visualization - Plot trajectories and compare datasets
Key Parameters
- build_nn_index=TRUE - Required for cell type label transfer
- num_dim - Number of PCA dimensions (typically 50-100)
- reduction_method - 'UMAP' or 'PCA' for visualization and label transfer
Reference Files
This skill includes comprehensive documentation in references/:
getting_started.md
- 17 pages of detailed installation and projection workflows
- Installation Guide - Complete setup with troubleshooting for gdal, Xcode, gfortran errors
- Projection Tutorial - Step-by-step co-embedding and label transfer workflow
- Code Examples - 26 practical examples covering data loading, processing, and visualization
visualization.md
- Interactive 3D plotting with plotly integration
- Advanced trajectory visualizations showing cell partitions
- Web-based exploration tools for large datasets
Use view to read specific reference files when detailed information is needed.
Working with This Skill
For Beginners
- Start with installation - Follow the getting_started.md installation guide carefully
- Use the projection workflow - The co-embedding tutorial provides a complete end-to-end example
- Master the basics first - Focus on data loading, gene filtering, and basic visualization before attempting advanced projection
For Intermediate Users
- Customize projection parameters - Adjust num_dim, UMI cutoffs, and visualization options
- Batch process multiple datasets - Use the transform model system for efficient analysis of many query datasets
- Troubleshoot common issues - Reference the installation troubleshooting section for gdal, Xcode, and gfortran problems
For Advanced Users
- Optimize performance - Use BPCells for large datasets and tune nearest neighbor indexing
- Custom visualization - Extend plotly visualizations for interactive exploration
- Pipeline integration - Incorporate Monocle 3 into larger single-cell analysis workflows
Navigation Tips
- Search by function name - Quick reference includes the most commonly used functions
- Check examples first - Each concept has multiple code examples with different approaches
- Reference the original URLs - Documentation includes links to official Monocle 3 documentation
Resources
references/
Organized documentation extracted from official sources:
- Step-by-step tutorials with complete code workflows
- Installation troubleshooting with specific error solutions
- Multiple code examples showing different approaches to the same task
- Links to original documentation for further reading
scripts/
Add helper scripts for common automation tasks such as:
- Batch projection of multiple query datasets
- Automated installation scripts
- Custom visualization functions
assets/
Add templates and examples such as:
- Example metadata files showing proper formatting
- Configuration templates for common analysis scenarios
- Boilerplate code for starting new projects
Notes
- This skill covers Monocle 3 version 1.4.25+ with Bioconductor 3.21 and R 4.4.1+
- Projection workflow is the key feature - enabling comparison of large datasets without memory issues
- Transform models can be reused across multiple query datasets for consistent analysis
- Code examples include both simple and advanced approaches for flexibility
Updating
To refresh this skill with updated documentation:
- Re-run the scraper with the same configuration
- The skill will be rebuilt with the latest information from the official Monocle 3 documentation