| name | seurat-local |
| description | Seurat documentation (local mirror) |
Seurat-Local Skill
Comprehensive assistance with Seurat v5 for single-cell RNA-seq analysis, generated from official documentation.
When to Use This Skill
This skill should be triggered when:
Core Seurat Analysis:
- Working with single-cell RNA-seq data analysis in R
- Setting up Seurat objects and performing quality control
- Normalizing data and finding variable features
- Running dimensional reduction (PCA, UMAP, t-SNE)
- Performing clustering and cell type identification
Data Integration:
- Integrating multiple single-cell datasets
- Using FindIntegrationAnchors and IntegrateData workflows
- Performing batch correction across experiments
- Working with both CCA and RPCA integration methods
- Transferring cell type annotations between datasets
Advanced Analysis:
- Differential expression testing and marker gene identification
- Spatial transcriptomics analysis with 10x Visium data
- Trajectory inference and lineage analysis
- Multi-modal data integration (e.g., scRNA-seq + scATAC-seq)
- Pseudobulk analysis and aggregated expression calculations
Seurat v5 Specific Features:
- Working with the new layered assay structure
- Using SCTransform v2 for normalization
- Performing integration in low-dimensional space
- Using the streamlined IntegrateLayers workflow
- Working with on-disk matrices for large datasets
Common Use Cases:
- "How do I integrate two scRNA-seq datasets?"
- "Help me find marker genes for my clusters"
- "How do I normalize my single-cell data?"
- "What's the difference between CCA and RPCA integration?"
- "How do I analyze spatial transcriptomics data?"
Quick Reference
Essential Seurat Workflow
Basic Setup and Normalization
library(Seurat)
library(SeuratData)
# Load example dataset
InstallData("pbmc3k")
pbmc <- LoadData("pbmc3k")
# Basic preprocessing
pbmc <- NormalizeData(pbmc, verbose = FALSE)
pbmc <- FindVariableFeatures(pbmc, selection.method = "vst", nfeatures = 2000)
Dimensional Reduction and Clustering
# Run PCA and determine dimensions
pbmc <- RunPCA(pbmc, verbose = FALSE)
pbmc <- FindNeighbors(pbmc, dims = 1:10)
pbmc <- FindClusters(pbmc, resolution = 0.5)
# Run UMAP for visualization
pbmc <- RunUMAP(pbmc, dims = 1:10)
DimPlot(pbmc, reduction = "umap")
Differential Expression
# Find markers for cluster 1
cluster1.markers <- FindMarkers(pbmc, ident.1 = 1, min.pct = 0.25)
head(cluster1.markers)
# Find all markers
pbmc.markers <- FindAllMarkers(pbmc, only.pos = TRUE, min.pct = 0.25)
Data Integration Workflows
Standard Integration (CCA-based)
# Split dataset by condition
ifnb.list <- SplitObject(ifnb, split.by = "stim")
# Normalize and find variable features
ifnb.list <- lapply(ifnb.list, function(x) {
x <- NormalizeData(x)
x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
})
# Find integration anchors
immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list, dims = 1:20)
# Integrate data
immune.combined <- IntegrateData(anchorset = immune.anchors, dims = 1:20)
Fast Integration (RPCA-based)
# Preprocess with PCA individually
features <- SelectIntegrationFeatures(object.list = ifnb.list)
ifnb.list <- lapply(ifnb.list, function(x) {
x <- ScaleData(x, features = features, verbose = FALSE)
x <- RunPCA(x, features = features, verbose = FALSE)
})
# Find anchors using RPCA (faster, more conservative)
immune.anchors <- FindIntegrationAnchors(
object.list = ifnb.list,
anchor.features = features,
reduction = "rpca"
)
SCTransform Integration
# Normalize with SCTransform
ifnb.list <- lapply(ifnb.list, SCTransform, method = "glmGamPoi")
features <- SelectIntegrationFeatures(object.list = ifnb.list, nfeatures = 3000)
# Prepare for integration
ifnb.list <- PrepSCTIntegration(object.list = ifnb.list, anchor.features = features)
ifnb.list <- lapply(ifnb.list, RunPCA, features = features)
# Integrate with SCTransform
immune.anchors <- FindIntegrationAnchors(
object.list = ifnb.list,
normalization.method = "SCT",
anchor.features = features,
reduction = "rpca"
)
immune.combined.sct <- IntegrateData(
anchorset = immune.anchors,
normalization.method = "SCT"
)
Spatial Transcriptomics
Load and Process 10x Visium Data
library(Seurat)
library(SeuratData)
library(ggplot2)
# Load spatial dataset
InstallData("stxBrain")
brain <- LoadData("stxBrain", type = "anterior1")
# Normalize with SCTransform (recommended for spatial data)
brain <- SCTransform(brain, assay = "Spatial", verbose = FALSE)
# Run dimensional reduction and clustering
brain <- RunPCA(brain, assay = "SCT", verbose = FALSE)
brain <- FindNeighbors(brain, reduction = "pca", dims = 1:30)
brain <- FindClusters(brain, verbose = FALSE)
brain <- RunUMAP(brain, reduction = "pca", dims = 1:30)
Spatial Visualization
# Visualize clusters on tissue
SpatialDimPlot(brain, label = TRUE, label.size = 3)
# Visualize gene expression
SpatialFeaturePlot(brain, features = c("Hpca", "Ttr"))
# Adjust visualization parameters
SpatialFeaturePlot(brain, features = "Ttr", alpha = c(0.1, 1))
Cell Type Label Transfer
# Load single-cell reference
allen_reference <- readRDS("allen_cortex.rds")
allen_reference <- SCTransform(allen_reference, ncells = 3000, verbose = FALSE)
# Find transfer anchors
anchors <- FindTransferAnchors(
reference = allen_reference,
query = cortex,
normalization.method = "SCT"
)
# Transfer cell type labels
predictions.assay <- TransferData(
anchorset = anchors,
refdata = allen_reference$subclass,
prediction.assay = TRUE
)
cortex[["predictions"]] <- predictions.assay
Advanced Features
Module Scoring
# Define gene sets
cd_features <- list(c('CD79B', 'CD79A', 'CD19', 'CD180', 'CD200',
'CD3D', 'CD2', 'CD3E', 'CD7', 'CD8A'))
# Calculate module scores
pbmc <- AddModuleScore(
object = pbmc,
features = cd_features,
name = 'CD_Features'
)
Project UMAP Coordinates
# Project new data onto existing UMAP
query_umap <- ProjectUMAP(
query = new_data,
reference = reference_data,
query.reduction = "pca",
reference.reduction = "pca",
reduction.model = reference_data[["umap"]]
)
Integrate Layers (Seurat v5)
# Integrate multiple layers in a single object
integrated_object <- IntegrateLayers(
object = seurat_object,
method = RPCAIntegration,
orig.reduction = "pca",
assay = "RNA",
features = variable_features
)
Key Concepts
Core Seurat Objects
- Seurat Object: Main data structure containing assays, metadata, and reductions
- Assay: Contains different data layers (counts, data, scale.data)
- Layers: In Seurat v5, data is stored in layers instead of slots
- DimReduc: Dimensional reduction objects (PCA, UMAP, etc.)
Data Integration Methods
- CCA (Canonical Correlation Analysis): Traditional method, good for strong batch effects
- RPCA (Reciprocal PCA): Faster, more conservative, less over-correction
- SCTransform: Regularized negative binomial normalization, recommended for most analyses
Spatial Analysis
- Spots: Spatial measurement locations (50μm for 10x Visium)
- Images: Tissue histology images stored in object
- Coordinate Systems: Mapping between spots and image coordinates
Reference Files
This skill includes comprehensive documentation in references/:
announcements.md
- Seurat v5 release notes and changes
- Backwards compatibility information
- New feature descriptions and migration guide
api.md
- Complete function reference with parameters
- Detailed examples for each function
- Performance notes and best practices
- Functions covered:
ProjectUMAP()- Query dataset projectionIntegrateLayers()- Multi-layer integrationFastRowScale()- Efficient matrix operationsAddModuleScore()- Gene set scoringCellSelector()- Interactive cell selectionTransferData()- Cross-dataset data transfer
other.md
- Code of conduct and community guidelines
- Advanced tutorials and case studies
- Integration workflows for complex scenarios
- Comprehensive PBMC stimulation analysis example
tutorials.md
- Step-by-step guides for common workflows
- RPCA vs CCA integration comparison
- Spatial transcriptomics analysis tutorials
- SCTransform normalization workflows
- Performance optimization tips
Working with This Skill
For Beginners
- Start with the basic workflow: Load data → Normalize → Find variable features → PCA → Cluster → UMAP
- Use the getting_started reference for fundamental concepts
- Follow the basic examples in the Quick Reference section
- Practice with the pbmc3k dataset which is included in SeuratData
For Intermediate Users
- Explore integration methods when working with multiple datasets
- Use spatial analysis features for spatial transcriptomics data
- Leverage the API reference for advanced function parameters
- Study the PBMC stimulation tutorial for comparative analysis
For Advanced Users
- Use the tutorials reference for complex workflows
- Optimize performance with RPCA integration and SCTransform
- Implement custom analysis pipelines using the function reference
- Contribute to the community following the code of conduct
Navigation Tips
- Search for specific functions in the api.md reference
- Find complete workflows in tutorials.md
- Check announcements.md for the latest Seurat v5 features
- Use other.md for specialized use cases and community guidelines
Resources
references/
Organized documentation extracted from official sources containing:
- Detailed function explanations with all parameters
- Real code examples with proper syntax highlighting
- Performance notes and best practices
- Links to original documentation for further reading
scripts/
Add helper scripts here for common automation tasks such as:
- Batch processing of multiple datasets
- Custom visualization functions
- Quality control automation
- Report generation
assets/
Store templates, boilerplate, and example projects:
- Example Seurat objects for testing
- Custom gene sets for module scoring
- Reference datasets for integration testing
- Visualization themes and templates
Notes
- Seurat v5 is now the default on CRAN with backwards compatibility
- New assay structure uses layers instead of slots for better data organization
- SCTransform v2 includes regularization improvements and glmGamPoi support
- Integration workflows are now streamlined and more memory-efficient
- Spatial analysis is fully integrated with enhanced visualization options
Updating
To refresh this skill with updated documentation:
- Re-run the documentation scraper with the same configuration
- The skill will be rebuilt with the latest Seurat documentation
- All examples and references will be updated automatically
- Quick reference patterns will be refreshed with new best practices