Claude Code Plugins

Community-maintained marketplace

Feedback

TF-differential-binding

@BIsnake2001/ChromSkills
3
0

The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name TF-differential-binding
description The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

DiffBind TF Differential Binding Analysis

Overview

This skill enables comprehensive differential TF binding analysis using DiffBind in R. DiffBind integrates read counting, normalization, and statistical modeling to identify differentially bound peaks between conditions.

To perform DiffBind differential binding analysis:

  • Initialize the project directory.
  • Refer to the Inputs & Outputs section to check inputs and build the output architecture. All the output file should located in ${proj_dir} in Step 0.
  • Always prompt user if required files are missing.
  • Provide a sample sheet with ChIP-seq peak files and corresponding BAM files for each sample.
  • Construct a DBA object from the sample sheet.
  • Compute read counts over consensus peak regions.
  • Specify experimental conditions (e.g., treatment vs. control or cell_type_A vs. cell_type_B).
  • Run statistical tests to identify differentially bound regions.
  • Generate correlation heatmaps, PCA plots, and volcano plots; extract significant binding events.

When to use this skill

Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

Recommended applications include:

  • Comparing treated vs. control or wild-type vs. mutant conditions to identify TF binding changes in response to stimuli, drugs, or mutations.
  • Comparing TF binding profiles between two cell types or experimental conditions to identify differentially bound regions (DBRs).
  • Comparing the different TF function in two conditions.
  • Integrating with RNA-seq to correlate TF binding alterations with gene expression changes.
  • Investigating co-factor dependencies or chromatin remodeling events linked to TF occupancy.

Inputs & Outputs

Inputs (choose one)

  • If starting from BAM files and BED peak files → Generate consensus peaks and count matrix.
  • If starting from existing count matrix → Go directly to DiffBind analysis.
  • If multiple conditions or batches → Include batch/condition in design

Outputs

${sample}_TF_DB_analysis/
    DBs/
      DB_results.csv # DESeq2 results (log2FC, p-values)
      DB_up.bed
      DB_down.bed  
    plots/ # visualization outputs
      PCA.pdf
      volcano.pdf
      heatmap.pdf
    logs/ # analysis logs 
    temp/ # other temp files

Decision Tree

Step 0: Initialize Project

  1. Make director for this project:

Call:

  • mcp__project-init-tools__project_init

with:

  • sample: sample name (e.g. c1_vs_c2)
  • task: TF_DB

The tool will:

  • Create ${sample}_TF_DB directory.
  • Return the full path of the ${sample}_TF_DB directory, which will be used as ${proj_dir}.

Step 1: Prepare Input Data

Create a CSV sample sheet (samplesheet.csv) with the following columns:

SampleID Tissue Factor Condition bamReads Peaks PeakCaller
TF_A_1 A TF Control Control1.bam Control1_peaks.narrowPeak narrow
TF_A_2 A TF Control Control2.bam Control2_peaks.narrowPeak narrow
TF_B_1 A TF Treated Treated1.bam Treated1_peaks.narrowPeak narrow
TF_B_2 A TF Treated Treated2.bam Treated2_peaks.narrowPeak narrow

Step 2: Load Data and Build the DiffBind Object

library(DiffBind)
samples <- read.csv("samplesheet.csv")
dbObj <- dba(sampleSheet=samples)

Key parameters:

  • sampleSheet: CSV file with BAM and peak information
  • Supports both narrowPeak and broadPeak formats

Step 3: Read Counting and Consensus Peak Generation

Count reads overlapping consensus peaks across samples:

# Generate a consensus peakset
dbObj <- dba.count(dbObj, summits=250)

Notes:

  • summits: re-centers peaks ±250 bp around summits for consistency.
  • The resulting matrix contains normalized counts for all samples.

Step 4: Contrast Definition

Define conditions for comparison:

# Define experimental contrasts (e.g., Treated vs Control)
dbObj <- dba.contrast(dbObj, categories=DBA_CONDITION, minMembers=2)

Alternatives:

  • For multifactor experiments: use DBA_TISSUE, DBA_TREATMENT, or custom metadata.
  • Check contrasts:
    dba.show(dbObj, bContrasts=TRUE)
    

Step 5: Differential Binding Analysis

# Perform analysis
dbObj <- dba.analyze(dbObj, method=DBA_DESEQ2)

Parameters:

  • method: choose DBA_DESEQ2 (default) or DBA_EDGER
  • th: FDR threshold (default 0.05)
  • fold: minimum log2 fold change
  • bUsePval=TRUE: use p-values instead of FDR cutoff

Step 6: Visualization and Quality Control

Correlation Heatmap

dba.plotHeatmap(dbObj, correlations=TRUE, scale="row")

PCA Plot

dba.plotPCA(dbObj, attributes=DBA_CONDITION, label=DBA_ID)

Volcano Plot

# Volcano plot
allResults <- dba.report(dbObj, method=DBA_DESEQ2, th=1)
with(allResults, plot(Fold, -log10(FDR),
     col=ifelse(FDR < 0.05 & abs(Fold) > 1, "red", "grey"),
     pch=16, main="Volcano Plot"))

Output: heatmap.pdf Volcano.pdf PCA.pdf


Step 7: Result Extraction

Export significant differential peaks:

write.csv(as.data.frame(allResults), "DB_results.csv", row.names = FALSE)
library(rtracklayer)
# Extract results with FDR < 0.05 and |log2FC| > 1
sigSites <- dba.report(dbObj, method=DBA_DESEQ2, th=0.05, fold=1)
print("Differential binding results summary:")
print(summary(sigSites))

# get the peaks that up or down in treated condition
diff_up <- sigSites[sigSites$Fold > 0]
diff_down <- sigSites[sigSites$Fold < 0]
export(diff_up, "DB_up_${treated_condition}.bed")
export(diff_down, "DB_down_${treated_condition}.bed")

Output: DB_results.csv DB_up_${treated_condition}.bed DB_down_${treated_condition}.bed


Interpretation and Biological Insights

Significance Criteria

  • FDR < 0.05 → statistically significant
  • |log2FC| > 1 → biologically meaningful difference
  • Consistent replicates → at least two replicates per condition recommended

Typical Biological Interpretations

  • Increased binding in treated condition → potential activation or recruitment of TFs
  • Decreased binding → loss of TF affinity or chromatin closing
  • Combine with RNA-seq to correlate with target gene expression.

Troubleshooting

Problem Possible Cause Solution
No differential peaks found Insufficient replicates or low coverage Increase sequencing depth or lower FDR threshold
Errors in sample sheet Column names incorrect or missing Use standard DiffBind column format
Inconsistent genome build Mixed genome assemblies Ensure all BAM and peak files use the same genome reference
Over-normalization Strong batch effects Include batch term in design or run dba.contrast(..., block=...)