| name | retention-analysis |
| description | Analyze user retention and churn using survival analysis, cohort analysis, and machine learning. Calculate retention rates, build survival curves, predict churn risk, and generate retention optimization strategies. Use when working with user subscription data, membership information, or when user mentions retention, churn, survival analysis, or customer lifetime value. |
| allowed-tools | Read, Write, Edit, Bash, Grep, Glob |
Retention Analysis Skill
Analyze user retention patterns, predict customer churn, and optimize retention strategies using advanced statistical methods and machine learning techniques.
Quick Start
This skill helps you:
- Calculate retention rates and churn metrics
- Build survival curves using Kaplan-Meier analysis
- Perform cohort analysis to understand behavior patterns
- Predict churn risk with machine learning models
- Identify retention drivers using Cox regression
- Generate actionable insights for retention improvement
When to Use
- SaaS Product Analysis: User subscription renewal and cancellation patterns
- Membership Programs: Member engagement and loyalty analysis
- E-commerce: Customer repeat purchase behavior and subscription boxes
- Gaming Apps: Player retention and engagement metrics
- Service Industries: Customer satisfaction and long-term relationships
- Subscription Businesses: Monthly/yearly subscription analysis
Key Requirements
Install required packages:
pip install pandas numpy matplotlib seaborn scikit-learn lifelines
Core Workflow
1. Data Preparation
Your data should include:
- User identifiers: Unique user/customer IDs
- Time variables: Registration date, activity dates, subscription period
- Event indicators: Churn status (1=churned, 0=active)
- User attributes: Demographics, behavior, subscription details
- Optional: Usage metrics, payment history, engagement data
2. Analysis Process
- Data preprocessing: Clean and prepare retention data
- Survival analysis: Build Kaplan-Meier curves
- Cohort analysis: Group users by acquisition time
- Risk modeling: Identify churn drivers with Cox regression
- Churn prediction: Build machine learning prediction models
- Insight generation: Create actionable recommendations
3. Output Deliverables
- Retention rate tables and charts
- Survival curves with confidence intervals
- Cohort heatmaps and behavior patterns
- Churn risk scores and feature importance
- Retention optimization strategies
Example Usage Scenarios
SaaS Subscription Analysis
# Analyze monthly subscription renewal patterns
# Predict which users are likely to churn
# Identify features that drive long-term retention
Membership Program Analysis
# Track member engagement over time
# Compare retention across membership tiers
# Analyze payment method impact on retention
E-commerce Customer Retention
# Analyze repeat purchase patterns
# Calculate customer lifetime value
# Identify high-value customer segments
Key Analysis Methods
Survival Analysis
- Kaplan-Meier Estimator: Non-parametric survival curve
- Log-rank Test: Compare survival between groups
- Cox Proportional Hazards: Multi-variable risk modeling
- Median Survival Time: Time when 50% of users have churned
Cohort Analysis
- Time-based Cohorts: Group by acquisition month/quarter
- Behavior-based Cohorts: Group by usage patterns
- Retention Matrix: Visualize retention over time periods
- Cohort Comparison: Compare different cohort behaviors
Machine Learning Prediction
- Logistic Regression: Binary churn classification
- Random Forest: Non-linear pattern detection
- Gradient Boosting: High accuracy prediction
- Feature Importance: Identify key churn drivers
Common Business Questions Answered
- What is our overall retention rate?
- How does retention vary by user segment?
- What factors most influence customer churn?
- Which users are at highest risk of leaving?
- How can we improve long-term retention?
- What is the typical customer lifetime?
Integration Examples
See examples/ directory for:
basic_retention.py- Survival analysis basicscohort_analysis.py- Cohort-based retention analysischurn_prediction.py- ML-based churn prediction- Sample datasets for testing
Best Practices
- Data Quality: Ensure accurate churn definitions and time measurements
- Event Definition: Clearly define what constitutes "churn"
- Time Windows: Choose appropriate analysis periods
- Segmentation: Analyze different user groups separately
- Validation: Always validate models with test data
- Business Context: Consider operational constraints and costs
Advanced Features
- Competing Risks Analysis: Different types of churn
- Time-varying Covariates: Dynamic feature analysis
- Customer Lifetime Value: Integrate retention with revenue
- Retention Forecasting: Predict future retention trends
- A/B Testing: Measure retention improvement impact