| name | data-analysis |
| description | Analyze datasets and create visualizations |
| version | 1.0.0 |
| author | Minion Team |
| tags | data, analysis, visualization, pandas |
| requirements | pandas>=2.0.0, matplotlib>=3.7.0, numpy>=1.24.0 |
Data Analysis Skill
Description
This skill helps analyze datasets and create meaningful visualizations. It can handle CSV files, perform statistical analysis, and generate various types of plots.
Usage Instructions
When a user requests data analysis:
- Load the dataset: Use pandas to read the data file
- Inspect the data: Check shape, columns, data types, and basic statistics
- Clean the data: Handle missing values and outliers if necessary
- Perform analysis: Calculate relevant statistics based on user's question
- Create visualizations: Generate appropriate plots (line, bar, scatter, etc.)
- Save results: Export results and visualizations
Available Resources
Scripts
scripts/analyze.py: Core analysis functions
load_dataset(filepath): Load data from various formatsbasic_statistics(df): Calculate descriptive statisticsdetect_outliers(df, column): Identify outlierscorrelation_analysis(df): Compute correlations
scripts/visualize.py: Visualization utilities
plot_distribution(df, column): Create distribution plotsplot_correlation_matrix(df): Visualize correlation heatmapplot_time_series(df, date_col, value_col): Time series plotssave_plot(fig, filename): Save figure to file
References
- references/examples.md: Usage examples and common patterns
- references/best_practices.md: Data analysis best practices
Example Prompts
- "Analyze this CSV file and show me the trends"
- "Create a visualization of the sales data by month"
- "Find correlations in this dataset"
- "Identify outliers in the price column"
- "Generate a statistical summary of the data"
Output Format
Analysis results should include:
- Data overview (shape, columns, types)
- Statistical summary
- Key insights and findings
- Visualizations (saved as PNG files)
- Recommendations or next steps
Notes
- Always inspect data before analysis
- Handle missing values appropriately
- Choose visualizations that match the data type
- Provide clear explanations of findings
- Save all outputs for user reference