name	data-viz-insight
description	Interactive data exploration and visualization skill. Use when users ask to visualize data, analyze datasets, create charts, or explore data files (CSV, Excel, Parquet, JSON). This skill guides through data exploration, proposes visualization strategies based on data characteristics, creates interactive Plotly charts in marimo notebooks, and generates analytical conclusions.

Data Viz Insight

This skill enables interactive data exploration, visualization planning, and automated chart generation in marimo notebooks.

Prerequisites

Marimo server must be running with MCP support:

uv run marimo edit main.py --mcp --no-token

Workflow

Follow this 5-step interactive process:

Step 1: Data Input

If the user hasn't provided a data file path, ask for it:

"Which data file would you like to visualize?"
Accept: CSV, Excel (.xlsx), Parquet (.parquet), JSON files

Verify the file exists before proceeding.

Step 2: Auto-Explore Data

Option 1: Use the exploration script (recommended for comprehensive analysis)

uv run python .claude/skills/data-viz-insight/scripts/explore_data.py data.csv

The script provides:

Data shape and schema
Summary statistics for numeric columns
Value counts for categorical columns
Missing data analysis
Date ranges for temporal columns
Sample rows
Visualization recommendations

Option 2: Use Polars directly for custom exploration

import polars as pl

# Read data (format detected automatically)
df = pl.read_csv("data.csv")  # or read_excel, read_parquet, read_json

# Gather key information
schema = df.schema  # Column names and types
shape = (df.height, df.width)  # Rows, columns
stats = df.describe()  # Summary statistics
nulls = df.null_count()  # Missing values

Present findings in a structured summary:

Data shape (rows × columns)
Column types breakdown (numeric, categorical, temporal)
Key statistics for numeric columns
Unique values for categorical columns
Date ranges for temporal data
Notable patterns or data quality issues

For detailed Polars patterns, see references/polars-patterns.md.

Step 3: Understand User Interest

After sharing initial insights, explicitly ask what aspects interest the user:

"What aspects of this data would you like to explore?"
"Are you interested in trends over time, category breakdowns, or relationships between variables?"
"Which specific fields or patterns caught your attention?"

Listen for specific interests like:

Time-based trends
Category comparisons
Distribution analysis
Correlation between variables
Metric vs metric comparisons
Top/bottom performers
Anomaly detection

Step 4: Propose Visualizations

Based on data characteristics and user interests, propose 3-5 specific charts with rationale:

Example proposal format:

Based on your data analysis, I propose these visualizations:

1. **[Metric] by [Category] (Bar Chart)** - Compare values across different groups
2. **[Metric] Over Time (Line Chart)** - Show trends and patterns
3. **[Category] Distribution (Pie Chart)** - Visualize proportions of the whole
4. **[Value] Distribution (Histogram)** - Understand the spread of values
5. **[Variable A] vs [Variable B] (Scatter)** - Explore relationships

Would you like me to create these visualizations?

Adapt to data type:

Sales data: "Revenue by Region", "Monthly Sales Trend", "Product Mix"
Web analytics: "Traffic by Source", "Daily Visitors", "Bounce Rate Distribution"
Scientific data: "Measurements by Condition", "Temperature Over Time", "Correlation Matrix"
Financial data: "Spending by Category", "Transaction Trend", "Amount Distribution"

For chart type selection guidance, see references/plotly-charts.md.

Step 5: Execute & Conclude

Once approved, create visualizations in marimo notebook and write conclusions.

Adding Visualization Cells

Use marimo MCP tools to inspect the notebook:

mcp__marimo__get_active_notebooks - Get session ID
mcp__marimo__get_lightweight_cell_map - View structure

Add cells directly to the marimo file using the Edit tool. Each chart follows this pattern:

@app.cell
def _(df, go, pl):
    import plotly.graph_objects as go

    # Group data for visualization
    category_totals = df.group_by("category").agg(
        pl.col("amount").sum().alias("total")
    ).sort("total", descending=True)

    # Create chart using Graph Objects
    fig = go.Figure(data=[
        go.Bar(
            x=category_totals["category"].to_list(),
            y=category_totals["total"].to_list(),
            marker=dict(
                color=category_totals["total"].to_list(),
                colorscale='Blues',
                showscale=False
            )
        )
    ])
    fig.update_layout(
        title="Spending by Category",
        xaxis_title="Category",
        yaxis_title="Total Amount (TWD)",
        showlegend=False
    )
    return fig

Cell guidelines:

ALWAYS use Plotly Graph Objects (plotly.graph_objects), NOT Plotly Express
Import go from plotly.graph_objects in the imports cell
Reference data from previous cells (e.g., df)
Convert Polars columns to lists using .to_list() before passing to plotly
Return the figure object
Use descriptive titles and axis labels with update_layout()
One visualization per cell for reactivity

Why Graph Objects over Express:

No numpy dependency required
More control over chart customization
Explicit data handling with .to_list()
Better performance with Polars DataFrames

Writing Conclusions

Add a conclusion cell summarizing key findings:

@app.cell
def _():
    import marimo as mo
    mo.md("""
    ## Data Analysis Summary

    **Key Findings:**
    - [Finding 1: e.g., "Category A accounts for 45% of total (12,450 units)"]
    - [Finding 2: e.g., "Peak activity occurs on [day/time] - 2.3x above average"]
    - [Finding 3: e.g., "Largest value: 3,196 in [category] on [date]"]

    **Insights:**
    - [Pattern or trend observed from the data]
    - [Notable anomaly or outlier identified]
    """)
    return

Keep conclusions:

Brief (3-5 bullet points for findings + 2-3 insights)
Data-driven (include specific numbers from analysis)
Actionable (suggest patterns or next steps when relevant)

After creating visualizations, use MCP tools to verify:

mcp__marimo__get_cell_outputs - View rendered charts
mcp__marimo__lint_notebook - Validate notebook structure

Marimo MCP Tools Reference

When marimo runs with MCP enabled, these tools are available:

mcp__marimo__get_marimo_rules - Get marimo best practices
mcp__marimo__get_active_notebooks - List active sessions and file paths
mcp__marimo__get_lightweight_cell_map - Preview notebook structure
mcp__marimo__get_tables_and_variables - Inspect data in session
mcp__marimo__get_cell_outputs - View visualization results
mcp__marimo__lint_notebook - Validate changes

Data Format Support

Polars supports these formats natively:

CSV:

df = pl.read_csv("data.csv")

Excel:

df = pl.read_excel("data.xlsx", sheet_name="Sheet1")

Parquet:

df = pl.read_parquet("data.parquet")

JSON:

df = pl.read_json("data.json")

Visualization Selection Guide

Quick reference for choosing chart types:

Categorical comparisons → Bar chart, horizontal bar Proportions → Pie chart (<7 categories), treemap Time series → Line chart, area chart Distributions → Histogram, box plot, violin plot Relationships → Scatter plot, bubble chart Correlations → Heatmap Multi-category → Grouped bar, stacked bar

For detailed examples and customization patterns, see references/plotly-charts.md.

Resources

This skill includes reference documentation for detailed patterns:

references/polars-patterns.md - Complete Polars data exploration patterns, filtering, transformations, and date operations
references/plotly-charts.md - Chart type examples, customization patterns, and interactive features

data-viz-insight

Install Skill

SKILL.md