Claude Code Plugins

Community-maintained marketplace

Feedback

earthquake_data_tsunami-query

@aayush-gc/claude-agent-skill-utilities
0
0

Query and analyze earthquake_data_tsunami.xlsx data using conversational AI. Automatically generated skill for Excel file with 783 rows and 13 columns across 1 sheet(s).

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name earthquake_data_tsunami-query
description Query and analyze earthquake_data_tsunami.xlsx data using conversational AI. Automatically generated skill for Excel file with 783 rows and 13 columns across 1 sheet(s).

earthquake_data_tsunami Query Skill

Auto-generated skill for querying earthquake_data_tsunami.xlsx

Dataset Overview

  • Original File: earthquake_data_tsunami.xlsx
  • File Size: 0.05 MB
  • Sheets: 1
  • Total Rows: 783
  • Total Columns: 13
  • Formulas: 0
  • Data Format: Parquet (optimized for fast querying)

Available Sheets

earthquake_data_tsunami

  • Rows: 783
  • Columns: 13
  • Key columns: magnitude, cdi, mmi, sig, nst

Query Capabilities

This skill enables natural language querying of the Excel data. You can:

Filtering and Selection

  • Filter rows based on conditions
  • Select specific columns
  • Combine multiple conditions

Aggregations

  • Group by categories
  • Calculate sums, averages, counts
  • Find min/max values

Analysis

  • Compare across groups
  • Identify trends and patterns
  • Generate insights from the data

Example Queries

"Show me total sales by age group"
"What's the average revenue for customers over 25?"
"Filter rows where status is 'active' and created after 2024-01-01"
"Group by category and calculate sum of revenue"
"Find the top 10 products by sales volume"
"Compare performance across different regions"

Formula Information

This dataset contains 0 formulas. They have been analyzed and documented.

Key Formulas

No formulas found

All formulas are documented in formula_map.json with their cell locations and dependencies.

Technical Details

  • Storage Format: Parquet (columnar, compressed)
  • Query Engine: Polars with streaming support
  • Memory Efficiency: Lazy loading, data loaded on-demand
  • Performance: ~30x faster than direct Excel queries
  • Data Location: earthquake_data_tsunami_parquet/

Instructions for Claude

When a user requests to query this data:

Step 1: Load Required Resources

import polars as pl
from pathlib import Path
import json

# Load data dictionary to understand schema
with open('data_dictionary.json', 'r') as f:
    schema = json.load(f)

# Load formula map if needed
with open('formula_map.json', 'r') as f:
    formulas = json.load(f)

Step 2: Use Query Helper

from query_helper import QueryHelper

# Initialize helper
helper = QueryHelper('earthquake_data_tsunami_parquet')

# Load a sheet
df = helper.load_sheet('Sheet1', lazy=True)

# Execute query with streaming
result = df.filter(
    pl.col("age") > 25
).group_by("category").agg([
    pl.count().alias("count"),
    pl.mean("revenue").alias("avg_revenue")
]).collect(streaming=True)

# Display results (paginated)
print(result.head(100))

Step 3: Handle Large Results

  • Always use .head(100) or .limit(100) for initial results
  • Offer to show more if user requests
  • Use streaming mode for queries: .collect(streaming=True)
  • Paginate large outputs

Step 4: Reference Documentation

  • Check data_dictionary.json for:

    • Column names and data types
    • Sample values
    • Formula indicators
  • Check formula_map.json for:

    • Excel formula definitions
    • Cell locations
    • Dependencies
  • Check sample_data.json for:

    • Representative data examples
    • Data patterns and formats

Column Reference

Sheet Column Type Has Formulas
earthquake_data_tsunami magnitude numeric No
earthquake_data_tsunami cdi numeric No
earthquake_data_tsunami mmi numeric No
earthquake_data_tsunami sig numeric No
earthquake_data_tsunami nst numeric No
earthquake_data_tsunami dmin numeric No
earthquake_data_tsunami gap numeric No
earthquake_data_tsunami depth numeric No
earthquake_data_tsunami latitude numeric No
earthquake_data_tsunami longitude numeric No
earthquake_data_tsunami Year numeric No
earthquake_data_tsunami Month numeric No
earthquake_data_tsunami tsunami numeric No

For complete column information, see data_dictionary.json.

Data Dictionary Location

All detailed schema information is in data_dictionary.json:

  • Column names, types, and sample values
  • Formula locations and definitions
  • Sheet relationships
  • Data statistics

Best Practices

  1. Always use lazy loading with pl.scan_parquet() for large datasets
  2. Stream results with .collect(streaming=True) to avoid memory issues
  3. Limit initial results to 100 rows, offer pagination
  4. Check data dictionary before constructing queries
  5. Handle nulls gracefully in user-facing outputs
  6. Validate column names from schema before querying
  7. Use appropriate aggregations based on data types

Example Complete Workflow

import polars as pl
import json
from query_helper import QueryHelper

# 1. Initialize
helper = QueryHelper('earthquake_data_tsunami_parquet')

# 2. Load schema
with open('data_dictionary.json', 'r') as f:
    schema = json.load(f)

# 3. Check available columns
sheet_info = schema.get('Sheet1', {})
columns = [col['name'] for col in sheet_info.get('column_details', [])]
print(f"Available columns: {', '.join(columns[:10])}")

# 4. Execute query
df = helper.load_sheet('Sheet1')
result = df.filter(
    pl.col('age') > 25
).select(['name', 'age', 'revenue']).head(100).collect(streaming=True)

# 5. Display formatted results
print(result)
print(f"\nShowing {len(result)} of {df.count()} total rows")

Troubleshooting

Q: Column not found error? A: Check data_dictionary.json for exact column names (case-sensitive)

Q: Memory issues with large queries? A: Use .head() to limit results and .collect(streaming=True)

Q: Formula not working? A: Check formula_map.json - formulas are pre-computed in Parquet data

Q: Sheet name not found? A: List available sheets from schema or use helper.list_sheets()


Note: This is an auto-generated skill. The quality of query results depends on the data quality in the source Excel file.