Claude Code Plugins

Community-maintained marketplace

Feedback

Comprehensive guide for using BigQuery CLI (bq) to query and inspect tables in Monzo's BigQuery projects, with emphasis on data sensitivity and INFORMATION_SCHEMA queries.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name bigquery
description Comprehensive guide for using BigQuery CLI (bq) to query and inspect tables in Monzo's BigQuery projects, with emphasis on data sensitivity and INFORMATION_SCHEMA queries.

BigQuery CLI Skill

This skill provides comprehensive guidance on using the BigQuery CLI (bq) for querying and inspecting data in Monzo's BigQuery projects.

Core Principles

  1. Always specify the project explicitly using --project_id=PROJECT_NAME
  2. Always use Standard SQL with --use_legacy_sql=false
  3. Respect data sensitivity - avoid querying actual content from sensitive tables
  4. Use INFORMATION_SCHEMA for metadata queries (schemas, columns, tables)

Common Query Patterns

1. Check Table Schema (INFORMATION_SCHEMA)

Use this to inspect column names, types, and structure without accessing sensitive data:

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT column_name, data_type, is_nullable
   FROM \`monzo-analytics.DATASET_NAME.INFORMATION_SCHEMA.COLUMNS\`
   WHERE table_name = 'TABLE_NAME'
   ORDER BY ordinal_position"

Examples:

# Check dims dataset table schema
bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT column_name, data_type FROM \`monzo-analytics.dims.INFORMATION_SCHEMA.COLUMNS\`
   WHERE table_name = 'vulnerable_customer_logs_dim' ORDER BY ordinal_position"

# Check prod dataset table schema
bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT column_name, data_type FROM \`monzo-analytics.prod.INFORMATION_SCHEMA.COLUMNS\`
   WHERE table_name = 'transactions' ORDER BY ordinal_position"

2. Count Rows (Safe for Sensitive Tables)

Use COUNT(*) to check table size without exposing data:

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT COUNT(*) as row_count FROM \`monzo-analytics.DATASET.TABLE_NAME\`"

Example:

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT COUNT(*) as row_count FROM \`monzo-analytics.dims.vulnerable_customer_logs_dim\`"

3. List All Tables in a Dataset

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT table_name, table_type
   FROM \`monzo-analytics.DATASET_NAME.INFORMATION_SCHEMA.TABLES\`
   ORDER BY table_name"

Example:

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT table_name FROM \`monzo-analytics.dims.INFORMATION_SCHEMA.TABLES\`
   ORDER BY table_name"

4. Export Schema to File

Useful for programmatic processing of table schemas:

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  --format=csv --quiet \
  "SELECT column_name FROM \`monzo-analytics.DATASET.INFORMATION_SCHEMA.COLUMNS\`
   WHERE table_name = 'TABLE_NAME' ORDER BY ordinal_position" \
  | tail -n +2 > /tmp/columns.txt

Example:

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  --format=csv --quiet \
  "SELECT column_name FROM \`monzo-analytics.dims.INFORMATION_SCHEMA.COLUMNS\`
   WHERE table_name = 'vulnerable_customer_logs_dim' ORDER BY ordinal_position" \
  | tail -n +2 > /tmp/columns.txt

5. Check Table Metadata

Get table creation time, size, and other metadata:

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT
     table_name,
     creation_time,
     ROUND(size_bytes/1024/1024/1024, 2) as size_gb,
     row_count
   FROM \`monzo-analytics.DATASET_NAME.INFORMATION_SCHEMA.TABLES\`
   WHERE table_name = 'TABLE_NAME'"

6. Find Tables by Pattern

Search for tables matching a naming pattern:

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT table_name
   FROM \`monzo-analytics.DATASET_NAME.INFORMATION_SCHEMA.TABLES\`
   WHERE table_name LIKE '%PATTERN%'
   ORDER BY table_name"

Example:

# Find all customer-related tables
bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT table_name FROM \`monzo-analytics.dims.INFORMATION_SCHEMA.TABLES\`
   WHERE table_name LIKE '%customer%' ORDER BY table_name"

7. Get Detailed Column Information

Get comprehensive column metadata including descriptions:

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT
     column_name,
     data_type,
     is_nullable,
     is_partitioning_column
   FROM \`monzo-analytics.DATASET.INFORMATION_SCHEMA.COLUMNS\`
   WHERE table_name = 'TABLE_NAME'
   ORDER BY ordinal_position"

8. Sample Data (Non-Sensitive Tables Only)

⚠️ WARNING: Only use this on non-sensitive tables. Never query actual content from people/staff/PII tables.

bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT * FROM \`monzo-analytics.DATASET.TABLE_NAME\` LIMIT 10"

Output Formatting Options

Control how results are displayed:

# CSV format
--format=csv

# JSON format
--format=json

# Pretty table format (default)
--format=prettyjson

# Quiet mode (no status messages)
--quiet

# Maximum rows to return
--max_rows=100

Common Projects and Datasets

Main Analytics Projects

  • monzo-analytics - Main analytics warehouse
  • monzo-analytics-v2 - New OOM architecture models
  • monzo-analytics-pii - PII-containing data (use with caution)
  • sanitized-events-prod - Sanitised event data
  • raw-analytics-events-prod - Raw event data

Common Datasets

  • dims - Dimension tables
  • prod - Production tables
  • lending - Lending-specific tables
  • slurpee - Slurpee data

Data Sensitivity Guidelines

✅ SAFE Operations (Always Allowed)

  1. INFORMATION_SCHEMA queries - These only return metadata, not actual data
  2. COUNT(*) queries - These only return row counts
  3. Schema inspection - Column names, types, table structure

⚠️ RESTRICTED Operations (Use with Caution)

  1. Querying actual content from:

    • People/staff data tables
    • PII-containing tables
    • Customer financial data
    • Authentication/security tables
  2. When in doubt:

    • Stick to INFORMATION_SCHEMA queries
    • Use COUNT(*) to verify table exists
    • Ask the user before querying actual content

🚫 NEVER Do This

  • Query actual rows from people, staff, hibob tables
  • Export PII data to local files
  • Query authentication credentials or tokens
  • Access customer financial details without explicit permission

Error Handling

Common Errors and Solutions

Error: "Not found: Table"

# Solution: Check the table exists first
bq query --project_id=monzo-analytics --use_legacy_sql=false \
  "SELECT table_name FROM \`monzo-analytics.DATASET.INFORMATION_SCHEMA.TABLES\`
   WHERE table_name LIKE '%SEARCH_TERM%'"

Error: "Access Denied"

# Solution: You may not have permissions for that project/dataset
# Try a different project or ask the user about access

Error: "Syntax error"

# Solution: Ensure you're using Standard SQL (--use_legacy_sql=false)
# Check backtick usage around project.dataset.table identifiers

Best Practices

  1. Always use fully-qualified table names with backticks:

    `project-id.dataset.table`
    
  2. Use LIMIT for exploratory queries to avoid large result sets:

    SELECT * FROM `project.dataset.table` LIMIT 10
    
  3. Check row counts before running expensive queries:

    # First check size
    bq query --project_id=monzo-analytics --use_legacy_sql=false \
      "SELECT COUNT(*) FROM \`project.dataset.table\`"
    
    # Then run full query if reasonable
    
  4. Use dry-run for cost estimation (for expensive queries):

    bq query --dry_run --use_legacy_sql=false "YOUR_QUERY_HERE"
    
  5. Export large results to file:

    bq query --project_id=monzo-analytics --use_legacy_sql=false \
      --format=csv "YOUR_QUERY" > output.csv
    

Quick Reference Commands

# Schema check
bq query --project_id=PROJECT --use_legacy_sql=false \
  "SELECT column_name, data_type FROM \`PROJECT.DATASET.INFORMATION_SCHEMA.COLUMNS\`
   WHERE table_name = 'TABLE' ORDER BY ordinal_position"

# Row count
bq query --project_id=PROJECT --use_legacy_sql=false \
  "SELECT COUNT(*) FROM \`PROJECT.DATASET.TABLE\`"

# List tables
bq query --project_id=PROJECT --use_legacy_sql=false \
  "SELECT table_name FROM \`PROJECT.DATASET.INFORMATION_SCHEMA.TABLES\`
   ORDER BY table_name"

# Table metadata
bq query --project_id=PROJECT --use_legacy_sql=false \
  "SELECT table_name, row_count, size_bytes
   FROM \`PROJECT.DATASET.INFORMATION_SCHEMA.TABLES\`
   WHERE table_name = 'TABLE'"

When to Use This Skill

Invoke this skill when you need to:

  • Query BigQuery tables or datasets
  • Inspect table schemas or column types
  • Count rows or check table existence
  • Export table metadata
  • Verify data before running dbt models
  • Investigate data issues or table structures
  • Find tables by naming patterns

Integration with dbt Workflow

When working on dbt models in the analytics repository:

  1. Before creating import models - Use BigQuery CLI to inspect source schemas
  2. Before running dbt - Verify source tables exist and have expected structure
  3. Debugging dbt failures - Query actual tables to understand data issues
  4. Validating generators - Check that column types match between source and generator

Remember: Always respect data sensitivity guidelines and use INFORMATION_SCHEMA when possible.