Claude Code Plugins

Community-maintained marketplace

Feedback

Complete guide to GA4 BigQuery export including setup, schema documentation, SQL query patterns, and data analysis. Use when exporting GA4 data to BigQuery, writing SQL queries for GA4 data, analyzing event-level data, working with nested/repeated fields (UNNEST), or building custom reports from raw data. Covers BigQuery linking, events_* tables, SQL patterns, and performance optimization.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name ga4-bigquery
description Complete guide to GA4 BigQuery export including setup, schema documentation, SQL query patterns, and data analysis. Use when exporting GA4 data to BigQuery, writing SQL queries for GA4 data, analyzing event-level data, working with nested/repeated fields (UNNEST), or building custom reports from raw data. Covers BigQuery linking, events_* tables, SQL patterns, and performance optimization.

GA4 BigQuery Export and Analysis

Overview

GA4 BigQuery export provides raw, event-level data access for advanced analysis, custom reporting, machine learning, and long-term data warehousing.

When to Use This Skill

Invoke this skill when:

  • Exporting GA4 raw data to BigQuery
  • Writing SQL queries for GA4 event data
  • Analyzing unsampled event-level data
  • Working with nested/repeated fields using UNNEST
  • Building custom reports beyond GA4 UI limits
  • Creating attribution models with raw data
  • Performing user journey analysis across all events
  • Integrating GA4 data with other data sources
  • Building machine learning models on GA4 data
  • Analyzing historical data beyond GA4 retention limits
  • Optimizing BigQuery query performance
  • Working with events_* table schema
  • Extracting event parameters from nested structures

Core Capabilities

BigQuery Export Setup

Requirements:

  • GA4 property (standard or 360)
  • Google Cloud project
  • BigQuery API enabled
  • Appropriate permissions

Setup Steps:

  1. Create Google Cloud Project:

    • Go to console.cloud.google.com
    • Create new project or select existing
    • Enable BigQuery API
  2. Link GA4 to BigQuery:

    • GA4 Admin → Product Links → BigQuery Links
    • Click "Link"
    • Choose Google Cloud project
    • Select dataset location (US, EU, etc.)
    • Configure export:
      • Daily: Complete export once per day (~9AM property timezone)
      • Streaming: Real-time export (360 only)
    • Click "Next"
    • Confirm setup

Export Options:

  • Daily Export: Free for standard GA4, once per day
  • Streaming Export: GA4 360 only, near real-time
  • Include Advertising IDs: Optional, for Ads integration

Data Availability:

  • Daily tables: ~24 hours after day ends
  • Intraday tables: ~3 updates per day
  • Streaming: Minutes after event collection (360)

BigQuery Table Structure

Table Naming:

  • project.dataset.events_YYYYMMDD - Daily export
  • project.dataset.events_intraday_YYYYMMDD - Intraday (partial day)
  • project.dataset.events_* - Wildcard for all dates

Key Schema Fields:

Event Fields:

  • event_date: YYYYMMDD format (STRING)
  • event_timestamp: Microseconds since epoch (INTEGER)
  • event_name: Event name (STRING)
  • event_params: Event parameters (RECORD, REPEATED)
  • event_value_in_usd: Event value in USD (FLOAT)

User Fields:

  • user_id: User ID if set (STRING)
  • user_pseudo_id: Anonymous user ID (STRING)
  • user_properties: User properties (RECORD, REPEATED)
  • user_first_touch_timestamp: First visit timestamp (INTEGER)

Device Fields:

  • device.category: desktop, mobile, tablet
  • device.operating_system: Windows, iOS, Android
  • device.browser: Chrome, Safari, etc.

Geo Fields:

  • geo.country: Country name
  • geo.region: State/region
  • geo.city: City name

Traffic Source Fields:

  • traffic_source.source: Source (google, direct)
  • traffic_source.medium: Medium (organic, cpc)
  • traffic_source.name: Campaign name

E-commerce Fields:

  • ecommerce.transaction_id: Transaction ID (STRING)
  • ecommerce.purchase_revenue_in_usd: Purchase revenue (FLOAT)
  • items: Items array (RECORD, REPEATED)

Basic SQL Query Patterns

Query 1: Event Count by Name

SELECT
  event_name,
  COUNT(*) as event_count
FROM
  `project.dataset.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20250101' AND '20250131'
GROUP BY
  event_name
ORDER BY
  event_count DESC

Query 2: Extract Event Parameters

SELECT
  event_date,
  event_name,
  user_pseudo_id,
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') as page_location,
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_title') as page_title
FROM
  `project.dataset.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20250101' AND '20250131'
  AND event_name = 'page_view'
LIMIT 1000

Query 3: Purchase Analysis

SELECT
  event_date,
  COUNT(DISTINCT user_pseudo_id) as purchasers,
  COUNT(DISTINCT ecommerce.transaction_id) as transactions,
  SUM(ecommerce.purchase_revenue_in_usd) as total_revenue,
  AVG(ecommerce.purchase_revenue_in_usd) as avg_order_value
FROM
  `project.dataset.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20250101' AND '20250131'
  AND event_name = 'purchase'
  AND ecommerce.transaction_id IS NOT NULL
GROUP BY
  event_date
ORDER BY
  event_date

Query 4: UNNEST Items Array

SELECT
  event_date,
  item.item_name,
  item.item_category,
  SUM(item.quantity) as total_quantity,
  SUM(item.item_revenue_in_usd) as total_revenue
FROM
  `project.dataset.events_*`,
  UNNEST(items) as item
WHERE
  _TABLE_SUFFIX BETWEEN '20250101' AND '20250131'
  AND event_name = 'purchase'
GROUP BY
  event_date,
  item.item_name,
  item.item_category
ORDER BY
  total_revenue DESC

Advanced Query Patterns

User Journey Analysis

WITH user_events AS (
  SELECT
    user_pseudo_id,
    event_timestamp,
    event_name,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') as page_location
  FROM
    `project.dataset.events_*`
  WHERE
    _TABLE_SUFFIX = '20250115'
)
SELECT
  user_pseudo_id,
  ARRAY_AGG(
    STRUCT(event_name, page_location, event_timestamp)
    ORDER BY event_timestamp
  ) as event_sequence
FROM
  user_events
GROUP BY
  user_pseudo_id
LIMIT 100

Session Attribution

SELECT
  event_date,
  traffic_source.source,
  traffic_source.medium,
  traffic_source.name as campaign,
  COUNT(DISTINCT user_pseudo_id) as users,
  COUNT(DISTINCT CONCAT(user_pseudo_id,
    (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'))) as sessions
FROM
  `project.dataset.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20250101' AND '20250131'
GROUP BY
  event_date,
  traffic_source.source,
  traffic_source.medium,
  traffic_source.name
ORDER BY
  sessions DESC

Helper Functions

-- Create reusable functions for parameter extraction
CREATE TEMP FUNCTION GetParamString(params ANY TYPE, target_key STRING)
RETURNS STRING
AS (
  (SELECT value.string_value FROM UNNEST(params) WHERE key = target_key)
);

CREATE TEMP FUNCTION GetParamInt(params ANY TYPE, target_key STRING)
RETURNS INT64
AS (
  (SELECT value.int_value FROM UNNEST(params) WHERE key = target_key)
);

-- Use in query
SELECT
  event_date,
  GetParamString(event_params, 'page_location') as page_location,
  GetParamInt(event_params, 'engagement_time_msec') as engagement_time
FROM
  `project.dataset.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20250101' AND '20250131'

Query Optimization

Best Practices:

  1. Use _TABLE_SUFFIX Filtering:
WHERE _TABLE_SUFFIX BETWEEN '20250101' AND '20250131'

NOT:

WHERE event_date BETWEEN '20250101' AND '20250131'
  1. Filter on Clustered Columns: GA4 tables clustered by event_name and event_timestamp:
WHERE event_name IN ('page_view', 'purchase')
  1. Select Specific Columns:
SELECT event_name, user_pseudo_id, event_timestamp

NOT:

SELECT *
  1. Limit UNNEST Operations:
-- Good: inline UNNEST
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location')

-- Avoid: full UNNEST in FROM
FROM table, UNNEST(event_params) as param
WHERE param.key = 'page_location'
  1. Use LIMIT During Development:
LIMIT 1000  -- Test query first

Cost Management

BigQuery Pricing:

  • Storage: ~$0.02/GB/month
  • Queries: ~$5/TB scanned
  • Streaming inserts: ~$0.05/GB (360 only)

Reducing Costs:

  • Partition by date using _TABLE_SUFFIX
  • Select only needed columns
  • Use LIMIT for testing
  • Create materialized views for frequent queries
  • Set up cost alerts in Google Cloud

Free Tier:

  • 10 GB storage free/month
  • 1 TB queries free/month

Data Retention

GA4 Export Retention:

  • Standard GA4: 2 months or 14 months (Admin setting)
  • BigQuery: Unlimited (until manually deleted)
  • Set table expiration if needed (optional)

Setting Expiration:

ALTER TABLE `project.dataset.events_20250101`
SET OPTIONS (
  expiration_timestamp=TIMESTAMP "2026-01-01 00:00:00 UTC"
)

Common Use Cases

1. Unsampled Reporting:

  • GA4 UI may sample large datasets
  • BigQuery = full, unsampled data
  • Use for accurate reporting

2. Custom Attribution:

  • Access full user journey
  • Build custom attribution models
  • Credit touchpoints as needed

3. Data Integration:

  • Join GA4 with CRM data
  • Combine with product catalog
  • Enrich with external sources

4. Machine Learning:

  • Export to ML tools
  • Predict churn, LTV, conversions
  • Train custom models

5. Long-term Analysis:

  • Historical analysis beyond GA4 limits
  • Year-over-year comparisons
  • Trend analysis

Integration with Other Skills

  • ga4-setup - Initial property setup before BigQuery export
  • ga4-recommended-events - Event structure in BigQuery tables
  • ga4-custom-events - Custom event parameters in BigQuery
  • ga4-custom-dimensions - Custom dimensions in event_params
  • ga4-reporting - Comparing BigQuery vs GA4 UI reports
  • ga4-measurement-protocol - Server-side events in BigQuery

References

  • references/bigquery-setup-complete.md - Step-by-step BigQuery linking
  • references/schema-reference.md - Complete table schema documentation
  • references/sql-patterns.md - Common SQL query patterns and examples
  • references/optimization-guide.md - Performance and cost optimization

Quick Reference

Table Names:

  • Daily: events_YYYYMMDD
  • Intraday: events_intraday_YYYYMMDD
  • Wildcard: events_*

Filter by Date:

WHERE _TABLE_SUFFIX BETWEEN '20250101' AND '20250131'

Extract Parameter:

(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'param_name')

UNNEST Items:

FROM table, UNNEST(items) as item

Costs:

  • Storage: $0.02/GB/month
  • Queries: $5/TB scanned