name	window-functions-deep-dive
description	Master OPAL window functions for row-relative calculations, rankings, and moving aggregates. Covers lag(), lead(), row_number(), rank(), dense_rank(), moving averages, first(), and last(). Use when comparing rows to neighbors, ranking within partitions, calculating rate of change, or computing time-based moving windows. CRITICAL - OPAL uses window() function wrapper, NOT SQL OVER clause.

Window Functions Deep Dive

Overview

Window functions perform calculations across sets of rows related to the current row, without collapsing rows like aggregate does. They enable:

Row comparisons: Compare current row to previous/next rows (lag/lead)
Rankings: Assign ranks within partitions (row_number/rank/dense_rank)
Moving calculations: Rolling averages, sums over time windows
Boundary values: First/last values in windows or partitions

CRITICAL: OPAL vs SQL Syntax

❌ WRONG - SQL OVER Syntax (Does NOT Work):

SUM(val) OVER (PARTITION BY subject ORDER BY time ROWS UNBOUNDED PRECEDING)
# Error: Unknown function 'over()'

✅ CORRECT - OPAL window() Function:

window(sum(val), group_by(subject), order_by(time))

Key Difference: OPAL uses window() function wrapper, NOT SQL OVER clause. The Observe documentation sometimes shows SQL syntax for reference, but you MUST use the window() syntax in actual OPAL queries.

Window Function Categories

1. Offset Functions (lag/lead)

Access values from previous or next rows relative to current row.

Functions:

lag(column, offset) - Get value from N rows back
lead(column, offset) - Get value from N rows ahead

Syntax:

window(lag(column, N), group_by(partition_columns))
window(lead(column, N), group_by(partition_columns))

Key Points:

Offset defaults to 1 if not specified
Returns null when offset goes beyond partition boundary
MUST specify group_by() for partitioning (even if single partition)
Optional order_by() for explicit ordering (default: unspecified)

2. Ranking Functions

Assign ranks or row numbers within partitions.

Functions:

row_number() - Sequential unique numbers (1, 2, 3, 4, ...)
rank() - Ranks with gaps after ties (1, 2, 2, 4, ...)
dense_rank() - Ranks without gaps (1, 2, 2, 3, ...)

Syntax:

window(row_number(), group_by(partition), order_by(sort_col))
window(rank(), group_by(partition), order_by(sort_col))
window(dense_rank(), group_by(partition), order_by(sort_col))

Key Differences:

Value	row_number()	rank()	dense_rank()
59	1	1	1
53	2	2	2
43	3	3	3
3	4	4	4
3	5	4	4
2	6	6	5

When to Use:

row_number(): Need unique IDs, pagination, sampling
rank(): Olympic-style ranking (ties share rank, gaps for fairness)
dense_rank(): Category ranking (no gaps, continuous numbering)

3. Aggregate Functions with Time Windows

Calculate rolling aggregates over time-based sliding windows.

Functions:

avg(column) - Moving average
sum(column) - Moving sum
min(column) - Moving minimum
max(column) - Moving maximum
count(column) - Moving count

Syntax:

window(aggregate_func(column), group_by(partition), order_by(time_col), frame(back:duration))

Frame Durations:

Minutes: 5m, 15m, 30m
Hours: 1h, 6h, 12h, 24h
Days: 1d, 7d, 30d

IMPORTANT: OPAL only supports time-based frames (back:duration), NOT row-based frames like SQL's ROWS N PRECEDING.

4. Value Functions (first/last)

Retrieve first or last value within partition or window.

Functions:

first(column) - First value (by order_by)
last(column) - Last value (by order_by)

Syntax:

# Entire partition
window(first(column), group_by(partition), order_by(sort_col))
window(last(column), group_by(partition), order_by(sort_col))

# Sliding window
window(first(column), group_by(partition), order_by(sort_col), frame(back:duration))
window(last(column), group_by(partition), order_by(sort_col), frame(back:duration))

Pattern 1: Rate of Change Detection (lag/lead)

Use Case: Detect sudden spikes or drops in metrics.

align 5m, rate:sum(m("span_call_count_5m"))
| make_col previous_rate:window(lag(rate, 1), group_by(service_name))
| make_col next_rate:window(lead(rate, 1), group_by(service_name))
| make_col rate_change:rate - previous_rate
| make_col pct_change:if(previous_rate > 0, (rate_change / previous_rate) * 100, 0)
| make_col is_spike:if(pct_change > 100 or pct_change < -50, true, false)
| filter is_spike = true
| topk 20, max(pct_change)

Result:

service_name  previous_rate  rate  pct_change  is_spike
frontend      2              50    2400%       true      # 24x increase!
frontend      50             2     -96%        true      # 96% drop
cartservice   10             80    700%        true      # 8x increase

How It Works:

lag(rate, 1) gets previous time bucket's rate
lead(rate, 1) gets next time bucket's rate (for forward detection)
Calculate percentage change: (current - previous) / previous * 100
Flag spikes: >100% increase or >50% decrease
topk sorts by largest changes

When to Use:

Detect traffic spikes or drops
Identify service anomalies
Alert on sudden behavior changes
Compare bucket-to-bucket changes

Pattern 2: Top-N Per Time Bucket (Ranking)

Use Case: Find top 5 busiest services in each 5-minute window.

align 5m, rate:sum(m("span_call_count_5m"))
| aggregate total:sum(rate), group_by(service_name, _c_bucket)
| make_col rank:window(rank(), group_by(_c_bucket), order_by(desc(total)))
| filter rank <= 5
| sort asc(_c_bucket), asc(rank)

Result:

_c_bucket  service_name             total  rank
5877542    frontend-proxy           59     1
5877542    frontend                 53     2
5877542    featureflagservice       43     3
5877542    productcatalogservice    31     4
5877542    cartservice              8      5
5877543    frontend                 61     1
5877543    frontend-proxy           55     2
...

How It Works:

Aggregate to get total per service per bucket
rank() within each _c_bucket partition
order_by(desc(total)) ranks highest first
Filter to top 5
Sort for readable output

When to Use:

Top-N queries within time windows
Identify busiest/slowest services per period
Compare rankings across time
Dashboard "top services" widgets

Variation - Use dense_rank() for No Gaps:

make_col dense_rank:window(dense_rank(), group_by(_c_bucket), order_by(desc(total)))

Use dense_rank() when ties should count as one rank (no gaps).

Pattern 3: Moving Average Baseline

Use Case: Detect anomalies by comparing current value to 30-minute moving average.

align 5m, rate:sum(m("span_call_count_5m"))
| make_col baseline:window(avg(rate), group_by(service_name), order_by(asc(valid_from)), frame(back:30m))
| make_col deviation:rate - baseline
| make_col pct_deviation:if(baseline > 0, (deviation / baseline) * 100, 0)
| make_col is_anomaly:if(pct_deviation > 50 or pct_deviation < -50, true, false)
| filter is_anomaly = true
| topk 20, max(pct_deviation)

Result:

service_name        rate  baseline  deviation  pct_deviation  is_anomaly
featureflagservice  150   30        120        400%           true
frontend            2     26        -24        -92%           true
cartservice         80    20        60         300%           true

How It Works:

frame(back:30m) creates 30-minute sliding window
avg(rate) calculates average within that window
Compare current rate to baseline
Flag deviations >50% above or below baseline

When to Use:

Anomaly detection with dynamic baseline
Smooth out noise in metrics
Detect sustained changes vs temporary blips
Performance monitoring (latency vs moving average)

Frame Duration Guidance:

back:15m - Very responsive, more false positives
back:30m - Balanced (recommended for most use cases)
back:1h - Stable baseline, less sensitive
back:24h - Day-over-day trending

Pattern 4: Detect Return to Baseline

Use Case: Alert when error rate exceeds initial rate by 2x for more than 1 hour.

align 5m, error_rate:sum(m("span_error_count_5m"))
| make_col first_rate:window(first(error_rate), group_by(service_name), order_by(asc(valid_from)), frame(back:1h))
| make_col current_elevated:if(error_rate > first_rate * 2, true, false)
| filter current_elevated = true
| statsby incidents:count(), avg_elevation:avg(error_rate / first_rate), group_by(service_name)
| sort desc(incidents)

Result:

service_name     incidents  avg_elevation
checkoutservice  15         3.2x          # Error rate 3.2x higher than 1h ago
paymentservice   8          2.8x
fraudservice     3          2.1x

How It Works:

first(error_rate) with frame(back:1h) gets rate from 1 hour ago
Compare current to first: error_rate > first_rate * 2
Aggregate to count incidents and average elevation

When to Use:

Detect if metric returned to baseline
Track sustained elevations
SLO violations (errors above initial threshold)
Performance degradation alerts

Pattern 5: Smooth Noisy Metrics

Use Case: Smooth CPU usage with 10-minute moving average for charting.

align 1m, cpu:avg(m("cpu_usage_percent"))
| make_col smooth_cpu:window(avg(cpu), group_by(host), order_by(asc(valid_from)), frame(back:10m))
| pick_col valid_from, valid_to, host, cpu, smooth_cpu

Result:

valid_from     host       cpu    smooth_cpu
1763259000...  web-01     85.2   78.5       # Raw vs smoothed
1763259060...  web-01     92.1   79.2       # Spike smoothed out
1763259120...  web-01     76.4   79.8       # Averages over 10m

How It Works:

Align to 1-minute buckets (high resolution)
Calculate 10-minute moving average
Use smooth_cpu for charting, cpu for alerts

When to Use:

Smooth spiky metrics for visualization
Reduce noise in dashboards
Calculate trends without losing detail
Separate signal from noise

Pattern 6: Consecutive Threshold Violations

Use Case: Alert only if latency exceeds SLO for 3 consecutive buckets.

align 5m, p95:percentile(duration_ms, 0.95)
| make_col prev1:window(lag(p95, 1), group_by(service_name))
| make_col prev2:window(lag(p95, 2), group_by(service_name))
| make_col all_high:if(p95 > 500 and prev1 > 500 and prev2 > 500, true, false)
| filter all_high = true
| statsby first_violation:min(valid_from), duration_mins:count() * 5, group_by(service_name)

Result:

service_name     first_violation       duration_mins
checkoutservice  2025-11-15 10:30:00   45            # Violated for 45 min
paymentservice   2025-11-15 11:15:00   20            # Violated for 20 min

How It Works:

lag(p95, 1) and lag(p95, 2) get two previous buckets
Check if current AND both previous exceed 500ms
Aggregate to find first violation and total duration

When to Use:

Reduce alert fatigue from transient spikes
Require sustained violations before alerting
SLO monitoring with grace period
Stability checks (N consecutive failures)

Common Window Function Patterns

Quick Reference

Pattern	Window Function	Use Case
Bucket-to-bucket change	`lag(col, 1)`	Spike detection, rate of change
Forward prediction	`lead(col, 1)`	Pre-emptive alerts
Top-N per period	`rank()`, `dense_rank()`	Rankings, leaderboards
Unique row IDs	`row_number()`	Sampling, pagination
Moving average	`avg()` + `frame(back:30m)`	Smoothing, baseline
Rolling sum	`sum()` + `frame(back:1h)`	Hourly totals
Start of period	`first()` + `frame(back:24h)`	Day-over-day comparison
End of period	`last()` + `frame(back:24h)`	Final value in window

window() Syntax Reference

Complete Syntax

window(
  expression,                     # Function to apply
  group_by(partition_columns),    # Partition into groups
  order_by(sort_expression),      # Order within partitions
  frame(back:duration)            # Optional: time-based window
)

Parameters

expression (required):

Offset: lag(col, N), lead(col, N)
Ranking: row_number(), rank(), dense_rank()
Aggregate: avg(col), sum(col), min(col), max(col), count(col)
Value: first(col), last(col)

group_by() (required for most functions):

Partitions data into groups
Window function operates independently per partition
Example: group_by(service_name) - separate windows per service

order_by() (required for ranking, optional for others):

Defines row order within partitions
Example: order_by(asc(valid_from)) - chronological order
Example: order_by(desc(total)) - highest first

frame() (optional, only for aggregates and value functions):

Time-based sliding window: frame(back:duration)
Examples: frame(back:5m), frame(back:1h), frame(back:24h)
NOT supported: Row-based frames (ROWS N PRECEDING in SQL)

Limitations and Workarounds

❌ Limitation 1: No Row-Based Frames

Problem: Cannot specify "last 3 rows" or "next 5 rows".

SQL Syntax (Does NOT Work):

ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
ROWS 5 FOLLOWING

OPAL Reality:

Only time-based frames: frame(back:duration)
No row-count-based frames

Workaround:

Use fixed time window: frame(back:15m) includes ~3 rows at 5m align
Use multiple lag() calls: lag(col, 1), lag(col, 2), lag(col, 3)

❌ Limitation 2: No Cumulative Sums

Problem: Cannot create running totals using window functions.

SQL Syntax (Does NOT Work):

SUM(val) OVER (ORDER BY time ROWS UNBOUNDED PRECEDING)

OPAL Behavior:

window(sum(val), order_by(time))  # Returns total, NOT cumulative

Why: Without frame, sums entire partition. With frame(back:duration), sums time window.

Workaround:

Use subqueries for cumulative calculations
Process outside OPAL (in application layer)
Request feature from Observe team

Example - Partial Workaround:

# NOT cumulative, but moving 1-hour sum
window(sum(val), group_by(service), order_by(time), frame(back:1h))

❌ Limitation 3: No OVER Clause

Problem: SQL OVER clause syntax fails in OPAL.

SQL Syntax (Does NOT Work):

SUM(val) OVER (PARTITION BY service ORDER BY time)
LAG(val, 1) OVER (PARTITION BY service)

Error: Unknown function 'over()'

Fix: Always use window() function wrapper.

Correct OPAL:

window(sum(val), group_by(service), order_by(time))
window(lag(val, 1), group_by(service))

Decision Tree: Which Window Function?

┌─────────────────────────────────────┐
│ What do you need to calculate?     │
└─────────────────────────────────────┘
           │
           ├─ Compare to previous/next row?
           │  └─> Use: lag() or lead()
           │      Examples:
           │      - Spike detection (current vs previous)
           │      - Rate of change (delta between buckets)
           │      - Consecutive violations (check last N via multiple lags)
           │
           ├─ Assign rankings or row numbers?
           │  └─> Use: row_number(), rank(), or dense_rank()
           │      Choose based on tie handling:
           │      - Unique IDs → row_number()
           │      - Olympic ranking (gaps after ties) → rank()
           │      - Continuous numbering (no gaps) → dense_rank()
           │
           ├─ Calculate over time window?
           │  └─> Use: avg/sum/min/max + frame(back:duration)
           │      Examples:
           │      - Moving average baseline
           │      - Rolling sum (hourly totals)
           │      - Smoothing spiky metrics
           │
           └─ Get first/last value in window?
              └─> Use: first() or last()
                  - Entire partition: omit frame()
                  - Sliding window: use frame(back:duration)
                  Examples:
                  - Compare to start of day (first in 24h window)
                  - Detect if returned to baseline

Performance Considerations

Window Function Performance

Fast ✅:

lag() and lead() with single offset
row_number(), rank(), dense_rank() within small partitions
Time-based frames with reasonable durations (<1h)

Slower ⚠️:

Large partitions (millions of rows per service)
Very long time windows (back:30d over large datasets)
Multiple window functions in single query

Optimization Tips:

Filter Before Windowing:

# GOOD: Filter first, then window
filter service_name = "frontend"
| make_col prev:window(lag(rate, 1), group_by(service_name))

# SLOW: Window on all services, then filter
make_col prev:window(lag(rate, 1), group_by(service_name))
| filter service_name = "frontend"

Use Appropriate Frame Duration:

# GOOD: 30m for 5m buckets = ~6 rows
frame(back:30m)

# SLOW: 24h for 5m buckets = ~288 rows
frame(back:24h)

Limit Output Early:

# GOOD: Filter anomalies first
filter pct_change > 100
| limit 100

# SLOW: Calculate all, limit at end
make_col pct_change:...
| limit 100

Common Mistakes and Fixes

Mistake 1: Using SQL OVER Syntax

❌ WRONG:

make_col prev:lag(rate, 1) over (partition by service_name)

Error: Unknown function 'over()'

✅ FIX:

make_col prev:window(lag(rate, 1), group_by(service_name))

Mistake 2: Forgetting group_by()

❌ WRONG:

make_col prev:window(lag(rate, 1))

Error: Undefined behavior or incorrect partitioning

✅ FIX:

# Partition by service
make_col prev:window(lag(rate, 1), group_by(service_name))

# Single partition (all rows together)
make_col prev:window(lag(rate, 1), group_by())

Mistake 3: Expecting Cumulative Sum

❌ WRONG EXPECTATION:

window(sum(value), order_by(time))  # Expecting running total

Reality: Returns total sum of entire partition for ALL rows.

✅ WORKAROUND:

# Use moving sum instead (not cumulative, but windowed)
window(sum(value), group_by(service), order_by(time), frame(back:1h))

Mistake 4: Using Row-Based Frame Syntax

❌ WRONG:

frame(rows: 3 preceding)              # No rows parameter
frame(between: 3 preceding and current)  # No SQL syntax

✅ FIX:

frame(back:15m)  # Time-based only

Cross-References

Related Skills:

detecting-anomalies - Uses lag() for rate-of-change detection
aggregating-gauge-metrics - Metrics aggregation before windowing
time-series-analysis - Temporal analysis with timechart
subquery-patterns-and-union - Advanced multi-stage calculations

Common Workflows:

Spike Detection: aggregating-gauge-metrics → window-functions (lag) → detecting-anomalies
Top-N Dashboards: aggregating-gauge-metrics → window-functions (rank) → filtering
Baseline Comparison: aggregating-gauge-metrics → window-functions (frame) → anomaly filtering
Trend Analysis: time-series-analysis → window-functions (moving avg) → visualization

Examples Summary

Example 1: Rate of Change Detection

Use Case: Find services with >100% request rate increase

align 5m, rate:sum(m("span_call_count_5m"))
| make_col prev:window(lag(rate, 1), group_by(service_name))
| make_col pct_change:if(prev > 0, (rate - prev) / prev * 100, 0)
| filter pct_change > 100

Example 2: Top 10 Services Per Time Bucket

Use Case: Busiest services in each 5-minute window

align 5m, rate:sum(m("span_call_count_5m"))
| make_col rank:window(rank(), group_by(_c_bucket), order_by(desc(rate)))
| filter rank <= 10

Example 3: Anomaly Detection with Moving Baseline

Use Case: Alert when current exceeds 30m average by 50%

align 5m, rate:sum(m("span_call_count_5m"))
| make_col baseline:window(avg(rate), group_by(service_name), order_by(asc(valid_from)), frame(back:30m))
| make_col anomaly:if(rate > baseline * 1.5, true, false)
| filter anomaly = true

Example 4: Smoothed Metric for Dashboards

Use Case: 15-minute moving average for charts

align 1m, latency:percentile(duration_ms, 0.95)
| make_col smooth_p95:window(avg(latency), group_by(service_name), order_by(asc(valid_from)), frame(back:15m))
| pick_col valid_from, valid_to, service_name, smooth_p95

Example 5: Detect Sustained Violations

Use Case: Alert if SLO violated for 3+ consecutive buckets

align 5m, p95:percentile(duration_ms, 0.95)
| make_col prev1:window(lag(p95, 1), group_by(service_name))
| make_col prev2:window(lag(p95, 2), group_by(service_name))
| make_col sustained:if(p95 > 500 and prev1 > 500 and prev2 > 500, true, false)
| filter sustained = true

Quick Syntax Reference Card

Function	Basic Syntax	With Frame
lag	`window(lag(col, 1), group_by(dim))`	N/A
lead	`window(lead(col, 1), group_by(dim))`	N/A
row_number	`window(row_number(), group_by(dim), order_by(col))`	N/A
rank	`window(rank(), group_by(dim), order_by(col))`	N/A
dense_rank	`window(dense_rank(), group_by(dim), order_by(col))`	N/A
avg	`window(avg(col), group_by(dim))`	`window(avg(col), group_by(dim), order_by(time), frame(back:30m))`
sum	`window(sum(col), group_by(dim))`	`window(sum(col), group_by(dim), order_by(time), frame(back:1h))`
first	`window(first(col), group_by(dim), order_by(time))`	`window(first(col), group_by(dim), order_by(time), frame(back:24h))`
last	`window(last(col), group_by(dim), order_by(time))`	`window(last(col), group_by(dim), order_by(time), frame(back:24h))`

Remember: ALWAYS use window() wrapper. NEVER use SQL OVER clause.

Install Skill

SKILL.md

Window Functions Deep Dive

Overview

CRITICAL: OPAL vs SQL Syntax

Window Function Categories

1. Offset Functions (lag/lead)

2. Ranking Functions

3. Aggregate Functions with Time Windows

4. Value Functions (first/last)

Pattern 1: Rate of Change Detection (lag/lead)

Pattern 2: Top-N Per Time Bucket (Ranking)

Pattern 3: Moving Average Baseline

Pattern 4: Detect Return to Baseline

Pattern 5: Smooth Noisy Metrics

Pattern 6: Consecutive Threshold Violations

Common Window Function Patterns

Quick Reference

window() Syntax Reference

Complete Syntax

Parameters

Limitations and Workarounds

❌ Limitation 1: No Row-Based Frames

❌ Limitation 2: No Cumulative Sums

❌ Limitation 3: No OVER Clause

Decision Tree: Which Window Function?

Performance Considerations

Window Function Performance

Common Mistakes and Fixes

Mistake 1: Using SQL OVER Syntax

Mistake 2: Forgetting group_by()

Mistake 3: Expecting Cumulative Sum

Mistake 4: Using Row-Based Frame Syntax

Cross-References

Examples Summary

Example 1: Rate of Change Detection

Example 2: Top 10 Services Per Time Bucket

Example 3: Anomaly Detection with Moving Baseline

Example 4: Smoothed Metric for Dashboards

Example 5: Detect Sustained Violations

Quick Syntax Reference Card