| name | working-with-intervals |
| description | Work with Interval datasets (time-bounded data) using OPAL. Use when analyzing data with start and end timestamps like distributed traces, batch jobs, or CI/CD pipeline runs. Covers duration calculations, temporal filtering, and aggregating by time properties. Intervals are immutable completed activities with two timestamps, distinct from Events (single timestamp) and Resources (mutable state). |
Working with Intervals
Interval datasets represent time-bounded, immutable activities with start and end timestamps. This skill teaches the core patterns for working with any interval data in OPAL.
When to Use This Skill
- Working with data that has both start_time and end_time fields
- Calculating and analyzing durations (how long things took)
- Filtering by temporal properties (when something started, ended, or its duration)
- Understanding the difference between Intervals, Events, and Resources
- Querying any time-bounded activity data
Prerequisites
- Access to Observe tenant via MCP
- Understanding that Intervals have TWO timestamps: usually
start_timeandend_time - Dataset with Interval type (check via
discover_context())
Key Concepts
What Are Interval Datasets?
Interval datasets represent completed activities with defined start and end points:
Temporal Structure: Two timestamps
start_time,ValidFrom,eventStart, etc - When the activity beganend_time,ValidTo,eventEnd, etc - When the activity ended
Mutability: Immutable once ended
- Once an interval ends, it never changes
- Represents a completed activity
Common Examples:
- Distributed trace spans (HTTP requests, database queries, RPC calls)
- Batch job runs (ETL jobs, data processing tasks)
- CI/CD pipeline executions (build steps, deployment stages)
- Process lifetimes (container start to stop)
Dataset Type Comparison
| Type | Timestamps | Mutability | Example |
|---|---|---|---|
| Event | Single (timestamp) |
Immutable | Log entry, audit event |
| Interval | Two (start_time, end_time) |
Immutable | Span, batch job, session |
| Resource | Two (Valid From, Valid To) |
Mutable | K8s pod, service state |
Critical distinction:
- Intervals = completed activities that happened (immutable)
- Resources = evolving entity state (mutable, receives updates)
Duration in OPAL
OPAL stores durations as a duration type (internally nanoseconds).
Duration unit conversions:
duration / 1ms # Convert to milliseconds
duration / 1s # Convert to seconds
duration / 1m # Convert to minutes
duration / 1h # Convert to hours
No manual math needed - OPAL handles the conversion.
Discovery Workflow
Step 1: Find interval datasets
discover_context("span trace") # For distributed traces
discover_context("batch job") # For batch processing
discover_context("pipeline run") # For CI/CD data
Step 2: Get detailed schema
discover_context(dataset_id="YOUR_DATASET_ID")
Step 3: Verify interval structure Look for:
start_timefieldend_timefielddurationfield (usually present)- Descriptive fields for grouping
Basic Patterns
Pattern 1: Calculate Duration Statistics
Get overall duration statistics:
make_col dur_ms:duration / 1ms
| statsby count:count(),
avg_dur:avg(dur_ms),
min_dur:min(dur_ms),
max_dur:max(dur_ms)
Output: Single row with duration statistics in milliseconds.
Pattern 2: Duration Percentiles
Understand duration distribution:
make_col dur_sec:duration / 1s
| statsby count:count(),
p50:percentile(dur_sec, 0.50),
p95:percentile(dur_sec, 0.95),
p99:percentile(dur_sec, 0.99)
Output: Percentiles show distribution better than averages.
Why percentiles: Less affected by outliers, align with SLO definitions.
Pattern 3: Filter by Duration
Find long-running activities:
make_col dur_min:duration / 1m
| filter dur_min > 5
| statsby long_running:count()
Output: Count of intervals exceeding 5 minutes.
Pattern 4: Group by Duration Range
Categorize intervals by duration:
make_col dur_ms:duration / 1ms
| make_col category:if(dur_ms < 100, "fast",
if(dur_ms < 1000, "medium", "slow"))
| statsby count:count(), group_by(category)
Output: Count in each duration category.
Note: Must use separate make_col statements - can't reference newly created column in same statement.
Pattern 5: Aggregate by Grouping Field
Compare durations across categories:
make_col dur_sec:duration / 1s
| statsby count:count(),
avg:avg(dur_sec),
p95:percentile(dur_sec, 0.95),
group_by(service_name)
| sort desc(p95)
| limit 10
Output: Duration metrics per service, sorted by p95.
Portable note: Replace service_name with your dataset's grouping field.
Common Use Cases
Finding Slowest Individual Intervals
make_col dur_sec:duration / 1s
| sort desc(dur_sec)
| limit 20
Use case: Identify specific slow instances for investigation.
Duration Distribution Analysis
make_col dur_ms:duration / 1ms
| statsby count:count(),
p50:percentile(dur_ms, 0.50),
p75:percentile(dur_ms, 0.75),
p90:percentile(dur_ms, 0.90),
p95:percentile(dur_ms, 0.95),
p99:percentile(dur_ms, 0.99)
Use case: Understand full duration distribution for SLO analysis.
Filtering by Temporal Window
Find intervals that started in a specific time range (use MCP tool call time range)
make_col dur_min:duration / 1m
| statsby count:count(), avg_dur:avg(dur_min)
Use case: Analyze intervals from specific time period (incident investigation).
Combining Duration and Other Filters
make_col dur_sec:duration / 1s
| filter dur_sec > 10
| filter environment = "production"
| statsby count:count(), group_by(job_type)
Use case: Complex filtering by duration AND attributes.
Complete Example
Scenario: You have CI/CD pipeline run data and want to identify slow build stages.
Step 1: Discover dataset
discover_context("pipeline build")
Found dataset ID: 12345678 with fields:
start_time,end_time,durationstage_name(build, test, deploy)pipeline_id
Step 2: Analyze stage durations
make_col dur_min:duration / 1m
| statsby run_count:count(),
avg_min:avg(dur_min),
p95_min:percentile(dur_min, 0.95),
max_min:max(dur_min),
group_by(stage_name)
| sort desc(p95_min)
Step 3: Interpret results
| stage_name | run_count | avg_min | p95_min | max_min |
|---|---|---|---|---|
| build | 150 | 3.2 | 5.8 | 12.3 |
| test | 150 | 8.5 | 15.2 | 22.1 |
| deploy | 150 | 2.1 | 3.5 | 6.8 |
Insight: Test stage has highest p95 (15.2 min) and should be optimized.
Common Pitfalls
Pitfall 1: Referencing Column in Same make_col
❌ Wrong:
make_col dur_ms:duration / 1ms,
is_slow:if(dur_ms > 1000, 1, 0)
✅ Correct:
make_col dur_ms:duration / 1ms
| make_col is_slow:if(dur_ms > 1000, 1, 0)
Why: OPAL processes columns in order - can't reference column being created in same statement.
Pitfall 2: Using Averages Instead of Percentiles
❌ Wrong (for duration analysis):
statsby avg_dur:avg(duration / 1s)
✅ Correct:
make_col dur_sec:duration / 1s
| statsby p50:percentile(dur_sec, 0.50),
p95:percentile(dur_sec, 0.95),
p99:percentile(dur_sec, 0.99)
Why: Averages are skewed by outliers. Percentiles show true distribution.
Pitfall 3: Confusing Intervals with Events
❌ Wrong assumption:
Intervals have a single timestamp field
✅ Correct understanding:
Intervals always have start_time AND end_time (two timestamps)
Events have a single timestamp
Why: Different dataset types have different temporal structures.
Pitfall 4: Confusing Intervals with Resources
❌ Wrong assumption:
Intervals track state changes over time
✅ Correct understanding:
Intervals = immutable completed activities
Resources = mutable state tracking
Why: Once an interval ends, it never changes. Resources receive updates and track evolving state.
Duration Unit Reference
Common duration conversions:
make_col dur_ms:duration / 1ms # Milliseconds
make_col dur_sec:duration / 1s # Seconds
make_col dur_min:duration / 1m # Minutes
make_col dur_hr:duration / 1h # Hours
You can also combine units:
make_col dur_mins:duration / 1m
| filter dur_mins > 30 # Over 30 minutes
Best Practices
- Convert duration for human readability
- Use percentiles not averages for duration analysis
- Separate make_col statements when referencing derived columns
- Choose appropriate time units (ms for fast operations, min/hr for long-running)
- Filter by duration thresholds to focus on interesting cases
- Include count in aggregations to understand volume
- Sort by meaningful metrics (p95, p99) in addition to averages
- Use limit or topk to avoid overwhelming results
Related Skills
- filtering-event-datasets - For single-timestamp log data (Events)
- aggregating-event-datasets - For summarizing event data with statsby
- time-series-analysis - For trending intervals over time with timechart
- working-with-resources - For mutable state tracking (Resources)
Summary
Intervals represent time-bounded, immutable activities:
- Two timestamps: start_time and end_time
- Duration field: Automatically calculated
- Immutable: Once ended, never change
- Core operation: Convert duration using
/ 1ms,/ 1s,/ 1m,/ 1h - Analysis: Use percentiles (p50, p95, p99) for distribution understanding
Key distinction: Intervals are completed activities (happened once, done), not evolving state (Resources) or point-in-time occurrences (Events).
Last Updated: November 14, 2025 Version: 2.0 Tested With: Observe OPAL (OpenTelemetry Span dataset)