Claude Code Plugins

Community-maintained marketplace

Feedback

working-with-reference-tables

@rustomax/observe-community-mcp
1
0

Work with Reference Tables (static CSV lookup data) using OPAL to enrich datasets with descriptive information. Use when you need to map IDs to human-readable names, add static metadata from CSV uploads, or perform lookups without temporal considerations. Covers both explicit and implicit lookup patterns, column name matching, and when to choose Reference Tables vs Resources vs Correlation Tags.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name working-with-reference-tables
description Work with Reference Tables (static CSV lookup data) using OPAL to enrich datasets with descriptive information. Use when you need to map IDs to human-readable names, add static metadata from CSV uploads, or perform lookups without temporal considerations. Covers both explicit and implicit lookup patterns, column name matching, and when to choose Reference Tables vs Resources vs Correlation Tags.

Working with Reference Tables

Work with Reference Tables (static lookup data) using OPAL to enrich datasets with descriptive information. Reference Tables store static mappings (max 10MB CSV) that don't track changes over time, providing an alternative to Resources when no temporal aspect is needed.

Use when you need to:

  • Map IDs to human-readable names (product IDs → product names, error codes → descriptions)
  • Enrich logs/spans with static metadata
  • Add dimension data from CSV uploads
  • Lookup values without temporal considerations

Covers reference table fundamentals, both explicit and implicit lookup patterns, and when to choose Reference Tables vs Resources vs Correlation Tags.

Key Concepts

What Are Reference Tables?

Reference Tables are static lookup datasets created from CSV uploads:

  • No timestamps - Static data
  • No change tracking - Subsequent uploads of the same table overwrite previous state
  • Max 10MB CSV file size
  • Uploaded manually via Observe UI or API
  • Primary key column for joining
  • Value columns with descriptive data

Example: Product reference table

app_product_id,app_product_name
"OLJCESPC7Z","National Park Foundation Explorascope"
"L9ECAV7KIM","Lens Cleaning Kit"
"6E92ZMYYFZ","Solar Filter"

When to Use Reference Tables

  • Data is static (doesn't change over time, or if it does change, the state tracking is not required)
  • Need simple ID-to-name mappings
  • No temporal aspect required
  • Dataset size under 10MB

Lookup Patterns

Reference Tables support two lookup approaches: explicit join (recommended) and implicit join (requires column name matching).

Pattern A: Explicit Lookup (Recommended)

Most flexible - specify join condition directly without column name matching:

# Join Product Logs with Product reference table using alias
lookup @product.app_product_id=product_id, pid:product_id, product_name:@product.app_product_name
| statsby count(), group_by(product_name)
| topk 10, max(product_name)

How it works:

  1. Use alias for reference table: @product
  2. Specify join condition: @product.app_product_id=product_id
  3. Select columns to retrieve: pname:@product.app_product_name
  4. No need to match column names!

MCP Parameters:

{
  "primary_dataset_id": "42782295",
  "secondary_dataset_ids": ["42782294"],
  "dataset_aliases": {"product": "42782294"}
}

Key advantages:

  • Column names don't need to match
  • Clear and explicit join condition
  • Full control over retrieved columns
  • No extra make_col needed

Pattern B: Implicit Lookup (Column Name Matching)

Requires exact column name matching - simpler syntax but less flexible:

make_col app_product_id:product_id
| lookup @product_ref
| make_col pid:product_id, pname:app_product_name
| limit 10

How it works:

  1. Reference table has primary key app_product_id
  2. Source dataset creates matching column: make_col app_product_id:product_id
  3. lookup automatically joins on matching column name
  4. All reference table columns added to result

MCP Parameters:

{
  "primary_dataset_id": "42782295",
  "secondary_dataset_ids": ["42782294"],
  "dataset_aliases": {"product_ref": "42782294"}
}

Why implicit requires matching:

❌ WRONG - Mismatched names fail:

lookup @product_ref  # Source has 'product_id', reference has 'app_product_id'

Error: "implicit lookup requires all primary key columns in the other dataset to match columns in the source dataset; missing columns from the source dataset: app_product_id"

✅ CORRECT - Create matching column first:

make_col app_product_id:product_id
| lookup @product_ref

When to use implicit:

  • Simple joins where column names already match
  • When you want automatic inclusion of all reference columns
  • Legacy queries or established patterns

When to use explicit (Pattern A):

  • Column names don't match (most common!)
  • Want control over which columns to retrieve
  • Clearer, more maintainable queries

Pattern C: Using on() with Column Bindings

Full control over join conditions and column selection:

# Join with explicit on() syntax
lookup on(product_id=@product.app_product_id), product_name:@product.app_product_name
| statsby count(), group_by(product_name)

Use case: Complex join conditions or when you need precise control over the join and column bindings

Lookup Behavior

  • Join type: Left outer join (keeps all rows)
  • No match: Reference table columns are NULL
  • Multiple matches: Returns all matching rows (Cartesian product)
  • Performance: Fast for small reference tables (<10MB)

Common Patterns

Pattern: Enrich with Descriptive Names (Explicit Join)

# Using explicit lookup - no column name matching needed!
lookup @product.app_product_id=product_id, pname:@product.app_product_name
| filter not is_null(pname)
| make_col pid:product_id, name:pname, service:container
| limit 20

Use case: Add human-readable product names to logs

Result: Logs with "National Park Foundation Explorascope" instead of "OLJCESPC7Z"

Alternative (implicit join):

make_col app_product_id:product_id
| lookup @product_ref
| filter not is_null(app_product_name)
| make_col pid:product_id, name:app_product_name, service:container
| limit 20

Pattern: Aggregate with Reference Data (Explicit Join)

# Using explicit lookup - cleaner and more maintainable
lookup @product.app_product_id=product_id, pname:@product.app_product_name
| statsby log_count:count(), group_by(pname)
| sort desc(log_count)

Use case: Count events by descriptive name

Result:

pname,log_count
"National Park Foundation Explorascope",864

Alternative (implicit join):

make_col app_product_id:product_id
| lookup @product_ref
| statsby log_count:count(), group_by(app_product_name)
| sort desc(log_count)

Pattern: Browse Reference Table Contents

make_col id:app_product_id, name:app_product_name
| limit 50

Dataset: Query reference table directly (use reference table as primary_dataset_id)

Use case: See available lookup values

Result: Complete list of products in reference table

Pattern: Handle Missing Lookups (Explicit Join)

# Using explicit lookup
lookup @product.app_product_id=product_id, pname:@product.app_product_name
| make_col pid:product_id,
          name:if(is_null(pname), "Unknown Product", pname)
| limit 10

Use case: Provide default value when reference lookup fails

Behavior: Shows "Unknown Product" when pname is NULL

Pattern: Filter to Matched Rows Only (Explicit Join)

# Using explicit lookup
lookup @product.app_product_id=product_id, pname:@product.app_product_name
| filter not is_null(pname)
| make_col pid:product_id, name:pname
| limit 20

Use case: Exclude rows without reference table matches

Behavior: Only returns rows with successful lookups

Troubleshooting

Issue: "Missing columns from source dataset"

Error: "implicit lookup requires all primary key columns in the other dataset to match columns in the source dataset; missing columns from the source dataset: app_product_id"

Cause: Source dataset doesn't have column matching reference table's primary key

Solution: Create matching column with make_col:

make_col app_product_id:product_id
| lookup @product_ref

Key insight: Column names must match exactly (case-sensitive!)

Issue: All reference columns are NULL

Cause: No matches found (lookup is left outer join)

Diagnosis: Check if join values actually exist in reference table:

filter app_product_id = "OLJCESPC7Z"
| limit 1

(Query reference table directly to verify value exists)

Solutions:

  1. Verify product_id values in source match app_product_id in reference
  2. Check for typos or case sensitivity issues
  3. Ensure reference table uploaded correctly

Issue: "Implicit lookup does not support additional arguments"

Error: "implicit lookup does not support additional arguments like explicit join predicates or column bindings"

Cause: Mixing implicit and explicit syntax:

lookup @product_ref on product_id = @product_ref.app_product_id  ❌

Solution Option 1: Use explicit lookup (recommended):

lookup @product.app_product_id=product_id, pname:@product.app_product_name  ✅

Solution Option 2: Use implicit lookup with column name matching:

make_col app_product_id:product_id
| lookup @product_ref  ✅

Key Takeaways

  1. Reference Tables are for static CSV data (no timestamps, no change tracking)
  2. Three lookup patterns available:
    • Explicit (recommended): lookup @product.app_product_id=product_id, pname:@product.app_product_name
    • Implicit: Requires exact column name matching via make_col
    • on() syntax: Full control with lookup on(condition), bindings
  3. Explicit lookup advantages: No column name matching needed, clearer syntax, full control over retrieved columns
  4. Left outer join behavior - keeps all rows, NULL when no match
  5. Use for static enrichment - alternative to Resources when no temporal aspect needed
  6. 10MB size limit per reference table
  7. Fast and efficient for small lookup datasets
  8. Column matching is case-sensitive when using implicit lookup
  9. Use is_null() to check for failed lookups - provides default values or filters unmatched rows
  10. Query reference tables directly to browse available lookup values

When to Choose Reference Tables vs Resources

Scenario Use Reference Tables Use Resources
Static data that doesn't change
Data changes over time and you need state history
Simple ID-to-name mappings
Need temporal joins with Events/Intervals
CSV upload (max 10MB)
Track mutable state evolution
Fast lookups without timestamps

References

  • Reference Tables created via CSV upload in Observe UI
  • Use lookup verb with @ alias for joining
  • Explicit lookup: @alias.ref_column=source_column, result:@alias.value_column
  • Implicit lookup: Requires matching column names (automatic join)
  • on() syntax: lookup on(condition), column_bindings
  • Maximum 10MB CSV size per table
  • Interface type typically shows as "unknown" in discovery