| name | observability-patterns |
| description | Comprehensive observability setup patterns for Google ADK agents including logging configuration, Cloud Trace integration, BigQuery Agent Analytics, and third-party observability tools (AgentOps, Phoenix, Weave). Use when implementing monitoring, debugging agent behavior, analyzing agent performance, setting up tracing, or when user mentions observability, logging, tracing, BigQuery analytics, AgentOps, Phoenix, Arize, or Weave. |
| allowed-tools | Bash, Read, Write, Edit |
Observability Patterns Skill
This skill provides comprehensive templates and configurations for implementing observability in Google ADK agents. Includes logging, tracing, BigQuery analytics, Cloud Trace integration, and third-party observability platforms.
Overview
Google ADK supports multiple observability approaches for monitoring, debugging, and analyzing agent behavior:
- Cloud Trace - Google Cloud native tracing with OpenTelemetry
- BigQuery Agent Analytics - Comprehensive event logging and analysis
- AgentOps - Session replays and unified tracing analytics
- Phoenix (Arize) - Open-source observability with self-hosted control
- Weave (W&B) - Weights & Biases platform for tracking and visualization
This skill covers production-ready observability implementations with security and scalability.
Available Scripts
1. Setup Cloud Trace
Script: scripts/setup-cloud-trace.sh <project-id>
Purpose: Configures Cloud Trace integration for ADK agents
Parameters:
project-id- Google Cloud project ID (required)
Usage:
# Setup Cloud Trace for local development
./scripts/setup-cloud-trace.sh my-project-id
# Setup with ADK CLI deployment
adk deploy agent_engine --project=my-project-id --trace_to_cloud ./agent
Environment Variables:
GOOGLE_CLOUD_PROJECT- Project ID for Cloud TraceGOOGLE_APPLICATION_CREDENTIALS- Path to service account key
Output: Cloud Trace enabled, traces visible in console.cloud.google.com
2. Setup BigQuery Agent Analytics
Script: scripts/setup-bigquery-analytics.sh <project-id> <dataset-id> [bucket-name]
Purpose: Configures BigQuery Agent Analytics plugin for comprehensive event logging
Parameters:
project-id- Google Cloud project ID (required)dataset-id- BigQuery dataset name (required)bucket-name- GCS bucket for multimodal content (optional)
Usage:
# Setup basic BigQuery analytics
./scripts/setup-bigquery-analytics.sh my-project agent-analytics
# Setup with GCS for multimodal content
./scripts/setup-bigquery-analytics.sh my-project agent-analytics my-content-bucket
# Create dataset and table
bq mk --dataset my-project:agent-analytics
bq mk --table agent-analytics.agent_events_v2 templates/bigquery-schema.json
IAM Requirements:
roles/bigquery.jobUser- Required for BigQuery operationsroles/bigquery.dataEditor- Required for writing dataroles/storage.objectCreator- Required if using GCS offloading
Output: BigQuery table created, events streaming to dataset
3. Setup AgentOps
Script: scripts/setup-agentops.sh
Purpose: Configures AgentOps integration for session replays and metrics
Usage:
# Install AgentOps
pip install -U agentops
# Setup with API key
AGENTOPS_API_KEY=your_api_key_here ./scripts/setup-agentops.sh
# Verify setup
python -c "import agentops; agentops.init(); print('AgentOps ready')"
Environment Variables:
AGENTOPS_API_KEY- AgentOps API key from app.agentops.ai/settings/projects
Output: AgentOps initialized, sessions visible in dashboard
4. Setup Phoenix
Script: scripts/setup-phoenix.sh
Purpose: Configures Phoenix (Arize) integration for open-source observability
Usage:
# Install Phoenix packages
pip install openinference-instrumentation-google-adk arize-phoenix-otel
# Setup Phoenix with API key
PHOENIX_API_KEY=your_key_here \
PHOENIX_COLLECTOR_ENDPOINT=https://app.phoenix.arize.com/s/your-space \
./scripts/setup-phoenix.sh
# Verify Phoenix connection
python scripts/verify-phoenix.py
Environment Variables:
PHOENIX_API_KEY- Phoenix API key from phoenix.arize.comPHOENIX_COLLECTOR_ENDPOINT- Phoenix collector endpoint URL
Output: Phoenix tracer initialized, traces visible in Phoenix dashboard
5. Setup Weave
Script: scripts/setup-weave.sh <entity> <project>
Purpose: Configures Weave (W&B) integration for observability
Parameters:
entity- W&B entity name (visible in Teams sidebar)project- W&B project name
Usage:
# Install Weave dependencies
pip install opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
# Setup Weave with API key
WANDB_API_KEY=your_wandb_key_here ./scripts/setup-weave.sh my-team my-project
# Verify Weave connection
python scripts/verify-weave.py
Environment Variables:
WANDB_API_KEY- W&B API key from wandb.ai/authorize
Output: Weave tracer initialized, traces visible in Weave dashboard
6. Validate Observability Setup
Script: scripts/validate-observability.sh
Purpose: Validates observability configuration and connectivity
Checks:
- Cloud Trace connectivity
- BigQuery dataset and table existence
- AgentOps initialization
- Phoenix endpoint reachability
- Weave endpoint reachability
- IAM permissions
- Environment variables set
Usage:
# Validate all observability configurations
./scripts/validate-observability.sh
# Validate specific tool
./scripts/validate-observability.sh --tool=bigquery
./scripts/validate-observability.sh --tool=cloud-trace
./scripts/validate-observability.sh --tool=agentops
Exit Codes:
0- All checks passed1- Configuration missing2- Connectivity failed3- Permission issues
Available Templates
1. Cloud Trace Configuration
Template: templates/cloud-trace-config.py
Purpose: Cloud Trace integration for ADK agents
Features:
- OpenTelemetry configuration
- Automatic span creation for agent runs
- LLM and tool call tracing
- Error and latency tracking
Usage:
# Enable Cloud Trace via ADK CLI
adk deploy agent_engine --project=$GOOGLE_CLOUD_PROJECT --trace_to_cloud ./agent
# Or via Python SDK
from google.adk.app import AdkApp
app = AdkApp(
agent=my_agent,
enable_tracing=True
)
Span Labels:
invocation- Top-level agent invocationagent_run- Individual agent executioncall_llm- LLM API callsexecute_tool- Tool executions
2. BigQuery Analytics Configuration
Template: templates/bigquery-analytics-config.py
Purpose: Complete BigQuery Agent Analytics plugin configuration
Features:
- Asynchronous event logging
- Multimodal content with GCS offloading
- OpenTelemetry-style tracing (trace_id, span_id)
- Event filtering and batching
- Custom content formatting
Usage:
from google.adk.plugins.bigquery_agent_analytics_plugin import (
BigQueryAgentAnalyticsPlugin, BigQueryLoggerConfig
)
bq_config = BigQueryLoggerConfig(
enabled=True,
gcs_bucket_name="your-bucket-name",
max_content_length=500 * 1024, # 500KB inline limit
batch_size=1, # Low latency
event_allowlist=["LLM_RESPONSE", "TOOL_COMPLETED"]
)
plugin = BigQueryAgentAnalyticsPlugin(
project_id="your-project-id",
dataset_id="your-dataset-id",
config=bq_config
)
app = App(root_agent=agent, plugins=[plugin])
Configuration Options:
enabled- Toggle logging on/offgcs_bucket_name- GCS bucket for large contentmax_content_length- Inline text limit (default 500KB)batch_size- Events per write (default 1)event_allowlist- Whitelist specific event typesevent_denylist- Blacklist specific event typescontent_formatter- Custom formatting function
3. BigQuery Schema
Template: templates/bigquery-schema.json
Purpose: BigQuery table schema for agent_events_v2
Schema Fields:
timestamp- Event recording timeevent_type- Event category (LLM_REQUEST, TOOL_STARTING, etc.)content- Event-specific JSON payloadcontent_parts- Structured multimodal datatrace_id- OpenTelemetry trace IDspan_id- OpenTelemetry span IDagent- Agent nameuser_id- User identifier
Partitioning: By DATE(timestamp) for cost optimization
Clustering: By event_type, agent, user_id for query performance
4. AgentOps Configuration
Template: templates/agentops-config.py
Purpose: AgentOps integration for session replays
Features:
- Minimal two-line integration
- Hierarchical span visualization
- LLM call tracking with prompts and completions
- Token count and latency metrics
- Cost tracking
Usage:
import agentops
# Initialize AgentOps (before ADK imports)
agentops.init()
# Your ADK agent code
from google.adk.app import App
app = App(root_agent=my_agent)
Span Hierarchy:
- Agent spans: Named
adk.agent.{AgentName} - LLM spans: Capture prompts, completions, tokens
- Tool spans: Record parameters and results
5. Phoenix Configuration
Template: templates/phoenix-config.py
Purpose: Phoenix (Arize) integration for open-source observability
Features:
- Self-hosted data control
- OpenInference instrumentation
- Trace evaluation
- Performance debugging
- Custom evaluators
Usage:
import os
from phoenix.otel import register
# Set Phoenix credentials
os.environ["PHOENIX_API_KEY"] = "your_api_key_here"
os.environ["PHOENIX_COLLECTOR_ENDPOINT"] = "https://app.phoenix.arize.com/s/your-space"
# Register Phoenix tracer
tracer_provider = register(
project_name="my-adk-agent",
auto_instrument=True
)
# Your ADK agent code (Phoenix auto-captures traces)
from google.adk.app import App
app = App(root_agent=my_agent)
Auto-Instrumentation: Phoenix automatically traces all ADK operations
6. Weave Configuration
Template: templates/weave-config.py
Purpose: Weave (W&B) integration for observability
Features:
- Timeline of agent calls
- Tool invocation tracking
- Reasoning process analysis
- Span hierarchy visualization
- Dashboard integration
Usage:
import os
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
import base64
# Setup Weave exporter
wandb_api_key = os.environ["WANDB_API_KEY"]
entity = "your-entity"
project = "your-project"
auth_string = f"api:{wandb_api_key}"
encoded_auth = base64.b64encode(auth_string.encode()).decode()
exporter = OTLPSpanExporter(
endpoint="https://trace.wandb.ai/otel/v1/traces",
headers={
"Authorization": f"Basic {encoded_auth}",
"project_id": f"{entity}/{project}"
}
)
# Configure tracer provider (BEFORE ADK imports)
provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(exporter))
trace.set_tracer_provider(provider)
# Your ADK agent code
from google.adk.app import App
app = App(root_agent=my_agent)
Critical: Set tracer provider before importing ADK components
Available Examples
1. Complete Observability Setup
Example: examples/complete-observability.md
Covers:
- Multi-tool observability setup
- Cloud Trace + BigQuery combination
- Third-party tool integration
- Production deployment patterns
- Cost optimization strategies
Step-by-Step Guide:
- Enable Cloud Trace for distributed tracing
- Configure BigQuery for event logging
- Add AgentOps for session replays
- Optional: Phoenix or Weave for additional insights
- Validate all configurations
- Deploy to production
Production Checklist:
- Cloud Trace enabled in production
- BigQuery dataset created with proper IAM
- GCS bucket configured for multimodal content
- Event filtering configured to control costs
- Alert rules defined for error rates
- Dashboard created for key metrics
- Retention policies set for cost control
2. BigQuery Analytics Queries
Example: examples/bigquery-queries.md
Covers:
- Conversation trace retrieval
- Token usage analysis
- Error rate tracking
- Tool usage statistics
- Performance metrics
- Cost analysis
Query Examples:
-- Retrieve conversation traces
SELECT timestamp, event_type, JSON_VALUE(content, '$.response')
FROM agent_events_v2
WHERE trace_id = 'your-trace-id'
ORDER BY timestamp ASC;
-- Token usage by agent
SELECT
agent,
AVG(CAST(JSON_VALUE(content, '$.usage.total') AS INT64)) as avg_tokens,
SUM(CAST(JSON_VALUE(content, '$.usage.total') AS INT64)) as total_tokens
FROM agent_events_v2
WHERE event_type = 'LLM_RESPONSE'
GROUP BY agent;
-- Error rate by event type
SELECT
event_type,
COUNT(*) as error_count,
DATE(timestamp) as day
FROM agent_events_v2
WHERE event_type LIKE '%ERROR%'
GROUP BY event_type, day
ORDER BY day DESC, error_count DESC;
-- Tool usage frequency
SELECT
JSON_VALUE(content, '$.tool_name') as tool,
COUNT(*) as usage_count
FROM agent_events_v2
WHERE event_type = 'TOOL_COMPLETED'
GROUP BY tool
ORDER BY usage_count DESC;
-- Access multimodal content from GCS
SELECT
part.mime_type,
part.object_ref.uri as gcs_uri
FROM agent_events_v2,
UNNEST(content_parts) AS part
WHERE part.storage_mode = 'GCS_REFERENCE';
3. Multi-Tool Integration
Example: examples/multi-tool-integration.md
Covers:
- Using multiple observability tools together
- Cloud Trace + BigQuery + AgentOps
- Data correlation across platforms
- Tool selection criteria
- Cost vs. insight tradeoffs
Integration Patterns:
Pattern 1: Google Cloud Native
- Cloud Trace for distributed tracing
- BigQuery for detailed event analysis
- Best for: GCP-centric deployments
Pattern 2: Comprehensive Monitoring
- Cloud Trace for infrastructure tracing
- AgentOps for session replays
- BigQuery for analytics
- Best for: Production monitoring with detailed debugging
Pattern 3: Open Source
- Phoenix for self-hosted observability
- BigQuery for long-term storage
- Best for: Data sovereignty requirements
Pattern 4: ML-Focused
- Weave for experiment tracking
- BigQuery for analytics
- Best for: Research and experimentation
4. Production Deployment
Example: examples/production-deployment.md
Covers:
- Production-ready observability configuration
- IAM role setup
- Cost optimization
- Alert configuration
- Dashboard creation
- Incident response
Production Setup:
IAM Configuration:
- Service account with minimal permissions
- Separate dev/staging/prod credentials
- Workload Identity for GKE deployments
Cost Controls:
- Event filtering to reduce BigQuery writes
- GCS lifecycle policies for multimodal content
- Table partitioning and clustering
- Retention policies (30-90 days)
Monitoring:
- Cloud Monitoring alerts for error rates
- BigQuery query dashboard in Looker Studio
- AgentOps session replay for debugging
- Trace analysis for performance issues
Security:
- No credentials in code (environment variables only)
- VPC Service Controls for data protection
- Customer-managed encryption keys (CMEK)
- Audit logging for compliance
Security Compliance
CRITICAL: This skill follows strict security rules:
❌ NEVER hardcode:
- API keys (AgentOps, Phoenix, Weave, W&B)
- Google Cloud credentials
- Service account keys
- OAuth tokens
- BigQuery connection strings
✅ ALWAYS:
- Use environment variables for secrets
- Generate
.env.examplewith placeholders - Add
.env*to.gitignore - Use Google Application Default Credentials
- Document credential acquisition process
- Use IAM roles instead of service account keys when possible
Placeholder format:
# .env.example
GOOGLE_CLOUD_PROJECT=your-project-id
AGENTOPS_API_KEY=your_agentops_key_here
PHOENIX_API_KEY=your_phoenix_key_here
PHOENIX_COLLECTOR_ENDPOINT=https://app.phoenix.arize.com/s/your-space
WANDB_API_KEY=your_wandb_key_here
Progressive Disclosure
This skill provides immediate setup guidance with references to detailed documentation:
- Quick Start: Use setup scripts for immediate configuration
- Production: Reference
production-deployment.mdfor complete guide - Analytics: Use
bigquery-queries.mdfor query templates - Integration: Reference
multi-tool-integration.mdfor advanced patterns
Load additional files only when specific customization is needed.
Common Workflows
1. Local Development Setup
# Enable Cloud Trace for local debugging
export GOOGLE_CLOUD_PROJECT=your-project-id
./scripts/setup-cloud-trace.sh your-project-id
# Start agent with tracing
python my_agent.py
# View traces at console.cloud.google.com/traces
2. Production Deployment with BigQuery
# 1. Create BigQuery dataset
bq mk --dataset my-project:agent-analytics
# 2. Create events table
bq mk --table agent-analytics.agent_events_v2 templates/bigquery-schema.json
# 3. Create GCS bucket for multimodal content
gsutil mb gs://my-agent-content/
# 4. Setup BigQuery analytics
./scripts/setup-bigquery-analytics.sh my-project agent-analytics my-agent-content
# 5. Deploy agent
adk deploy agent_engine --project=my-project ./agent
# 6. Validate setup
./scripts/validate-observability.sh --tool=bigquery
3. Multi-Tool Integration
# 1. Setup Cloud Trace
export GOOGLE_CLOUD_PROJECT=your-project-id
./scripts/setup-cloud-trace.sh your-project-id
# 2. Setup BigQuery Analytics
./scripts/setup-bigquery-analytics.sh your-project agent-analytics my-bucket
# 3. Setup AgentOps
export AGENTOPS_API_KEY=your_key_here
./scripts/setup-agentops.sh
# 4. Validate all configurations
./scripts/validate-observability.sh
Troubleshooting
Cloud Trace Not Showing Traces
Check:
GOOGLE_CLOUD_PROJECTenvironment variable is set- Cloud Trace API is enabled
- Service account has
roles/cloudtrace.agent - Tracer initialized before ADK imports
Debug:
# Check Cloud Trace API status
gcloud services list --enabled | grep cloudtrace
# Enable Cloud Trace API
gcloud services enable cloudtrace.googleapis.com
# Test trace export
python scripts/test-cloud-trace.py
BigQuery Events Not Appearing
Check:
- Dataset and table exist
- Service account has correct IAM roles
- BigQuery API is enabled
- Plugin configuration is correct
- No event filtering blocking events
Debug:
# Check dataset exists
bq ls my-project:
# Check table schema
bq show --schema agent-analytics.agent_events_v2
# Check IAM permissions
gcloud projects get-iam-policy my-project \
--flatten="bindings[].members" \
--filter="bindings.members:serviceAccount:YOUR_SA_EMAIL"
# Test plugin manually
python scripts/test-bigquery-plugin.py
AgentOps Not Capturing Traces
Check:
- AgentOps initialized before ADK imports
- API key is valid
- Network connectivity to app.agentops.ai
- AgentOps package version is latest
Fix:
# Update AgentOps
pip install -U agentops
# Test initialization
python -c "import agentops; agentops.init(); print('Success')"
# Check for conflicts with other tracers
# Ensure AgentOps is initialized first
Phoenix Connection Failed
Check:
- Phoenix API key is valid
- Collector endpoint URL is correct
- Network access to Phoenix endpoint
- Required packages installed
Debug:
# Test Phoenix endpoint
curl -H "Authorization: Bearer YOUR_KEY" \
https://app.phoenix.arize.com/s/YOUR_SPACE
# Verify package versions
pip list | grep -E "(openinference|phoenix)"
# Run verification script
python scripts/verify-phoenix.py
Weave Traces Not Appearing
Check:
- Tracer provider set BEFORE ADK imports
- W&B API key is valid
- Entity and project names are correct
- OTEL exporter configured properly
Fix:
# Verify initialization order
# 1. Import OTEL packages
# 2. Configure and set tracer provider
# 3. THEN import ADK
# Correct order:
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
trace.set_tracer_provider(TracerProvider()) # FIRST
from google.adk.app import App # THEN
Dependencies
Required:
google-adk>=1.21.0- ADK framework (version 1.21.0+ for full BigQuery features)google-cloud-trace>=1.13.0- Cloud Trace client (optional)google-cloud-bigquery>=3.0.0- BigQuery client (optional)
Optional (Third-party tools):
agentops>=0.3.0- AgentOps integrationopeninference-instrumentation-google-adk>=0.1.0- Phoenix instrumentationarize-phoenix-otel>=0.1.0- Phoenix OTEL exporteropentelemetry-sdk>=1.20.0- OpenTelemetry SDK for Weaveopentelemetry-exporter-otlp-proto-http>=1.20.0- OTLP exporter for Weave
Installation:
# Core ADK with Cloud Trace
pip install google-adk google-cloud-trace
# With BigQuery Analytics
pip install google-adk google-cloud-bigquery
# With AgentOps
pip install google-adk agentops
# With Phoenix
pip install google-adk openinference-instrumentation-google-adk arize-phoenix-otel
# With Weave
pip install google-adk opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
# All observability tools
pip install google-adk google-cloud-trace google-cloud-bigquery agentops \
openinference-instrumentation-google-adk arize-phoenix-otel \
opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
Best Practices
- Multi-Layer Observability: Use Cloud Trace for infrastructure, BigQuery for analytics, and AgentOps for debugging
- Cost Control: Implement event filtering and retention policies to manage BigQuery costs
- Security: Never hardcode credentials; use environment variables and IAM roles
- Progressive Rollout: Start with Cloud Trace, add BigQuery when analytics needed
- Tool Selection: Choose tools based on requirements (open-source vs. managed, cost vs. features)
- Data Correlation: Use trace_id across all tools for unified debugging
- Alert Configuration: Set up alerts for error rates, latency spikes, and cost anomalies
- Dashboard Creation: Build custom dashboards in Looker Studio, Grafana, or tool-native UIs
Additional Resources
- Cloud Trace: https://cloud.google.com/trace/docs
- BigQuery Agent Analytics: https://google.github.io/adk-docs/observability/bigquery-agent-analytics/
- AgentOps: https://app.agentops.ai/
- Phoenix (Arize): https://arize.com/docs/phoenix/
- Weave (W&B): https://docs.wandb.ai/weave/
- ADK Observability Guide: https://google.github.io/adk-docs/observability/
- OpenTelemetry: https://opentelemetry.io/docs/
Tool Comparison
| Feature | Cloud Trace | BigQuery | AgentOps | Phoenix | Weave |
|---|---|---|---|---|---|
| Hosting | Google Cloud | Google Cloud | SaaS | SaaS/Self-hosted | SaaS |
| Cost | Free tier + usage | Storage + queries | Free tier + paid | Free tier + paid | Free tier + paid |
| Setup Complexity | Low | Medium | Very Low | Low | Medium |
| Data Control | Google Cloud | Google Cloud | Third-party | Self-host option | Third-party |
| Query Flexibility | Low | Very High | Medium | High | Medium |
| Real-time | Yes | Near real-time | Yes | Yes | Yes |
| Custom Dashboards | Limited | Full (Looker) | Built-in | Built-in | Built-in |
| Best For | Infrastructure tracing | Deep analytics | Quick debugging | Open-source, control | ML experiments |