name	retellai-advanced-troubleshooting
description	Apply Retell AI advanced debugging techniques for hard-to-diagnose issues. Use when standard troubleshooting fails, investigating complex race conditions, or preparing evidence bundles for Retell AI support escalation. Trigger with phrases like "retellai hard bug", "retellai mystery error", "retellai impossible to debug", "difficult retellai issue", "retellai deep debug".
allowed-tools	Read, Grep, Bash(kubectl:), Bash(curl:), Bash(tcpdump:*)
version	1.0.0
license	MIT
author	Jeremy Longshore <jeremy@intentsolutions.io>

Retell AI Advanced Troubleshooting

Overview

Deep debugging techniques for complex Retell AI issues that resist standard troubleshooting.

Prerequisites

Access to production logs and metrics
kubectl access to clusters
Network capture tools available
Understanding of distributed tracing

Evidence Collection Framework

Comprehensive Debug Bundle

#!/bin/bash
# advanced-retellai-debug.sh

BUNDLE="retellai-advanced-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BUNDLE"/{logs,metrics,network,config,traces}

# 1. Extended logs (1 hour window)
kubectl logs -l app=retellai-integration --since=1h > "$BUNDLE/logs/pods.log"
journalctl -u retellai-service --since "1 hour ago" > "$BUNDLE/logs/system.log"

# 2. Metrics dump
curl -s localhost:9090/api/v1/query?query=retellai_requests_total > "$BUNDLE/metrics/requests.json"
curl -s localhost:9090/api/v1/query?query=retellai_errors_total > "$BUNDLE/metrics/errors.json"

# 3. Network capture (30 seconds)
timeout 30 tcpdump -i any port 443 -w "$BUNDLE/network/capture.pcap" &

# 4. Distributed traces
curl -s localhost:16686/api/traces?service=retellai > "$BUNDLE/traces/jaeger.json"

# 5. Configuration state
kubectl get cm retellai-config -o yaml > "$BUNDLE/config/configmap.yaml"
kubectl get secret retellai-secrets -o yaml > "$BUNDLE/config/secrets-redacted.yaml"

tar -czf "$BUNDLE.tar.gz" "$BUNDLE"
echo "Advanced debug bundle: $BUNDLE.tar.gz"

Systematic Isolation

Layer-by-Layer Testing

// Test each layer independently
async function diagnoseRetell AIIssue(): Promise<DiagnosisReport> {
  const results: DiagnosisResult[] = [];

  // Layer 1: Network connectivity
  results.push(await testNetworkConnectivity());

  // Layer 2: DNS resolution
  results.push(await testDNSResolution('api.retellai.com'));

  // Layer 3: TLS handshake
  results.push(await testTLSHandshake('api.retellai.com'));

  // Layer 4: Authentication
  results.push(await testAuthentication());

  // Layer 5: API response
  results.push(await testAPIResponse());

  // Layer 6: Response parsing
  results.push(await testResponseParsing());

  return { results, firstFailure: results.find(r => !r.success) };
}

Minimal Reproduction

// Strip down to absolute minimum
async function minimalRepro(): Promise<void> {
  // 1. Fresh client, no customization
  const client = new RetellAIClient({
    apiKey: process.env.RETELLAI_API_KEY!,
  });

  // 2. Simplest possible call
  try {
    const result = await client.ping();
    console.log('Ping successful:', result);
  } catch (error) {
    console.error('Ping failed:', {
      message: error.message,
      code: error.code,
      stack: error.stack,
    });
  }
}

Timing Analysis

class TimingAnalyzer {
  private timings: Map<string, number[]> = new Map();

  async measure<T>(label: string, fn: () => Promise<T>): Promise<T> {
    const start = performance.now();
    try {
      return await fn();
    } finally {
      const duration = performance.now() - start;
      const existing = this.timings.get(label) || [];
      existing.push(duration);
      this.timings.set(label, existing);
    }
  }

  report(): TimingReport {
    const report: TimingReport = {};
    for (const [label, times] of this.timings) {
      report[label] = {
        count: times.length,
        min: Math.min(...times),
        max: Math.max(...times),
        avg: times.reduce((a, b) => a + b, 0) / times.length,
        p95: this.percentile(times, 95),
      };
    }
    return report;
  }
}

Memory and Resource Analysis

// Detect memory leaks in Retell AI client usage
const heapUsed: number[] = [];

setInterval(() => {
  const usage = process.memoryUsage();
  heapUsed.push(usage.heapUsed);

  // Alert on sustained growth
  if (heapUsed.length > 60) { // 1 hour at 1/min
    const trend = heapUsed[59] - heapUsed[0];
    if (trend > 100 * 1024 * 1024) { // 100MB growth
      console.warn('Potential memory leak in retellai integration');
    }
  }
}, 60000);

Race Condition Detection

// Detect concurrent access issues
class Retell AIConcurrencyChecker {
  private inProgress: Set<string> = new Set();

  async execute<T>(key: string, fn: () => Promise<T>): Promise<T> {
    if (this.inProgress.has(key)) {
      console.warn(`Concurrent access detected for ${key}`);
    }

    this.inProgress.add(key);
    try {
      return await fn();
    } finally {
      this.inProgress.delete(key);
    }
  }
}

Support Escalation Template

## Retell AI Support Escalation

**Severity:** P[1-4]
**Request ID:** [from error response]
**Timestamp:** [ISO 8601]

### Issue Summary
[One paragraph description]

### Steps to Reproduce
1. [Step 1]
2. [Step 2]

### Expected vs Actual
- Expected: [behavior]
- Actual: [behavior]

### Evidence Attached
- [ ] Debug bundle (retellai-advanced-debug-*.tar.gz)
- [ ] Minimal reproduction code
- [ ] Timing analysis
- [ ] Network capture (if relevant)

### Workarounds Attempted
1. [Workaround 1] - Result: [outcome]
2. [Workaround 2] - Result: [outcome]

Instructions

Step 1: Collect Evidence Bundle

Run the comprehensive debug script to gather all relevant data.

Step 2: Systematic Isolation

Test each layer independently to identify the failure point.

Step 3: Create Minimal Reproduction

Strip down to the simplest failing case.

Step 4: Escalate with Evidence

Use the support template with all collected evidence.

Output

Comprehensive debug bundle collected
Failure layer identified
Minimal reproduction created
Support escalation submitted

Error Handling

Issue	Cause	Solution
Can't reproduce	Race condition	Add timing analysis
Intermittent failure	Timing-dependent	Increase sample size
No useful logs	Missing instrumentation	Add debug logging
Memory growth	Resource leak	Use heap profiling

Examples

Quick Layer Test

# Test each layer in sequence
curl -v https://api.retellai.com/health 2>&1 | grep -E "(Connected|TLS|HTTP)"

Resources

Next Steps

For load testing, see retellai-load-scale.

retellai-advanced-troubleshooting

Install Skill

SKILL.md