Claude Code Plugins

Community-maintained marketplace

Feedback

error-handling-patterns

@tipmaster/claude-config
1
0

Use when adding error handling, designing APIs, or debugging failures - guides selection of fail-fast vs graceful degradation, error boundaries, retry strategies, and user-facing messages to build resilient systems

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name error-handling-patterns
description Use when adding error handling, designing APIs, or debugging failures - guides selection of fail-fast vs graceful degradation, error boundaries, retry strategies, and user-facing messages to build resilient systems

Error Handling Patterns

Overview

Systematic approach to error handling that balances resilience with debuggability through appropriate strategies for different error types and contexts.

When to Use

Use this skill when:

  • Adding error handling to new code
  • Designing API error responses
  • Implementing retry logic
  • Debugging production errors
  • Reviewing error handling in code
  • Deciding between fail-fast and graceful degradation

Symptoms that trigger this skill:

  • "Add error handling for..."
  • "Handle this failure..."
  • "Retry when X fails..."
  • try/catch blocks, error objects
  • Production errors in logs
  • Discussing failure modes

Don't use when:

  • Validation errors (straightforward)
  • Expected control flow (not errors)

Quick Reference: Error Handling Decision Tree

Use TodoWrite for ALL items below when handling errors:

Is this error recoverable?
├─ No → Fail fast (crash, log, alert)
└─ Yes → Can user take action?
    ├─ Yes → Return actionable error message
    └─ No → Should we retry automatically?
        ├─ Yes → Retry with backoff
        └─ No → Graceful degradation or fail

Implementation

Step 1: Create TodoWrite Checklist

☐ Classify error type (recoverable vs unrecoverable)
☐ Choose strategy (fail-fast, retry, graceful degradation)
☐ Implement error boundary if needed
☐ Add user-facing error message (actionable)
☐ Add technical error details (for logging)
☐ Add request/trace ID for debugging
☐ Log error with context (stack trace, input, state)
☐ Set up monitoring/alerting if critical
☐ Test error scenarios (unit tests, integration tests)
☐ Document error behavior in API docs

Step 2: Error Classification

Unrecoverable errors (fail fast):

  • Programmer errors (bugs)
  • Missing required config
  • Invalid state (corrupted data)
  • Out of memory
  • Missing dependencies

Recoverable errors (handle gracefully):

  • Network timeouts
  • Rate limits
  • User input validation
  • Resource temporarily unavailable
  • Third-party API failures

Decision:

// Unrecoverable → Let it crash
if (!process.env.DATABASE_URL) {
  throw new Error('DATABASE_URL is required');
}

// Recoverable → Handle gracefully
try {
  const data = await fetchFromAPI();
} catch (error) {
  if (error.code === 'TIMEOUT') {
    return fallbackData;
  }
  throw error; // Re-throw if unexpected
}

Step 3: Fail-Fast Pattern

When to use:

  • Configuration errors on startup
  • Programmer errors (bugs)
  • Invalid state that can't be recovered
  • Security violations

How:

// Example: Fail fast on startup
function validateConfig() {
  if (!process.env.DATABASE_URL) {
    console.error('FATAL: DATABASE_URL not set');
    process.exit(1);
  }
  if (!process.env.API_KEY) {
    console.error('FATAL: API_KEY not set');
    process.exit(1);
  }
}

validateConfig(); // Run before starting server

// Example: Fail fast on invalid state
function processOrder(order) {
  if (!order || !order.id) {
    throw new Error('Invalid order: missing id');
  }
  // Process order
}

Why fail fast:

  • Bugs surface immediately (not hidden)
  • Clear error messages (not mysterious failures)
  • Prevents cascading failures
  • Easier to debug (fails at root cause)

Step 4: Retry Pattern

When to use:

  • Transient network failures
  • Rate limiting (with backoff)
  • Temporary resource unavailability
  • Idempotent operations

When NOT to use:

  • Non-idempotent operations (e.g., charging credit card)
  • Permanent failures (404, 401, validation errors)
  • Operations with side effects

Exponential backoff with jitter:

async function retryWithBackoff(fn, options = {}) {
  const {
    maxRetries = 3,
    initialDelayMs = 1000,
    maxDelayMs = 10000,
    factor = 2,
  } = options;

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      // Don't retry on permanent errors
      if (error.statusCode === 404 || error.statusCode === 400) {
        throw error;
      }

      // Last attempt, give up
      if (attempt === maxRetries) {
        throw new Error(`Failed after ${maxRetries} retries: ${error.message}`);
      }

      // Calculate delay with exponential backoff and jitter
      const delay = Math.min(
        initialDelayMs * Math.pow(factor, attempt),
        maxDelayMs
      );
      const jitter = delay * 0.1 * Math.random();
      const totalDelay = delay + jitter;

      console.warn(`Retry attempt ${attempt + 1} after ${totalDelay}ms`);
      await sleep(totalDelay);
    }
  }
}

// Usage
const data = await retryWithBackoff(() => fetchFromAPI(), {
  maxRetries: 5,
  initialDelayMs: 1000,
});

Retry rules:

  • ✅ Exponential backoff (1s, 2s, 4s, 8s, ...)
  • ✅ Add jitter (prevent thundering herd)
  • ✅ Max delay cap (don't wait hours)
  • ✅ Max retry count (eventually give up)
  • ❌ Don't retry non-idempotent operations
  • ❌ Don't retry permanent errors (400, 404)

Step 5: Graceful Degradation

When to use:

  • Non-critical features
  • Optional enhancements
  • Features with acceptable fallbacks

Patterns:

Pattern 1: Fallback value

async function getUserPreferences(userId) {
  try {
    return await db.getUserPreferences(userId);
  } catch (error) {
    console.warn(`Failed to load preferences: ${error.message}`);
    return DEFAULT_PREFERENCES; // Fallback
  }
}

Pattern 2: Feature flag fallback

async function enhanceWithAI(text) {
  if (!AI_FEATURE_ENABLED) {
    return text; // Gracefully degrade
  }

  try {
    return await aiService.enhance(text);
  } catch (error) {
    console.warn(`AI enhancement failed: ${error.message}`);
    return text; // Fallback to original
  }
}

Pattern 3: Partial failure

async function fetchDashboardData() {
  const [users, orders, analytics] = await Promise.allSettled([
    fetchUsers(),
    fetchOrders(),
    fetchAnalytics(),
  ]);

  return {
    users: users.status === 'fulfilled' ? users.value : [],
    orders: orders.status === 'fulfilled' ? orders.value : [],
    analytics: analytics.status === 'fulfilled' ? analytics.value : null,
  };
}

Step 6: Error Boundaries (React Example)

Component-level error isolation:

class ErrorBoundary extends React.Component {
  state = { hasError: false, error: null };

  static getDerivedStateFromError(error) {
    return { hasError: true, error };
  }

  componentDidCatch(error, errorInfo) {
    console.error('Caught error:', error, errorInfo);
    // Send to error tracking (Sentry, etc.)
    trackError(error, errorInfo);
  }

  render() {
    if (this.state.hasError) {
      return (
        <div className="error-fallback">
          <h2>Something went wrong</h2>
          <button onClick={() => this.setState({ hasError: false })}>
            Try again
          </button>
        </div>
      );
    }

    return this.props.children;
  }
}

// Usage: Wrap risky components
<ErrorBoundary>
  <RiskyComponent />
</ErrorBoundary>

Step 7: User-Facing Error Messages

Good error messages:

  • ✅ Explain what went wrong
  • ✅ Tell user what to do next
  • ✅ Avoid technical jargon
  • ✅ Include request ID for support

Bad error messages:

  • ❌ Generic: "An error occurred"
  • ❌ Technical: "500 Internal Server Error"
  • ❌ No action: "Failed"

Examples:

// Bad
throw new Error('Invalid input');

// Good
throw new Error('Email address is invalid. Please use format: user@example.com');

// Bad
return { error: 'Database error' };

// Good
return {
  error: {
    message: 'We couldn\'t save your changes. Please try again in a moment.',
    code: 'DATABASE_UNAVAILABLE',
    retryAfter: 5, // seconds
    requestId: 'req_abc123',
  }
};

// Bad
console.error('Error');

// Good
console.error({
  message: 'Failed to fetch user data',
  userId: userId,
  endpoint: '/api/users',
  statusCode: response.status,
  requestId: response.headers.get('X-Request-ID'),
  timestamp: new Date().toISOString(),
  stack: error.stack,
});

Step 8: Logging with Context

Always log errors with context:

try {
  await processOrder(order);
} catch (error) {
  console.error({
    message: 'Order processing failed',
    error: error.message,
    stack: error.stack,
    orderId: order.id,
    userId: order.userId,
    orderTotal: order.total,
    timestamp: new Date().toISOString(),
    requestId: req.id,
  });

  // Re-throw or handle
  throw error;
}

Context to include:

  • Input parameters
  • Current state
  • Request/trace ID
  • Timestamp
  • Stack trace
  • Error type/code

Common Mistakes

Mistake Why It's Wrong Fix
Swallowing errors Silent failures, hard to debug Log errors, re-throw if needed
Generic error messages User can't take action Specific, actionable messages
Retrying non-idempotent ops Duplicate charges, double emails Only retry safe operations
Infinite retries Never gives up, wastes resources Max retry count and timeout
No exponential backoff Thundering herd, overwhelms server Exponential backoff with jitter
Catching all errors blindly Masks bugs, hides real issues Only catch expected errors
No logging context Can't reproduce or debug Log inputs, state, request ID
Failing entire request on partial failure All-or-nothing, poor UX Graceful degradation, partial success

Rationalization Counters

"I'll add error handling later" → Later never comes. Errors happen in production immediately. Handle them now.

"Just catch and log, it's fine" → Catching and logging isn't handling. Decide: retry, degrade, or fail. Logging alone helps no one.

"Users don't need details" → Generic errors frustrate users. "Something went wrong" is useless. Tell them what and how to fix.

"Retry everything, it'll work eventually" → Non-idempotent retries cause duplicates. Permanent errors never succeed. Be selective.

"Error handling adds too much code" → Unhandled errors add outages, angry users, and 3am debugging. Error handling is core logic, not optional.

"This won't fail in production" → Famous last words. Everything fails in production. Plan for failure.

Decision Guide: Which Pattern to Use?

Startup/Configuration errors:

  • ✅ Fail fast (exit immediately)

Network requests (idempotent):

  • ✅ Retry with exponential backoff

Network requests (non-idempotent):

  • ❌ Don't auto-retry
  • ✅ Return error, let user retry

Non-critical features:

  • ✅ Graceful degradation with fallback

Critical operations (payment, data integrity):

  • ✅ Fail fast, alert, require manual intervention

User input validation:

  • ✅ Return specific error message
  • ❌ Don't retry or degrade

Third-party API failures:

  • ✅ Retry if transient (503, timeout)
  • ✅ Degrade if optional feature
  • ✅ Fail if critical dependency

Error Handling by Layer

API Layer (Express example)

// Global error handler
app.use((error, req, res, next) => {
  // Log with context
  console.error({
    error: error.message,
    stack: error.stack,
    path: req.path,
    method: req.method,
    userId: req.user?.id,
    requestId: req.id,
  });

  // User-facing response
  res.status(error.statusCode || 500).json({
    error: {
      message: error.userMessage || 'An unexpected error occurred',
      code: error.code || 'INTERNAL_ERROR',
      requestId: req.id,
    },
  });
});

Service Layer

class UserService {
  async getUser(userId) {
    try {
      return await this.db.users.findById(userId);
    } catch (error) {
      if (error.code === 'NOT_FOUND') {
        throw new NotFoundError(`User ${userId} not found`);
      }
      throw new ServiceError('Failed to fetch user', { cause: error });
    }
  }
}

Database Layer

async function queryWithRetry(query, params) {
  return retryWithBackoff(
    () => db.query(query, params),
    {
      maxRetries: 3,
      shouldRetry: (error) => error.code === 'CONNECTION_ERROR',
    }
  );
}

Integration with Existing Workflows

With TDD:

  • Write tests for error scenarios
  • Test retry logic
  • Test graceful degradation fallbacks

With monitoring:

  • Send errors to error tracking (Sentry, Rollbar)
  • Set up alerts for critical errors
  • Track error rates in metrics

With APIs:

  • Document error responses in OpenAPI
  • Include error codes and messages
  • Provide retry guidance

Real-World Impact

Without this skill:

  • Silent failures (errors swallowed)
  • Mysterious production issues (no context logged)
  • Thundering herd (no backoff on retries)
  • Duplicate charges (retrying non-idempotent ops)
  • Frustrated users ("An error occurred")

With this skill:

  • Fast failure on bugs (easy debugging)
  • Resilient systems (smart retries)
  • Graceful degradation (partial failures OK)
  • Clear error messages (users know what to do)
  • Rich logs (easy debugging with context)

Required Background

None. This skill is self-contained.

Cross-References

  • Use superpowers:systematic-debugging when debugging error scenarios
  • Use superpowers:test-driven-development to test error paths
  • Use superpowers:api-design-review for API error design