Claude Code Plugins

Community-maintained marketplace

Feedback

Validating Inputs

@obra/clank
25
0

Check all external inputs for validity - garbage in, nothing out, never garbage out

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name Validating Inputs
description Check all external inputs for validity - garbage in, nothing out, never garbage out
when_to_use Before implementing any function that receives external data. When writing functions that take parameters from users, APIs, databases, files, or other untrusted sources. When you see missing validation, no error handling, or silent failures. When implementing without thinking "what could go wrong?". When code throws TypeError, ValueError, KeyError from missing validation. When crashes or runtime errors occur. When security vulnerabilities exist (injection attacks, buffer overflow). When data corruption happens. When code fails with unexpected inputs.
version 1.0.0
languages all

Validating Inputs

Overview

Professional-grade software never outputs garbage regardless of what it receives. "Garbage in, garbage out" is the mark of sloppy, insecure code.

Core principle: Check all data from external sources. Validate all routine parameters from untrusted sources. Decide consciously how to handle invalid data.

Modern standard: "Garbage in, nothing out" OR "Garbage in, error message out" OR "No garbage allowed in"

Violating the letter of this rule is violating the spirit of defensive programming.

When to Use

Always use when writing functions that receive:

  • User input (forms, command-line args, uploaded files)
  • External API responses
  • Database query results
  • File contents
  • Network data
  • Configuration files
  • Any data from outside your direct control

Warning signs you need this:

  • Function assumes inputs are valid
  • No validation beyond empty/null checks
  • No assertions documenting assumptions
  • Spec mentions constraints but code doesn't check them
  • Silent failures or wrong results with bad data
  • Security vulnerabilities (injection, overflow, etc.)
  • Functions accept any input without question

Don't skip when:

  • "Inputs will always be valid" (they won't)
  • "Validation happens elsewhere" (defense in depth - check anyway)
  • "It's just internal code" (today's internal is tomorrow's API)
  • Under time pressure (validation prevents longer debugging)

The Two-Level Defense

Level 1: Assertions (Should NEVER Happen)

Use for: Conditions that indicate bugs in YOUR code

def calculate_velocity(distance: float, time: float) -> float:
    # Preconditions: These should NEVER be violated if caller is correct
    assert distance >= 0, "distance cannot be negative"
    assert time > 0, "time must be positive"

    result = distance / time

    # Postcondition: Result should be reasonable
    assert result >= 0, f"velocity cannot be negative: {result}"

    return result

Assertions are:

  • Executable documentation
  • Compiled out in production (typically)
  • For catching programmer errors during development
  • Should fire = bug in code that needs fixing

Level 2: Error Handling (MIGHT Happen)

Use for: Conditions you expect might occur in production

def calculate_average_score(scores: list[float]) -> float:
    """Calculate average of test scores (must be 0-100)."""

    # Error handling: Validate external data
    if scores is None:
        raise ValueError("scores cannot be None")

    if not scores:
        raise ValueError("Cannot calculate average of empty score list")

    # Validate each score
    for i, score in enumerate(scores):
        if not isinstance(score, (int, float)):
            raise TypeError(f"Score {i} is not a number: {score}")
        if score < 0 or score > 100:
            raise ValueError(f"Score {i} out of range [0-100]: {score}")

    result = sum(scores) / len(scores)

    # Postcondition: Verify result is valid
    assert 0 <= result <= 100, f"Calculated average out of range: {result}"

    return result

Error handling:

  • Stays in production code
  • Handles expected anomalies gracefully
  • Validates external/untrusted data
  • Should trigger = need to handle error, not fix code

Quick Reference

Situation Approach Example
External data Validate everything Check ranges, types, formats, lengths
Routine parameters Check if from untrusted source Validate or document assumptions
Internal invariants Assert they hold Assert postconditions, state assumptions
Null/None Check explicitly if value is None: raise ValueError()
Empty collections Decide if valid or error Empty list error or return default?
Type mismatches Check with isinstance if not isinstance(score, (int, float))
Range violations Check bounds if score < 0 or score > 100
Invalid formats Use regex/validators Email, phone, URLs
Security risks Validate aggressively SQL injection, buffer overflow, path traversal

Validation Checklist

Before implementing any function receiving external data:

1. Identify all inputs

  • What data comes from outside my control?
  • Which parameters could be bad?
  • What are the data sources? (user, API, DB, file, network)

2. Document constraints

  • What are valid ranges? (0-100, positive only, etc.)
  • What are valid types? (int, float, string)
  • What are valid formats? (email, phone, date)
  • What are valid lengths? (string max, array min/max)
  • Are nulls allowed?
  • Are empties allowed?

3. Think "what could go wrong?"

  • Wrong type passed
  • Null/None passed
  • Empty collection passed
  • Negative where positive expected
  • Out of range values
  • Invalid format (malformed email, etc.)
  • Security attacks (injection, overflow)

4. Implement validation

  • Check each constraint explicitly
  • Use error handling for expected problems
  • Use assertions for programmer errors
  • Provide clear error messages
  • Document assumptions in assertions

5. Decide error response

  • Return neutral value? (0, empty string, None)
  • Raise exception with clear message?
  • Log and continue?
  • Substitute closest valid value?
  • Shut down? (safety-critical)

Robustness vs Correctness

Consciously choose based on domain:

Correctness (Never Return Wrong Answer)

Prefer when:

  • Safety-critical (medical, aviation, financial)
  • Security-critical
  • Data integrity critical
  • Wrong result is worse than no result

Strategy: Validate aggressively, fail fast with errors

def calculate_radiation_dosage(params):
    # Medical system: wrong dosage could kill patient
    # Better to refuse than to guess
    if not all_params_valid(params):
        raise ValueError("Cannot calculate dosage with invalid parameters")
    # If ANY doubt, raise error

Robustness (Keep Operating)

Prefer when:

  • Consumer applications
  • Non-critical features
  • User convenience matters
  • Some result better than crash

Strategy: Substitute reasonable values, log issues, continue

def get_user_theme_color(color_code):
    # UI preference: wrong color annoying but not critical
    # Better to show default than crash
    if not is_valid_color(color_code):
        logger.warning(f"Invalid color code {color_code}, using default")
        return DEFAULT_COLOR
    return color_code

Make this choice explicit in your design. Don't just fall into one approach without thinking.

Common Input Validation Patterns

Pattern 1: Validate Numeric Ranges

def process_temperature(temp_celsius: float) -> float:
    # Range validation
    if not isinstance(temp_celsius, (int, float)):
        raise TypeError(f"Temperature must be numeric, got {type(temp_celsius)}")

    if temp_celsius < -273.15:  # Absolute zero
        raise ValueError(f"Temperature cannot be below absolute zero: {temp_celsius}")

    if temp_celsius > 1000:  # Sanity check
        raise ValueError(f"Temperature seems unrealistic: {temp_celsius}")

    return temp_celsius + 273.15  # Convert to Kelvin

Pattern 2: Validate String Formats

import re

def send_email(email_address: str) -> None:
    # Format validation
    if not email_address or not isinstance(email_address, str):
        raise ValueError("Email address required")

    email_address = email_address.strip()

    if not re.match(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$', email_address):
        raise ValueError(f"Invalid email format: {email_address}")

    if len(email_address) > 254:  # RFC 5321 limit
        raise ValueError("Email address too long")

    # Proceed with valid email
    ...

Pattern 3: Validate Collections

def process_batch(items: list) -> None:
    # Collection validation
    if items is None:
        raise ValueError("items cannot be None")

    if not isinstance(items, list):
        raise TypeError(f"items must be a list, got {type(items)}")

    if not items:
        raise ValueError("items list cannot be empty")

    if len(items) > 1000:  # Sanity check
        raise ValueError(f"Batch too large: {len(items)} items (max 1000)")

    for i, item in enumerate(items):
        if item is None:
            raise ValueError(f"Item {i} cannot be None")
        # Validate each item...

Pattern 4: Validate Required Fields

def create_user(data: dict) -> None:
    # Required fields validation
    required_fields = ['username', 'email', 'password']

    for field in required_fields:
        if field not in data:
            raise ValueError(f"Missing required field: {field}")

        if not data[field] or not isinstance(data[field], str):
            raise ValueError(f"Field '{field}' must be non-empty string")

        if not data[field].strip():
            raise ValueError(f"Field '{field}' cannot be whitespace only")

Pattern 5: Preconditions and Postconditions

def withdraw_money(account_id: str, amount: float) -> float:
    # Preconditions (assertions for internal invariants)
    assert account_id, "account_id should never be empty"
    assert amount > 0, "amount should be positive (checked by caller)"

    # Validation (error handling for external data)
    balance = get_balance(account_id)

    if balance < amount:
        raise ValueError(f"Insufficient funds: balance {balance}, requested {amount}")

    new_balance = balance - amount

    # Postcondition (assertion for internal invariant)
    assert new_balance >= 0, "Balance should never be negative"
    assert new_balance == balance - amount, "Math error in withdrawal"

    update_balance(account_id, new_balance)
    return new_balance

Security Validation

Especially check for:

  • SQL Injection: Validate/sanitize database inputs, use parameterized queries
  • Command Injection: Never pass user input directly to system calls
  • Path Traversal: Validate file paths don't contain ../
  • Buffer Overflow: Check string/array lengths against limits
  • Integer Overflow: Validate arithmetic won't overflow
  • XSS/HTML Injection: Sanitize user content before display
  • XML/JSON Injection: Validate structure and content

Rule: Be especially paranoid with anything that could attack your system.

Common Mistakes

❌ Only checking for null/empty:

if not scores:
    return 0.0
return sum(scores) / len(scores)  # Doesn't check constraints!

✅ Check ALL constraints:

if not scores:
    raise ValueError("Cannot calculate average of empty list")
for score in scores:
    if score < 0 or score > 100:
        raise ValueError(f"Score out of range: {score}")
return sum(scores) / len(scores)

❌ Assuming types are correct:

def add(a, b):
    return a + b  # What if a or b are strings? None? Lists?

✅ Validate types:

def add(a: float, b: float) -> float:
    if not isinstance(a, (int, float)) or not isinstance(b, (int, float)):
        raise TypeError(f"Arguments must be numeric: {type(a)}, {type(b)}")
    return a + b

❌ Silent failure or wrong default:

if not scores:
    return 0.0  # Is 0.0 the right answer for empty? Or should it error?

✅ Explicit decision:

if not scores:
    raise ValueError("Cannot calculate average of empty list")
    # OR if 0.0 is intentional:
    # return 0.0  # Intentionally return 0 for empty list per business rules

❌ No error message context:

if age < 18:
    raise ValueError("Invalid age")  # Which age? What was the value?

✅ Informative error messages:

if age < 18:
    raise ValueError(f"Age must be 18+, got {age}")

Red Flags - STOP and Add Validation

Before implementing:

  • Haven't thought "what could go wrong?"
  • No validation code written yet
  • Only checking null/empty
  • Assuming inputs are valid
  • "Validation happens elsewhere" (maybe, but check anyway)

After implementing:

  • Function accepts any input without checking
  • No assertions documenting assumptions
  • Spec mentions constraints but code doesn't enforce them
  • Could pass wrong type and function wouldn't catch it
  • Security review would fail

All of these mean: Add comprehensive validation now.

Common Rationalizations

Excuse Reality
"Inputs will always be valid" They won't. Users make mistakes, APIs change, bugs happen.
"Validation happens elsewhere" Defense in depth. Check at every layer.
"It's just internal code" Today's internal is tomorrow's API. Validate anyway.
"Adds too much code" 5 lines of validation prevents hours of debugging.
"Slows down the code" Correctness > speed. Optimize later if needed.
"Trust the caller" Trust but verify. Catch bugs at boundaries.
"Users know what they're doing" Users make mistakes. Software should help, not crash.
"I'll add validation later" Later never comes. Add it now.

Three Levels of Validation

Level 1: Type Validation

Check data is the expected type:

if not isinstance(value, expected_type):
    raise TypeError(f"Expected {expected_type}, got {type(value)}")

Level 2: Constraint Validation

Check data meets business rules:

if value < min_value or value > max_value:
    raise ValueError(f"Value {value} out of range [{min_value}, {max_value}]")

Level 3: Format/Semantic Validation

Check data is semantically valid:

if not re.match(email_pattern, email):
    raise ValueError(f"Invalid email format: {email}")

Apply all three levels to external data.

Assertions vs Error Handling

Use Assertions When:

  • Documenting internal invariants
  • Checking preconditions from trusted callers
  • Verifying postconditions you guarantee
  • Catching programmer errors (bugs in YOUR code)
  • Development/debugging (typically compiled out in production)
def withdraw(self, amount):
    assert self.balance >= 0, "Balance invariant violated"  # Should never happen
    assert amount > 0, "Caller should have checked amount"   # Caller's bug

Use Error Handling When:

  • Validating external/untrusted data
  • Handling expected anomalies
  • User input could be wrong
  • API might return bad data
  • Production code must handle gracefully
def withdraw(self, amount):
    if amount <= 0:  # User might request $0 or negative
        raise ValueError(f"Withdrawal amount must be positive, got {amount}")

    if amount > self.balance:  # User might request too much
        raise ValueError(f"Insufficient funds: {amount} requested, {self.balance} available")

Rule: Assertions for bugs, error handling for anomalies.

Validation Strategy by Source

Data Source Trust Level Validation Approach
User input Untrusted Validate everything aggressively
External API Untrusted Validate responses, handle failures
Database Semi-trusted Check for corruption, missing data
Config file Semi-trusted Validate format and values
Internal parameters Trusted Use assertions to document assumptions
Your own methods Trusted Assertions for preconditions

Common Validation Scenarios

Validating Numeric Input

# Check type, range, special values
if not isinstance(value, (int, float)):
    raise TypeError(f"Expected number, got {type(value)}")

if math.isnan(value) or math.isinf(value):
    raise ValueError(f"Value cannot be NaN or Inf: {value}")

if value < minimum or value > maximum:
    raise ValueError(f"Value {value} out of range [{minimum}, {maximum}]")

Validating String Input

# Check type, emptiness, length, format
if not isinstance(value, str):
    raise TypeError(f"Expected string, got {type(value)}")

value = value.strip()

if not value:
    raise ValueError("Value cannot be empty or whitespace only")

if len(value) > max_length:
    raise ValueError(f"Value too long: {len(value)} chars (max {max_length})")

if not pattern.match(value):
    raise ValueError(f"Value doesn't match required format: {value}")

Validating Collections

# Check type, emptiness, size, element validity
if not isinstance(items, list):
    raise TypeError(f"Expected list, got {type(items)}")

if not items:
    raise ValueError("List cannot be empty")

if len(items) > max_items:
    raise ValueError(f"Too many items: {len(items)} (max {max_items})")

for i, item in enumerate(items):
    if item is None:
        raise ValueError(f"Item {i} cannot be None")
    # Validate each element...

Error Response Strategies

Choose consciously based on domain:

1. Return Neutral Value

When: Non-critical, user convenience matters

def get_color_preference(color_code):
    if not is_valid_color(color_code):
        return DEFAULT_COLOR  # Neutral, harmless
    return color_code

2. Substitute Valid Value

When: Can safely substitute without data loss

def clamp_temperature(temp):
    # Thermometer calibrated 0-100°C
    if temp < 0:
        return 0  # Closest valid value
    if temp > 100:
        return 100
    return temp

3. Raise Exception

When: Caller must handle the error

def charge_payment(amount):
    if amount <= 0:
        raise ValueError(f"Payment amount must be positive: {amount}")
    # Process payment

4. Log and Continue

When: Error isn't critical, want visibility

def sync_data(data):
    if not is_valid(data):
        logger.warning(f"Invalid data encountered, skipping: {data}")
        return
    # Process valid data

5. Shut Down

When: Safety-critical, wrong result is dangerous

def control_reactor(params):
    if not params_within_safe_limits(params):
        emergency_shutdown()
        raise CriticalError("Unsafe parameters detected, reactor shut down")

Verification Before Shipping

Before marking validation complete:

  • Identified ALL external data sources
  • Validated ALL constraints from spec
  • Used assertions for internal invariants
  • Used error handling for external anomalies
  • Provided clear, informative error messages
  • Consciously chose: robustness vs correctness
  • Tested with invalid inputs (not just valid ones)
  • Security-reviewed for injection/overflow/attacks

Real-World Impact

From Code Complete and baseline testing:

Baseline test results:

  • Agent only checked empty list (most basic edge case)
  • Ignored spec constraint (scores must be 0-100)
  • No type checking, no assertions, no comprehensive validation
  • Grade: D- for defensive programming

With validation:

  • Catches bad data at boundary (not deep in call stack)
  • Clear error messages aid debugging
  • Assertions catch programmer errors early
  • Production code is robust and secure

Industry impact:

  • Security vulnerabilities often stem from missing input validation
  • Defensive programming prevents "impossible" errors
  • Validating early is cheaper than debugging later

Integration with Other Skills

For multi-layer validation: See skills/debugging/defense-in-depth for validating at every layer data passes through

For systematic debugging: If validation fails in production, see skills/debugging/systematic-debugging for root cause analysis