name	debugging-and-profiling
description	Debugging fundamentals, debugpy/VS Code, pdb, CPU profiling, memory profiling, profiling async code, performance optimization, systematic diagnosis, common bottlenecks

Debugging and Profiling

Overview

Core Principle: Profile before optimizing. Humans are terrible at guessing where code is slow. Always measure before making changes.

Python debugging and profiling enables systematic problem diagnosis and performance optimization. Use debugpy/pdb for step-through debugging, cProfile for CPU profiling, memory_profiler for memory analysis. The biggest mistake: optimizing code without profiling first—you'll likely optimize the wrong thing.

When to Use

Use this skill when:

"Code is slow"
"How to profile Python?"
"Memory leak"
"Debugging not working"
"Find bottleneck"
"Optimize performance"
"Step through code"
"Where is my code spending time?"

Don't use when:

Setting up project (use project-structure-and-tooling)
Already know what to optimize (but still profile to verify!)
Algorithm selection (different skill domain)

Symptoms triggering this skill:

Code runs slower than expected
Memory usage growing over time
Need to understand execution flow
Performance degraded after changes

Debugging Fundamentals

Using debugpy with VS Code

# ✅ CORRECT: debugpy for remote debugging
import debugpy

# Allow VS Code to attach
debugpy.listen(5678)
print("Waiting for debugger to attach...")
debugpy.wait_for_client()

# Your code here
def process_data(data):
    result = []
    for item in data:
        # Set breakpoint in VS Code on this line
        transformed = transform(item)
        result.append(transformed)
    return result

# VS Code launch.json configuration:
"""
{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Attach",
            "type": "python",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 5678
            }
        }
    ]
}
"""

Using pdb (Python Debugger)

# ✅ CORRECT: pdb for interactive debugging
import pdb

def buggy_function(data):
    result = []
    for i, item in enumerate(data):
        # Drop into debugger
        pdb.set_trace()  # Or: breakpoint() in Python 3.7+

        processed = item * 2
        result.append(processed)
    return result

# pdb commands:
# n (next): Execute next line
# s (step): Step into function
# c (continue): Continue execution
# p variable: Print variable
# pp variable: Pretty print variable
# l (list): Show current location in code
# w (where): Show stack trace
# q (quit): Quit debugger

Conditional Breakpoints

# ❌ WRONG: Breaking on every iteration
def process_items(items):
    for item in items:
        pdb.set_trace()  # Breaks 10000 times!
        process(item)

# ✅ CORRECT: Conditional breakpoint
def process_items(items):
    for i, item in enumerate(items):
        if i == 5000:  # Only break on specific iteration
            breakpoint()
        process(item)

# ✅ BETTER: Use pdb.set_trace with condition
def process_items(items):
    for item in items:
        if item.value < 0:  # Break only when problematic
            breakpoint()
        process(item)

Post-Mortem Debugging

# ✅ CORRECT: Debug after exception
import pdb

def main():
    try:
        # Code that might raise exception
        result = risky_operation()
    except Exception:
        # Drop into debugger at exception point
        pdb.post_mortem()

# ✅ CORRECT: Auto post-mortem for unhandled exceptions
import sys

def custom_excepthook(type, value, traceback):
    pdb.post_mortem(traceback)

sys.excepthook = custom_excepthook

# Now unhandled exceptions drop into pdb automatically

Why this matters: Breakpoints let you inspect state at exact point of failure. Conditional breakpoints avoid noise. Post-mortem debugging examines crashes.

CPU Profiling

cProfile for Function-Level Profiling

import cProfile
import pstats

# ❌ WRONG: Guessing which function is slow
def slow_program():
    # "I think this loop is the problem..."
    for i in range(1000):
        process_data(i)

# ✅ CORRECT: Profile to find actual bottleneck
def slow_program():
    for i in range(1000):
        process_data(i)

# Profile the function
cProfile.run('slow_program()', 'profile_stats')

# Analyze results
stats = pstats.Stats('profile_stats')
stats.strip_dirs()
stats.sort_stats('cumulative')
stats.print_stats(20)  # Top 20 functions by cumulative time

# ✅ CORRECT: Profile with context manager
from contextlib import contextmanager
import cProfile

@contextmanager
def profiled():
    pr = cProfile.Profile()
    pr.enable()
    yield
    pr.disable()

    stats = pstats.Stats(pr)
    stats.strip_dirs()
    stats.sort_stats('cumulative')
    stats.print_stats(20)

# Usage
with profiled():
    slow_program()

Profiling Specific Code Blocks

# ✅ CORRECT: Profile specific section
import cProfile

pr = cProfile.Profile()

# Normal code
setup_data()

# Profile this section
pr.enable()
expensive_operation()
pr.disable()

# More normal code
cleanup()

# View results
pr.print_stats(sort='cumulative')

Line-Level Profiling with line_profiler

# Install: pip install line_profiler

# ✅ CORRECT: Line-by-line profiling
from line_profiler import LineProfiler

@profile  # Use @profile decorator
def slow_function():
    total = 0
    for i in range(10000):
        total += i ** 2
    return total

# Run with kernprof:
# kernprof -l -v script.py

# Or programmatically:
lp = LineProfiler()
lp.add_function(slow_function)
lp.enable()
slow_function()
lp.disable()
lp.print_stats()

# Output shows time spent per line:
# Line #      Hits         Time  Per Hit   % Time  Line Contents
# ==============================================================
#     1                                           def slow_function():
#     2         1          2.0      2.0      0.0      total = 0
#     3     10001      15234.0      1.5     20.0      for i in range(10000):
#     4     10000      60123.0      6.0     80.0          total += i ** 2
#     5         1          1.0      1.0      0.0      return total

Why this matters: cProfile shows which functions are slow. line_profiler shows which lines within functions. Both essential for optimization.

Visualizing Profiles with SnakeViz

# Install: pip install snakeviz

# Profile code
python -m cProfile -o program.prof script.py

# Visualize
snakeviz program.prof

# Opens browser with interactive visualization:
# - Sunburst chart showing call hierarchy
# - Icicle chart showing time distribution
# - Click functions to zoom in

Memory Profiling

Memory Usage with memory_profiler

# Install: pip install memory_profiler

from memory_profiler import profile

# ✅ CORRECT: Track memory usage per line
@profile
def memory_hungry_function():
    # Line-by-line memory usage shown
    big_list = [i for i in range(1000000)]  # Allocates ~40MB
    big_dict = {i: i**2 for i in range(1000000)}  # Another ~40MB
    return len(big_list), len(big_dict)

# Run with:
# python -m memory_profiler script.py

# Output:
# Line #    Mem usage    Increment   Line Contents
# ================================================
#      3   38.3 MiB     38.3 MiB   @profile
#      4                             def memory_hungry_function():
#      5   45.2 MiB      6.9 MiB       big_list = [i for i in range(1000000)]
#      6   83.1 MiB     37.9 MiB       big_dict = {i: i**2 for i in range(1000000)}
#      7   83.1 MiB      0.0 MiB       return len(big_list), len(big_dict)

Finding Memory Leaks

# ✅ CORRECT: Detect memory leaks with tracemalloc
import tracemalloc

# Start tracing
tracemalloc.start()

# Take snapshot before
snapshot1 = tracemalloc.take_snapshot()

# Run code that might leak
problematic_function()

# Take snapshot after
snapshot2 = tracemalloc.take_snapshot()

# Compare snapshots
top_stats = snapshot2.compare_to(snapshot1, 'lineno')

print("Top 10 memory increases:")
for stat in top_stats[:10]:
    print(stat)

tracemalloc.stop()

# ✅ CORRECT: Track specific objects
import gc
import sys

def find_memory_leak():
    # Force garbage collection
    gc.collect()

    # Track objects before
    before = len(gc.get_objects())

    # Run potentially leaky code
    for _ in range(100):
        leaky_operation()

    # Force GC again
    gc.collect()

    # Track objects after
    after = len(gc.get_objects())

    if after > before:
        print(f"Potential leak: {after - before} objects not collected")

        # Find what's keeping objects alive
        for obj in gc.get_objects():
            if isinstance(obj, MyClass):  # Suspect class
                print(f"Found {type(obj)}: {sys.getrefcount(obj)} references")
                print(gc.get_referrers(obj))

Profiling Memory with objgraph

# Install: pip install objgraph

import objgraph

# ✅ CORRECT: Find most common objects
def analyze_memory():
    objgraph.show_most_common_types()
    # Output:
    # dict                   12453
    # function               8234
    # list                   6789
    # ...

# ✅ CORRECT: Track object growth
objgraph.show_growth()
potentially_leaky_function()
objgraph.show_growth()  # Shows objects that increased

# ✅ CORRECT: Visualize object references
import objgraph
objgraph.show_refs([my_object], filename='refs.png')
# Creates graph showing what references my_object

Why this matters: Memory leaks cause gradual performance degradation. tracemalloc and memory_profiler help find exactly where memory is allocated.

Profiling Async Code

Profiling Async Functions

import asyncio
import cProfile
import pstats

# ❌ WRONG: cProfile doesn't work well with async
async def slow_async():
    await asyncio.sleep(1)
    await process_data()

cProfile.run('asyncio.run(slow_async())')  # Misleading results

# ✅ CORRECT: Use yappi for async profiling
# Install: pip install yappi
import yappi

async def slow_async():
    await asyncio.sleep(1)
    await process_data()

yappi.set_clock_type("wall")  # Use wall time, not CPU time
yappi.start()

asyncio.run(slow_async())

yappi.stop()

# Print stats
stats = yappi.get_func_stats()
stats.sort("totaltime", "desc")
stats.print_all()

# ✅ CORRECT: Profile coroutines specifically
stats = yappi.get_func_stats(filter_callback=lambda x: 'coroutine' in x.name)
stats.print_all()

Detecting Blocking Code in Async

# ✅ CORRECT: Detect event loop blocking
import asyncio
import time

class LoopMonitor:
    def __init__(self, threshold: float = 0.1):
        self.threshold = threshold

    async def monitor(self):
        while True:
            start = time.monotonic()
            await asyncio.sleep(0.01)  # Very short sleep
            elapsed = time.monotonic() - start

            if elapsed > self.threshold:
                print(f"WARNING: Event loop blocked for {elapsed:.3f}s")

async def main():
    # Start monitor
    monitor = LoopMonitor(threshold=0.1)
    monitor_task = asyncio.create_task(monitor.monitor())

    # Run your async code
    await your_async_function()

    monitor_task.cancel()

# ✅ CORRECT: Use asyncio debug mode
asyncio.run(main(), debug=True)
# Warns about slow callbacks (>100ms)

Performance Optimization Strategies

Optimization Workflow

# ✅ CORRECT: Systematic optimization approach

# 1. Profile to find bottleneck
import cProfile
cProfile.run('main()', 'profile_stats')

# 2. Analyze results
stats = pstats.Stats('profile_stats')
stats.sort_stats('cumulative')
stats.print_stats(10)  # Focus on top 10

# 3. Identify specific slow function
def slow_function(data):
    # Original implementation
    result = []
    for item in data:
        if is_valid(item):
            result.append(transform(item))
    return result

# 4. Create benchmark
import timeit

def benchmark():
    data = create_test_data(10000)
    time_taken = timeit.timeit(
        lambda: slow_function(data),
        number=100
    )
    print(f"Average time: {time_taken / 100:.4f}s")

benchmark()  # Baseline: 0.1234s

# 5. Optimize
def optimized_function(data):
    # Use list comprehension (faster)
    return [transform(item) for item in data if is_valid(item)]

# 6. Benchmark again
time_taken = timeit.timeit(
    lambda: optimized_function(data),
    number=100
)
print(f"Average time: {time_taken / 100:.4f}s")  # 0.0789s - 36% faster!

# 7. Verify correctness
assert slow_function(data) == optimized_function(data)

# 8. Re-profile entire program to verify improvement
cProfile.run('main()', 'profile_stats_optimized')

Why this matters: Without profiling, you might optimize code that takes 1% of runtime, ignoring the 90% bottleneck. Always measure.

Common Optimizations

# ❌ WRONG: Repeated expensive operations
def process_items(items):
    for item in items:
        # Regex compiled every iteration!
        pattern = re.compile(r'\d+')
        match = pattern.search(item)

# ✅ CORRECT: Move expensive operations outside loop
def process_items(items):
    pattern = re.compile(r'\d+')  # Compile once
    for item in items:
        match = pattern.search(item)

# ❌ WRONG: Growing list with repeated concatenation
def build_large_list():
    result = []
    for i in range(100000):
        result = result + [i]  # Creates new list each time! O(n²)

# ✅ CORRECT: Use append
def build_large_list():
    result = []
    for i in range(100000):
        result.append(i)  # O(n)

# ❌ WRONG: Checking membership in list
def filter_items(items, blacklist):
    return [item for item in items if item not in blacklist]
    # O(n * m) if blacklist is list

# ✅ CORRECT: Use set for membership checks
def filter_items(items, blacklist):
    blacklist_set = set(blacklist)  # O(m)
    return [item for item in items if item not in blacklist_set]
    # O(n) for iteration + O(1) per lookup = O(n)

Caching Results

from functools import lru_cache

# ❌ WRONG: Recomputing expensive results
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)
# O(2^n) - recalculates same values repeatedly

# ✅ CORRECT: Cache results
@lru_cache(maxsize=None)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)
# O(n) - each value computed once

# ✅ CORRECT: Custom caching for unhashable arguments
from functools import wraps

def cache_dataframe_results(func):
    cache = {}

    @wraps(func)
    def wrapper(df):
        # Use hash of dataframe content as key
        key = hashlib.md5(df.to_csv(index=False).encode()).hexdigest()

        if key not in cache:
            cache[key] = func(df)

        return cache[key]

    return wrapper

@cache_dataframe_results
def expensive_dataframe_operation(df):
    # Complex computation
    return df.groupby('category').agg({'value': 'sum'})

Systematic Diagnosis

Performance Degradation Diagnosis

# ✅ CORRECT: Diagnose performance regression
import cProfile
import pstats

def diagnose_slowdown():
    """Compare current vs baseline performance."""

    # Profile current code
    cProfile.run('main()', 'current_profile.prof')

    # Load baseline profile (from git history or previous run)
    # git show main:profile.prof > baseline_profile.prof

    current = pstats.Stats('current_profile.prof')
    baseline = pstats.Stats('baseline_profile.prof')

    print("=== CURRENT ===")
    current.sort_stats('cumulative')
    current.print_stats(10)

    print("\n=== BASELINE ===")
    baseline.sort_stats('cumulative')
    baseline.print_stats(10)

    # Look for functions that got slower
    # Compare cumulative times

Memory Leak Diagnosis

# ✅ CORRECT: Systematic memory leak detection
import tracemalloc
import gc

def diagnose_memory_leak():
    """Run function multiple times and check memory growth."""

    gc.collect()
    tracemalloc.start()

    # Baseline
    snapshot1 = tracemalloc.take_snapshot()

    # Run 100 times
    for _ in range(100):
        potentially_leaky_function()
        gc.collect()

    # Check memory
    snapshot2 = tracemalloc.take_snapshot()

    top_stats = snapshot2.compare_to(snapshot1, 'lineno')

    print("Top 10 memory allocations:")
    for stat in top_stats[:10]:
        print(f"{stat.traceback}: +{stat.size_diff / 1024:.1f} KB")

    tracemalloc.stop()

I/O vs CPU Bound Diagnosis

# ✅ CORRECT: Determine if I/O or CPU bound
import time
import cProfile

def diagnose_bottleneck():
    """Determine if program is I/O or CPU bound."""

    # Time wall clock
    start_wall = time.time()
    main()
    wall_time = time.time() - start_wall

    # Profile CPU time
    pr = cProfile.Profile()
    pr.enable()
    start_cpu = time.process_time()
    main()
    cpu_time = time.process_time() - start_cpu
    pr.disable()

    print(f"Wall time: {wall_time:.2f}s")
    print(f"CPU time: {cpu_time:.2f}s")

    if cpu_time / wall_time > 0.9:
        print("CPU bound - optimize computation")
        # Consider: Cython, NumPy, multiprocessing
    else:
        print("I/O bound - optimize I/O")
        # Consider: async/await, caching, batching

Common Bottlenecks and Solutions

String Concatenation

# ❌ WRONG: String concatenation in loop
def build_string(items):
    result = ""
    for item in items:
        result += str(item) + "\n"  # Creates new string each time
    return result
# O(n²) time complexity

# ✅ CORRECT: Use join
def build_string(items):
    return "\n".join(str(item) for item in items)
# O(n) time complexity

# Benchmark:
# 1000 items: 0.0015s (join) vs 0.0234s (concatenation) - 15x faster
# 10000 items: 0.015s (join) vs 2.341s (concatenation) - 156x faster

List Comprehension vs Map/Filter

import timeit

# ✅ CORRECT: List comprehension (usually fastest)
def with_list_comp(data):
    return [x * 2 for x in data if x > 0]

# ✅ CORRECT: Generator (memory efficient for large data)
def with_generator(data):
    return (x * 2 for x in data if x > 0)

# Map/filter (sometimes faster for simple operations)
def with_map_filter(data):
    return map(lambda x: x * 2, filter(lambda x: x > 0, data))

# Benchmark
data = list(range(1000000))
print(timeit.timeit(lambda: list(with_list_comp(data)), number=10))
print(timeit.timeit(lambda: list(with_generator(data)), number=10))
print(timeit.timeit(lambda: list(with_map_filter(data)), number=10))

# Results: List comprehension usually fastest for complex logic
# Generator best when you don't need all results at once

Dictionary Lookups vs List Searches

# ❌ WRONG: Searching in list
def find_users_list(user_ids, all_users_list):
    results = []
    for user_id in user_ids:
        for user in all_users_list:  # O(n) per lookup
            if user['id'] == user_id:
                results.append(user)
                break
    return results
# O(n * m) time complexity

# ✅ CORRECT: Use dictionary
def find_users_dict(user_ids, all_users_dict):
    return [all_users_dict[uid] for uid in user_ids if uid in all_users_dict]
# O(n) time complexity

# Benchmark:
# 1000 lookups in 10000 items:
# List: 1.234s
# Dict: 0.001s - 1234x faster!

DataFrame Iteration Anti-Pattern

import pandas as pd
import numpy as np

# ❌ WRONG: Iterating over DataFrame rows
def process_rows_iterrows(df):
    results = []
    for idx, row in df.iterrows():  # VERY SLOW
        if row['value'] > 0:
            results.append(row['value'] * 2)
    return results

# ✅ CORRECT: Vectorized operations
def process_rows_vectorized(df):
    mask = df['value'] > 0
    return (df.loc[mask, 'value'] * 2).tolist()

# Benchmark with 100,000 rows:
# iterrows: 15.234s
# vectorized: 0.015s - 1000x faster!

Profiling Tools Comparison

When to Use Which Tool

Tool	Use Case	Output
cProfile	Function-level CPU profiling	Which functions take most time
line_profiler	Line-level CPU profiling	Which lines within function slow
memory_profiler	Line-level memory profiling	Memory usage per line
tracemalloc	Memory allocation tracking	Where memory allocated
yappi	Async/multithreaded profiling	Profile concurrent code
py-spy	Sampling profiler (no code changes)	Profile running processes
scalene	CPU+GPU+memory profiling	Comprehensive profiling

py-spy for Production Profiling

# Install: pip install py-spy

# Profile running process (no code changes needed!)
py-spy record -o profile.svg --pid 12345

# Profile for 60 seconds
py-spy record -o profile.svg --duration 60 -- python script.py

# Top-like view of running process
py-spy top --pid 12345

# Why use py-spy:
# - No code changes needed
# - Minimal overhead
# - Can attach to running process
# - Great for production debugging

Anti-Patterns

Premature Optimization

# ❌ WRONG: Optimizing before measuring
def process_data(data):
    # "Let me make this fast with complex caching..."
    # Spend hours optimizing function that takes 0.1% of runtime

# ✅ CORRECT: Profile first
cProfile.run('main()', 'profile.prof')
# Oh, process_data only takes 0.1% of time
# The real bottleneck is database queries (90% of time)
# Optimize database queries instead!

Micro-Optimizations

# ❌ WRONG: Micro-optimizing at expense of readability
def calculate(x, y):
    # "Using bit shift instead of multiply by 2 for speed!"
    return (x << 1) + (y << 1)
# Saved: ~0.0000001 seconds per call
# Cost: Unreadable code

# ✅ CORRECT: Clear code first
def calculate(x, y):
    return 2 * x + 2 * y
# Modern Python JIT optimizes this anyway
# Only optimize if profiler shows this is bottleneck

Not Benchmarking Changes

# ❌ WRONG: Assuming optimization worked
def slow_function():
    # Original code
    pass

def optimized_function():
    # "Optimized" code
    pass

# Assume optimized_function is faster without measuring

# ✅ CORRECT: Benchmark before and after
import timeit

before = timeit.timeit(slow_function, number=1000)
after = timeit.timeit(optimized_function, number=1000)

print(f"Before: {before:.4f}s")
print(f"After: {after:.4f}s")
print(f"Speedup: {before/after:.2f}x")

# Verify correctness
assert slow_function() == optimized_function()

Decision Trees

What Tool to Use for Profiling?

What do I need to profile?
├─ CPU time
│   ├─ Function-level → cProfile
│   ├─ Line-level → line_profiler
│   └─ Async code → yappi
├─ Memory usage
│   ├─ Line-level → memory_profiler
│   ├─ Allocation tracking → tracemalloc
│   └─ Object types → objgraph
└─ Running process (no code changes) → py-spy

Optimization Strategy

Is code slow?
├─ Yes → Profile to find bottleneck
│   ├─ CPU bound → Profile with cProfile
│   │   └─ Optimize hot functions (vectorize, cache, algorithms)
│   └─ I/O bound → Profile with timing
│       └─ Use async/await, caching, batching
└─ No → Don't optimize (focus on features/correctness)

Memory Issue Diagnosis

Is memory usage high?
├─ Yes → Profile with memory_profiler
│   ├─ Growing over time → Memory leak
│   │   └─ Use tracemalloc to find leak
│   └─ High but stable → Large data structures
│       └─ Optimize data structures (generators, efficient types)
└─ No → Monitor but don't optimize yet

Integration with Other Skills

After using this skill:

If I/O bound → See @async-patterns-and-concurrency for async optimization
If data processing slow → See @scientific-computing-foundations for vectorization
If need to track improvements → See @ml-engineering-workflows for metrics

Before using this skill:

If unsure code is slow → Use this skill to profile and confirm!
If setting up profiling → See @project-structure-and-tooling for dependencies

Quick Reference

Essential Profiling Commands

# CPU profiling
import cProfile
cProfile.run('main()', 'profile.prof')

# View results
import pstats
stats = pstats.Stats('profile.prof')
stats.sort_stats('cumulative')
stats.print_stats(20)

# Memory profiling
import tracemalloc
tracemalloc.start()
# ... code ...
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
for stat in top_stats[:10]:
    print(stat)

Debugging Commands

# Set breakpoint
breakpoint()  # Python 3.7+
# or
import pdb; pdb.set_trace()

# pdb commands:
# n - next line
# s - step into
# c - continue
# p var - print variable
# l - list code
# w - where am I
# q - quit

Optimization Checklist

Profile before optimizing (use cProfile)
Identify bottleneck (top 20% of time)
Create benchmark for bottleneck
Optimize bottleneck
Benchmark again to verify improvement
Re-profile entire program
Verify correctness (tests still pass)

Common Optimizations

Problem	Solution	Speedup
String concatenation in loop	Use str.join()	10-100x
List membership checks	Use set	100-1000x
DataFrame iteration	Vectorize with NumPy/pandas	100-1000x
Repeated expensive computation	Cache with @lru_cache	∞ (depends on cache hits)
I/O bound	Use async/await	10-100x
CPU bound with parallelizable work	Use multiprocessing	~number of cores

Red Flags

If you find yourself:

Optimizing before profiling → STOP, profile first
Spending hours on micro-optimizations → Check if it's bottleneck
Making code unreadable for speed → Benchmark the benefit
Assuming what's slow → Profile to verify

Always measure. Never assume.

Install Skill

SKILL.md

Debugging and Profiling

Overview

When to Use

Debugging Fundamentals

Using debugpy with VS Code

Using pdb (Python Debugger)

Conditional Breakpoints

Post-Mortem Debugging

CPU Profiling

cProfile for Function-Level Profiling

Profiling Specific Code Blocks

Line-Level Profiling with line_profiler

Visualizing Profiles with SnakeViz

Memory Profiling

Memory Usage with memory_profiler

Finding Memory Leaks

Profiling Memory with objgraph

Profiling Async Code

Profiling Async Functions

Detecting Blocking Code in Async

Performance Optimization Strategies

Optimization Workflow

Common Optimizations

Caching Results

Systematic Diagnosis

Performance Degradation Diagnosis

Memory Leak Diagnosis

I/O vs CPU Bound Diagnosis

Common Bottlenecks and Solutions

String Concatenation

List Comprehension vs Map/Filter

Dictionary Lookups vs List Searches

DataFrame Iteration Anti-Pattern

Profiling Tools Comparison

When to Use Which Tool

py-spy for Production Profiling

Anti-Patterns

Premature Optimization

Micro-Optimizations

Not Benchmarking Changes

Decision Trees

What Tool to Use for Profiling?

Optimization Strategy

Memory Issue Diagnosis

Integration with Other Skills

Quick Reference

Essential Profiling Commands

Debugging Commands

Optimization Checklist

Common Optimizations

Red Flags