name	debug-troubleshoot
description	Systematic debugging approach for Rust async code with Tokio, Turso, and redb. Use when diagnosing runtime issues, performance problems, async deadlocks, database connection issues, or panics.

Debug and Troubleshoot

Systematic debugging approach for Rust async code with Tokio, Turso, and redb.

Purpose

Diagnose and resolve runtime issues, performance problems, and unexpected behavior in the memory system.

Debugging Tools

1. Logging with tracing

Setup

use tracing::{debug, info, warn, error, instrument};

#[instrument]
async fn problematic_function(id: &str) -> Result<Data> {
    debug!("Starting operation for id: {}", id);

    let data = fetch_data(id).await
        .map_err(|e| {
            error!("Failed to fetch data: {:?}", e);
            e
        })?;

    info!("Successfully fetched data");
    Ok(data)
}

Run with Logging

# Info level
RUST_LOG=info cargo run

# Debug level
RUST_LOG=debug cargo run

# Trace level (very verbose)
RUST_LOG=trace cargo run

# Specific module
RUST_LOG=memory_core::storage=debug cargo run

# Multiple modules
RUST_LOG=memory_core=debug,memory_storage_turso=trace cargo run

2. Console Debugging

Add Tracing Subscriber

use tracing_subscriber::{fmt, prelude::*, EnvFilter};

fn init_tracing() {
    tracing_subscriber::registry()
        .with(fmt::layer())
        .with(EnvFilter::from_default_env())
        .init();
}

Pretty Printing

# Colored output
RUST_LOG=debug cargo run 2>&1 | less -R

# JSON format (for parsing)
RUST_LOG_FORMAT=json RUST_LOG=debug cargo run

3. Tokio Console

Enable

# Cargo.toml
[dependencies]
tokio = { version = "1", features = ["full", "tracing"] }
console-subscriber = "0.1"

fn main() {
    console_subscriber::init();
    // Rest of code
}

Run

# Terminal 1: Run app
cargo run --features tokio-console

# Terminal 2: Run console
tokio-console

Shows:

Active tasks
Task spawn history
Resource usage
Blocking detection

4. LLDB/GDB Debugger

# Build with debug symbols
cargo build

# Run under debugger
rust-lldb target/debug/memory-core

# Set breakpoint
(lldb) b src/lib.rs:42

# Run
(lldb) run

# Inspect variables
(lldb) p variable_name

# Stack trace
(lldb) bt

Common Issues

1. Async Deadlocks

Symptoms

Program hangs
No errors, just stops responding
High CPU usage with no progress

Diagnosis

// Add timeout to detect hangs
use tokio::time::{timeout, Duration};

let result = timeout(
    Duration::from_secs(5),
    potentially_hanging_operation()
).await;

match result {
    Ok(Ok(data)) => println!("Success"),
    Ok(Err(e)) => println!("Operation failed: {}", e),
    Err(_) => println!("TIMEOUT! Possible deadlock"),
}

Common Causes

Cause 1: Holding lock across await

// WRONG - Can deadlock
let mut data = mutex.lock().await;
async_operation().await;  // Lock held during async operation
data.update();

// RIGHT
let value = {
    let data = mutex.lock().await;
    data.clone()  // Clone needed data
}; // Lock released
async_operation().await;

Cause 2: Circular task dependencies

// Task A waits for B, B waits for A
let (tx1, rx1) = oneshot::channel();
let (tx2, rx2) = oneshot::channel();

tokio::spawn(async move {
    rx2.await;  // Waits for task 2
    tx1.send(());
});

tokio::spawn(async move {
    rx1.await;  // Waits for task 1
    tx2.send(());
});
// DEADLOCK!

2. Database Connection Issues

Turso Connection Failures

#[instrument]
async fn check_turso_health(client: &TursoClient) -> Result<()> {
    debug!("Checking Turso connection");

    match client.execute("SELECT 1").await {
        Ok(_) => {
            info!("Turso connection healthy");
            Ok(())
        }
        Err(e) => {
            error!("Turso connection failed: {:?}", e);

            // Check specific errors
            if e.to_string().contains("timeout") {
                warn!("Connection timeout - network issue?");
            } else if e.to_string().contains("auth") {
                error!("Authentication failed - check token");
            }

            Err(e)
        }
    }
}

redb Lock Issues

// Diagnose lock problems
#[instrument]
fn debug_redb_locks(db: &Database) {
    debug!("Checking redb transactions");

    // Try to acquire write lock
    match db.begin_write() {
        Ok(txn) => {
            debug!("Write lock acquired successfully");
            drop(txn);
        }
        Err(e) => {
            error!("Cannot acquire write lock: {:?}", e);
            warn!("Possible long-running read transaction");
        }
    }
}

Fix: Ensure short-lived transactions

// WRONG - Long-lived transaction
let read_txn = db.begin_read()?;
expensive_computation();  // Transaction held too long
let value = read_txn.get(...)?;

// RIGHT - Short transaction
let value = {
    let read_txn = db.begin_read()?;
    read_txn.get(...)?
}; // Transaction dropped
expensive_computation();

3. Memory Leaks

Detect with Valgrind

cargo build
valgrind --leak-check=full \
         --show-leak-kinds=all \
         ./target/debug/memory-core

Common Causes

Cause 1: Circular Arc references

// Can leak if not careful
struct Node {
    next: Option<Arc<Mutex<Node>>>,
}

// Use Weak to break cycles
struct Node {
    next: Option<Weak<Mutex<Node>>>,
}

Cause 2: Unbounded channels

// WRONG - Can grow indefinitely
let (tx, mut rx) = mpsc::unbounded_channel();

// RIGHT - Bounded with backpressure
let (tx, mut rx) = mpsc::channel(100);

4. Performance Issues

Profile with Flamegraph

# Install
cargo install flamegraph

# Run profiler
cargo flamegraph --dev

# View flamegraph.svg
firefox flamegraph.svg

Common Bottlenecks

Issue: Excessive cloning

// Use Arc for shared ownership
let data = Arc::new(expensive_data);

// Clone Arc (cheap), not data
for _ in 0..100 {
    let data_clone = data.clone();  // Just increments refcount
    tokio::spawn(async move {
        process(data_clone).await;
    });
}

Issue: Synchronous redb in async context

// WRONG - Blocks executor
async fn save_data(db: &Database, data: Data) {
    let txn = db.begin_write().unwrap();  // Sync, blocks!
    // ...
}

// RIGHT - Use spawn_blocking
async fn save_data(db: Database, data: Data) {
    tokio::task::spawn_blocking(move || {
        let txn = db.begin_write()?;
        // ... sync operations ...
        txn.commit()
    }).await??;
}

Issue: Too many concurrent tasks

// WRONG - Spawns thousands of tasks
for item in huge_list {
    tokio::spawn(process(item));
}

// RIGHT - Limit concurrency
use tokio::sync::Semaphore;

let semaphore = Arc::new(Semaphore::new(10));
for item in huge_list {
    let permit = semaphore.clone().acquire_owned().await?;
    tokio::spawn(async move {
        let _permit = permit;  // Released on drop
        process(item).await
    });
}

5. Panic Debugging

Get Full Backtrace

RUST_BACKTRACE=full cargo run

Add Panic Hook

use std::panic;

fn main() {
    panic::set_hook(Box::new(|panic_info| {
        error!("PANIC: {:?}", panic_info);
        // Log to file, send to monitoring, etc.
    }));

    // Rest of code
}

Common Panics

Unwrap on None/Err

// WRONG
let value = option.unwrap();  // Panics if None

// RIGHT
let value = option.expect("Expected value to be present");
// Or
let value = option.ok_or_else(|| anyhow!("Missing value"))?;

Index out of bounds

// WRONG
let item = vec[index];  // Panics if index >= len

// RIGHT
let item = vec.get(index)
    .ok_or_else(|| anyhow!("Index {} out of bounds", index))?;

Debugging Workflow

1. Reproduce the Issue

Create minimal reproduction
Identify conditions that trigger it
Make it deterministic if possible

2. Add Instrumentation

#[instrument(skip(self))]
async fn problematic_function(&self, id: &str) -> Result<Data> {
    debug!("Starting with id: {}", id);

    let step1 = self.step1(id).await?;
    debug!("Step1 complete: {:?}", step1);

    let step2 = self.step2(step1).await?;
    debug!("Step2 complete: {:?}", step2);

    Ok(step2)
}

3. Run with Logging

RUST_LOG=debug cargo run 2>&1 | tee debug.log

4. Analyze Logs

Look for patterns before failure
Check timing (slow operations?)
Verify expected flow

5. Form Hypothesis

What could cause this behavior?
Is it related to concurrency?
Is it a logic error?
Is it an external dependency?

6. Test Hypothesis

Add specific logging
Add assertions
Create targeted test
Modify code to test theory

7. Fix and Verify

Implement fix
Add regression test
Verify fix works
Verify no new issues

Testing for Bugs

Stress Tests

#[tokio::test]
async fn stress_test_concurrent_access() {
    let memory = create_test_memory().await;
    let mut handles = vec![];

    // Spawn 1000 concurrent operations
    for i in 0..1000 {
        let mem = memory.clone();
        handles.push(tokio::spawn(async move {
            mem.operation(i).await
        }));
    }

    // All should succeed
    for handle in handles {
        handle.await.unwrap().unwrap();
    }
}

Troubleshooting Checklist

Can you reproduce the issue?
Is it in production, test, or both?
Recent changes related to the issue?
Error messages or panics?
Logs show expected flow?
Performance degradation?
Resource usage (CPU, memory, connections)?
External dependencies healthy?
Database connections working?
Locks or deadlocks?
Async tasks completing?

Install Skill

SKILL.md

Debug and Troubleshoot

Purpose

Debugging Tools

1. Logging with tracing

Setup

Run with Logging

2. Console Debugging

Add Tracing Subscriber

Pretty Printing

3. Tokio Console

Enable

Run

4. LLDB/GDB Debugger

Common Issues

1. Async Deadlocks

Symptoms

Diagnosis

Common Causes

2. Database Connection Issues

Turso Connection Failures

redb Lock Issues

3. Memory Leaks

Detect with Valgrind

Common Causes

4. Performance Issues

Profile with Flamegraph

Common Bottlenecks

5. Panic Debugging

Get Full Backtrace

Add Panic Hook

Common Panics

Debugging Workflow

1. Reproduce the Issue

2. Add Instrumentation

3. Run with Logging

4. Analyze Logs

5. Form Hypothesis

6. Test Hypothesis

7. Fix and Verify

Testing for Bugs

Stress Tests

Troubleshooting Checklist