name	performance-profiler
description	Analyzes BazBOM scan performance, identifies bottlenecks, and provides optimization recommendations. Use when scans are slow, investigating performance regressions, or optimizing for large repositories.

Performance Profiler Skill

Analyzes execution time, memory usage, and identifies performance bottlenecks in BazBOM scans.

When to Use

Activate when you hear:

"Why is this scan so slow?"
"Performance test"
"How fast should this be?"
"Memory usage too high"
"Optimize the scan"

Performance Metrics

Expected Performance Baselines

Repository Size	Packages	Expected Time	Expected Memory
Small (<10MB)	<100	<1s	~50MB
Medium (10-50MB)	100-1K	1-3s	~100MB
Large (50-100MB)	1K-5K	3-10s	~150MB
Huge (>100MB)	5K+	10-30s	~200MB

Baselines for Apple Silicon M1/M2

Profiling Commands

Basic Timing

# macOS time with memory stats
/usr/bin/time -l bazbom scan . 2>&1

# Key metrics:
#   - real time (wall clock)
#   - user time (CPU)
#   - maximum resident set size (memory)

Detailed Profiling

# With debug logging
RUST_LOG=debug /usr/bin/time -l bazbom scan . 2>&1 | tee scan-profile.log

# Trace-level logging (very verbose)
RUST_LOG=trace /usr/bin/time -l bazbom scan . 2>&1 | tee scan-trace.log

# Specific module profiling
RUST_LOG=bazbom::bazel=debug /usr/bin/time -l bazbom scan .

Comparative Profiling

# Compare different scan modes
for mode in quick check scan full; do
    echo "=== Testing $mode mode ==="
    /usr/bin/time -l bazbom $mode . 2>&1 | grep -E "real|maximum"
done

# Progressive limits
for limit in 10 50 100 500 1000; do
    echo "=== Limit: $limit ==="
    /usr/bin/time -l bazbom full --limit $limit . 2>&1 | grep -E "real|maximum"
done

Bottleneck Detection

Common Bottlenecks

1. maven_install.json Parsing

Symptom: Slow startup, high CPU during initial phase Detection:

RUST_LOG=bazbom::bazel=debug bazbom scan . 2>&1 | grep -E "parsing|extracting"

Optimization: Consider caching parsed results

2. SBOM Generation

Symptom: Slow after dependency extraction Detection:

RUST_LOG=bazbom_formats=debug bazbom scan . 2>&1 | grep -E "spdx|cyclonedx"

Optimization: Stream output instead of building in memory

3. Polyglot Scanning

Symptom: Delays when scanning non-Java ecosystems Detection:

RUST_LOG=bazbom_polyglot=debug bazbom scan . 2>&1

Optimization: Parallel ecosystem scanning

4. Reachability Analysis

Symptom: Very slow, high memory usage Detection:

# Check if auto-enabled
bazbom scan . 2>&1 | grep "reachability"

Optimization:

# Disable if not needed
export BAZBOM_NO_SMART_DEFAULTS=1
bazbom scan .

Performance Analysis Workflow

1. Baseline Measurement

# Clean run
cd /path/to/repo
/usr/bin/time -l bazbom scan . 2>&1 | tee baseline.txt

# Extract metrics
grep "real" baseline.txt
grep "maximum resident set size" baseline.txt

2. Identify Slow Phases

# Enable debug logging
RUST_LOG=debug bazbom scan . 2>&1 | tee debug.log

# Analyze log timestamps
grep -E "\[bazbom\]" debug.log | while read line; do
    echo "$line"
done

3. Isolate Bottleneck

# Test without polyglot
# (would need feature flag)

# Test with smaller dataset
bazbom scan --limit 10 .

# Test parsing only
# (could add --parse-only flag)

4. Measure Improvement

# Before optimization
/usr/bin/time -l bazbom scan . 2>&1 | tee before.txt

# Apply fix
# ... make changes ...

# After optimization
/usr/bin/time -l bazbom scan . 2>&1 | tee after.txt

# Compare
diff <(grep "real\|maximum" before.txt) <(grep "real\|maximum" after.txt)

Performance Regression Detection

Test Suite

#!/bin/bash
# performance-regression-test.sh

REPOS=(
    "~/Documents/BazBOM_Testing/real-repos/bazel-examples"
    "~/Documents/BazBOM_Testing/generated"
)

for repo in "${REPOS[@]}"; do
    echo "Testing: $repo"
    cd "$repo"

    # Run 3 times, take median
    for i in 1 2 3; do
        /usr/bin/time -l bazbom scan . 2>&1 | grep "real"
    done
done

Acceptable Thresholds

Time regression: >20% slower than baseline
Memory regression: >30% more memory
Throughput regression: <50% packages/second

Optimization Recommendations

For Large Repositories (>1000 packages)

# Use incremental mode
bazbom scan --incremental .

# Limit scope
bazbom scan --ecosystems java .

# Disable reachability
export BAZBOM_NO_SMART_DEFAULTS=1

For Repeated Scans

# Use caching (if implemented)
bazbom scan --cache .

# Scan only changed files
bazbom scan --bazel-affected-by-files $(git diff --name-only HEAD~1)

For CI/CD

# Quick mode for PRs
bazbom quick .

# Check mode for fast validation
bazbom check .

# Full scan only on main branch
if [ "$BRANCH" = "main" ]; then
    bazbom full .
fi

Performance Report Format

Performance Profile Report
=========================

Repository: bazel-examples
Size: 8.2MB
Packages: 59

Timing:
  Real time: 0.92s
  User time: 0.45s
  System time: 0.28s

Memory:
  Maximum RSS: 52MB
  Peak memory: 48MB
  Average memory: 35MB

Phases:
  Detection: 0.02s (2%)
  Parsing: 0.15s (16%)
  Extraction: 0.35s (38%)
  SBOM gen: 0.25s (27%)
  Polyglot: 0.15s (16%)

Bottlenecks:
  ⚠️ Extraction phase taking 38% of total time
  ✅ Memory usage within normal range
  ✅ No apparent bottlenecks

Recommendations:
  - Performance is within expected range
  - Consider caching parsed maven_install.json for repeated runs

Profiling Tools

Built-in Rust Tools

# Cargo flamegraph (requires cargo-flamegraph)
cargo flamegraph --bin bazbom -- scan .

# Valgrind (if available)
valgrind --tool=massif target/release/bazbom scan .

macOS Tools

# Instruments (if available)
instruments -t "Time Profiler" -D trace.trace target/release/bazbom scan .

# Activity Monitor
# (manual observation during execution)

Success Criteria

Performance is acceptable when:

✅ Meets baseline expectations for repository size
✅ No single phase >50% of total time
✅ Memory usage linear with package count
✅ No memory leaks (RSS stays stable)
✅ Throughput >100 packages/second

Remember: Optimize for the common case, not edge cases. Most repos are small to medium size.

performance-profiler

Install Skill

SKILL.md