| name | ci-fix-loop |
| description | Autonomous CI fix loop with background monitoring and retry logic. Runs up to 10 fix-commit-push-wait cycles until CI passes or max retries reached. |
CI Fix Loop Skill
Orchestrates autonomous CI repair: analyze → fix → commit → push → monitor → repeat until success.
When to Use
This skill is invoked when:
- User runs
/fix-ci --loopor/fix-ci --auto - Multiple CI fix iterations are needed
- User wants hands-off CI repair
Configuration
| Setting | Value | Description |
|---|---|---|
| max_attempts | 10 | Maximum fix iterations |
| poll_interval | 60 | Seconds between CI status checks |
| ci_start_timeout | 120 | Seconds to wait for CI run to start |
| ci_run_timeout | 1800 | Max seconds to wait for CI completion (30 min) |
Workflow
Phase 1: Initialize
Get context and validate:
BRANCH=$(git branch --show-current)
REPO=$(gh repo view --json nameWithOwner -q .nameWithOwner 2>/dev/null || echo "unknown")
Safety checks:
- Block on protected branches:
if [[ "$BRANCH" == "main" || "$BRANCH" == "master" ]]; then
echo "Cannot run autonomous fixes on $BRANCH"
echo "Create a feature branch: git checkout -b fix/ci-errors"
# STOP - do not proceed
fi
- Handle uncommitted changes:
if [[ -n $(git status --porcelain) ]]; then
echo "Stashing uncommitted changes..."
git stash push -m "pre-ci-fix-loop-$(date +%Y%m%d_%H%M%S)"
fi
Initialize state:
attempt = 1
max_attempts = 10
last_errors = []
history = []
started_at = now
Phase 2: Fix Loop
For each attempt from 1 to 10:
Step 2.1: Display Progress
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
CI Fix Loop - Attempt ${attempt}/${max_attempts}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Branch: ${branch}
Repository: ${repo}
Step 2.2: Fetch CI Logs
Get most recent failed run:
RUN_ID=$(gh run list --branch "$BRANCH" --limit 5 --json databaseId,conclusion \
--jq '[.[] | select(.conclusion == "failure")][0].databaseId')
if [ -z "$RUN_ID" ]; then
echo "No failed runs found - checking if CI is passing..."
# May already be fixed, verify
fi
Fetch logs for failed jobs:
FAILED_JOBS=$(gh run view $RUN_ID --json jobs --jq '.jobs[] | select(.conclusion == "failure") | .databaseId')
for JOB_ID in $FAILED_JOBS; do
gh api repos/${REPO}/actions/jobs/${JOB_ID}/logs > /tmp/ci-logs-${JOB_ID}.txt 2>/dev/null || true
done
Step 2.3: Analyze Errors
Invoke the ci-log-analyzer agent:
- Parse CI logs from /tmp/ci-logs-*.txt
- Extract structured error list with type, file, line, message
- Returns JSON with errors categorized by type (lint/test/type/build)
Step 2.4: Check for Progress
Compare current errors with previous attempt:
if current_errors == last_errors AND attempt > 1:
# Same errors after fix attempt = likely unfixable
consecutive_same_errors += 1
if consecutive_same_errors >= 2:
echo "Same errors detected after 2 fix attempts - aborting"
echo "These errors may require manual intervention"
# STOP - exit loop with failure report
fi
if current_errors is empty:
# No errors found - CI might be passing
# Skip to monitoring phase
Step 2.5: Apply Fixes
Invoke the ci-error-fixer agent with error list:
- Applies targeted fixes based on error type
- Shows diffs for each change
- Reports fixed vs flagged-for-manual-review counts
Track results:
errors_fixed = count of successfully fixed errors
errors_flagged = count of errors needing manual review
Step 2.6: Commit & Push
Stage and commit changes:
git add .
# Create descriptive commit message
git commit -m "fix(ci): automated fix attempt ${attempt}
Errors addressed:
- ${error_summary_list}
Attempt ${attempt} of ${max_attempts} (ci-fix-loop)"
Push to trigger CI:
PUSH_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ)
git push origin ${BRANCH}
Step 2.7: Wait for CI Run to Start
Poll until new run appears (max 2 minutes):
TIMEOUT=120
START=$(date +%s)
while true; do
RUN_JSON=$(gh run list --branch "$BRANCH" --limit 1 --json databaseId,status,createdAt)
CREATED=$(echo "$RUN_JSON" | jq -r '.[0].createdAt')
# Check if this run was created after our push
if [[ "$CREATED" > "$PUSH_TIME" ]]; then
NEW_RUN_ID=$(echo "$RUN_JSON" | jq -r '.[0].databaseId')
echo "CI run started: $NEW_RUN_ID"
break
fi
ELAPSED=$(($(date +%s) - START))
if [ $ELAPSED -gt $TIMEOUT ]; then
echo "Warning: No CI run started after ${TIMEOUT}s"
echo "Check if workflows are enabled for this branch"
break
fi
sleep 5
done
Step 2.8: Monitor CI (Background)
Spawn the ci-monitor agent with run_in_background: true:
The monitor will:
- Poll
gh run listevery 60 seconds - Return when CI reaches terminal state
- Output:
SUCCESS|RUN_ID,FAILURE|RUN_ID,CANCELLED|RUN_ID, orTIMEOUT|RUN_ID
Wait for monitor result using TaskOutput tool.
Step 2.9: Handle Result
Parse monitor output:
case "$RESULT" in
SUCCESS*)
# CI passed! Exit loop with success
;;
FAILURE*)
# CI still failing - continue to next attempt
;;
CANCELLED*)
# Run was cancelled - warn and exit
echo "CI run was cancelled externally"
# EXIT with warning
;;
TIMEOUT*)
# Exceeded 30 min wait
echo "CI run timed out after 30 minutes"
# Ask if should continue waiting or abort
;;
esac
Step 2.10: Record History
history.append({
attempt: attempt,
errors_found: len(current_errors),
errors_fixed: errors_fixed,
errors_flagged: errors_flagged,
run_id: run_id,
result: conclusion,
duration: attempt_duration
})
last_errors = current_errors
attempt += 1
Phase 3: Final Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
CI Fix Loop Complete
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Result: [SUCCESS|FAILURE] after ${attempts} attempt(s)
Summary:
Total time: ${total_duration}
Commits created: ${commit_count}
Errors fixed: ${total_errors_fixed}
History:
For each entry in history:
Attempt ${n}: Found ${errors_found} errors, fixed ${fixed} → ${result}
If FAILURE:
Remaining Issues (require manual intervention):
- ${file}:${line} - ${message}
Type: ${type}
Suggested next steps:
1. Review errors above
2. Check CI logs: gh run view ${last_run_id} --log-failed
3. Fix manually and push
If SUCCESS:
CI is now passing!
Next steps:
1. Review automated commits: git log --oneline -${commit_count}
2. Squash if desired: git rebase -i HEAD~${commit_count}
3. Create PR: /github:create-pr
Error Handling
Network/API Failures
- Retry
ghcommands 3 times with 5s backoff - If persistent, abort and report
Git Conflicts
- If push fails due to upstream changes:
echo "Upstream changes detected"
echo "Pull and retry: git pull --rebase && /fix-ci --loop"
- Abort loop
Unfixable Errors
- Track errors persisting across 2+ attempts
- Mark as "unfixable" in final report
- Continue attempting other errors
Timeout
- CI run timeout (30 min): report and suggest
gh run watch - CI start timeout (2 min): check workflow configuration
Safety Mechanisms
- Branch protection: Never run on main/master
- Max attempts: Hard limit of 10 iterations
- Stash protection: Uncommitted changes are preserved
- Progress detection: Abort if same errors repeat twice
- Timeout limits: 30 min max CI wait per attempt
- Commit tracking: Report all commits for easy revert
Token Efficiency
Estimated per iteration:
- Analysis (sonnet): ~2000 tokens
- Fix application (sonnet): ~3000 tokens
- CI monitoring (haiku): ~500 tokens
- State/reporting: ~500 tokens
- Total: ~6000 tokens/iteration
- 10 iterations max: ~60,000 tokens
Key optimizations:
- Haiku model for CI polling (10x cheaper than sonnet)
- No context accumulation between iterations
- Minimal state tracking
- Background execution frees terminal