name	atft-autonomy
description	Coordinate Claude Code skills with OpenAI Codex autonomous workflows for end-to-end ATFT-GAT-FAN maintenance.
proactive	true

ATFT Autonomy Skill

Mission

Seamlessly orchestrate Claude and Codex agents to keep the ATFT-GAT-FAN stack production-ready.
Decide when to run deep Codex analyses versus fast Claude remediations.
Maintain shared context (AGENTS.md, health snapshots, run logs) across agents.

Requests to “run full autonomous maintenance”, “coordinate Claude and Codex”, or “schedule daily self-healing”.
Situations where an issue spans multiple domains (dataset + training + quality).
Need to generate or refresh .mcp.json, AGENTS.md, or other shared configs.

Confirm tool availability:
- command -v codex (expect npm package @openai/codex).
- command -v ./tools/claude-code.sh.
Verify configuration files:
- .mcp.json exists with filesystem/git servers (tools/codex.sh recreates if missing).
- AGENTS.md up to date with latest guidelines.
Run snapshot diagnostics:
- tools/project-health-check.sh --summary (logs to _logs/health-checks/).
- nvidia-smi and df -h to ensure resource headroom before long autonomous jobs.

./tools/claude-code.sh --no-check "Run proactive maintenance checklist" — quick fixes + TODO updates.
./tools/codex.sh --max --exec "Perform deep optimization sweep across dataset, training, and research modules" — longer Codex session with MCP.
Append combined findings to docs/ops/autonomy_log.md (capture commands run, deltas, next steps).
git status --short → review diffs, ensure no unintended drift.

Launch Claude for rapid triage: ./tools/claude-code.sh "Investigate failed training and prepare briefing for Codex".
Pass Claude findings to Codex:
./tools/codex.sh --exec "Read docs/ops/incident_brief.md and propose remedial plan".
Execute agreed actions (dataset rebuild, training rerun, quality checks).
Update docs/ops/incident_log.md with both agents’ actions and resolutions.

Add cron entry (example):
0 2 * * * cd /workspace/gogooku3 && ./tools/codex.sh --max >> _logs/daily-optimization.log 2>&1.
Pair with weekly Claude sweep:
0 7 * * MON cd /workspace/gogooku3 && ./tools/claude-code.sh --no-check >> _logs/weekly-claude.log 2>&1.
Summarize improvements weekly by aggregating _logs/codex-sessions/*.log and _logs/claude-code/*.log; record highlights in docs/ops/weekly_autonomy_report.md.

Ensure both agents read CLAUDE.md and AGENTS.md before editing high-risk files.
Rotate docs/SKILLS_GUIDE.md and claude/skills/ when procedures change.
Store cross-agent decisions in EXPERIMENT_STATUS.md or docs/ops/autonomy_log.md.

Codex CLI missing → npm install -g @openai/codex.
MCP server errors → regenerate .mcp.json via ./tools/codex.sh and restart run.
Conflicting edits → consolidate drafts under docs/ops/autonomy_pending/ and resolve manually.
Resource saturation → stagger Claude/Codex runs, cap concurrency in cron.

After combined sessions, notify stakeholders by updating docs/ops/weekly_autonomy_report.md.
Attach session logs _logs/codex-sessions/*.log and _logs/claude-code/*.log for audit.
Re-sync ~/.claude/skills/ so Claude picks up any procedural updates discovered during Codex runs.