Claude Code Plugins

Community-maintained marketplace

Feedback
26
0

Patterns for using Codex CLI as an automated code reviewer. Covers review prompts, structured output parsing, issue tracking, and configuration. Use when implementing review gates or automated quality checks.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name codex-reviewer
description Patterns for using Codex CLI as an automated code reviewer. Covers review prompts, structured output parsing, issue tracking, and configuration. Use when implementing review gates or automated quality checks.

Codex as Code Reviewer

Use OpenAI's Codex CLI to automate code review as a quality gate. This skill covers review prompt design, output parsing, and integration patterns.

Prerequisites

Codex CLI must be installed and available in PATH:

which codex  # verify installation

Basic Invocation

# Execute with prompt from stdin
echo "Review this code..." | codex exec - --sandbox read-only -o output.txt

# Key flags
codex exec - \
  --sandbox read-only \                    # read-only | workspace-write | danger-full-access
  -c 'approval_policy="never"' \           # untrusted | on-failure | on-request | never
  -o /tmp/review-output.txt                # output file (stdout is messy)

Configuration

Store user preferences in ~/.claude/codex/config.json:

interface CodexConfig {
  sandbox?: "read-only" | "workspace-write" | "danger-full-access";
  approval_policy?: "untrusted" | "on-failure" | "on-request" | "never";
  bypass_sandbox?: boolean;      // --dangerously-bypass-approvals-and-sandbox
  extra_args?: string[];         // additional CLI args (DO NOT override -o)
  timeout_seconds?: number;      // default: 1200 (20 min), must be < hook timeout
}

Recommended defaults for review:

  • sandbox: "read-only" - reviewer should inspect, not modify
  • approval_policy: "never" - no human approval needed for read-only
  • timeout_seconds: 1200 - 20 minutes for thorough review (must be < hook timeout, typically 1800s)

Review Prompt Structure

Structure prompts with clear sections for context and expectations:

# Code Review

Review work completed by Claude in an iterative loop. Claude claims the task is complete.

## Assignment
${originalPrompt}

## Git Context
**Working Directory**: `pwd`
**Repository**: `basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)"`
**Branch**: `git branch --show-current 2>/dev/null || echo "detached/unknown"`
**Uncommitted changes**: `git diff --stat 2>/dev/null || echo "None"`
**Staged changes**: `git diff --cached --stat 2>/dev/null || echo "None"`
**Recent commits (last 4 hours)**: `git log --oneline -5 --since="4 hours ago" 2>/dev/null || echo "None"`

${previousReviewHistory}

## Review Process
1. Understand the task (read referenced files as needed)
2. Review git changes (`git diff`, `git diff --cached`, `git log`, etc.)
3. Run verification commands from success criteria if applicable
4. Check ALL requirements - be thorough, not superficial

## Output Format

If approved:
\`\`\`
<review>APPROVE</review>
<notes>Optional notes for the record</notes>
\`\`\`

If issues found:
\`\`\`
<review>REJECT</review>
<resolved>
[ISSUE-1] How you verified this previous issue is now fixed
</resolved>
<issues>
[ISSUE-1] severity: Description of the issue
[ISSUE-2] severity: Description of another issue
</issues>
<notes>Optional notes visible to future review cycles</notes>
\`\`\`

- Severity levels: `critical` (blocking), `major` (significant), `minor` (nice to fix)
- Issue IDs must be unique across all cycles - continue numbering from previous reviews (don't restart at ISSUE-1)
- `<resolved>` section: List any previous issues you verified as fixed (omit if none or first review)
- `<notes>` section: Optional, visible to future review cycles
- Be thorough - report ALL issues found

Review ${currentCycle}/${maxCycles}.

Output Format Specification

Verdict Tags

<!-- Approved - no blocking issues -->
<review>APPROVE</review>
<notes>Optional commentary for the record</notes>

<!-- Rejected - issues require resolution -->
<review>REJECT</review>
<resolved>
[ISSUE-1] Verified: test now passes after fix in auth.ts:45
[ISSUE-3] Verified: error handling added in api.ts:120
</resolved>
<issues>
[ISSUE-4] critical: API endpoint returns 500 on empty input
[ISSUE-5] major: Missing input validation in UserForm component
[ISSUE-6] minor: Typo in error message "recieved" -> "received"
</issues>
<notes>Good progress on previous issues. Focus on input validation.</notes>

Issue Format

[ISSUE-{id}] {severity}: {description}
  • id: Unique integer, monotonically increasing across cycles
  • severity: critical | major | minor
  • description: Clear, actionable description (can be multi-line)

Severity Definitions

Severity Meaning Action
critical Blocking bug, security issue, data loss risk Must fix before approval
major Significant functionality gap, poor UX Should fix before approval
minor Code quality, style, minor improvements Nice to fix, non-blocking

Parsing Codex Output

Parse from the END of output to avoid matching echoed examples in the prompt:

interface ReviewIssue {
  id: number;
  severity: "critical" | "major" | "minor";
  description: string;
}

interface ResolvedIssue {
  id: number;
  verification: string;
}

interface ReviewResult {
  approved: boolean;
  issues: ReviewIssue[];
  resolved: ResolvedIssue[];
  notes: string | null;
}

function parseCodexOutput(output: string): ReviewResult {
  // Find LAST occurrence of each tag (avoids matching prompt examples)
  const reviewMatches = [...output.matchAll(/<review>\s*(APPROVE|REJECT)\s*<\/review>/gi)];
  const lastReview = reviewMatches.length > 0 ? reviewMatches[reviewMatches.length - 1] : null;
  const verdict = lastReview ? lastReview[1].toUpperCase() : null;

  // Parse notes
  const notesMatches = [...output.matchAll(/<notes>([\s\S]*?)<\/notes>/gi)];
  const lastNotes = notesMatches.length > 0 ? notesMatches[notesMatches.length - 1] : null;
  const notes = lastNotes ? lastNotes[1].trim() : null;

  if (verdict === "APPROVE") {
    return { approved: true, issues: [], resolved: [], notes };
  }

  if (verdict === "REJECT") {
    // Parse issues from last <issues> block
    const issues: ReviewIssue[] = [];
    const issuesMatches = [...output.matchAll(/<issues>([\s\S]*?)<\/issues>/gi)];
    const lastIssuesMatch = issuesMatches.length > 0 ? issuesMatches[issuesMatches.length - 1] : null;
    const issuesBlock = lastIssuesMatch ? lastIssuesMatch[1] : null;

    if (issuesBlock) {
      const issuePattern = /\[ISSUE-(\d+)\]\s*(critical|major|minor):\s*([\s\S]+?)(?=\[ISSUE-|\s*$)/gi;
      for (const match of issuesBlock.matchAll(issuePattern)) {
        issues.push({
          id: parseInt(match[1], 10),
          severity: match[2].toLowerCase() as "critical" | "major" | "minor",
          description: match[3].trim(),
        });
      }
    }

    // Parse resolved from last <resolved> block
    const resolved: ResolvedIssue[] = [];
    const resolvedMatches = [...output.matchAll(/<resolved>([\s\S]*?)<\/resolved>/gi)];
    const lastResolvedMatch = resolvedMatches.length > 0 ? resolvedMatches[resolvedMatches.length - 1] : null;
    const resolvedBlock = lastResolvedMatch ? lastResolvedMatch[1] : null;

    if (resolvedBlock) {
      const resolvedPattern = /\[ISSUE-(\d+)\]\s*([\s\S]+?)(?=\[ISSUE-|\s*$)/gi;
      for (const match of resolvedBlock.matchAll(resolvedPattern)) {
        resolved.push({
          id: parseInt(match[1], 10),
          verification: match[2].trim(),
        });
      }
    }

    // Handle REJECT with no parseable issues - auto-approve to avoid deadlock
    if (issues.length === 0) {
      return {
        approved: true,
        issues: [],
        resolved: [],
        notes: notes
          ? `[AUTO-APPROVED: REJECT with unparseable issues] ${notes}`
          : "[AUTO-APPROVED: REJECT with unparseable issues]",
      };
    }

    return { approved: false, issues, resolved, notes };
  }

  // Unclear response - default to approve
  return { approved: true, issues: [], resolved: [], notes: null };
}

Review History Tracking

Maintain history across review cycles for context:

interface ReviewHistoryEntry {
  cycle: number;
  decision: "APPROVE" | "REJECT";
  issues: ReviewIssue[];
  resolved: ResolvedIssue[];
  notes: string | null;
}

function buildReviewHistorySection(history: ReviewHistoryEntry[]): string {
  if (history.length === 0) return "";

  const sections = history.map((entry) => {
    const parts: string[] = [`### Cycle ${entry.cycle}: ${entry.decision}`];

    if (entry.resolved.length > 0) {
      parts.push(`**Resolved:**\n${entry.resolved.map(r =>
        `  - [ISSUE-${r.id}] ✓ ${r.verification}`
      ).join("\n")}`);
    }

    if (entry.issues.length > 0) {
      parts.push(`**Issues:**\n${entry.issues.map(i =>
        `  - [ISSUE-${i.id}] ${i.severity}: ${i.description}`
      ).join("\n")}`);
    }

    if (entry.notes) {
      parts.push(`**Notes:** ${entry.notes}`);
    }

    return parts.join("\n");
  });

  return `## Previous Reviews\n\n${sections.join("\n\n")}\n\n`;
}

Spawning Codex CLI

import { spawnSync } from "node:child_process";
import { existsSync, readFileSync } from "node:fs";

function callCodexReview(
  prompt: string,
  cwd: string,
  config: CodexConfig
): ReviewResult {
  // Check availability
  const which = spawnSync("which", ["codex"], { encoding: "utf-8" });
  if (which.status !== 0) {
    console.error("Codex CLI not found, approving by default");
    return { approved: true, issues: [], resolved: [], notes: null };
  }

  const outputFile = `/tmp/codex-review-${Date.now()}.txt`;
  const args: string[] = ["exec", "-"];  // read from stdin

  if (config.bypass_sandbox) {
    args.push("--dangerously-bypass-approvals-and-sandbox");
  } else {
    args.push("--sandbox", config.sandbox ?? "read-only");
    args.push("-c", `approval_policy="${config.approval_policy ?? "never"}"`);
  }

  args.push("-o", outputFile);

  // WARNING: extra_args can override -o, which breaks output parsing
  // Filter out -o/--output to prevent this, or validate config upstream
  if (Array.isArray(config.extra_args)) {
    args.push(...config.extra_args.filter(a =>
      typeof a === "string" && a !== "-o" && !a.startsWith("--output")
    ));
  }

  // NOTE: timeout must be less than hook timeout (typically 1800s for Claude Code hooks)
  const timeoutMs = (config.timeout_seconds ?? 1200) * 1000;

  const result = spawnSync("codex", args, {
    cwd,
    encoding: "utf-8",
    timeout: timeoutMs,
    maxBuffer: 1024 * 1024,
    input: prompt,
  });

  if (!existsSync(outputFile)) {
    console.error("No output file created");
    return { approved: true, issues: [], resolved: [], notes: null };
  }

  const output = readFileSync(outputFile, "utf-8");
  return parseCodexOutput(output);
}

Error Handling

Fail Safe, Not Stuck

When errors occur, default to APPROVE to avoid trapping users in infinite loops:

try {
  return callCodexReview(prompt, cwd, config);
} catch (e) {
  console.error("Codex review failed:", e);
  // Clean up any state files
  // Default to approve - don't block on review failure
  return { approved: true, issues: [], resolved: [], notes: null };
}

Handle Unclear Responses

// REJECT with no parseable issues = auto-approve to prevent deadlock
if (verdict === "REJECT" && issues.length === 0) {
  return {
    approved: true,
    issues: [],
    resolved: [],
    notes: "[AUTO-APPROVED: REJECT with unparseable issues]",
  };
}

// No clear verdict found = approve by default
if (!verdict) {
  return { approved: true, issues: [], resolved: [], notes: null };
}

Feedback Loop Pattern

When review rejects, format structured feedback for the next iteration:

function buildFeedbackPrompt(
  result: ReviewResult,
  originalPrompt: string,
  iteration: number,
  maxIterations: number,
  reviewCycle: number,
  maxReviewCycles: number,
  completionPromise: string
): string {
  const issuesList = result.issues
    .map(i => `- [ISSUE-${i.id}] ${i.severity}: ${i.description}`)
    .join("\n");

  const resolvedSection = result.resolved.length > 0
    ? `\n\n**Resolved from previous cycle:**\n${result.resolved
        .map(r => `- [ISSUE-${r.id}] ✓ ${r.verification}`)
        .join("\n")}`
    : "";

  const notesSection = result.notes
    ? `\n\n**Reviewer notes:** ${result.notes}`
    : "";

  return `# Ralph Loop - Iteration ${iteration}/${maxIterations}

## Review Feedback (Cycle ${reviewCycle}/${maxReviewCycles})

Your previous completion was reviewed and requires changes.
${resolvedSection}

**Open Issues:**
${issuesList}
${notesSection}

Address ALL open issues above, then output <promise>${completionPromise}</promise> when truly complete.

---

${originalPrompt}`;
}

Integration Checklist

  • Check Codex CLI availability before calling
  • Use read-only sandbox for review tasks
  • Set appropriate timeout (20+ minutes for complex reviews)
  • Parse from END of output to avoid prompt echo matches
  • Handle REJECT with no issues as auto-approve
  • Handle unclear responses as approve (fail safe)
  • Track issue IDs across cycles (monotonic, never restart)
  • Include git context in review prompt
  • Include previous review history for multi-cycle reviews
  • Clean up temp files and state on errors
  • Log for debugging (errors, timing, verdict)