| name | consulting-peer-llms |
| description | Use when user explicitly requests feedback from other LLM tools (Gemini, Codex) on current work - executes peer reviews in parallel and synthesizes responses into actionable insights |
Consulting Peer LLMs
Overview
Get feedback from other LLM CLI tools (Gemini, Codex) on your current work. This skill executes multiple LLM reviews in parallel and synthesizes their responses into one actionable report.
Core principle: Use peer LLMs for external validation and diverse perspectives on implementation quality.
When to Use
Use this skill when the user explicitly requests:
- "Validate this with codex"
- "Get feedback from gemini"
- "I want a review from other LLMs"
- "Do a peer review"
Do NOT use this skill:
- Automatically without user request
- For every piece of code (it's heavyweight)
- When quick internal review is sufficient
Prerequisites
Supported CLI tools:
gemini- Google Gemini CLI toolcodex- OpenAI Codex CLI toolclaude- Anthropic Claude CLI tool- (Add more as needed)
Detection and usage:
- With arguments: Use only specified CLIs (e.g.,
/review gemini codex) - Without arguments: Auto-detect all installed CLIs and use them
Quick Start
Minimal invocation:
User: "Review this code with gemini and codex"
You:
1. Collect context (conversation + git changes)
2. Generate review prompt
3. Execute both CLIs in parallel
4. Show raw Gemini response
5. Show raw Codex response
6. Synthesize final assessment
Workflow
Step 0: Determine Target CLIs
Arguments passed to /review command
REQUESTED_CLIS=("$@")
if [ ${#REQUESTED_CLIS[@]} -eq 0 ]; then
No arguments - default to standard set
TARGET_CLIS=("gemini" "codex" "claude") echo "Using default CLIs: ${TARGET_CLIS[*]}" else
Arguments provided - use specified CLIs
TARGET_CLIS=("${REQUESTED_CLIS[@]}") echo "Using specified CLIs: ${TARGET_CLIS[*]}" fi
Key behaviors:
- No arguments → Use all installed CLIs
- With arguments → Use only specified CLIs (skip if not installed)
- Always synthesize report (even for single CLI)
Step 1: Collect Context
Gather comprehensive context from current session:
Work Summary:
- What was implemented (extract from recent conversation)
- User's stated purpose and requirements
- Any mentioned constraints or design decisions
Code Changes:
# Get git information
BASE_SHA=$(git rev-parse origin/main 2>/dev/null || git rev-parse origin/master 2>/dev/null || echo "HEAD~1")
CURRENT_SHA=$(git rev-parse HEAD)
# Note: We do not pass file lists or diffs explicitly.
# Peer agents are expected to check `git status` or `git diff $BASE_SHA..$CURRENT_SHA` themselves.
Project Context:
- Programming language and frameworks (from files)
- Key architecture patterns (from memory if available)
Step 2: Generate Review Prompt
Use structured prompt based on code-reviewer patterns.
Full template: See reference/prompt-template.md
Key sections:
- Output language (if project/conversation uses non-English)
- What was implemented
- Requirements/plan
- Changes (SHAs + file list + summary)
- Review focus areas (quality, architecture, testing, requirements)
- Expected output format (critical/important/minor issues + strengths)
Language detection:
- Automatically detect conversation language (Korean, Japanese, etc.)
- Add "Output Language" section if non-English detected
- Omit if English or no clear preference
- See template for detection logic
Step 3: Execute in Parallel
Run target CLIs simultaneously using bash background execution:
# Generate full prompt
# NOTE: The agent executing this must replace {What Was Implemented} and {Requirements} with actual context.
PROMPT=$(cat <<EOF
# Code Review Request
## Context
- **Implemented**: {Extract from conversation context}
- **Requirements**: {User's stated requirements}
## Changes
Please check the changes between **$BASE_SHA** and **$CURRENT_SHA** using your git tools (e.g., \`git diff $BASE_SHA..$CURRENT_SHA\`).
## Review Focus
Please evaluate:
1. **Critical Issues**: Bugs, Security, Data Loss.
2. **Code Quality**: Maintainability, Error Handling.
3. **Architecture**: Design soundness, Scalability.
## Output Constraints
- **NO Thinking Process**: Do not include internal thinking or logs.
- **Concise**: Focus on actionable feedback.
EOF
)
# Prepare result storage
declare -A CLI_RESULTS
declare -A CLI_PIDS
declare -A CLI_FILES
# Timeout settings
TIMEOUT="600s" # 10 minutes for all CLIs
# Launch all target CLIs in parallel
for cli in "${TARGET_CLIS[@]}"; do
OUTPUT_FILE="/tmp/${cli}-review.txt"
CLI_FILES[$cli]="$OUTPUT_FILE"
# Execute based on CLI type (each has different command syntax)
case "$cli" in
gemini)
timeout "$TIMEOUT" gemini "$PROMPT" > "$OUTPUT_FILE" 2>/dev/null &
;;
codex)
timeout "$TIMEOUT" codex exec "$PROMPT" > "$OUTPUT_FILE" 2>/dev/null &
;;
claude)
timeout "$TIMEOUT" claude "$PROMPT" > "$OUTPUT_FILE" 2>/dev/null &
;;
*)
echo "Warning: Unknown CLI $cli, attempting generic execution"
timeout "$TIMEOUT" "$cli" "$PROMPT" > "$OUTPUT_FILE" 2>/dev/null &
;;
esac
CLI_PIDS[$cli]=$!
done
# Wait for all CLIs to complete and collect results
for cli in "${TARGET_CLIS[@]}"; do
wait ${CLI_PIDS[$cli]}
EXIT_CODE=$?
OUTPUT_FILE="${CLI_FILES[$cli]}"
if [ $EXIT_CODE -eq 0 ] && [ -s "$OUTPUT_FILE" ]; then
CLI_RESULTS[$cli]=$(cat "$OUTPUT_FILE")
else
CLI_RESULTS[$cli]="[${cli} CLI failed or returned empty response]"
fi
done
# Cleanup temp files
for cli in "${TARGET_CLIS[@]}"; do
rm -f "${CLI_FILES[$cli]}"
done
Key points:
- Dynamic execution based on
TARGET_CLISarray - Each CLI may have different command syntax (handle in case statement)
- Parallel execution for efficiency
- Graceful handling of failures
Detailed CLI usage: See reference/cli-commands.md
Step 4: Present Raw Responses
Show original responses first for transparency:
# Display each CLI result
for cli in "${TARGET_CLIS[@]}"; do
echo "# ${cli^} Review"
echo ""
echo "${CLI_RESULTS[$cli]}"
echo ""
echo "---"
echo ""
done
Output format example:
# Gemini Review
{Gemini response content}
---
# Codex Review
{Codex response content}
---
# Claude Review
{Claude response content}
---
Step 5: Synthesize Final Report
Analyze all CLI responses and generate synthesized assessment.
IMPORTANT: Always create a synthesized report, even for a single CLI. The session should analyze and structure the feedback regardless of how many CLIs were executed.
Report structure:
# Final Assessment
## Critical Issues Requiring Immediate Attention
{Synthesized critical issues - bugs, security, data loss}
- Issue: {Clear description}
- Impact: {Why it's critical}
- Location: {file:line references}
- Recommendation: {Specific fix}
## Architecture & Design Concerns
{Architectural improvements needed}
## Code Quality Issues
{Code quality improvements}
- Error handling
- Edge cases
- Maintainability
## Testing Gaps
{Test coverage or strategy problems}
## Security & Performance Considerations
{Security and performance items}
## What's Working Well
{Positive feedback}
## Actionable Recommendations
1. **Immediate**: {Fix right now}
2. **Before Merge**: {Must handle before merging}
3. **Future Enhancement**: {Consider for later}
## Summary
{Overall assessment and next steps}
Synthesis principles:
- Always synthesize: Even single CLI responses get analyzed and structured
- Consolidate duplicates: Same issue mentioned by multiple CLIs = one entry
- Filter for validity: Only include legitimate concerns
- Prioritize by impact: Not by which/how many LLMs mentioned it
- Make actionable: Concrete recommendations, not vague advice
- Remove noise: Focus on essentials
- Add context: Session analyzes and adds insights beyond raw responses
Example reports: See reference/report-format.md
Error Handling
Some CLIs fail:
- Continue with successful CLIs
- Note failures in final report
- Still provide synthesized assessment from available feedback
Timeout issues (exit code 124):
- Use 300s (5 minutes) timeout:
timeout 300s - This allows time for complex reviews
- Reduce prompt size if still timing out: Focus on key changes only
- Check CLI responsiveness:
time gemini "test"
No retries: Keep execution fast and simple.
Context Window Management
Keep prompts focused to avoid token bloat:
Include:
- Last 5-10 conversation exchanges (recent context)
- Git diff summary (--stat output)
- List of changed files
- Key implementation details from conversation
Exclude:
- Full git diff (too verbose)
- Entire file contents (unless critical)
- Historical conversation (only recent)
- Unrelated project files
Integration with Other Skills
Typical workflow:
- Implement feature with Claude
- User requests: "Validate with gemini"
- Use this skill → Get peer feedback
- Address critical/important issues
- Use
managing-git-workflowto commit/push/PR
Complements:
- Internal code review processes
- Test-driven development
- Systematic debugging
Common Patterns
Before major commit:
User: "I want a review from codex before committing"
→ Run: /consulting-peer-llms:review codex
→ Analyze synthesized feedback
→ Address issues
→ Then commit
Second opinion:
User: "Ask gemini if this architecture is okay"
→ Run: /consulting-peer-llms:review gemini
→ Evaluate synthesized feedback
→ Refine if needed
Cross-validation:
User: "Check if other LLMs agree"
→ Run: /consulting-peer-llms:review (auto-detect all)
→ Check consensus in synthesized report
Specific multi-model review:
User: "Get feedback from both gemini and claude"
→ Run: /consulting-peer-llms:review gemini claude
→ Compare perspectives in synthesized report
Red Flags - STOP Immediately
If you catch yourself doing these, STOP:
- ❌ Running peer review without explicit user request
- ❌ Skipping raw response output (always show originals first)
- ❌ Just showing raw responses without synthesis
- ❌ Skipping synthesis for single CLI (always synthesize!)
- ❌ Including full git diff in prompt (use summary)
- ❌ Forgetting to check CLI exit codes
- ❌ Not cleaning up temp files
- ❌ Hardcoding CLI list instead of using dynamic TARGET_CLIS
Limitations
This skill does NOT:
- Automatically trigger on every code change
- Replace internal review processes
- Guarantee 100% correct feedback (LLMs can be wrong)
- Handle interactive CLI prompts (only single-shot commands)
Remember:
- Peer LLM feedback is additional perspective, not absolute truth
- Critical issues from LLMs should still be validated
- Different models have different strengths/weaknesses
Troubleshooting
"Empty response from CLI"
- Check CLI can run:
gemini "test" - Verify API keys/auth if required
- Check prompt isn't too long
"Both CLIs failed"
- Run diagnostics:
gemini --version && codex --version - Check system logs for errors
- Verify network connectivity if cloud-based
"Response is truncated"
- CLIs may have output limits
- Try reducing prompt length
- Focus on specific concerns in prompt
Quick Reference
Typical execution time: 10-30 seconds (parallel)
Command usage:
/consulting-peer-llms:review- Auto-detect all installed CLIs/consulting-peer-llms:review gemini- Use Gemini only/consulting-peer-llms:review gemini codex- Use Gemini and Codex/consulting-peer-llms:review gemini codex claude- Use all three
Output stages:
- Raw responses from each executed CLI
- Synthesized final report (always, even for single CLI)
Temp files used:
/tmp/{cli-name}-review.txt(one per CLI)
Git commands:
git rev-parse HEAD~1 # Base SHA
git rev-parse HEAD # Current SHA
git diff --name-only $BASE..$HEAD # Changed files
git diff --stat $BASE..$HEAD # Change summary
See Also
- reference/prompt-template.md - Full prompt structure
- reference/cli-commands.md - CLI execution details
- reference/report-format.md - Report examples