name	gemini
description	Use when the user asks to run Gemini CLI (gemini -p) or needs visual analysis, UI testing, web research, frontend compilation checks, image/PDF/video analysis, or MCP server integration

Gemini Skill Guide

To analyze visual content, test frontends, or research packages, run Gemini CLI with prompts and file attachments.

Models Available:

gemini-2.5-pro - Complex reasoning, visual analysis, large context (1M tokens)
gemini-2.5-flash - Quick queries, simple analysis, real-time responses

Approval Modes:

Default (no flags) - Manual approval for each action (safest)
--approval-mode auto_edit - Auto-approve file edits only
--yolo - Auto-approve everything (⚠️ test environments only)

Always ask user which model to use via AskUserQuestion before running. Recommend gemini-2.5-pro for visual/complex tasks, gemini-2.5-flash for quick queries.

Running a Task

Ask the user (via AskUserQuestion) which model to run (gemini-2.5-pro or gemini-2.5-flash) in a single prompt. Recommend gemini-2.5-pro for visual/complex tasks, gemini-2.5-flash for quick queries.
Select the approval mode required for the task; default to no flags (manual approval) unless file edits or auto-execution are necessary.
Assemble the command with appropriate options:
- POSITIONAL PROMPT - Use gemini "prompt text" (NOT -p flag which is deprecated)
- -m, --model <MODEL> - Specify model
- --output-format json - For structured, parseable output
- --approval-mode <default|auto_edit|yolo> - Control destructive operations
- --include-directories <DIRS> - Focus context on specific directories
- @filepath - Attach files (images, PDFs, audio, video, code files, diffs)
For visual tasks: Always attach images using @filepath syntax (e.g., gemini "Review UI" @screenshot.png).
For code review tasks: Pass file contents via @filepath (e.g., gemini "Review code" @src/file.ts) OR embed contents in prompt using command substitution (e.g., gemini "Review this diff:\n\n$(cat diff.txt)").
For web research: Results are AI-generated summaries with citations, not raw search results.
Run the command, capture output (parse JSON if using --output-format json), and summarize for the user.
After Gemini completes, inform the user: "You can ask follow-up questions or request additional analysis. Gemini can leverage MCP servers for deeper integration."

Quick Reference

Use case	Key flags	Example command
Analyze screenshot	`@image`	`gemini "Review this UI for accessibility" @screenshot.png`
Code review	`@code` or `$(cat)`	`gemini "Review code quality" @src/auth.ts`
Review git diff	`$(cat)` or `@file`	`gemini "Review this diff: $(cat /tmp/diff.txt)"`
Web research	`--output-format json`	`gemini "React hooks best practices 2025?" --output-format json`
Test compilation	`--approval-mode auto_edit`	`gemini "Build project, report errors" --approval-mode auto_edit`
Visual regression	`@multiple-images`	`gemini "Compare screenshots" @before.png @after.png`
PDF extraction	`@pdf`	`gemini "Extract requirements" @spec.pdf`
Frontend health check	`--approval-mode auto_edit --include-directories`	`gemini "Build, check TypeScript, start server, test routes" --approval-mode auto_edit --include-directories src`
Parse output	`--output-format json` then `jq`	`result=$(gemini "Analyze deps" --output-format json); echo "$result" \| jq -r '.response'`

Example: Visual UI Testing

Scenario: Test login page after design changes

# Step 1: Capture current state
gemini "Start dev server and screenshot the login page at /login" --approval-mode auto_edit

# Step 2: Compare with baseline
gemini "Compare this screenshot with baseline, report visual differences" @baseline-login.png @current-login.png --output-format json

# Step 3: Parse results
result=$(gemini "Analyze visual differences" @baseline-login.png @current-login.png --output-format json)
echo "$result" | jq -r '.response' > visual-diff-report.md

Example: Web Research → Implementation

Scenario: Find and apply modern state management patterns

# Step 1: Research (returns AI summary with citations)
gemini "Modern React state management for large apps" --output-format json > research.json

# Step 2: Extract recommendations
cat research.json | jq -r '.response' > recommendations.md

# Step 3: Apply learnings
gemini "Refactor state management based on recommendations.md" @recommendations.md @src/store/ --approval-mode auto_edit

Example: Code Review

Scenario: Review code changes on a feature branch

# Step 1: Get diff of changes
git diff main...HEAD > /tmp/branch-diff.txt

# Step 2: Review with Gemini (method 1: embed diff in prompt)
gemini "Review this code diff for quality, bugs, and security issues:

$(cat /tmp/branch-diff.txt)

Focus on:
1. Code quality and best practices
2. Potential bugs or security vulnerabilities
3. Suggestions for improvements" --output-format json > review.json

# Step 3: Parse and display results
cat review.json | jq -r '.response'

# Alternative method 2: Use @filepath if diff is in project
git diff main...HEAD > diff.txt
gemini "Review this code diff" @diff.txt

Common Pitfalls

❌ Don't use deprecated -p flag - Use positional prompt: gemini "prompt" not gemini -p "prompt" ❌ Don't expect Gemini to access file paths directly - Use @filepath or $(cat file) to pass contents ❌ Don't expect raw Google Search results - Gemini returns AI-generated summaries with citations ✅ Do use positional prompts: gemini "your prompt here" ✅ Do pass file contents via @filepath or command substitution ✅ Do use --output-format json for structured responses you can parse with jq

Best Practices

Use modern syntax: Always use positional prompts gemini "prompt", NOT deprecated -p flag
Pass file contents properly:
- For files in project: @filepath syntax (e.g., gemini "Review" @src/file.ts)
- For temp/external files: Command substitution (e.g., gemini "Review: $(cat /tmp/diff.txt)")
- For visual/binary: Always use @filepath (images, PDFs, audio, video)
Context focus: Use --include-directories to limit scope and save tokens
Structured output: Add --output-format json when you need to parse results
Multi-modal: Combine @image.png @code.ts @spec.pdf for comprehensive analysis
Token efficiency: Use gemini-2.5-flash for simple queries to save quota
Working directory: Gemini runs in current directory; cd before invoking if needed

When to Use Gemini vs Codex

Use Gemini for:

🖼️ Visual analysis (screenshots, UI, wireframes, designs)
🌐 Web research (package docs, best practices via Google Search)
📱 Frontend testing (compilation, deployment verification)
📄 Multi-modal tasks (images, PDFs, audio, video)
🔧 MCP integration (GitHub, databases, external APIs)

Use Codex for:

💻 Deep code refactoring across multiple files
🏗️ Architectural changes requiring codebase understanding
🔧 Complex backend logic and algorithms

Following Up

After every gemini command, immediately use AskUserQuestion to confirm next steps or collect clarifications.
For follow-ups, run new gemini "prompt" commands (no resume feature like Codex).
When using --output-format json, parse with jq to extract specific fields:
- Response: jq -r '.response'
- Token usage: jq '.stats.models'
- Errors: jq '.error'
- Tool calls: jq '.stats.tools.byName'

Error Handling

Stop and report failures when gemini exits non-zero; request direction before retrying.
Before using --yolo or --approval-mode auto_edit, ask user permission via AskUserQuestion unless already given.
API quota exceeded → Suggest gemini-2.5-flash or retry later.
MCP connection issues → Suggest checking server status with /mcp command in interactive mode.
Partial results → Summarize warnings and ask how to adjust via AskUserQuestion.

Reference Files

Advanced topics available in separate files:

MCP_INTEGRATION.md - External services (GitHub, databases, custom APIs)
MEMORY_CONTEXT.md - Persistent memory, GEMINI.md files, project context
ADVANCED_CONFIG.md - Approval modes, sandbox execution, tool restrictions
TOOLS_REFERENCE.md - Built-in tools, web search capabilities

gemini

Install Skill

SKILL.md