name	gemini-image-generator
description	Generate images using Google Gemini NanoBanana via browser automation. Use this skill for general-purpose AI image generation from text prompts. Includes persistent authentication, automatic environment setup, and reference image support for style matching.

Gemini Image Generator

Gemini NanoBananaを使った汎用AI画像生成スキル。

When to Use This Skill

Trigger when user:

Asks to generate/create images with AI
Mentions "Gemini image", "generate picture", "create artwork"
Requests visual content from text descriptions
Wants to produce illustrations or graphics
Wants to create images matching a reference image's style (NEW!)

For specific use cases, use specialized skills:

LP/セールスレター画像 → gemini-lp-generator
ウェビナースライド → gemini-slide-generator

Quick Start

cd /path/to/gemini-image-generator

# 1. Check authentication
python scripts/run.py auth_manager.py status

# 2. Authenticate (if needed)
python scripts/run.py auth_manager.py setup

# 3. Generate image (basic)
python scripts/run.py image_generator.py \
  --prompt "sunset over mountains, watercolor style" \
  --output output/my_image.png

# 4. Generate with reference image (NEW!)
python scripts/run.py image_generator.py \
  --prompt "犬を描いて" \
  --reference-image "/path/to/reference.png" \
  --output output/styled_dog.png

How It Works

Basic Mode (テキストのみ)

Navigate to gemini.google.com
Click "ツール" (Tools) button
Select "画像を作成" (Create Image) - Activates NanoBanana
Enter prompt and generate
Download generated image

Reference Image Mode (参考画像あり) - NEW!

Upload reference image to Gemini
AI analyzes visual elements (style, colors, lighting, etc.)
Extract analysis as YAML format
Generate optimized meta-prompt
Create new image with matching style

┌────────────────┐     ┌────────────────┐     ┌────────────────┐
│  📷 Reference  │ →   │  📋 YAML       │ →   │  📝 Optimized  │
│     Image      │     │    Analysis    │     │     Prompt     │
└────────────────┘     └────────────────┘     └────────────────┘
                                                      │
                                                      ▼
                                              ┌────────────────┐
                                              │  🖼️ Generated  │
                                              │     Image      │
                                              └────────────────┘

Parameters

Parameter	Required	Default	Description
`--prompt`	Yes	-	Image generation prompt
`--output`	No	`output/generated_image.png`	Output file path
`--reference-image`	No	-	Reference image for style extraction
`--yaml-output`	No	-	Save YAML analysis to file
`--show-browser`	No	False	Show browser for debugging
`--timeout`	No	180	Max wait time in seconds

Prompt Examples

Basic Examples (テキストのみ)

# Landscape
python scripts/run.py image_generator.py \
  --prompt "serene sunset over snow-capped mountains, warm orange sky, photorealistic"

# Art style
python scripts/run.py image_generator.py \
  --prompt "watercolor painting of a cat sitting by window, soft colors"

# Product photo
python scripts/run.py image_generator.py \
  --prompt "professional product photography, white background, soft lighting"

Reference Image Examples (参考画像あり) - NEW!

# Match style of reference image
python scripts/run.py image_generator.py \
  --prompt "犬を描いて" \
  --reference-image "examples/watercolor_cat.png" \
  --output output/watercolor_dog.png

# Save YAML analysis for review
python scripts/run.py image_generator.py \
  --prompt "森の風景" \
  --reference-image "examples/sunset.jpg" \
  --yaml-output output/analysis.yaml \
  --output output/forest.png

# Debug mode with browser visible
python scripts/run.py image_generator.py \
  --prompt "カフェの内装" \
  --reference-image "examples/cozy_room.png" \
  --show-browser \
  --output output/cafe.png

Standalone Tools

# Extract YAML only (without generating image)
python scripts/run.py prompt_extractor.py \
  --image "examples/reference.png" \
  --output analysis.yaml

# Generate prompt from YAML
python scripts/run.py meta_prompt.py \
  --yaml analysis.yaml \
  --request "猫を描いて"

Authentication

This skill manages browser authentication for all Gemini-based skills:

gemini-slide-generator (shares browser profile)
gemini-lp-generator (shares browser profile)

# Check status
python scripts/run.py auth_manager.py status

# Setup (opens browser for Google login)
python scripts/run.py auth_manager.py setup

# Clear session
python scripts/run.py auth_manager.py clear

Troubleshooting

Problem	Solution
Not authenticated	Run `auth_manager.py setup`
Timeout	Increase with `--timeout 300`
UI not found	Use `--show-browser` to debug
Generation refused	Modify prompt (avoid restricted content)

Data Storage

data/browser_profile/ - Browser session (shared with other Gemini skills)
data/state.json - Authentication state
output/ - Generated images

Architecture

scripts/
├── config.py           # Centralized settings
├── browser_utils.py    # BrowserFactory and StealthUtils
├── auth_manager.py     # Authentication management
├── image_generator.py  # Image generation (with reference image support)
├── prompt_extractor.py # Extract visual elements as YAML (NEW!)
├── meta_prompt.py      # Generate optimized prompts from YAML (NEW!)
└── run.py              # Wrapper script for venv

docs/
└── UPGRADE_SPEC.md     # Feature specification with diagrams

Notes

First generation takes longer (browser startup)
Subsequent generations faster (session reuse)
Authentication persists ~7 days
UI selectors may break when Gemini updates

gemini-image-generator

Install Skill

SKILL.md

Gemini Image Generator

When to Use This Skill

Quick Start

How It Works

Basic Mode (テキストのみ)

Reference Image Mode (参考画像あり) - NEW!

Parameters

Prompt Examples

Basic Examples (テキストのみ)

Reference Image Examples (参考画像あり) - NEW!

Standalone Tools

Authentication

Troubleshooting

Data Storage

Architecture

Notes