| name | gemini-image-generator |
| description | Generate images using Google Gemini NanoBanana via browser automation. Use this skill for general-purpose AI image generation from text prompts. Includes persistent authentication, automatic environment setup, and reference image support for style matching. |
Gemini Image Generator
Gemini NanoBananaを使った汎用AI画像生成スキル。
When to Use This Skill
Trigger when user:
- Asks to generate/create images with AI
- Mentions "Gemini image", "generate picture", "create artwork"
- Requests visual content from text descriptions
- Wants to produce illustrations or graphics
- Wants to create images matching a reference image's style (NEW!)
For specific use cases, use specialized skills:
- LP/セールスレター画像 →
gemini-lp-generator - ウェビナースライド →
gemini-slide-generator
Quick Start
cd /path/to/gemini-image-generator
# 1. Check authentication
python scripts/run.py auth_manager.py status
# 2. Authenticate (if needed)
python scripts/run.py auth_manager.py setup
# 3. Generate image (basic)
python scripts/run.py image_generator.py \
--prompt "sunset over mountains, watercolor style" \
--output output/my_image.png
# 4. Generate with reference image (NEW!)
python scripts/run.py image_generator.py \
--prompt "犬を描いて" \
--reference-image "/path/to/reference.png" \
--output output/styled_dog.png
How It Works
Basic Mode (テキストのみ)
- Navigate to
gemini.google.com - Click "ツール" (Tools) button
- Select "画像を作成" (Create Image) - Activates NanoBanana
- Enter prompt and generate
- Download generated image
Reference Image Mode (参考画像あり) - NEW!
- Upload reference image to Gemini
- AI analyzes visual elements (style, colors, lighting, etc.)
- Extract analysis as YAML format
- Generate optimized meta-prompt
- Create new image with matching style
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ 📷 Reference │ → │ 📋 YAML │ → │ 📝 Optimized │
│ Image │ │ Analysis │ │ Prompt │
└────────────────┘ └────────────────┘ └────────────────┘
│
▼
┌────────────────┐
│ 🖼️ Generated │
│ Image │
└────────────────┘
Parameters
| Parameter | Required | Default | Description |
|---|---|---|---|
--prompt |
Yes | - | Image generation prompt |
--output |
No | output/generated_image.png |
Output file path |
--reference-image |
No | - | Reference image for style extraction |
--yaml-output |
No | - | Save YAML analysis to file |
--show-browser |
No | False | Show browser for debugging |
--timeout |
No | 180 | Max wait time in seconds |
Prompt Examples
Basic Examples (テキストのみ)
# Landscape
python scripts/run.py image_generator.py \
--prompt "serene sunset over snow-capped mountains, warm orange sky, photorealistic"
# Art style
python scripts/run.py image_generator.py \
--prompt "watercolor painting of a cat sitting by window, soft colors"
# Product photo
python scripts/run.py image_generator.py \
--prompt "professional product photography, white background, soft lighting"
Reference Image Examples (参考画像あり) - NEW!
# Match style of reference image
python scripts/run.py image_generator.py \
--prompt "犬を描いて" \
--reference-image "examples/watercolor_cat.png" \
--output output/watercolor_dog.png
# Save YAML analysis for review
python scripts/run.py image_generator.py \
--prompt "森の風景" \
--reference-image "examples/sunset.jpg" \
--yaml-output output/analysis.yaml \
--output output/forest.png
# Debug mode with browser visible
python scripts/run.py image_generator.py \
--prompt "カフェの内装" \
--reference-image "examples/cozy_room.png" \
--show-browser \
--output output/cafe.png
Standalone Tools
# Extract YAML only (without generating image)
python scripts/run.py prompt_extractor.py \
--image "examples/reference.png" \
--output analysis.yaml
# Generate prompt from YAML
python scripts/run.py meta_prompt.py \
--yaml analysis.yaml \
--request "猫を描いて"
Authentication
This skill manages browser authentication for all Gemini-based skills:
gemini-slide-generator(shares browser profile)gemini-lp-generator(shares browser profile)
# Check status
python scripts/run.py auth_manager.py status
# Setup (opens browser for Google login)
python scripts/run.py auth_manager.py setup
# Clear session
python scripts/run.py auth_manager.py clear
Troubleshooting
| Problem | Solution |
|---|---|
| Not authenticated | Run auth_manager.py setup |
| Timeout | Increase with --timeout 300 |
| UI not found | Use --show-browser to debug |
| Generation refused | Modify prompt (avoid restricted content) |
Data Storage
data/browser_profile/- Browser session (shared with other Gemini skills)data/state.json- Authentication stateoutput/- Generated images
Architecture
scripts/
├── config.py # Centralized settings
├── browser_utils.py # BrowserFactory and StealthUtils
├── auth_manager.py # Authentication management
├── image_generator.py # Image generation (with reference image support)
├── prompt_extractor.py # Extract visual elements as YAML (NEW!)
├── meta_prompt.py # Generate optimized prompts from YAML (NEW!)
└── run.py # Wrapper script for venv
docs/
└── UPGRADE_SPEC.md # Feature specification with diagrams
Notes
- First generation takes longer (browser startup)
- Subsequent generations faster (session reuse)
- Authentication persists ~7 days
- UI selectors may break when Gemini updates