name

gemini-image-gen

description

Generate and edit images using Google's Gemini API (gemini-3-pro-image-preview model). Use when users request (1) Generating images from text prompts, (2) Editing existing images with AI instructions, (3) Creating images with specific styles or templates, (4) Generating multiple variations of images, (5) Creating images with reference images for style consistency, (6) Any image generation task mentioning Gemini, Google AI, or requiring professional image output. Supports aspect ratios (1:1, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9), 7 style templates (3 glass styles, 4 other creative styles), reference images (up to 14), and batch processing.

Gemini Image Generation Skill

Generate and edit professional images using Google's Gemini API with support for style templates, reference images, and batch processing.

Overview

This skill enables image generation and editing through Google's gemini-3-pro-image-preview model. It provides:

Text-to-Image generation - Create images from text descriptions
Image editing - Modify existing images with text instructions
Style templates - Apply consistent styling using markdown templates with {subject} placeholders
Reference images - Use up to 14 reference images for style/composition guidance
Batch processing - Generate multiple variations in a single command
Aspect ratio control - Choose from 8 aspect ratios for different use cases

Use this skill when users request image creation, editing, or generation tasks involving Google's Gemini AI.

Prerequisites & Setup

API Key Acquisition

Visit https://aistudio.google.com/apikey
Sign in with your Google account
Create a new API key
Copy the key immediately

Note: Paid tier is recommended to avoid rate limits in production.

API Key Configuration

Option 1: Environment Variable (Recommended)

# Linux/macOS - add to ~/.bashrc or ~/.zshrc
export GOOGLE_AI_API_KEY="your_key_here"

# Windows PowerShell - add to $PROFILE
$env:GOOGLE_AI_API_KEY="your_key_here"

Option 2: .env File (Project-specific)

Create a .env file in the user's working directory:

GOOGLE_AI_API_KEY=your_key_here

The script automatically loads .env files using python-dotenv.

Installing uv (Package Manager)

This skill requires uv for dependency management:

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Or on macOS
brew install uv

Installing Dependencies

Dependencies install automatically on first run!

When you first use the skill, it will detect missing dependencies and run uv sync automatically. You'll see:

Installing required dependencies... (one-time setup)
Dependencies installed successfully!
Please run the command again.

Manual installation (optional):

cd <skill-dir>/scripts
uv sync

This installs:

google-genai>=1.0.0 (Gemini API client)
python-dotenv>=1.0.0 (environment variable loading)
pillow>=10.0.0 (image processing)

Note: Python 3.10 or higher is required.

Core Workflow

Path Resolution

Always use absolute paths to the skill directory when invoking the script:

# You will resolve <skill-dir> to the actual installation path
uv run python <skill-dir>/scripts/main.py <args>

Output files are saved relative to the user's current working directory.

Basic Command Structure

uv run python <skill-dir>/scripts/main.py <output> <prompts...> [OPTIONS]

Arguments:

<output> - Output file path (e.g., images/output.png, images/icon.png)
<prompts...> - One or more subject prompts

Default Output Location: Images are saved to images/ folder in the project root

Options:

--style <path> or -s - Path to style template .md file
--edit <path> or -e - Edit existing image instead of generating
--ref <path> or -r - Reference image (repeatable, up to 14 total)
--aspect <ratio> or -a - Aspect ratio (default: 1:1)

Decision Tree: Generate vs Edit

Generate new image:

User wants to create something from scratch
Use prompts to describe the desired image
Optionally use --style and --ref

Edit existing image:

User has an image to modify
Use --edit <image-path>
Provide prompt describing the changes
Optionally use --ref for style guidance

Command Reference

Basic Generation

uv run python <skill-dir>/scripts/main.py images/sunset.png \
  "A vibrant sunset over mountain peaks with orange and purple sky" \
  --aspect 16:9

Output: Creates images/sunset.png in project root

Using Bundled Style Templates

# Purple glass 3D style
uv run python <skill-dir>/scripts/main.py images/rocket.png "rocket" \
  --style <skill-dir>/assets/styles/purple_glass_3d.md

# Neon wireframe style
uv run python <skill-dir>/scripts/main.py images/gear.png "gear" \
  --style <skill-dir>/assets/styles/neon_wireframe.md

# Gold metallic 3D style
uv run python <skill-dir>/scripts/main.py images/cube.png "cube" \
  --style <skill-dir>/assets/styles/gold_metallic_3d.md

What happens:

Template loads with full prompt template
{subject} placeholder replaced with your subject
Full prompt used for generation

Output: Styled image matching the template aesthetic in images/ folder

Batch Processing (Multiple Subjects)

uv run python <skill-dir>/scripts/main.py images/icon.png \
  "cube" "sphere" "pyramid" "cylinder" \
  --style <skill-dir>/assets/styles/emerald_glass_3d.md

Output: Creates images/icon_1.png, images/icon_2.png, images/icon_3.png, images/icon_4.png in emerald glass style

Image Editing

uv run python <skill-dir>/scripts/main.py logo_edited.png \
  "Change the background to solid blue (#0066CC)" \
  --edit logo.png

Requirements:

logo.png must exist in current directory (or provide full path)
Edit prompt describes the desired change

Output: Creates logo_edited.png with modified background

Using Reference Images

uv run python <skill-dir>/scripts/main.py gear.png \
  "A gear icon matching the style of the reference" \
  --ref example.png

Use case: Match style/aesthetics of an existing image

With bundled examples:

uv run python <skill-dir>/scripts/main.py icon.png "CPU chip" \
  --ref <skill-dir>/assets/styles/purple_glass_3d/examples/1.png

Aspect Ratio Selection

Note: The default aspect ratio is 1:1 (square). Specify --aspect to use a different ratio.

# YouTube thumbnail
uv run python <skill-dir>/scripts/main.py thumbnail.png \
  "AI coding tutorial thumbnail with vibrant colors" \
  --aspect 16:9

# Instagram post
uv run python <skill-dir>/scripts/main.py insta_post.png \
  "Product photo on clean white background" \
  --aspect 1:1

# Vertical video
uv run python <skill-dir>/scripts/main.py story.png \
  "Mobile-first vertical design" \
  --aspect 9:16

Maximum Consistency (Template + References)

uv run python <skill-dir>/scripts/main.py icon.png "database" \
  --style <skill-dir>/assets/styles/amber_glass_3d.md \
  --ref <skill-dir>/assets/styles/amber_glass_3d/examples/1.png \
  --ref <skill-dir>/assets/styles/amber_glass_3d/examples/2.png

Result: Highest possible consistency with established style

Style Templates

Using Bundled Templates

This skill includes 14 professional style templates:

Glass Styles (3D frosted glass with rim lighting):

purple_glass_3d.md - Royal purple (#7C3AED) glass with violet rim lighting
emerald_glass_3d.md - Deep emerald (#059669) glass with lime green rim lighting
amber_glass_3d.md - Rich amber (#D97706) glass with golden rim lighting

Other Styles: 4. neon_wireframe.md - Hot pink/cyan glowing wireframe outlines on black 5. gold_metallic_3d.md - Brushed gold metal (#D4AF37) with warm highlights 6. minimalist_flat.md - Soft pastel flat 2D design on white background 7. gradient_holographic.md - Iridescent purple-pink-cyan gradients on white

Usage:

--style <skill-dir>/assets/styles/purple_glass_3d.md
--style <skill-dir>/assets/styles/neon_wireframe.md
--style <skill-dir>/assets/styles/gold_metallic_3d.md

Creating Custom Templates

Create a .md file with this structure:

# My Style Name

## Prompt Template

Your detailed prompt here with {subject} placeholder. All styling, colors, materials described explicitly. Background treatment, lighting, composition specified.

Key requirements:

Include {subject} placeholder where the variable content goes
Be explicit about colors (use hex codes)
Describe materials, lighting, and composition
List prohibitions (NO gradients, NO text, etc.)

Example custom template:

## Prompt Template

Premium photograph of {subject}. Professional studio lighting with soft shadows. Clean white background (#FFFFFF). High-resolution DSLR quality. NO text, NO graphics, NO distractions.

Template Path Resolution

Bundled templates:

--style <skill-dir>/assets/styles/purple_glass_3d.md
--style <skill-dir>/assets/styles/neon_wireframe.md
--style <skill-dir>/assets/styles/gold_metallic_3d.md

User's custom template:

--style ./my-styles/custom-style.md
--style /absolute/path/to/template.md

How Templates Work

The script:

Reads the template file
Finds the code block after ## Prompt Template
Replaces {subject} with your provided subject
Uses the complete prompt for generation

Example:

Template: "Premium 3D {subject} made of glass"
Your subject: "cube"
Final prompt: "Premium 3D cube made of glass"

Reference Images

When to Use References

Style consistency:

Match aesthetics of existing images
Maintain brand visual identity
Replicate specific rendering technique

Composition guidance:

Specific layout or arrangement
Particular perspective or angle
Element positioning requirements

Content specification:

Exact shapes or icons to include
Specific objects to emulate
Visual elements to incorporate

Important: WHAT vs HOW

Reference images define WHAT:

Specific shapes, icons, symbols
Objects and elements to include
Compositional layout and arrangement

Text prompts define HOW:

Visual style and aesthetics
Material properties (glass, metal, etc.)
Colors, lighting, rendering technique
Background treatment

Reference Image Limits

Maximum: 14 reference images per request
Edit mode: 13 references + 1 main image = 14 total
More isn't always better: Start with 1-2, add more only if consistency requires it

Using Bundled Examples

Each style template includes 3 example images for reference consistency:

# Purple glass examples
--ref <skill-dir>/assets/styles/purple_glass_3d/examples/1.png
--ref <skill-dir>/assets/styles/purple_glass_3d/examples/2.png
--ref <skill-dir>/assets/styles/purple_glass_3d/examples/3.png

# Neon wireframe examples
--ref <skill-dir>/assets/styles/neon_wireframe/examples/1.png

# Gold metallic examples
--ref <skill-dir>/assets/styles/gold_metallic_3d/examples/1.png

Use these to maintain consistency with your chosen style.

Multiple References

uv run python <skill-dir>/scripts/main.py output.png "rocket" \
  --ref example1.png \
  --ref example2.png \
  --ref example3.png \
  --style <skill-dir>/assets/styles/gradient_holographic.md

Priority: Earlier references have stronger influence on the output.

Error Handling

Missing API Key

Error message:

Error: GOOGLE_AI_API_KEY not found in environment
Create a .env file with: GOOGLE_AI_API_KEY=your_key_here
Get your API key from: https://aistudio.google.com/apikey

Your response:

Provide step-by-step API key setup instructions
Offer to create .env file in user's working directory
Link to API key acquisition URL
Remind about paid tier recommendation

Invalid Aspect Ratio

Error message:

error: argument --aspect/-a: invalid choice: '5:3'

Your response:

Show the valid aspect ratios: 1:1, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
Suggest the closest match to what user requested
Explain use cases for the recommended ratio

File Not Found

Error message:

Error: Input image not found: logo.png

Your response:

Verify the filename with the user
Check if file is in current directory
Suggest using absolute path if needed
Offer to list image files in current directory

Too Many Reference Images

Script behavior: Uses first 14 references, ignores rest

Your response:

Inform user of the 14-image limit
Ask which references to prioritize if more than 14 provided
Explain that earlier references have stronger influence

Module Import Errors

Error message:

ModuleNotFoundError: No module named 'google.genai'

Your response:

This shouldn't happen (auto-install should catch it)
If it does, the script will show: "Installing required dependencies..."
User should run the command again after auto-install completes
If auto-install fails, guide user to run manually: cd <skill-dir>/scripts && uv sync

API Quota Exceeded

Error message:

google.api_core.exceptions.ResourceExhausted: 429 Quota exceeded

Your response:

Explain free tier limits have been reached
Suggest upgrading to paid tier: https://aistudio.google.com/
Offer to retry after a brief delay
Recommend checking usage dashboard

Python Version Error

Error message:

requires-python = >=3.10

Your response:

Inform user Python 3.10+ is required
Check current version: python --version
Provide upgrade instructions for their platform

Output Management

Output Path Handling

Default output location:

images/output.png      → ./images/output.png
images/icon.png        → ./images/icon.png

Absolute paths:

/home/user/projects/output.png → Exact location specified

Directory creation: The script automatically creates the images/ directory and any parent directories if they don't exist.

Batch Processing Outputs

Single subject → Single file:

# Input
uv run python <skill-dir>/scripts/main.py images/output.png "cube"

# Output
images/output.png

Multiple subjects → Numbered files:

# Input
uv run python <skill-dir>/scripts/main.py images/icon.png "cube" "sphere" "pyramid"

# Output
images/icon_1.png (cube)
images/icon_2.png (sphere)
images/icon_3.png (pyramid)

Informing the User

Always tell the user:

Where the image(s) were saved
The filename(s) created
For batch processing, which file corresponds to which subject

Example response:

I've generated 3 emerald glass icons:
- images/icon_1.png (cube)
- images/icon_2.png (sphere)
- images/icon_3.png (pyramid)

All files are in the images/ folder.

Advanced Usage

Iterative Refinement with Style Consistency

Use successful outputs as references for subsequent generations:

# First generation
uv run python <skill-dir>/scripts/main.py icon_v1.png "gear" \
  --style <skill-dir>/assets/styles/purple_glass_3d.md

# If result is good, use it as reference for related icons
uv run python <skill-dir>/scripts/main.py icon_v2.png "cog" \
  --style <skill-dir>/assets/styles/purple_glass_3d.md \
  --ref icon_v1.png

Combining Templates and Multiple References

uv run python <skill-dir>/scripts/main.py output.png "circuit board" \
  --style <skill-dir>/assets/styles/neon_wireframe.md \
  --ref <skill-dir>/assets/styles/neon_wireframe/examples/1.png \
  --ref <skill-dir>/assets/styles/neon_wireframe/examples/2.png \
  --ref <skill-dir>/assets/styles/neon_wireframe/examples/3.png

Use case: Maximum consistency for brand-critical assets

Gradient Degradation Warning

Important: Avoid gradients if you plan to iterate on outputs.

Gradients compound compression artifacts through multiple generations. For iterative workflows:

Use solid backgrounds for elements you'll refine
Generate gradients/atmospherics last (single-pass only)
Or generate background separately and composite externally

See: references/best_practices.md for detailed explanation and examples

Additional Resources

For detailed information, consult the reference files:

API Capabilities (references/api_capabilities.md):

Complete aspect ratio guide with use cases
Reference image limits and best practices
API authentication and rate limits
Error codes and troubleshooting
Response structure details

Best Practices (references/best_practices.md):

Style consistency techniques
Prompt engineering principles
Gradient degradation lesson (with showcase examples)
Background generation strategies
Reference image techniques (WHAT vs HOW)
Batch processing patterns
Template design guidelines

Aspect Ratio Quick Reference

Ratio	Use Case	Example
1:1	Social media, icons, profiles	Instagram grid, app icons
3:4	Portrait photos, prints	Traditional portrait
4:3	Landscape photos, presentations	Older monitors, prints
4:5	Instagram portrait posts	Taller feed presence
5:4	Instagram landscape posts	Landscape feed
9:16	Vertical video, stories	TikTok, IG Stories
16:9	YouTube, presentations, monitors	Thumbnails, slides
21:9	Ultra-wide, cinematic	YouTube banners, headers

Default aspect ratio: 1:1 (square)

Decision making:

YouTube thumbnail? → 16:9
Instagram post? → 1:1 or 4:5
Profile picture? → 1:1
Vertical video? → 9:16
Presentation? → 16:9
Square icon? → 1:1
Cinematic/banner? → 21:9

Usage Examples Summary

Most Common Patterns

# 1. Basic generation with style
uv run python <skill-dir>/scripts/main.py images/icon.png "database" \
  --style <skill-dir>/assets/styles/purple_glass_3d.md

# 2. Batch generation with different style
uv run python <skill-dir>/scripts/main.py images/icon.png \
  "cube" "sphere" "pyramid" \
  --style <skill-dir>/assets/styles/gold_metallic_3d.md

# 3. Edit existing image
uv run python <skill-dir>/scripts/main.py images/edited.png \
  "Change background to white" \
  --edit images/original.png

# 4. With reference for consistency
uv run python <skill-dir>/scripts/main.py images/icon.png "gear" \
  --style <skill-dir>/assets/styles/neon_wireframe.md \
  --ref <skill-dir>/assets/styles/neon_wireframe/examples/1.png

# 5. Custom aspect ratio with flat style
uv run python <skill-dir>/scripts/main.py images/thumbnail.png \
  "Tutorial thumbnail with code editor" \
  --style <skill-dir>/assets/styles/minimalist_flat.md \
  --aspect 16:9

Workflow Tips

First-time setup: Dependencies auto-install on first run (no manual setup needed!)
API key: Verify GOOGLE_AI_API_KEY is set before generation
Path resolution: Always use absolute paths to skill directory
Output location: Images save to user's current working directory
Error messages: Parse script output for clear error indications
Consistency: Use templates + references for maximum style consistency
Iteration: Use solid backgrounds if you plan to iterate (avoid gradients)
Batch processing: Generate similar assets together for efficiency

Security Reminders

Never commit .env files to version control
Add .env to .gitignore
Use environment variables for production deployments
Rotate API keys periodically
Monitor API usage in Google AI Studio dashboard

Install Skill

SKILL.md

Gemini Image Generation Skill

Overview

Prerequisites & Setup

API Key Acquisition

API Key Configuration

Installing uv (Package Manager)

Installing Dependencies

Core Workflow

Path Resolution

Basic Command Structure

Decision Tree: Generate vs Edit

Command Reference

Basic Generation

Using Bundled Style Templates

Batch Processing (Multiple Subjects)

Image Editing

Using Reference Images

Aspect Ratio Selection

Maximum Consistency (Template + References)

Style Templates

Using Bundled Templates

Creating Custom Templates

Template Path Resolution

How Templates Work

Reference Images

When to Use References

Important: WHAT vs HOW

Reference Image Limits

Using Bundled Examples

Multiple References

Error Handling

Missing API Key

Invalid Aspect Ratio

File Not Found

Too Many Reference Images

Module Import Errors

API Quota Exceeded

Python Version Error

Output Management

Output Path Handling

Batch Processing Outputs

Informing the User

Advanced Usage

Iterative Refinement with Style Consistency

Combining Templates and Multiple References

Gradient Degradation Warning

Additional Resources

Aspect Ratio Quick Reference

Usage Examples Summary

Most Common Patterns

Workflow Tips

Security Reminders