| name | gemini-image-generator |
| description | Generate images using Google's Gemini API. Use when creating images from text prompts, editing existing images, or combining reference images for AI-generated visual content. |
Gemini Image Generator
Overview
Generate images using Google's Gemini API with support for text-to-image generation, image editing, and multi-image reference inputs. Supports both the fast Gemini 2.5 Flash model and the high-quality Gemini 3 Pro model with up to 4K resolution.
When to Use
- Generating app icons, logos, and UI assets
- Creating marketing visuals and promotional graphics
- Prototyping UI designs with AI-generated placeholders
- Generating game sprites and 2D assets
- Creating concept art and mood boards
- Editing or modifying existing images with text prompts
- Style transfer using reference images
Prerequisites
- Python 3.9+
google-genaipackageGEMINI_API_KEYenvironment variable
Installation
pip install google-genai
Getting an API Key
- Go to Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Copy the key and set it as an environment variable:
export GEMINI_API_KEY="your-api-key"
Add to your shell profile (~/.zshrc or ~/.bashrc) for persistence:
echo 'export GEMINI_API_KEY="your-api-key"' >> ~/.zshrc
Quick Start
Generate a simple image:
python scripts/generate_image.py -p "A fluffy orange cat sitting on a windowsill, warm sunlight, cozy atmosphere"
Generate with specific aspect ratio:
python scripts/generate_image.py -p "Modern tech startup banner" -a 16:9 -o banner.png
Edit an existing image:
python scripts/generate_image.py -p "Make the sky more dramatic with sunset colors" -i photo.jpg -o edited.png
Command Reference
python scripts/generate_image.py [options]
Required:
-p, --prompt TEXT Text prompt describing the image
Optional:
-o, --output PATH Output file path (default: auto-generated)
-m, --model MODEL Model to use (default: gemini-3-pro-image-preview)
-a, --aspect-ratio RATIO Aspect ratio (default: 1:1)
-s, --size SIZE Image size: 1K, 2K, 4K (default: 1K, Pro only)
-i, --input-image PATH Input image for editing mode
-r, --reference-images Reference image(s), can be repeated (max 14)
-v, --verbose Show detailed progress
Models
| Model | Resolution | Best For |
|---|---|---|
gemini-3-pro-image-preview |
Up to 4K | Final assets, high quality, professional work |
gemini-2.5-flash-image |
1024px | Quick iterations, prototyping, batch generation |
The Pro model is used by default. Use the Flash model for faster generation when quality is less critical:
python scripts/generate_image.py -p "Quick concept sketch" -m gemini-2.5-flash-image
Aspect Ratios
| Ratio | Use Case |
|---|---|
1:1 |
App icons, profile pictures, thumbnails |
2:3 |
Portrait photos, book covers |
3:2 |
Landscape photos, postcards |
3:4 |
Portrait photos, social media posts |
4:3 |
Traditional photos, presentations |
4:5 |
Instagram posts, portrait social media |
5:4 |
Large format prints |
9:16 |
Stories, vertical videos, mobile wallpapers |
16:9 |
Widescreen banners, video thumbnails, headers |
21:9 |
Ultrawide banners, cinematic headers |
Image Sizes
Available for Gemini 3 Pro model only:
| Size | Resolution | Use Case |
|---|---|---|
1K |
1024px | Web graphics, thumbnails |
2K |
2048px | Print materials, detailed graphics |
4K |
4096px | High-resolution prints, large displays |
python scripts/generate_image.py -p "Detailed landscape" -s 4K -o landscape_4k.png
Prompt Engineering Guide
Prompt Structure
Use this formula for effective prompts:
[Subject] + [Style] + [Details] + [Quality modifiers]
Techniques
1. Be Specific About the Subject
Bad: "a cat"
Good: "a fluffy orange tabby cat sitting on a windowsill"
2. Specify Art Style
- Photorealistic, cartoon, anime, oil painting, watercolor
- Digital art, 3D render, pixel art, vector illustration
- Specific styles: "in the style of Studio Ghibli", "cyberpunk aesthetic"
3. Include Environment and Lighting
- "golden hour lighting", "dramatic shadows", "soft ambient light"
- "neon-lit cityscape", "cozy interior", "misty forest"
4. Add Quality Modifiers
- "high quality", "detailed", "professional"
- "sharp focus", "studio lighting", "cinematic"
5. Specify Composition
- "centered composition", "rule of thirds"
- "close-up", "wide shot", "bird's eye view", "isometric"
Example Prompts by Use Case
App Icon
Minimalist app icon for a weather app, blue gradient background,
white cloud with golden sun rays, flat design, rounded corners,
iOS style, clean and modern
Marketing Banner
Professional tech startup banner, abstract geometric shapes
flowing from left to right, purple and blue gradient,
modern and clean aesthetic, corporate style
Game Sprite
Pixel art character sprite, fantasy warrior with glowing sword,
32x32 style, transparent background, retro 16-bit game aesthetic,
vibrant colors
Product Photo
Professional product photo of wireless earbuds on white background,
soft shadows, studio lighting, minimalist composition,
commercial photography style
Concept Art
Futuristic city skyline at sunset, flying vehicles between
towering skyscrapers, neon lights reflecting on wet streets,
cyberpunk atmosphere, cinematic composition, detailed
UI Mockup Asset
Abstract gradient background for mobile app, soft purple to pink
transition, subtle geometric patterns, modern and minimal,
suitable for dark text overlay
Generation Modes
Text-to-Image
Generate images from text descriptions:
python scripts/generate_image.py -p "Your description here" -o output.png
Image Editing
Modify an existing image with a text prompt:
python scripts/generate_image.py \
-p "Change the background to a tropical beach at sunset" \
-i original.jpg \
-o edited.png
Multi-Image Reference
Use up to 14 reference images to guide style or content:
python scripts/generate_image.py \
-p "Create a new character in this art style" \
-r style_ref1.png \
-r style_ref2.png \
-o new_character.png
Examples
Generate App Icons
# iOS-style weather icon
python scripts/generate_image.py \
-p "Minimalist weather app icon, blue sky gradient, white fluffy cloud, sun peeking out, flat design, rounded square, iOS 17 style" \
-a 1:1 \
-o weather_icon.png
# Fitness app icon
python scripts/generate_image.py \
-p "Fitness app icon, running figure silhouette, orange to red gradient background, energetic and dynamic, modern flat design" \
-a 1:1 \
-o fitness_icon.png
Create Marketing Assets
# Website hero banner
python scripts/generate_image.py \
-p "Abstract tech hero banner, flowing data visualization, dark blue background with glowing cyan accents, futuristic and professional" \
-a 21:9 \
-s 2K \
-o hero_banner.png
# Social media post
python scripts/generate_image.py \
-p "Motivational quote background, soft sunrise gradient, minimalist mountain silhouette, peaceful and inspiring" \
-a 4:5 \
-o social_post_bg.png
Generate Game Assets
# Character sprite
python scripts/generate_image.py \
-p "Pixel art hero character, knight with blue cape and silver armor, idle pose, transparent background, 16-bit retro style" \
-a 1:1 \
-o knight_sprite.png
# Environment tile
python scripts/generate_image.py \
-p "Grass tile for top-down RPG, seamless pattern, vibrant green with small flowers, pixel art style, 32x32 aesthetic" \
-a 1:1 \
-o grass_tile.png
Edit Photos
# Change background
python scripts/generate_image.py \
-p "Replace background with a cozy coffee shop interior" \
-i portrait.jpg \
-o portrait_coffee_shop.png
# Style enhancement
python scripts/generate_image.py \
-p "Enhance with dramatic cinematic color grading, increase contrast, add film grain" \
-i landscape.jpg \
-o landscape_cinematic.png
Troubleshooting
"GEMINI_API_KEY environment variable not set"
Set your API key:
export GEMINI_API_KEY="your-api-key"
"Rate limit exceeded"
Wait a few minutes and try again. For batch operations, add delays between requests.
"Content policy violation"
Modify your prompt to avoid content that violates Google's usage policies. Try:
- Using more generic descriptions
- Avoiding specific brand names or copyrighted characters
- Removing potentially sensitive content
"No image in response"
The model sometimes returns text instead of an image. Try:
- Making your prompt more specific
- Adding "generate an image of" to your prompt
- Using a different aspect ratio
"Unsupported image format"
Supported formats for input images: PNG, JPEG, WebP
Size option not working
The size option (2K, 4K) is only available for gemini-3-pro-image-preview. The Flash model always generates 1024px images.
Best Practices
- Start simple: Begin with clear, concise prompts and iterate
- Use the right model: Flash for speed, Pro for quality
- Match aspect ratio to use case: 16:9 for banners, 1:1 for icons
- Save high-quality versions: Use 4K when you need detailed assets
- Iterate on prompts: Small changes can significantly affect results
- Use reference images: For consistent style across multiple generations
- Add quality modifiers: "high quality", "detailed", "professional"
- Specify what you don't want: "no text", "simple background", "no people"