| name | database-query-optimization |
| description | Improve database query performance through indexing, query optimization, and execution plan analysis. Reduce response times and database load. |
Database Query Optimization
Overview
Slow database queries are a common performance bottleneck. Optimization through indexing, efficient queries, and caching dramatically improves application performance.
When to Use
- Slow response times
- High database CPU usage
- Performance regression
- New feature deployment
- Regular maintenance
Instructions
1. Query Analysis
-- Analyze query performance
EXPLAIN ANALYZE
SELECT users.id, users.name, COUNT(orders.id) as order_count
FROM users
LEFT JOIN orders ON users.id = orders.user_id
WHERE users.created_at > '2024-01-01'
GROUP BY users.id, users.name
ORDER BY order_count DESC;
-- Results show:
-- - Seq Scan (slow) vs Index Scan (fast)
-- - Rows: actual vs planned (high variance = bad)
-- - Execution time (milliseconds)
-- Key metrics:
-- - Sequential Scan: Full table read (slow)
-- - Index Scan: Uses index (fast)
-- - Nested Loop: Joins with loops
-- - Sort: In-memory or disk sort
2. Indexing Strategy
Index Types:
Single Column:
CREATE INDEX idx_users_email ON users(email);
Use: WHERE email = ?
Size: Small, quick to create
Composite Index:
CREATE INDEX idx_orders_user_date
ON orders(user_id, created_at);
Use: WHERE user_id = ? AND created_at > ?
Order: Most selective first
Covering Index:
CREATE INDEX idx_orders_covering
ON orders(user_id) INCLUDE (total_amount);
Benefit: No table lookup needed
Partial Index:
CREATE INDEX idx_active_users
ON users(id) WHERE status = 'active';
Benefit: Smaller, faster
Full Text:
CREATE FULLTEXT INDEX idx_search
ON articles(title, content);
Use: Text search queries
---
Index Rules:
- Create indexes for WHERE conditions
- Create indexes for JOIN columns
- Create indexes for ORDER BY
- Don't over-index (slows writes)
- Monitor index usage
- Remove unused indexes
- Update statistics regularly
- Partial indexes for filtered queries
Missing Index Query:
SELECT object_name, equality_columns
FROM sys.dm_db_missing_index_details
ORDER BY equality_columns;
3. Query Optimization Techniques
# Common optimization patterns
# BEFORE (N+1 queries)
for user in users:
orders = db.query("SELECT * FROM orders WHERE user_id = ?", user.id)
# 1 + N queries
# AFTER (single query with JOIN)
orders = db.query("""
SELECT u.*, o.* FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > ?
""", date_threshold)
# BEFORE (inefficient WHERE)
SELECT * FROM users
WHERE LOWER(email) = LOWER('Test@Example.com')
# Can't use index (function used)
# AFTER (index-friendly)
SELECT * FROM users
WHERE email = 'test@example.com'
# Case-insensitive constraint + index
# BEFORE (wildcard at start)
SELECT * FROM users WHERE email LIKE '%example.com'
# Can't use index (wildcard at start)
# AFTER (wildcard at end)
SELECT * FROM users WHERE email LIKE 'user%'
# Can use index
# BEFORE (slow aggregation)
SELECT user_id, COUNT(*) as cnt
FROM orders
GROUP BY user_id
ORDER BY cnt DESC
LIMIT 10
# AFTER (pre-aggregated)
SELECT user_id, order_count
FROM user_order_stats
WHERE order_count IS NOT NULL
ORDER BY order_count DESC
LIMIT 10
4. Optimization Checklist
Analysis:
[ ] Run EXPLAIN ANALYZE on slow queries
[ ] Check actual vs estimated rows
[ ] Look for sequential scans
[ ] Identify expensive operations
[ ] Compare execution plans
Indexing:
[ ] Index WHERE columns
[ ] Index JOIN columns
[ ] Index ORDER BY columns
[ ] Check unused indexes
[ ] Remove duplicate indexes
[ ] Create composite indexes strategically
[ ] Analyze index statistics
Query Optimization:
[ ] Remove unnecessary columns (SELECT *)
[ ] Use JOINs instead of subqueries
[ ] Avoid functions in WHERE
[ ] Use wildcards carefully (avoid %)
[ ] Batch operations
[ ] Use LIMIT for result sets
[ ] Archive old data
Caching:
[ ] Implement query caching
[ ] Cache aggregations
[ ] Use Redis for hot data
[ ] Invalidate strategically
Monitoring:
[ ] Track slow queries
[ ] Monitor index usage
[ ] Set up alerts
[ ] Regular statistics update
[ ] Measure improvements
---
Expected Improvements:
With Proper Indexing:
- Sequential Scan → Index Scan
- Response time: 5 seconds → 50ms (100x faster)
- CPU usage: 80% → 20%
- Concurrent users: 100 → 1000
Quick Wins:
- Add index to frequently filtered column
- Fix N+1 queries
- Use LIMIT for large results
- Archive old data
- Expected: 20-50% improvement
Key Points
- EXPLAIN ANALYZE shows query execution
- Indexes must match WHERE/JOIN/ORDER BY
- Avoid functions in WHERE clauses
- Fix N+1 queries (join instead of loop)
- Monitor slow query log regularly
- Stats updates needed for accuracy
- Pre-calculate aggregations
- Archive historical data
- Use explain plans before/after
- Measure and monitor continuously