| name | database-advanced-optimization |
| category | database |
| difficulty | advanced |
| estimated_time | 60 minutes |
| prerequisites | Basic SQL knowledge, Database design fundamentals, Understanding of indexes and query execution |
| tags | postgresql, mongodb, redis, performance, optimization, scaling, caching, sharding |
| learning_objectives | Master database-specific optimization techniques, Implement effective caching strategies, Design scalable database architectures, Optimize query performance across multiple database systems, Configure connection pooling and monitoring |
| version | 1.0.0 |
| last_updated | Fri Jan 17 2025 00:00:00 GMT+0000 (Coordinated Universal Time) |
| description | Use SQL (PostgreSQL) when: |
Database Advanced Optimization
Level 1: Quick Reference (5 minutes)
Database Selection Guide
Use SQL (PostgreSQL) when:
- ACID compliance is critical
- Complex joins and transactions required
- Data has clear relational structure
- Strong consistency needed
- Rich query capabilities required
Use NoSQL (MongoDB) when:
- Flexible schema needed
- Horizontal scaling is priority
- Document-oriented data model fits
- High write throughput required
- Eventual consistency acceptable
Use In-Memory (Redis) when:
- Sub-millisecond latency required
- Caching layer needed
- Real-time features (pub/sub, streams)
- Session management
- Rate limiting or counters
Common Optimization Patterns
-- PostgreSQL: Create covering index
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at)
INCLUDE (status, total_amount);
-- PostgreSQL: Analyze query plan
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 123;
// MongoDB: Create compound index
db.orders.createIndex({ userId: 1, createdAt: -1 }, { background: true });
// MongoDB: Use aggregation pipeline efficiently
db.orders.aggregate([
{ $match: { status: "pending" } },
{ $sort: { createdAt: -1 } },
{ $limit: 100 }
]);
# Redis: Implement cache-aside pattern
def get_user(user_id):
cache_key = f"user:{user_id}"
user = redis.get(cache_key)
if user is None:
user = db.query("SELECT * FROM users WHERE id = %s", user_id)
redis.setex(cache_key, 3600, json.dumps(user))
return json.loads(user)
Essential Optimization Checklist
PostgreSQL
- Create appropriate indexes (B-tree, GIN, GiST)
- Analyze query plans with EXPLAIN ANALYZE
- Configure autovacuum appropriately
- Set up connection pooling (PgBouncer)
- Monitor with pg_stat_statements
- Optimize shared_buffers and work_mem
- Configure appropriate WAL settings
MongoDB
- Design effective shard keys
- Create compound indexes for common queries
- Enable profiler for slow queries
- Configure replica sets for read scaling
- Optimize aggregation pipelines
- Set appropriate write concerns
- Monitor with MongoDB Compass/Atlas
Redis
- Implement appropriate caching strategy
- Configure maxmemory and eviction policies
- Use pipelining for bulk operations
- Set appropriate TTLs
- Monitor memory usage and hit rates
- Configure persistence (RDB/AOF) appropriately
- Use Redis Cluster for scaling
Quick Wins
Immediate Impact:
- Add indexes for frequently queried columns
- Enable query result caching
- Implement connection pooling
- Add Redis cache layer for hot data
- Optimize N+1 queries with batch loading
Performance Monitoring:
# PostgreSQL query stats
SELECT query, calls, total_time, mean_time
FROM pg_stat_statements
ORDER BY mean_time DESC LIMIT 10;
# MongoDB profiler
db.setProfilingLevel(1, { slowms: 100 });
db.system.profile.find().sort({ ts: -1 }).limit(10);
# Redis stats
redis-cli INFO stats | grep -E 'keyspace_hits|keyspace_misses'
Level 2:
📚 Full Examples: See REFERENCE.md for complete code samples, detailed configurations, and production-ready implementations.
Implementation Guide (45 minutes)
PostgreSQL Advanced Optimization
Index Strategies
PostgreSQL supports multiple index types, each optimized for specific use cases:
B-tree Indexes (Default)
See REFERENCE.md for complete implementation.
GIN Indexes (Full-text and Array Search)
See REFERENCE.md for complete implementation.
GiST Indexes (Geometric and Range Data)
See REFERENCE.md for complete implementation.
Query Optimization with EXPLAIN ANALYZE
See REFERENCE.md for complete implementation.
Understanding EXPLAIN Output:
See REFERENCE.md for complete implementation.
Optimization Decisions:
- Index Scan = Good (using indexes)
- Buffers shared hit = Good (data in cache)
- Nested Loop = Appropriate for small result sets
- Hash Join = Better for large joins
VACUUM and Maintenance
See REFERENCE.md for complete implementation.
Connection Pooling with PgBouncer
See REFERENCE.md for complete implementation.
Connection Pooling Best Practices:
- Use
transactionmode for most applications - Set
default_pool_size= (CPU cores × 2) + effective_spindle_count - Monitor pool usage:
SHOW POOLS;in PgBouncer console - Application connection string:
postgresql://user:pass@pgbouncer:6432/myapp
PostgreSQL Performance Tuning
See REFERENCE.md for complete implementation.
MongoDB Advanced Optimization
Sharding Strategies
Shard Key Selection:
See REFERENCE.md for complete implementation.
Shard Key Patterns:
- Range-based Sharding (ordered data):
// Good for time-series data
sh.shardCollection("analytics.events", { "date": 1, "userId": 1 })
// Queries benefit from targeted routing
db.events.find({ date: ISODate("2025-01-17"), userId: 123 })
- Hash-based Sharding (random distribution):
// Good for even distribution, poor for range queries
sh.shardCollection("users.profiles", { "_id": "hashed" })
// All shards queried for range
db.profiles.find({ _id: { $gt: 1000, $lt: 2000 } })
- Zone Sharding (geographic distribution):
See REFERENCE.md for complete implementation.
Index Optimization
See REFERENCE.md for complete implementation.
Index Analysis:
See REFERENCE.md for complete implementation.
Aggregation Pipeline Optimization
See REFERENCE.md for complete implementation.
Replica Set Configuration
See REFERENCE.md for complete implementation.
Redis Advanced Patterns
Caching Strategies
1. Cache-Aside (Lazy Loading)
See REFERENCE.md for complete implementation.
2. Write-Through Caching
See REFERENCE.md for complete implementation.
3. Write-Behind (Write-Back) Caching
See REFERENCE.md for complete implementation.
4. Read-Through Caching
See REFERENCE.md for complete implementation.
Redis Data Structures and Use Cases
See REFERENCE.md for complete implementation.
Pub/Sub and Streams
Pub/Sub Pattern:
See REFERENCE.md for complete implementation.
Redis Streams (Preferred for Reliable Messaging):
See REFERENCE.md for complete implementation.
Redis Cluster Configuration
See REFERENCE.md for complete implementation.
Redis Configuration (redis.conf):
See REFERENCE.md for complete implementation.
Query Optimization Techniques
Solving the N+1 Problem
Bad: N+1 Queries
See REFERENCE.md for complete implementation.
Good: Join or Batch Loading
See REFERENCE.md for complete implementation.
Database Scaling Strategies
1. Vertical Scaling (Scale Up)
- Increase CPU, RAM, storage
- Simple but has limits
- Good for initial growth
2. Horizontal Scaling (Scale Out)
Read Replicas:
See REFERENCE.md for complete implementation.
Sharding:
See REFERENCE.md for complete implementation.
Connection Pooling Best Practices
HikariCP (Java):
See REFERENCE.md for complete implementation.
Python (SQLAlchemy):
See REFERENCE.md for complete implementation.
Node.js (pg-pool):
See REFERENCE.md for complete implementation.
Database Monitoring and Profiling
PostgreSQL Monitoring
pg_stat_statements Extension:
See REFERENCE.md for complete implementation.
Key Metrics to Monitor:
See REFERENCE.md for complete implementation.
MongoDB Profiling
Enable Profiler:
See REFERENCE.md for complete implementation.
Analyze Slow Queries:
See REFERENCE.md for complete implementation.
Server Status Metrics:
See REFERENCE.md for complete implementation.
Redis Monitoring
INFO Command:
See REFERENCE.md for complete implementation.
Key Metrics:
See REFERENCE.md for complete implementation.
Slow Log:
See REFERENCE.md for complete implementation.
Backup and Recovery Strategies
PostgreSQL Backup
Physical Backup (pg_basebackup):
# Full backup
pg_basebackup -h localhost -U postgres -D /backup/pg_data -Fp -Xs -P
# Compressed backup
pg_basebackup -h localhost -U postgres -D /backup/pg_data -Ft -z -P
Logical Backup (pg_dump):
See REFERENCE.md for complete implementation.
Restore:
See REFERENCE.md for complete implementation.
Point-in-Time Recovery (PITR):
See REFERENCE.md for complete implementation.
MongoDB Backup
mongodump:
See REFERENCE.md for complete implementation.
mongorestore:
See REFERENCE.md for complete implementation.
Filesystem Snapshots (Replica Set):
See REFERENCE.md for complete implementation.
Redis Backup
RDB (Snapshot):
See REFERENCE.md for complete implementation.
AOF (Append-Only File):
# Enable AOF (redis.conf)
appendonly yes
appendfsync everysec
# Rewrite AOF
redis-cli BGREWRITEAOF
# Backup AOF file
cp /var/lib/redis/appendonly.aof /backup/appendonly-$(date +%Y%m%d).aof
Restore:
# Stop Redis
systemctl stop redis
# Restore RDB or AOF
cp /backup/dump.rdb /var/lib/redis/
# or
cp /backup/appendonly.aof /var/lib/redis/
# Start Redis
systemctl start redis
Level 3: Deep Dive Resources (10 minutes)
Official Documentation
PostgreSQL:
MongoDB:
- MongoDB Performance Best Practices
- Sharding Guide
- Index Strategies
- Aggregation Pipeline Optimization
Redis:
Books and Courses
- "High Performance PostgreSQL" by Gregory Smith
- "PostgreSQL Query Optimization" by Henrietta Dombrovskaya
- "MongoDB: The Definitive Guide" by Shannon Bradshaw
- "Redis in Action" by Josiah Carlson
- "Designing Data-Intensive Applications" by Martin Kleppmann
Tools and Utilities
PostgreSQL:
- pgAdmin 4 - Database management
- pg_top - Real-time monitoring
- pgBadger - Log analyzer
- pgtune - Configuration generator
MongoDB:
- MongoDB Compass - GUI client
- MongoDB Atlas - Managed service
- mongostat/mongotop - Command-line monitoring
- Studio 3T - Advanced GUI
Redis:
- RedisInsight - GUI client
- redis-cli - Command-line interface
- redis-stat - Real-time monitoring
- redis-rdb-tools - RDB file analysis
Practice Exercises
- Index Optimization Challenge: Given a slow query, design optimal indexes
- Sharding Strategy: Design shard key for multi-tenant application
- Cache Implementation: Build cache-aside pattern with TTL and invalidation
- Query Analysis: Use EXPLAIN to optimize complex join queries
- Backup Strategy: Design comprehensive backup/recovery plan
Related Skills
- Database Design Fundamentals
- API Performance Optimization
- System Architecture Patterns
- Monitoring and Observability
Examples
Basic Usage
// TODO: Add basic example for advanced-optimization
// This example demonstrates core functionality
Advanced Usage
// TODO: Add advanced example for advanced-optimization
// This example shows production-ready patterns
Integration Example
// TODO: Add integration example showing how advanced-optimization
// works with other systems and services
See examples/advanced-optimization/ for complete working examples.
Integration Points
This skill integrates with:
Upstream Dependencies
- Tools: Common development tools and frameworks
- Prerequisites: Basic understanding of general concepts
Downstream Consumers
- Applications: Production systems requiring advanced-optimization functionality
- CI/CD Pipelines: Automated testing and deployment workflows
- Monitoring Systems: Observability and logging platforms
Related Skills
- See other skills in this category
Common Integration Patterns
- Development Workflow: How this skill fits into daily development
- Production Deployment: Integration with production systems
- Monitoring & Alerting: Observability integration points
Common Pitfalls
Pitfall 1: Insufficient Testing
Problem: Not testing edge cases and error conditions leads to production bugs
Solution: Implement comprehensive test coverage including:
- Happy path scenarios
- Error handling and edge cases
- Integration points with external systems
Prevention: Enforce minimum code coverage (80%+) in CI/CD pipeline
Pitfall 2: Hardcoded Configuration
Problem: Hardcoding values makes applications inflexible and environment-dependent
Solution: Use environment variables and configuration management:
- Separate config from code
- Use environment-specific configuration files
- Never commit secrets to version control
Prevention: Use tools like dotenv, config validators, and secret scanners
Pitfall 3: Ignoring Security Best Practices
Problem: Security vulnerabilities from not following established security patterns
Solution: Follow security guidelines:
- Input validation and sanitization
- Proper authentication and authorization
- Encrypted data transmission (TLS/SSL)
- Regular security audits and updates
Prevention: Use security linters, SAST tools, and regular dependency updates
Best Practices:
- Follow established patterns and conventions for advanced-optimization
- Keep dependencies up to date and scan for vulnerabilities
- Write comprehensive documentation and inline comments
- Use linting and formatting tools consistently
- Implement proper error handling and logging
- Regular code reviews and pair programming
- Monitor production metrics and set up alerts
Bundled Resources
This skill includes 6 production-ready templates and scripts:
- templates/postgres-optimization.sql - PostgreSQL optimization queries and index strategies
- templates/mongodb-sharding.yaml - MongoDB sharding configuration
- templates/redis-caching.py - Redis caching pattern implementations
- scripts/query-profiling.sh - Automated database query profiling
- config/pgbouncer.ini - PgBouncer connection pooling configuration
- templates/monitoring-dashboard.json - Grafana dashboard for database metrics
All resources are located in the skills/database/advanced-optimization/ directory.