| name | databases-architecture-skill |
| description | Master database design (SQL, NoSQL), system architecture, API design (REST, GraphQL), and building scalable systems. Learn PostgreSQL, MongoDB, system design patterns, and enterprise architectures. |
Databases & Architecture Skill
Complete guide to designing databases, systems, and APIs that scale.
Quick Start
Learning Path
Data → Schema → APIs → Architecture
↓ ↓ ↓ ↓
SQL Normalize REST Microservices
NoSQL Indexes GraphQL Patterns
Get Started in 5 Steps
SQL Fundamentals (2-3 weeks)
- SELECT, INSERT, UPDATE, DELETE
- Joins and aggregations
Database Design (3-4 weeks)
- Normalization
- Entity-relationship modeling
- Indexing
NoSQL Databases (2-3 weeks)
- Document stores (MongoDB)
- Key-value (Redis)
- When to use each
API Design (3-4 weeks)
- REST principles
- GraphQL basics
- Error handling
System Architecture (ongoing)
- Scalability patterns
- Caching strategies
- Distributed systems
SQL Databases
SQL Fundamentals
-- CREATE TABLE
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE,
age INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- INSERT
INSERT INTO users (name, email, age)
VALUES ('Alice', 'alice@example.com', 25);
-- SELECT (basic)
SELECT * FROM users;
SELECT name, email FROM users;
-- WHERE (filtering)
SELECT * FROM users WHERE age > 25;
SELECT * FROM users WHERE age >= 25 AND age <= 35;
-- LIKE (pattern matching)
SELECT * FROM users WHERE name LIKE 'A%'; -- Starts with A
-- ORDER BY (sorting)
SELECT * FROM users ORDER BY age DESC; -- Highest first
-- LIMIT (pagination)
SELECT * FROM users LIMIT 10 OFFSET 20; -- Skip 20, show 10
Advanced SQL
-- JOINS
SELECT users.name, orders.amount
FROM users
INNER JOIN orders ON users.id = orders.user_id;
-- LEFT JOIN (include nulls)
SELECT users.name, COUNT(orders.id) as order_count
FROM users
LEFT JOIN orders ON users.id = orders.user_id
GROUP BY users.id, users.name;
-- GROUP BY & AGGREGATION
SELECT age, COUNT(*) as count, AVG(salary) as avg_salary
FROM users
GROUP BY age
HAVING COUNT(*) > 5; -- Filter groups
-- Window functions
SELECT name, salary,
AVG(salary) OVER (PARTITION BY department) as dept_avg,
RANK() OVER (ORDER BY salary DESC) as salary_rank
FROM employees;
-- CTEs (Common Table Expressions)
WITH high_earners AS (
SELECT * FROM employees WHERE salary > 100000
)
SELECT department, COUNT(*) as count
FROM high_earners
GROUP BY department;
-- UPDATE
UPDATE users SET age = 26 WHERE name = 'Alice';
-- DELETE
DELETE FROM users WHERE age < 18;
Database Design
Normalization (Reduce data redundancy):
1NF: Each column has atomic value
2NF: Remove partial dependencies
3NF: Remove transitive dependencies
BCNF: Every determinant is a candidate key
Example - Poor vs Good Design:
-- POOR (denormalized)
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(255),
course1 VARCHAR(255),
course2 VARCHAR(255),
course3 VARCHAR(255),
teacher1 VARCHAR(255),
teacher2 VARCHAR(255)
);
-- GOOD (normalized)
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(255)
);
CREATE TABLE courses (
id INT PRIMARY KEY,
name VARCHAR(255),
teacher_id INT FOREIGN KEY
);
CREATE TABLE enrollments (
student_id INT FOREIGN KEY,
course_id INT FOREIGN KEY,
PRIMARY KEY (student_id, course_id)
);
Indexing & Performance
-- Create index
CREATE INDEX idx_email ON users(email);
CREATE INDEX idx_age_salary ON users(age, salary); -- Composite
-- Analyze query performance
EXPLAIN SELECT * FROM users WHERE email = 'alice@example.com';
-- Index types
-- B-tree: General purpose (default)
-- Hash: Exact matches only
-- GiST: Geospatial, full-text search
-- BRIN: Large datasets, sequential data
-- When to index
-- ✓ Columns in WHERE clause
-- ✓ Columns in JOIN ON clause
-- ✗ Low cardinality (yes/no, status)
-- ✗ Small tables
PostgreSQL Advanced
-- JSONB (JSON with indexing)
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
metadata JSONB
);
INSERT INTO products VALUES (1, 'Laptop', '{"color": "silver", "specs": {"cpu": "M1"}}');
-- Query JSONB
SELECT * FROM products WHERE metadata->>'color' = 'silver';
SELECT * FROM products WHERE metadata->'specs'->>'cpu' = 'M1';
-- Array columns
CREATE TABLE tags (
id SERIAL PRIMARY KEY,
article_id INT,
tags TEXT[]
);
SELECT * FROM tags WHERE 'database' = ANY(tags);
-- Full-text search
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title VARCHAR(255),
content TEXT,
search_vector tsvector
);
UPDATE articles SET search_vector = to_tsvector('english', title || ' ' || content);
SELECT * FROM articles WHERE search_vector @@ to_tsquery('database');
NoSQL Databases
MongoDB Document Storage
// Insert documents
db.users.insertOne({
_id: ObjectId(),
name: "Alice",
email: "alice@example.com",
age: 25,
tags: ["developer", "python"],
address: {
street: "123 Main St",
city: "New York",
zip: "10001"
}
});
// Query documents
db.users.find({ name: "Alice" });
db.users.find({ age: { $gt: 25 } }); // Greater than
db.users.find({ tags: "python" }); // Array contains
// Update
db.users.updateOne(
{ name: "Alice" },
{ $set: { age: 26 } }
);
db.users.updateOne(
{ _id: ObjectId(...) },
{ $push: { tags: "javascript" } } // Add to array
);
// Aggregation pipeline
db.users.aggregate([
{ $match: { age: { $gt: 20 } } },
{ $group: { _id: null, avg_age: { $avg: "$age" } } },
{ $sort: { avg_age: -1 } }
]);
// Indexes
db.users.createIndex({ email: 1 });
db.users.createIndex({ name: 1, age: 1 });
db.users.createIndex({ search: "text" }); // Full-text search
Redis Caching
import redis
r = redis.Redis(host='localhost', port=6379)
# Strings
r.set('user:1:name', 'Alice')
r.get('user:1:name') # b'Alice'
r.incr('page:views') # Increment counter
# TTL (Time to live)
r.setex('token:xyz', 3600, 'valid') # Expires in 1 hour
# Lists
r.lpush('queue:jobs', 'job1', 'job2')
r.rpop('queue:jobs') # Dequeue
r.llen('queue:jobs') # Length
# Sets
r.sadd('tags:post:1', 'python', 'database', 'backend')
r.smembers('tags:post:1')
r.sismember('tags:post:1', 'python') # Is member?
# Hashes
r.hset('user:1', mapping={'name': 'Alice', 'email': 'alice@example.com'})
r.hgetall('user:1')
# Pub/Sub
r.publish('channel:notifications', 'New message')
# Transactions
pipe = r.pipeline()
pipe.set('key1', 'value1')
pipe.set('key2', 'value2')
pipe.execute()
API Design
REST API Best Practices
HTTP Methods:
GET - Retrieve resource (safe, idempotent)
POST - Create resource
PUT - Replace entire resource (idempotent)
PATCH - Partial update
DELETE - Remove resource (idempotent)
Status Codes:
200 OK - Success
201 Created - Resource created
204 No Content - Success, no body
400 Bad Request - Client error
401 Unauthorized - Auth required
403 Forbidden - Not allowed
404 Not Found - Resource missing
500 Internal Server Error
Resource URLs:
GET /api/users # List all
GET /api/users/:id # Get one
POST /api/users # Create
PUT /api/users/:id # Update (full)
PATCH /api/users/:id # Update (partial)
DELETE /api/users/:id # Delete
// Nested resources
GET /api/users/:id/posts # User's posts
POST /api/users/:id/posts # Create post for user
Request/Response Example:
POST /api/users
Content-Type: application/json
{
"name": "Alice",
"email": "alice@example.com",
"age": 25
}
Response (201 Created):
{
"id": 123,
"name": "Alice",
"email": "alice@example.com",
"age": 25,
"created_at": "2024-01-15T10:30:00Z"
}
GraphQL
# Schema
type User {
id: ID!
name: String!
email: String!
posts: [Post!]!
}
type Post {
id: ID!
title: String!
content: String!
author: User!
}
type Query {
user(id: ID!): User
users(limit: Int): [User!]!
post(id: ID!): Post
}
type Mutation {
createUser(name: String!, email: String!): User!
updateUser(id: ID!, name: String): User
deleteUser(id: ID!): Boolean!
}
# Query
query GetUserWithPosts {
user(id: "123") {
name
email
posts {
title
id
}
}
}
# Mutation
mutation CreateUser {
createUser(name: "Alice", email: "alice@example.com") {
id
name
email
}
}
GraphQL vs REST:
| Aspect | REST | GraphQL |
|---|---|---|
| Over-fetching | Common | None |
| Under-fetching | Need multiple requests | Single query |
| Caching | Easy (HTTP caching) | More complex |
| Learning curve | Low | High |
| Use case | Simple CRUD | Complex, nested data |
System Design & Architecture
Scalability Patterns
Vertical Scaling (Scale Up):
- Add more CPU, RAM, storage
- Simple but has limits
- Single point of failure
Horizontal Scaling (Scale Out):
- Add more servers
- Load balancing needed
- Better resilience
Caching Strategy
Cache Levels:
1. Client-side (browser cache)
2. CDN (edge caching)
3. Application cache (Redis, Memcached)
4. Database (query caching)
Cache Invalidation Strategies:
1. TTL (Time to Live) - Automatic expiration
2. Event-based - Invalidate on change
3. Purge - Manual invalidation
Microservices Architecture
Advantages:
✓ Independent scaling
✓ Technology diversity
✓ Faster deployment
Challenges:
✗ Network latency
✗ Distributed transactions
✗ Operational complexity
Pattern:
API Gateway → Services → Databases
↓
Service Discovery
Message Queue
Logging/Monitoring
Database Sharding
Split data across multiple databases
- Range-based: User ID 1-1000 → DB1, 1001-2000 → DB2
- Hash-based: hash(user_id) % num_shards
- Directory-based: Lookup table maps to shard
Tradeoffs:
✓ Horizontal scaling
✗ Complex queries
✗ Operational overhead
Learning Checklist
- Understand SQL SELECT with WHERE, JOIN
- Can design normalized schema
- Know when to use indexes
- Understand NoSQL document stores
- Built API with proper status codes
- Know REST vs GraphQL trade-offs
- Understand caching strategies
- Know sharding and replication
- Understand microservices patterns
- Ready for architect role!
Source: https://roadmap.sh/sql, https://roadmap.sh/system-design, https://roadmap.sh/api-design