| name | nosql-database-design |
| description | Design NoSQL database schemas for MongoDB and DynamoDB. Use when modeling document structures, designing collections, or planning NoSQL data architectures. |
NoSQL Database Design
Overview
Design scalable NoSQL schemas for MongoDB (document) and DynamoDB (key-value). Covers data modeling patterns, denormalization strategies, and query optimization for NoSQL systems.
When to Use
- MongoDB collection design
- DynamoDB table and index design
- Document structure modeling
- Embedding vs. referencing decisions
- Query pattern optimization
- NoSQL indexing strategies
- Data denormalization planning
MongoDB Schema Design
Document Structure Design
MongoDB - Embedded Documents:
// Single document with embedded arrays
db.createCollection("users")
db.users.insertOne({
_id: ObjectId("..."),
email: "john@example.com",
name: "John Doe",
createdAt: new Date(),
// Embedded address
address: {
street: "123 Main St",
city: "New York",
state: "NY",
zipCode: "10001"
},
// Embedded array of items
orders: [
{
orderId: ObjectId("..."),
date: new Date(),
total: 149.99
},
{
orderId: ObjectId("..."),
date: new Date(),
total: 89.99
}
]
})
MongoDB - Referenced Documents:
// Separate collections with references
db.createCollection("users")
db.createCollection("orders")
db.users.insertOne({
_id: ObjectId("..."),
email: "john@example.com",
name: "John Doe"
})
db.orders.insertMany([
{
_id: ObjectId("..."),
userId: ObjectId("..."), // Reference to user
orderDate: new Date(),
total: 149.99
},
{
_id: ObjectId("..."),
userId: ObjectId("..."),
orderDate: new Date(),
total: 89.99
}
])
// Query with $lookup for JOINs
db.orders.aggregate([
{
$match: { userId: ObjectId("...") }
},
{
$lookup: {
from: "users",
localField: "userId",
foreignField: "_id",
as: "user"
}
}
])
Indexing in MongoDB
// Single field index
db.users.createIndex({ email: 1 })
db.orders.createIndex({ createdAt: -1 })
// Compound index
db.orders.createIndex({ userId: 1, createdAt: -1 })
// Text index for search
db.products.createIndex({ name: "text", description: "text" })
// Geospatial index
db.stores.createIndex({ location: "2dsphere" })
// TTL index for auto-expiration
db.sessions.createIndex({ createdAt: 1 }, { expireAfterSeconds: 3600 })
// Sparse index (only documents with field)
db.users.createIndex({ phone: 1 }, { sparse: true })
// Check index usage
db.users.aggregate([{ $indexStats: {} }])
Schema Validation
// Define collection validation schema
db.createCollection("products", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "price", "category"],
properties: {
_id: { bsonType: "objectId" },
name: {
bsonType: "string",
description: "Product name (required)"
},
price: {
bsonType: "decimal",
minimum: 0,
description: "Price must be positive"
},
category: {
enum: ["electronics", "clothing", "food"],
description: "Category must be one of listed values"
},
tags: {
bsonType: "array",
items: { bsonType: "string" }
},
createdAt: {
bsonType: "date"
}
}
}
}
})
DynamoDB Schema Design
Table Structure
// DynamoDB table with single primary key
const TableName = "users"
const params = {
TableName,
KeySchema: [
{ AttributeName: "userId", KeyType: "HASH" } // Partition key
],
AttributeDefinitions: [
{ AttributeName: "userId", AttributeType: "S" } // String
],
BillingMode: "PAY_PER_REQUEST" // On-demand
}
// DynamoDB table with composite primary key
const ordersParams = {
TableName: "orders",
KeySchema: [
{ AttributeName: "userId", KeyType: "HASH" }, // Partition key
{ AttributeName: "orderId", KeyType: "RANGE" } // Sort key
],
AttributeDefinitions: [
{ AttributeName: "userId", AttributeType: "S" },
{ AttributeName: "orderId", AttributeType: "S" }
],
BillingMode: "PAY_PER_REQUEST"
}
Global Secondary Indexes (GSI)
// Add GSI for querying by email
const gsiParams = {
TableName: "users",
AttributeDefinitions: [
{ AttributeName: "email", AttributeType: "S" }
],
GlobalSecondaryIndexes: [
{
IndexName: "emailIndex",
KeySchema: [
{ AttributeName: "email", KeyType: "HASH" }
],
Projection: {
ProjectionType: "ALL" // Return all attributes
},
BillingMode: "PAY_PER_REQUEST"
}
]
}
// GSI with composite key for time-based queries
const timeIndexParams = {
GlobalSecondaryIndexes: [
{
IndexName: "userCreatedIndex",
KeySchema: [
{ AttributeName: "userId", KeyType: "HASH" },
{ AttributeName: "createdAt", KeyType: "RANGE" }
],
Projection: { ProjectionType: "ALL" },
BillingMode: "PAY_PER_REQUEST"
}
]
}
DynamoDB Item Operations
// Put item (insert/update)
const putParams = {
TableName: "users",
Item: {
userId: { S: "user-123" },
email: { S: "john@example.com" },
name: { S: "John Doe" },
createdAt: { N: Date.now().toString() },
metadata: {
M: {
joinDate: { N: Date.now().toString() },
source: { S: "web" }
}
}
}
}
// Query using GSI
const queryParams = {
TableName: "users",
IndexName: "emailIndex",
KeyConditionExpression: "email = :email",
ExpressionAttributeValues: {
":email": { S: "john@example.com" }
}
}
// Batch get items
const batchGetParams = {
RequestItems: {
"users": {
Keys: [
{ userId: { S: "user-123" } },
{ userId: { S: "user-456" } }
]
}
}
}
Denormalization Patterns
MongoDB - Embedding for Performance:
// Embed frequently accessed data to avoid lookups
db.orders.insertOne({
_id: ObjectId("..."),
userId: ObjectId("..."),
userEmail: "john@example.com", // Denormalized
userName: "John Doe", // Denormalized
createdAt: new Date(),
items: [
{
productId: ObjectId("..."),
productName: "Laptop", // Denormalized
productPrice: 999.99, // Denormalized
quantity: 1
}
]
})
DynamoDB - Denormalization with Consistency:
// Store related data in same item to ensure consistency
const params = {
TableName: "orders",
Item: {
userId: { S: "user-123" },
orderId: { S: "order-456" },
orderDate: { N: Date.now().toString() },
// User data snapshot at order time
userSnapshot: {
M: {
email: { S: "john@example.com" },
address: { S: "123 Main St" }
}
},
// Items with product information
items: {
L: [
{
M: {
productId: { S: "prod-789" },
name: { S: "Laptop" },
price: { N: "999.99" },
quantity: { N: "1" }
}
}
]
}
}
}
Design Patterns
MongoDB - Time-Series Pattern:
// Efficient time-series data storage
db.sensor_data.insertOne({
_id: ObjectId("..."),
sensorId: "sensor-123",
date: ISODate("2024-01-15"),
measurements: [
{ time: "12:00", temperature: 72.5, humidity: 45 },
{ time: "12:01", temperature: 72.6, humidity: 45.2 },
{ time: "12:02", temperature: 72.4, humidity: 44.8 }
]
})
// Index for efficient queries
db.sensor_data.createIndex({ sensorId: 1, date: -1 })
DynamoDB - One-to-Many Relationship:
// Store one-to-many relationships efficiently
// User comments using userId as partition key, commentId as sort key
const commentParams = {
TableName: "comments",
Item: {
userId: { S: "user-123" }, // Partition key
commentId: { S: "comment-789" }, // Sort key
postId: { S: "post-456" },
content: { S: "Great article!" },
createdAt: { N: Date.now().toString() }
}
}
Capacity Planning
MongoDB - Horizontal Scaling:
// Sharding for large collections
sh.shardCollection("ecommerce.orders", { userId: "hashed" })
// Monitor shard distribution
db.orders.aggregate([
{ $group: { _id: "$userId", count: { $sum: 1 } } },
{ $sort: { count: -1 } }
])
DynamoDB - Partition Key Design:
// Good: Distribute across many keys
// Partition key: "USER#123" (spreads across partitions)
// Sort key: "ORDER#2024-01"
// Bad: Hot partition
// Partition key: "ADMIN" (all admin operations hit same partition)
// Solution: Add timestamp or random suffix
// Partition key: "ADMIN#20240115#random"
Migration Considerations
- Plan data migration strategy
- Consider consistency requirements
- Test query patterns before finalizing schema
- Monitor performance after deployment
- Document relationships and access patterns
- Plan for schema evolution