| name | mongodb-schema-design |
| version | 2.1.0 |
| description | Master MongoDB schema design and data modeling patterns. Learn embedding vs referencing, relationships, normalization, and schema evolution. Use when designing databases, normalizing data, or optimizing queries. |
| sasmp_version | 1.3.0 |
| bonded_agent | 03-mongodb-data-modeling |
| bond_type | PRIMARY_BOND |
| capabilities | embedding-strategy, referencing-strategy, relationship-modeling, schema-patterns, evolution-planning |
| input_validation | [object Object] |
| output_format | [object Object] |
| error_handling | [object Object] |
| prerequisites | [object Object] |
| testing | [object Object] |
MongoDB Schema Design
Master data modeling and schema patterns.
Quick Start
One-to-One: Embedded
// User with single address - embed if always accessed together
{
_id: ObjectId('...'),
name: 'John',
email: 'john@example.com',
address: {
street: '123 Main St',
city: 'New York',
zip: '10001'
}
}
One-to-Many: Embed Array
// User with multiple tags - embed if limited size
{
_id: ObjectId('...'),
name: 'John',
tags: ['mongodb', 'database', 'nosql'],
posts: [
{ _id: 1, title: 'Post 1', content: '...' },
{ _id: 2, title: 'Post 2', content: '...' }
]
}
One-to-Many: Reference
// User with many orders - reference if potentially large
{
_id: ObjectId('user1'),
name: 'John',
email: 'john@example.com'
}
// Orders collection
{
_id: ObjectId('order1'),
customerId: ObjectId('user1'),
total: 99.99
}
Many-to-Many: Array of References
// Products with categories
{
_id: ObjectId('product1'),
name: 'Laptop',
categoryIds: [
ObjectId('electronics'),
ObjectId('computers')
]
}
// Categories collection
{
_id: ObjectId('electronics'),
name: 'Electronics'
}
Schema Patterns
Attribute Pattern
// Store variant attributes flexibly
{
_id: ObjectId('...'),
productName: 'T-Shirt',
attributes: [
{ key: 'color', value: 'blue' },
{ key: 'size', value: 'L' },
{ key: 'material', value: 'cotton' }
]
}
Polymorphic Pattern
// Different document types in same collection
{
_id: ObjectId('...'),
type: 'email',
to: 'user@example.com',
subject: 'Hello'
}
{
_id: ObjectId('...'),
type: 'sms',
phoneNumber: '+1234567890',
message: 'Hi there'
}
Tree Structures: Adjacency List
// Parent-child relationships
{
_id: ObjectId('...'),
name: 'Electronics',
parent: null
}
{
_id: ObjectId('...'),
name: 'Computers',
parent: ObjectId('electronics')
}
Versioned Pattern
// Track document history
{
_id: ObjectId('...'),
name: 'Product',
description: 'Latest description',
versions: [
{ v: 1, name: 'Product', description: 'Original', date: ISODate(...) },
{ v: 2, name: 'Product', description: 'Updated', date: ISODate(...) }
]
}
Design Principles
Embedding Advantages
- Single query to fetch related data
- Atomic updates for related documents
- No joins needed
Referencing Advantages
- Avoid data duplication
- Smaller documents
- Flexible relationships
- Can grow independently
Decision Tree
Does the related data grow unbounded?
YES → Use referencing
NO → Consider embedding
Is the related data frequently accessed separately?
YES → Use referencing
NO → Consider embedding
Do updates need to be atomic across documents?
YES → Use embedding
NO → Use referencing
Python Design Example
# User with embedded address
users.insert_one({
'name': 'John',
'email': 'john@example.com',
'address': {
'street': '123 Main St',
'city': 'New York'
}
})
# User with references to orders
users.insert_one({
'_id': ObjectId('...'),
'name': 'John'
})
orders.insert_one({
'userId': ObjectId('...'),
'total': 99.99
})
# Query with $lookup
users.aggregate([
{ '$lookup': {
'from': 'orders',
'localField': '_id',
'foreignField': 'userId',
'as': 'orders'
}}
])
Best Practices
✅ Embed when data is always accessed together ✅ Reference for unbounded arrays ✅ Keep document size under 16MB ✅ Consider query patterns when designing ✅ Denormalize carefully for performance ✅ Plan for schema evolution ✅ Use validation schemas ✅ Document your design decisions