| name | system-design |
| description | System design, software architecture, API design, cybersecurity, and threat modeling. Build secure, scalable systems. |
| triggers | architecture, system design, security, api, scalability, owasp, threat modeling, compliance |
| parameters | [object Object] |
| outputs | [object Object] |
| retry | [object Object] |
| observability | [object Object] |
| level | advanced |
| prerequisites | core-development, data-structures |
| sasmp_version | 1.3.0 |
| bonded_agent | 01-core-paths |
| bond_type | PRIMARY_BOND |
System Design Skill
Quick Reference
| Pattern |
Best For |
Complexity |
Scaling |
| Monolith |
Startups, MVPs |
Low |
Limited |
| Microservices |
Large teams |
High |
Excellent |
| Serverless |
Event-driven |
Medium |
Auto |
| Event-Driven |
High throughput |
High |
Excellent |
Scalability Progression
Level 1: Single Server
│
▼ Bottleneck: CPU/Memory
Level 2: Load Balancer + Multiple Servers
│
▼ Bottleneck: Database reads
Level 3: Caching Layer (Redis)
│
▼ Bottleneck: Database writes
Level 4: Read Replicas
│
▼ Bottleneck: Single DB limits
Level 5: Sharding / Partitioning
│
▼ Bottleneck: Cross-shard queries
Level 6: CQRS + Event Sourcing
Architecture Decision Tree
What's your team size and product stage?
│
├─► Team < 10, product unclear
│ └─► Monolith (start simple)
│
├─► Team > 10, clear domain boundaries
│ └─► Microservices
│
├─► Variable workloads, pay-per-use
│ └─► Serverless
│
└─► High throughput, async workflows
└─► Event-Driven
API Design
REST Best Practices
GET /api/v1/users # List
GET /api/v1/users/{id} # Get
POST /api/v1/users # Create
PUT /api/v1/users/{id} # Replace
PATCH /api/v1/users/{id} # Update
DELETE /api/v1/users/{id} # Delete
GET /api/v1/users/{id}/orders # Nested
HTTP Status Codes
| Code |
Meaning |
Use When |
| 200 |
OK |
GET/PUT/PATCH success |
| 201 |
Created |
POST success |
| 204 |
No Content |
DELETE success |
| 400 |
Bad Request |
Invalid input |
| 401 |
Unauthorized |
No/invalid auth |
| 403 |
Forbidden |
No permission |
| 404 |
Not Found |
Resource missing |
| 429 |
Too Many Requests |
Rate limited |
| 500 |
Server Error |
Server failure |
Database Selection
| Use Case |
Best Choice |
Notes |
| Transactions |
PostgreSQL |
ACID, most versatile |
| High write |
Cassandra |
Write-optimized |
| Caching |
Redis |
Sub-millisecond |
| Search |
Elasticsearch |
Full-text search |
| Analytics |
BigQuery |
Column-store |
| Time-series |
TimescaleDB |
Time-based data |
| Graph |
Neo4j |
Relationships |
Security: OWASP Top 10 (2025)
| # |
Vulnerability |
Prevention |
| 1 |
Broken Access Control |
Verify auth on every request |
| 2 |
Cryptographic Failures |
TLS 1.3, AES-256, Argon2 |
| 3 |
Injection |
Parameterized queries |
| 4 |
Insecure Design |
Threat modeling |
| 5 |
Security Misconfiguration |
Harden defaults |
| 6 |
Vulnerable Components |
Dependency scanning |
| 7 |
Auth Failures |
MFA, rate limiting |
| 8 |
Data Integrity |
Sign data, verify sources |
| 9 |
Logging Failures |
Comprehensive logging |
| 10 |
SSRF |
Allowlist URLs |
Encryption Standards
| Layer |
Standard |
Notes |
| In Transit |
TLS 1.3 |
HTTPS everywhere |
| At Rest |
AES-256 |
Encrypt sensitive data |
| Passwords |
Argon2id |
bcrypt acceptable |
| API Keys |
SHA-256 |
Store hashed |
Threat Modeling: STRIDE
┌─────────────────────────────────────────┐
│ STRIDE MODEL │
├─────────────────────────────────────────┤
│ S - Spoofing │
│ → Strong auth, MFA │
│ │
│ T - Tampering │
│ → Integrity checks, signatures │
│ │
│ R - Repudiation │
│ → Audit logging │
│ │
│ I - Information Disclosure │
│ → Encryption, access control │
│ │
│ D - Denial of Service │
│ → Rate limiting, DDoS protection │
│ │
│ E - Elevation of Privilege │
│ → Least privilege, RBAC │
└─────────────────────────────────────────┘
Compliance Requirements
| Standard |
Domain |
Key Requirements |
| GDPR |
EU Data |
Consent, right to delete |
| HIPAA |
Healthcare |
PHI encryption, audit logs |
| SOC 2 |
Services |
Security controls |
| PCI DSS |
Payments |
Card data protection |
| CCPA |
CA Privacy |
Consumer rights |
Disaster Recovery
| Strategy |
RTO |
RPO |
Cost |
| Backup/Restore |
Hours |
Hours |
Low |
| Pilot Light |
10s min |
Minutes |
Medium |
| Warm Standby |
Minutes |
Seconds |
High |
| Active-Active |
Seconds |
Zero |
Very High |
Troubleshooting
System not scaling?
├─► Database bottleneck? → Add caching, replicas
├─► Single point of failure? → Add redundancy
├─► Stateful services? → Make stateless
└─► Network limits? → CDN, optimize payloads
Security incident response?
├─► 1. CONTAIN: Isolate affected systems
├─► 2. IDENTIFY: Scope and entry point
├─► 3. ERADICATE: Remove threat, patch
├─► 4. RECOVER: Restore from clean backup
└─► 5. LEARN: Post-mortem, improve
Common Failure Modes
| Symptom |
Root Cause |
Recovery |
| Cascading failures |
Tight coupling |
Circuit breakers |
| Works locally |
Env differences |
Containers, IaC |
| Data breach |
Missing controls |
Audit, RBAC |
| Audit failed |
Missing compliance |
Gap analysis |
Next Actions
Describe your system requirements for architecture recommendations.