| name | aws-prescriptive-guidance |
| description | AWS Prescriptive Guidance for best practices and architectural patterns. Use for AWS architecture recommendations, SageMaker AI endpoints guidance, deployment patterns, and AWS solution architectures. |
AWS Prescriptive Guidance Skill
Comprehensive AWS architectural patterns, ML deployment strategies, and cloud design best practices from official AWS documentation.
When to Use This Skill
This skill should be triggered when:
- Designing AWS architectures - Multi-tier applications, microservices, serverless systems
- Deploying ML models - SageMaker endpoints, inference pipelines, MLOps workflows
- Implementing cloud patterns - Circuit breakers, saga patterns, event sourcing, API routing
- Modernizing applications - Strangler fig migrations, anti-corruption layers, hexagonal architecture
- Building resilient systems - Retry with backoff, scatter-gather, publish-subscribe patterns
- Creating MLOps pipelines - Model training, deployment automation, cross-cloud workflows
- Integrating preprocessing with inference - SageMaker inference pipelines, feature engineering
- Implementing DevOps practices - CI/CD patterns, infrastructure as code, deployment strategies
Specific triggers:
- Questions about SageMaker deployment patterns
- Architecture design for AWS services
- MLOps workflow implementation
- Microservices design patterns
- System modernization strategies
- DevOps and CI/CD best practices
Quick Reference
1. SageMaker Inference Pipeline Setup
Deploy preprocessing and ML model in a single endpoint:
import sagemaker
from sagemaker import get_execution_role
# Initialize SageMaker session
sagemaker_session = sagemaker.Session()
role = get_execution_role()
bucket = sagemaker_session.default_bucket()
# Upload training data
train_input = sagemaker_session.upload_data(
path="data/training.csv",
bucket=bucket,
key_prefix="myproject/train"
)
2. Create SKLearn Preprocessor
Build a preprocessing stage for your inference pipeline:
from sagemaker.sklearn.estimator import SKLearn
sklearn_preprocessor = SKLearn(
entry_point="preprocessing.py",
role=role,
framework_version="0.23-1",
instance_type="ml.c4.xlarge",
sagemaker_session=sagemaker_session
)
sklearn_preprocessor.fit({"train": train_input})
3. Train Linear Learner Model
Create and train a regression model:
from sagemaker.image_uris import retrieve
ll_image = retrieve("linear-learner", boto3.Session().region_name)
ll_estimator = sagemaker.estimator.Estimator(
ll_image,
role,
instance_count=1,
instance_type="ml.m4.2xlarge",
output_path=f"s3://{bucket}/model-output"
)
ll_estimator.set_hyperparameters(
feature_dim=10,
predictor_type="regressor",
mini_batch_size=32
)
ll_estimator.fit({"train": preprocessed_data})
4. Deploy Pipeline Model
Combine preprocessing and inference in single endpoint:
from sagemaker.pipeline import PipelineModel
from time import gmtime, strftime
# Create models from trained estimators
preprocessing_model = sklearn_preprocessor.create_model()
inference_model = ll_estimator.create_model()
# Create pipeline
timestamp = strftime("%Y-%m-%d-%H-%M-%S", gmtime())
pipeline = PipelineModel(
name=f"pipeline-{timestamp}",
role=role,
models=[preprocessing_model, inference_model]
)
# Deploy to endpoint
pipeline.deploy(
initial_instance_count=1,
instance_type="ml.c4.xlarge",
endpoint_name=f"pipeline-endpoint-{timestamp}"
)
5. Test Inference Endpoint
Send data to your deployed pipeline:
from sagemaker.predictor import Predictor
from sagemaker.serializers import CSVSerializer
predictor = Predictor(
endpoint_name="pipeline-endpoint-2024-01-15",
sagemaker_session=sagemaker_session,
serializer=CSVSerializer()
)
# Raw data input (preprocessing happens automatically)
payload = "0.44, 0.365, 0.125, 0.516, 0.2155, 0.114"
prediction = predictor.predict(payload)
print(f"Prediction: {prediction}")
6. Batch Transform for Testing
Test preprocessing before full pipeline deployment:
# Create transformer from preprocessor
transformer = sklearn_preprocessor.transformer(
instance_count=1,
instance_type="ml.m5.xlarge",
assemble_with="Line",
accept="text/csv"
)
# Transform training data
transformer.transform(train_input, content_type="text/csv")
transformer.wait()
# Get preprocessed output location
preprocessed_data = transformer.output_path
7. Circuit Breaker Pattern
Implement resilient service calls with circuit breaker:
from enum import Enum
import time
class CircuitState(Enum):
CLOSED = "closed"
OPEN = "open"
HALF_OPEN = "half_open"
class CircuitBreaker:
def __init__(self, failure_threshold=5, timeout=60):
self.state = CircuitState.CLOSED
self.failures = 0
self.failure_threshold = failure_threshold
self.timeout = timeout
self.last_failure_time = None
def call(self, func, *args, **kwargs):
if self.state == CircuitState.OPEN:
if time.time() - self.last_failure_time > self.timeout:
self.state = CircuitState.HALF_OPEN
else:
raise Exception("Circuit breaker is OPEN")
try:
result = func(*args, **kwargs)
self.on_success()
return result
except Exception as e:
self.on_failure()
raise e
def on_success(self):
self.failures = 0
self.state = CircuitState.CLOSED
def on_failure(self):
self.failures += 1
self.last_failure_time = time.time()
if self.failures >= self.failure_threshold:
self.state = CircuitState.OPEN
8. Retry with Exponential Backoff
Implement resilient API calls:
import time
import random
def retry_with_backoff(func, max_retries=5, base_delay=1, max_delay=60):
"""
Retry function with exponential backoff and jitter
"""
for attempt in range(max_retries):
try:
return func()
except Exception as e:
if attempt == max_retries - 1:
raise e
# Calculate exponential backoff with jitter
delay = min(base_delay * (2 ** attempt), max_delay)
jitter = random.uniform(0, delay * 0.1)
sleep_time = delay + jitter
print(f"Attempt {attempt + 1} failed. Retrying in {sleep_time:.2f}s")
time.sleep(sleep_time)
# Usage
result = retry_with_backoff(lambda: api_call())
9. Publish-Subscribe Pattern (SNS/SQS)
Decouple services with pub-sub messaging:
import boto3
# Create SNS client
sns = boto3.client('sns')
sqs = boto3.client('sqs')
# Create topic
topic_response = sns.create_topic(Name='order-events')
topic_arn = topic_response['TopicArn']
# Create queue and subscribe to topic
queue_response = sqs.create_queue(QueueName='order-processing')
queue_url = queue_response['QueueUrl']
queue_arn = sqs.get_queue_attributes(
QueueUrl=queue_url,
AttributeNames=['QueueArn']
)['Attributes']['QueueArn']
# Subscribe queue to topic
sns.subscribe(
TopicArn=topic_arn,
Protocol='sqs',
Endpoint=queue_arn
)
# Publish message
sns.publish(
TopicArn=topic_arn,
Message='{"order_id": "12345", "status": "created"}',
Subject='OrderCreated'
)
10. API Gateway Lambda Integration
Serverless API with request routing:
import json
def lambda_handler(event, context):
"""
Lambda function for API Gateway integration
"""
# Extract request details
http_method = event.get('httpMethod')
path = event.get('path')
body = json.loads(event.get('body', '{}'))
# Route based on method and path
if http_method == 'GET' and path == '/items':
return {
'statusCode': 200,
'headers': {'Content-Type': 'application/json'},
'body': json.dumps({'items': ['item1', 'item2']})
}
elif http_method == 'POST' and path == '/items':
# Process new item
item_id = body.get('id')
return {
'statusCode': 201,
'headers': {'Content-Type': 'application/json'},
'body': json.dumps({'id': item_id, 'status': 'created'})
}
return {
'statusCode': 404,
'body': json.dumps({'error': 'Not found'})
}
Key Concepts
Cloud Design Patterns
AWS Prescriptive Guidance covers 13 primary architectural patterns for modern cloud applications:
- Anti-corruption Layer - Mediates between legacy systems and modern microservices, translating data formats and protocols
- API Routing Patterns - Three strategies for routing requests (hostname, path, HTTP header based)
- Circuit Breaker - Prevents cascading failures by detecting faults and stopping requests to failing services
- Event Sourcing - Captures all state changes as a sequence of events for audit trails and replay capability
- Hexagonal Architecture - Separates core business logic from external dependencies and infrastructure
- Publish-Subscribe - Enables asynchronous, decoupled communication between services
- Retry with Backoff - Handles transient failures with progressive delay increases
- Saga Patterns - Manages distributed transactions across microservices (choreography vs orchestration)
- Scatter-Gather - Aggregates results from parallel requests to multiple services
- Strangler Fig - Incrementally replaces legacy systems without complete rewrites
- Transactional Outbox - Ensures reliable message publishing with database transactions
SageMaker ML Deployment Patterns
Inference Pipelines:
- Combine preprocessing and model inference in single endpoint
- Reduce latency by eliminating intermediate data storage
- Simplify deployment by bundling feature engineering with prediction
- Support real-time and batch inference workflows
MLOps Workflows:
- Automated model training and deployment pipelines
- Cross-platform integration (Azure DevOps, GitHub Actions)
- Self-service templates with Backstage for standardized deployments
- GPU-optimized training for custom deep learning models
Architecture Benefits:
- Single endpoint for preprocessing + inference = lower latency
- Automatic scaling of both preprocessing and model inference
- Consistent feature engineering between training and inference
- Simplified monitoring and logging
Microservices Architecture Principles
Modern applications emphasize:
- Scalability - Horizontal scaling of independent services
- Release Velocity - Fast, independent deployments per service
- Fault Isolation - Failures contained to individual services
- Technology Diversity - Different services can use different tech stacks
Design Pattern Categories
Integration Patterns:
- Anti-corruption layer, API routing, event sourcing
Resilience Patterns:
- Circuit breaker, retry with backoff, scatter-gather
Data Patterns:
- Event sourcing, transactional outbox, saga orchestration
Migration Patterns:
- Strangler fig, hexagonal architecture
Reference Files
best_practices.md
Official AWS Prescriptive Guidance best practices for cloud architecture, covering:
- AWS Well-Architected Framework alignment
- Security best practices
- Cost optimization strategies
- Operational excellence guidelines
- Performance efficiency recommendations
- Reliability design patterns
When to use: Start here for general AWS architecture guidance and foundational principles.
patterns.md
Comprehensive catalog of AWS implementation patterns, including:
- MLOps and SageMaker deployment patterns
- Microservices architecture patterns
- DevOps and CI/CD patterns
- Data engineering patterns
- Serverless architecture patterns
- Migration and modernization strategies
When to use: Reference when implementing specific solutions or looking for proven architectural approaches.
Working with This Skill
For Beginners
Start with:
- Review
best_practices.mdfor AWS foundational concepts - Explore simple patterns like retry with backoff or publish-subscribe
- Use the Quick Reference section for copy-paste examples
- Focus on single-service patterns before multi-service architectures
Recommended learning path:
- Basic SageMaker deployment → Inference pipelines → MLOps workflows
- Simple Lambda functions → API Gateway integration → Microservices
- Single pattern → Combined patterns → Full architecture design
For Intermediate Users
Focus on:
- Multi-service integration patterns (circuit breaker, saga patterns)
- SageMaker inference pipelines with custom preprocessing
- Event-driven architectures with SNS/SQS
- API Gateway patterns with Lambda integrations
- DevOps automation for ML model deployment
Navigation tips:
- Use
patterns.mdfor comprehensive implementation guides - Combine multiple patterns for complex architectures
- Reference Quick Reference for implementation details
- Study MLOps workflows for production ML systems
For Advanced Users
Deep dive into:
- Complex saga orchestration patterns
- Multi-region, multi-account architectures
- Custom SageMaker algorithms with GPU optimization
- Hexagonal architecture for domain-driven design
- Event sourcing with CQRS patterns
- Cross-cloud MLOps with Azure DevOps or GitHub Actions
Best practices:
- Combine multiple patterns for enterprise architectures
- Implement observability and monitoring from the start
- Use infrastructure as code (CloudFormation, CDK, Terraform)
- Design for failure with circuit breakers and retry logic
- Implement proper security boundaries and IAM policies
Technology Stack
Core AWS Services
- Amazon SageMaker - ML model training and deployment
- Amazon SageMaker Studio - Integrated ML development environment
- AWS Lambda - Serverless compute
- Amazon API Gateway - RESTful API management
- Amazon SNS - Pub-sub messaging
- Amazon SQS - Message queuing
- Amazon ECR - Container registry
- Amazon S3 - Object storage
- AWS CloudFormation - Infrastructure as code
ML Frameworks
- Scikit-learn - Preprocessing and traditional ML
- PyTorch - Deep learning models
- TensorFlow - Neural network frameworks
- XGBoost - Gradient boosting
- CatBoost - Categorical feature handling
DevOps Tools
- Azure DevOps - Cross-cloud CI/CD
- GitHub Actions - Workflow automation
- Backstage - Developer portal and templates
- Hydra - ML experiment configuration
Common Workflows
Deploy ML Model with Preprocessing
- Prepare data and upload to S3
- Create preprocessing estimator (SKLearn, custom containers)
- Train preprocessing on sample data
- Create ML model estimator (Linear Learner, XGBoost, custom)
- Train model on preprocessed data
- Combine into pipeline model
- Deploy pipeline to SageMaker endpoint
- Test with raw input data
Implement Resilient Microservice
- Design service boundaries (hexagonal architecture)
- Implement circuit breaker for external calls
- Add retry with exponential backoff for transient failures
- Use pub-sub pattern for async communication
- Implement health checks and monitoring
- Add API Gateway for routing and throttling
- Deploy with auto-scaling and load balancing
Modernize Legacy System
- Analyze legacy system and identify boundaries
- Implement anti-corruption layer for integration
- Use strangler fig pattern for incremental migration
- Build new services with hexagonal architecture
- Route traffic gradually to new services
- Monitor and validate new implementations
- Decommission legacy components incrementally
Prerequisites
For SageMaker Development
- Active AWS account with appropriate IAM permissions
- Python 3.9 or higher
- SageMaker Python SDK (v2.49.2+)
- Boto3 library for AWS API access
- Understanding of ML concepts and workflows
For Architecture Implementation
- AWS account with admin or PowerUser access
- Familiarity with AWS core services (EC2, S3, Lambda)
- Understanding of distributed systems concepts
- Experience with microservices architectures (helpful)
- Knowledge of IaC tools (CloudFormation, CDK, Terraform)
Development Environment
# Install required Python packages
pip install boto3 sagemaker scikit-learn pandas numpy
# Configure AWS credentials
aws configure
Resources
Official Documentation Links
- AWS Prescriptive Guidance: https://docs.aws.amazon.com/prescriptive-guidance/
- SageMaker Developer Guide: https://docs.aws.amazon.com/sagemaker/
- AWS Architecture Center: https://aws.amazon.com/architecture/
Pattern Categories
- MLOps Patterns - Model deployment, training automation, endpoint management
- DevOps Patterns - CI/CD pipelines, infrastructure automation, deployment strategies
- Microservices Patterns - Service communication, resilience, API design
- Data Patterns - Data pipelines, event streaming, batch processing
- Migration Patterns - Legacy modernization, cloud migration, hybrid architectures
Notes
- Patterns include step-by-step implementation guides with code examples
- All code examples tested with latest AWS SDK versions
- SageMaker examples use Python 3.9+ and SageMaker SDK 2.x
- Architecture patterns follow AWS Well-Architected Framework
- Reference implementations available in AWS Solutions Library
- Examples assume standard AWS regions (us-east-1, us-west-2)
Best Practices
SageMaker Deployment
- Use inference pipelines to bundle preprocessing with models
- Implement batch transform for large-scale predictions
- Enable auto-scaling for production endpoints
- Use multi-model endpoints for cost optimization
- Implement proper IAM roles with least privilege
- Enable CloudWatch logging and monitoring
Architecture Design
- Design for failure - assume components will fail
- Implement retry logic with exponential backoff
- Use circuit breakers to prevent cascade failures
- Decouple services with asynchronous messaging
- Implement proper observability from day one
- Use infrastructure as code for reproducibility
- Follow the strangler fig pattern for migrations
Cost Optimization
- Use SageMaker Savings Plans for predictable workloads
- Leverage spot instances for training jobs
- Implement auto-scaling for endpoints
- Use multi-model endpoints to reduce costs
- Monitor and optimize compute instance types
- Clean up unused resources and endpoints
Updating
This skill is based on AWS Prescriptive Guidance documentation. AWS regularly updates patterns and best practices, so:
- Check AWS Prescriptive Guidance website for latest patterns
- Review SageMaker SDK changelogs for API updates
- Follow AWS architecture blog for new design patterns
- Subscribe to AWS What's New for service updates