| name | infrastructure-security |
| version | 2.0.0 |
| description | Securing AI/ML infrastructure including model storage, API endpoints, and compute resources |
| sasmp_version | 1.3.0 |
| bonded_agent | 06-api-security-tester |
| bond_type | PRIMARY_BOND |
| input_schema | [object Object] |
| output_schema | [object Object] |
| owasp_llm_2025 | LLM03, LLM10 |
| nist_ai_rmf | Govern, Manage |
AI Infrastructure Security
Protect AI/ML infrastructure from attacks targeting model storage, APIs, and compute resources.
Quick Reference
Skill: infrastructure-security
Agent: 06-api-security-tester
OWASP: LLM03 (Supply Chain), LLM10 (Unbounded Consumption)
NIST: Govern, Manage
Use Case: Secure AI deployment infrastructure
Infrastructure Attack Surface
[External Threats]
↓
[API Gateway] → [Load Balancer] → [Inference Servers]
↓ ↓ ↓
[Rate Limit] [DDoS Protection] [Model Storage]
↓ ↓ ↓
[Auth/AuthZ] [TLS Termination] [Secrets Manager]
Security Layers
1. API Security
Authentication:
methods:
- API keys (rotation: 90 days)
- OAuth 2.0 / OIDC
- mTLS for service-to-service
requirements:
- Strong key generation
- Secure transmission
- Revocation capability
Rate Limiting:
per_user: 100 req/min
per_ip: 1000 req/min
burst: 50
cost_based: true # Token-aware limiting
Input Validation:
max_length: 4096 tokens
content_type: application/json
schema_validation: strict
encoding: UTF-8 normalized
# API Security Configuration
class APISecurityConfig:
def __init__(self):
self.auth_config = {
'type': 'oauth2',
'token_expiry': 3600,
'refresh_enabled': True,
}
self.rate_limits = {
'default': {'requests': 100, 'window': 60},
'premium': {'requests': 1000, 'window': 60},
'burst_multiplier': 2,
}
self.input_validation = {
'max_tokens': 4096,
'blocked_patterns': self._load_blocked_patterns(),
'sanitization': True,
}
2. Model Protection
Storage Security:
encryption: AES-256-GCM
access_control: RBAC
audit_logging: enabled
backup: encrypted, offsite
Theft Prevention:
query_limits: 10000/day per user
output_perturbation: enabled
watermarking: model and output
access_logging: all queries
class ModelProtection:
def __init__(self, model):
self.model = model
self.watermark = self._generate_watermark()
def protected_inference(self, input_data, user_id):
# Log the query
self.log_query(user_id, input_data)
# Check query limits
if self.exceeds_limit(user_id):
raise RateLimitError("Query limit exceeded")
# Run inference
output = self.model(input_data)
# Add output perturbation (anti-extraction)
output = self.add_perturbation(output)
# Apply watermark
output = self.apply_watermark(output)
return output
3. Network Security
Network Configuration:
internal_only: true
vpc_isolation: enabled
firewall_rules:
- allow: internal_services
- deny: all_external (except API gateway)
TLS Configuration:
version: "1.3"
cipher_suites: [TLS_AES_256_GCM_SHA384]
certificate_rotation: 90 days
mtls: service_to_service
4. Compute Security
Container Security:
base_image: distroless
user: non-root
filesystem: read-only
capabilities: minimal
seccomp: enabled
Resource Limits:
cpu: 4 cores max
memory: 16GB max
gpu_memory: 24GB max
disk: ephemeral only
Isolation:
runtime: gvisor
network: namespace isolated
secrets: mounted, not in env
Security Checklist
API Layer:
- [ ] Strong authentication (OAuth2/mTLS)
- [ ] Rate limiting implemented
- [ ] Input validation enabled
- [ ] Error messages sanitized
- [ ] Logging comprehensive
Storage Layer:
- [ ] Encryption at rest
- [ ] Access controls configured
- [ ] Audit logging enabled
- [ ] Backup encryption
Network Layer:
- [ ] TLS 1.3 enforced
- [ ] Internal VPC only
- [ ] Firewall rules configured
- [ ] DDoS protection enabled
Compute Layer:
- [ ] Non-root containers
- [ ] Resource limits set
- [ ] Secrets in vault
- [ ] Immutable infrastructure
Vulnerability Testing
class InfrastructureSecurityTester:
def test_api_security(self, endpoint):
results = []
# Test authentication bypass
results.append(self.test_auth_bypass(endpoint))
# Test rate limiting
results.append(self.test_rate_limits(endpoint))
# Test input validation
results.append(self.test_input_validation(endpoint))
# Test error handling
results.append(self.test_error_disclosure(endpoint))
return results
def test_auth_bypass(self, endpoint):
payloads = [
{'Authorization': ''},
{'Authorization': 'Bearer invalid'},
{'Authorization': 'Bearer ' + 'a' * 1000},
]
for payload in payloads:
response = requests.get(endpoint, headers=payload)
if response.status_code != 401:
return Finding("auth_bypass", "CRITICAL")
return None
Severity Classification
CRITICAL:
- Authentication bypass
- Model theft possible
- Data exposure
HIGH:
- Rate limiting bypassable
- Weak encryption
- Insufficient logging
MEDIUM:
- Missing input validation
- Verbose error messages
- Outdated dependencies
LOW:
- Non-optimal configurations
- Minor policy gaps
Troubleshooting
Issue: API rate limiting not effective
Solution: Implement token-based limits, add IP reputation
Issue: Model extraction detected
Solution: Lower query limits, add output perturbation
Issue: High latency from security layers
Solution: Optimize validation, use caching, async logging
Integration Points
| Component | Purpose |
|---|---|
| Agent 06 | Security testing |
| Agent 08 | CI/CD security gates |
| /test api | Security scanning |
| SIEM | Security monitoring |
Protect AI infrastructure with defense-in-depth security.