Claude Code Plugins

Community-maintained marketplace

Feedback

Securing AI/ML infrastructure including model storage, API endpoints, and compute resources

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name infrastructure-security
version 2.0.0
description Securing AI/ML infrastructure including model storage, API endpoints, and compute resources
sasmp_version 1.3.0
bonded_agent 06-api-security-tester
bond_type PRIMARY_BOND
input_schema [object Object]
output_schema [object Object]
owasp_llm_2025 LLM03, LLM10
nist_ai_rmf Govern, Manage

AI Infrastructure Security

Protect AI/ML infrastructure from attacks targeting model storage, APIs, and compute resources.

Quick Reference

Skill:       infrastructure-security
Agent:       06-api-security-tester
OWASP:       LLM03 (Supply Chain), LLM10 (Unbounded Consumption)
NIST:        Govern, Manage
Use Case:    Secure AI deployment infrastructure

Infrastructure Attack Surface

                    [External Threats]
                          ↓
[API Gateway] → [Load Balancer] → [Inference Servers]
      ↓              ↓                    ↓
[Rate Limit]   [DDoS Protection]   [Model Storage]
      ↓              ↓                    ↓
[Auth/AuthZ]   [TLS Termination]   [Secrets Manager]

Security Layers

1. API Security

Authentication:
  methods:
    - API keys (rotation: 90 days)
    - OAuth 2.0 / OIDC
    - mTLS for service-to-service
  requirements:
    - Strong key generation
    - Secure transmission
    - Revocation capability

Rate Limiting:
  per_user: 100 req/min
  per_ip: 1000 req/min
  burst: 50
  cost_based: true  # Token-aware limiting

Input Validation:
  max_length: 4096 tokens
  content_type: application/json
  schema_validation: strict
  encoding: UTF-8 normalized
# API Security Configuration
class APISecurityConfig:
    def __init__(self):
        self.auth_config = {
            'type': 'oauth2',
            'token_expiry': 3600,
            'refresh_enabled': True,
        }

        self.rate_limits = {
            'default': {'requests': 100, 'window': 60},
            'premium': {'requests': 1000, 'window': 60},
            'burst_multiplier': 2,
        }

        self.input_validation = {
            'max_tokens': 4096,
            'blocked_patterns': self._load_blocked_patterns(),
            'sanitization': True,
        }

2. Model Protection

Storage Security:
  encryption: AES-256-GCM
  access_control: RBAC
  audit_logging: enabled
  backup: encrypted, offsite

Theft Prevention:
  query_limits: 10000/day per user
  output_perturbation: enabled
  watermarking: model and output
  access_logging: all queries
class ModelProtection:
    def __init__(self, model):
        self.model = model
        self.watermark = self._generate_watermark()

    def protected_inference(self, input_data, user_id):
        # Log the query
        self.log_query(user_id, input_data)

        # Check query limits
        if self.exceeds_limit(user_id):
            raise RateLimitError("Query limit exceeded")

        # Run inference
        output = self.model(input_data)

        # Add output perturbation (anti-extraction)
        output = self.add_perturbation(output)

        # Apply watermark
        output = self.apply_watermark(output)

        return output

3. Network Security

Network Configuration:
  internal_only: true
  vpc_isolation: enabled
  firewall_rules:
    - allow: internal_services
    - deny: all_external (except API gateway)

TLS Configuration:
  version: "1.3"
  cipher_suites: [TLS_AES_256_GCM_SHA384]
  certificate_rotation: 90 days
  mtls: service_to_service

4. Compute Security

Container Security:
  base_image: distroless
  user: non-root
  filesystem: read-only
  capabilities: minimal
  seccomp: enabled

Resource Limits:
  cpu: 4 cores max
  memory: 16GB max
  gpu_memory: 24GB max
  disk: ephemeral only

Isolation:
  runtime: gvisor
  network: namespace isolated
  secrets: mounted, not in env

Security Checklist

API Layer:
  - [ ] Strong authentication (OAuth2/mTLS)
  - [ ] Rate limiting implemented
  - [ ] Input validation enabled
  - [ ] Error messages sanitized
  - [ ] Logging comprehensive

Storage Layer:
  - [ ] Encryption at rest
  - [ ] Access controls configured
  - [ ] Audit logging enabled
  - [ ] Backup encryption

Network Layer:
  - [ ] TLS 1.3 enforced
  - [ ] Internal VPC only
  - [ ] Firewall rules configured
  - [ ] DDoS protection enabled

Compute Layer:
  - [ ] Non-root containers
  - [ ] Resource limits set
  - [ ] Secrets in vault
  - [ ] Immutable infrastructure

Vulnerability Testing

class InfrastructureSecurityTester:
    def test_api_security(self, endpoint):
        results = []

        # Test authentication bypass
        results.append(self.test_auth_bypass(endpoint))

        # Test rate limiting
        results.append(self.test_rate_limits(endpoint))

        # Test input validation
        results.append(self.test_input_validation(endpoint))

        # Test error handling
        results.append(self.test_error_disclosure(endpoint))

        return results

    def test_auth_bypass(self, endpoint):
        payloads = [
            {'Authorization': ''},
            {'Authorization': 'Bearer invalid'},
            {'Authorization': 'Bearer ' + 'a' * 1000},
        ]
        for payload in payloads:
            response = requests.get(endpoint, headers=payload)
            if response.status_code != 401:
                return Finding("auth_bypass", "CRITICAL")
        return None

Severity Classification

CRITICAL:
  - Authentication bypass
  - Model theft possible
  - Data exposure

HIGH:
  - Rate limiting bypassable
  - Weak encryption
  - Insufficient logging

MEDIUM:
  - Missing input validation
  - Verbose error messages
  - Outdated dependencies

LOW:
  - Non-optimal configurations
  - Minor policy gaps

Troubleshooting

Issue: API rate limiting not effective
Solution: Implement token-based limits, add IP reputation

Issue: Model extraction detected
Solution: Lower query limits, add output perturbation

Issue: High latency from security layers
Solution: Optimize validation, use caching, async logging

Integration Points

Component Purpose
Agent 06 Security testing
Agent 08 CI/CD security gates
/test api Security scanning
SIEM Security monitoring

Protect AI infrastructure with defense-in-depth security.