name	Terraform IaC Expert
description	Infrastructure as Code for AI workloads using Terraform across AWS, Azure, GCP, and OCI
version	1.1.0
last_updated	Tue Jan 06 2026 00:00:00 GMT+0000 (Coordinated Universal Time)
external_version	Terraform 1.10+
resources	resources/modules.tf
triggers	terraform, infrastructure as code, IaC, provisioning

Terraform IaC Expert

Expert in Terraform and Infrastructure as Code for deploying AI infrastructure across multi-cloud environments.

Project Structure

infrastructure/
├── modules/
│   ├── aws-bedrock/
│   ├── azure-openai/
│   ├── oci-genai/
│   └── vector-store/
├── environments/
│   ├── dev/
│   ├── staging/
│   └── prod/
├── shared/
│   ├── networking/
│   └── security/
└── scripts/

Module Overview

Module	Provider	Purpose
`aws-bedrock`	AWS	Bedrock, Knowledge Bases, VPC Endpoints
`azure-openai`	Azure	Azure OpenAI, AI Search, Private Endpoints
`oci-genai`	OCI	DACs, Endpoints, Agents, Knowledge Bases
`vector-store`	Multi	OpenSearch Serverless, Qdrant, Milvus

Full module code: resources/modules.tf

AWS AI Infrastructure

Bedrock Module

module "aws_ai" {
  source = "./modules/aws-bedrock"

  prefix             = "prod"
  region             = "us-east-1"
  vpc_id             = data.aws_vpc.main.id
  private_subnet_ids = data.aws_subnets.private.ids

  enable_private_endpoint = true
  create_knowledge_base   = true
}

Key Resources

IAM roles for Bedrock access
VPC endpoints for private connectivity
Knowledge bases with OpenSearch

Azure AI Infrastructure

Azure OpenAI Module

module "azure_ai" {
  source = "./modules/azure-openai"

  openai_name         = "prod-openai"
  location            = "eastus"
  resource_group_name = azurerm_resource_group.ai.name

  gpt4o_capacity      = 100  # TPM
  embedding_capacity  = 50

  enable_private_endpoint = true
}

Key Resources

Cognitive Account (OpenAI kind)
Model deployments (GPT-4o, embeddings)
Private endpoints
Azure AI Search

OCI AI Infrastructure

GenAI Module

module "oci_ai" {
  source = "./modules/oci-genai"

  prefix         = "prod"
  compartment_id = var.oci_compartment_id
  cluster_type   = "HOSTING"
  unit_count     = 10
  unit_shape     = "LARGE_COHERE"

  create_agent          = true
  create_knowledge_base = true
}

Key Resources

Dedicated AI Clusters (DAC)
Model endpoints
GenAI Agents
Knowledge bases

Vector Store Infrastructure

OpenSearch Serverless (AWS)

module "vectors" {
  source = "./modules/vector-store/aws-opensearch"

  prefix           = "prod"
  vpc_endpoint_ids = [aws_vpc_endpoint.opensearch.id]
  allowed_principals = [aws_iam_role.bedrock.arn]
}

Multi-Cloud Environment

# environments/prod/main.tf

terraform {
  backend "s3" {
    bucket = "terraform-state-ai-infra"
    key    = "prod/terraform.tfstate"
    encrypt = true
  }
}

provider "aws" { region = var.aws_region }
provider "azurerm" { features {} }
provider "oci" { ... }

# Deploy to all clouds
module "aws_ai" { source = "../../modules/aws-bedrock" ... }
module "azure_ai" { source = "../../modules/azure-openai" ... }
module "oci_ai" { source = "../../modules/oci-genai" ... }

Best Practices

State Management

Remote state (S3, Azure Blob, OCI Object Storage)
State locking (DynamoDB, Cosmos DB)
Encrypt state at rest
Separate state per environment

Variable Validation

variable "environment" {
  type = string
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be dev, staging, or prod."
  }
}

Sensitive Data

Use sensitive = true for outputs
Reference secrets from secret managers
Never commit .tfvars with secrets

Tagging

default_tags {
  tags = {
    Environment = var.environment
    Project     = "ai-platform"
    ManagedBy   = "terraform"
  }
}

CI/CD Integration

GitHub Actions

- uses: hashicorp/setup-terraform@v3
- run: terraform init
- run: terraform plan -out=tfplan
- run: terraform apply -auto-approve tfplan
  if: github.ref == 'refs/heads/main'

Key Patterns

Plan on PR, apply on merge
Use workspaces or directories for environments
Lock state during apply
Store plan artifacts

Managed Kubernetes GPU

EKS GPU Nodes

eks_managed_node_groups = {
  gpu = {
    instance_types = ["g5.2xlarge"]
    ami_type = "AL2_x86_64_GPU"
    taints = [{ key = "nvidia.com/gpu" ... }]
  }
}

AKS GPU Nodes

resource "azurerm_kubernetes_cluster_node_pool" "gpu" {
  vm_size = "Standard_NC24ads_A100_v4"
  node_taints = ["nvidia.com/gpu=true:NoSchedule"]
}

OKE GPU Nodes

resource "oci_containerengine_node_pool" "gpu" {
  node_shape = "BM.GPU.A100-v2.8"
}

Full examples: resources/modules.tf

Resources

Infrastructure as Code for enterprise AI deployments.

Terraform IaC Expert

Install Skill

SKILL.md

Terraform IaC Expert

Project Structure

Module Overview

AWS AI Infrastructure

Bedrock Module

Key Resources

Azure AI Infrastructure

Azure OpenAI Module

Key Resources

OCI AI Infrastructure

GenAI Module

Key Resources

Vector Store Infrastructure

OpenSearch Serverless (AWS)

Multi-Cloud Environment

Best Practices

State Management

Variable Validation

Sensitive Data

Tagging

CI/CD Integration

GitHub Actions

Key Patterns

Managed Kubernetes GPU

EKS GPU Nodes

AKS GPU Nodes

OKE GPU Nodes

Resources