name	generating-ovn-topology
description	Generates and displays OVN-Kubernetes network topology diagrams showing logical switches, routers, ports with IP/MAC addresses in Mermaid format
tools	Bash, Read, Write

Quick Start - OVN Topology Generation

IMMEDIATE ACTIONS (follow these steps in order):

Detect Cluster: Find the OVN-Kubernetes cluster kubeconfig

Run: scripts/detect-cluster.sh 2>/dev/null

The script discovers OVN-Kubernetes clusters:
- Scans all kubeconfig files: current KUBECONFIG env, ~/.kube/kind-config, ~/ovn.conf, ~/.kube/config
- Tests ALL contexts in each kubeconfig (not just current-context)
- Returns parseable list to stdout: index|kubeconfig|cluster_name|node_count|namespace
- Diagnostics go to stderr
- Exit code: 0=success, 1=no clusters found
How to handle the output:

The script returns pipe-delimited lines to stdout, one per cluster found, e.g.:
```
1|/home/user/.kube/kind-config|kind-ovn|3|ovn-kubernetes
2|/home/user/.kube/config|prod-cluster|12|openshift-ovn-kubernetes
```
Decision logic:
- If one cluster found → automatically use it (extract kubeconfig path from column 2)
- If multiple clusters found → show the list to user and ask them to choose by number
- After selection, extract the kubeconfig path from column 2 of the chosen line
- Store the selected kubeconfig path in variable KC for use in subsequent steps
Example output format parsing:
- Column 1: Index number (for user selection)
- Column 2: Kubeconfig file path (this is what you need for $KC)
- Column 3: Cluster display name
- Column 4: Number of nodes
- Column 5: OVN namespace name
Important: Parse the output using standard text processing. The exact implementation is up to you - use whatever approach works best (awk, Python, inline parsing, etc.).
Check Permissions: Verify user's Kubernetes access level and inform about write permissions

Run: scripts/check_permissions.py "$KC"

The script returns:
- Exit 0: Read-only access or user confirmed → proceed
- Exit 1: Error or user cancelled → stop
- Exit 2: Write permissions detected → AI must ask user for confirmation
When exit code 2 is returned:
1. Parse the stdout to get the list of write permissions
2. Display the permissions clearly to the user using a formatted message
3. Explain that:
  - This skill performs ONLY read-only operations
  - No cluster modifications will be made
  - The warning is for transparency about their access level
  - List read-only operations: kubectl get, kubectl exec (ovn-nbctl list), local file writes
  - List forbidden operations: kubectl create/delete/patch, ovn-nbctl modifications
4. Ask the user explicitly: "You have cluster admin permissions. This command will only perform read-only operations. Do you want to proceed?"
5. If user says yes → continue, if no → stop
Example of proper user communication:
```
⚠️  WARNING: Write Permissions Detected

Your kubeconfig has cluster admin permissions:
  • Delete pods, deployments, services
  • Create and modify resources
  • Full cluster access

📋 IMPORTANT:
This command will ONLY perform read-only operations:
  ✅ kubectl get (pods, nodes)
  ✅ kubectl exec (to run read-only ovn-nbctl list commands)
  ✅ Local file writes (topology diagram)

Operations that will NEVER be performed:
  ❌ kubectl create/delete/patch/apply
  ❌ ovn-nbctl modifications
  ❌ Any cluster state changes

Do you want to proceed with read-only topology generation?
```
Security Note: This step ensures informed consent. The user must be explicitly aware that their cluster admin credentials are accessible to the AI agent (acting on their behalf), even though only read-only operations will be performed. This transparency is critical for security and trust.
Check Output File: Ask user if ovn-topology-diagram.md exists:
- (1) Overwrite, (2) Custom path, (3) Timestamp, (4) Cancel
Create Private Temp Directory: Create a private temporary directory using mkdtemp and use it for all temporary files.
```
TMPDIR=$(mktemp -d)
```
Collect OVN Data: Get full topology data from the cluster

Run: scripts/collect_ovn_data.py "$KC" "$TMPDIR"

Detail files written to $TMPDIR:
- ovn_switches_detail.txt - node|uuid|name|other_config
- ovn_routers_detail.txt - node|uuid|name|external_ids|options
- ovn_lsps_detail.txt - node|name|addresses|type|options
- ovn_lrps_detail.txt - node|name|mac|networks|options
- ovn_pods_detail.txt - namespace|name|ip|node
Analyze Placement: Determine per-node vs cluster-wide components

Run: scripts/analyze_placement.py "$TMPDIR"

Placement results written to $TMPDIR:
- ovn_switch_placement.txt - name|placement (per-node|cluster-wide|cluster-wide-visual)
- ovn_router_placement.txt - name|placement (per-node|cluster-wide|cluster-wide-visual)
Generate Diagram: Create Mermaid graph BT diagram
- Read $TMPDIR/ovn_switch_placement.txt to determine where each switch goes
- Read $TMPDIR/ovn_router_placement.txt to determine where each router goes
- Read detail files directly (ovn_switches_detail.txt, ovn_routers_detail.txt, etc.)
- Skip UUID column when parsing switches/routers detail files
- If placement is per-node → put inside node subgraph
- If placement is cluster-wide or cluster-wide-visual → put outside subgraphs
Save & Report: Write diagram to file, show summary, clean up temporary files

CRITICAL RULES:

❌ NO codebase searching for IPs/MACs
❌ NO synthetic/example data
❌ NO inline multi-line bash (use helper scripts)
❌ NO direct kubectl commands (must use helper scripts only)
✅ Use helper scripts for all kubectl interactions and architecture discovery
✅ For helper scripts only: If kubectl is required, use KUBECONFIG="$KC" kubectl --kubeconfig="$KC"
✅ SECURITY: Create private temp directory with TMPDIR=$(mktemp -d) - never use /tmp directly
✅ Temporary files use $TMPDIR (private directory created with mkdtemp)
✅ Clean up temporary files when done: rm -rf "$TMPDIR"

Safety & Security Guarantees

🔒 Read-Only Operations

This skill performs ONLY read-only operations against your Kubernetes cluster. No cluster state is modified.

Allowed Operations:

✅ kubectl get - Query resources
✅ kubectl exec ... ovn-nbctl list - Query OVN database (read-only)
✅ Local file writes (temporary files in $TMPDIR, output diagram)

Forbidden Operations (NEVER used):

❌ kubectl create/apply/delete/patch - No resource modifications
❌ kubectl scale/drain/cordon - No cluster operations
❌ ovn-nbctl create/set/add/remove/destroy - No OVN modifications
❌ No pod restarts or service disruptions

Privacy Consideration: The generated diagram contains network topology information. Control sharing appropriately based on your security policies.

Architecture Concepts

Interconnect Mode (Distributed NBDB)

In interconnect mode, each node runs its own NBDB with local copies of components:

Per-Node Components (different UUIDs on each node):

ovn_cluster_router - Each node has its own cluster router instance
join switch - Each node has its own join switch instance
transit_switch - Each node has its own transit switch instance
Node switches (e.g., ovn-control-plane, ovn-worker)
External switches (e.g., ext_ovn-control-plane)
Gateway routers (e.g., GR_ovn-control-plane)

Visualization Overrides:

transit_switch: PER-NODE in reality → shown CLUSTER-WIDE for visualization clarity
join: PER-NODE → kept PER-NODE (no override)

Helper Scripts

All helper scripts are in the scripts/ directory.

Script	Purpose	Input	Output
detect-cluster.sh	Find OVN cluster kubeconfig across all contexts. Scans multiple kubeconfig files and all their contexts. Returns parseable list.	None	Parseable list to stdout: `index\|kubeconfig\|cluster\|nodes\|namespace`. Exit: 0=success, 1=none found
check_permissions.py	Check user permissions and warn if write access detected.	KUBECONFIG path	Exit: 0=proceed, 1=cancelled/error, 2=write perms (needs user confirmation)
collect_ovn_data.py	Data collector: Queries each node for all data, with graceful degradation (continues on node failures). Writes detail files.	KUBECONFIG path, TMPDIR	Detail files: `ovn_switches_detail.txt`, `ovn_routers_detail.txt`, `ovn_lsps_detail.txt`, `ovn_lrps_detail.txt`, `ovn_pods_detail.txt`
analyze_placement.py	Placement analyzer: Analyzes UUID patterns from detail files to determine per-node vs cluster-wide placement.	TMPDIR (reads detail files)	Placement files: `ovn_switch_placement.txt`, `ovn_router_placement.txt`

Diagram Generation Rules

Structure

graph BT
    subgraph node1["<b style='color:black'>Node: name (node_ip)</b>"]
        direction BT

        %% LAYER 1 (Bottom): Pods and Management Ports
        POD_example["Pod: pod-name<br/>Namespace: ns<br/>IP: x.x.x.x"]
        MGMT["Management Port: k8s-node<br/>IP: x.x.x.x"]

        %% LAYER 2: Pod LSPs
        LSP_pod["LSP: namespace_pod-name<br/>MAC: xx:xx:xx:xx:xx:xx<br/>IP: x.x.x.x"]
        LSP_mgmt["LSP: k8s-node<br/>MAC: xx:xx:xx:xx:xx:xx<br/>IP: x.x.x.x"]

        %% LAYER 3: Node Switch
        LS_node["Logical Switch: node-name<br/>Subnet: x.x.x.x/24"]

        %% LAYER 4: Node Switch LSPs
        LSP_stor["LSP: stor-node<br/>Type: router"]

        %% LAYER 5: Cluster Router
        LR_cluster["Logical Router: ovn_cluster_router"]
        LRP_rtos["LRP: rtos-node<br/>MAC: xx:xx<br/>IP: x.x.x.x/24"]
        LRP_rtoj["LRP: rtoj-ovn_cluster_router<br/>IP: x.x.x.x/16"]
        LRP_rtots["LRP: rtots-node<br/>IP: x.x.x.x/16"]

        %% LAYER 6: Join Switch
        LS_join["Logical Switch: join"]
        LSP_jtor_cr["LSP: jtor-ovn_cluster_router<br/>Type: router"]
        LSP_jtor_gr["LSP: jtor-GR_node<br/>Type: router"]

        %% LAYER 7: Gateway Router
        LR_gr["Gateway Router: GR_node"]
        LRP_rtoj_gr["LRP: rtoj-GR_node<br/>IP: x.x.x.x/16"]
        LRP_rtoe["LRP: rtoe-GR_node<br/>IP: x.x.x.x/24"]

        %% LAYER 8: External Switch
        LS_ext["Logical Switch: ext_node"]
        LSP_etor["LSP: etor-GR_node<br/>Type: router"]
        LSP_breth0["LSP: breth0_node<br/>Type: localnet"]

        %% LAYER 9 (Top): External Network
        EXT_NET["External Network<br/>Physical bridge: breth0<br/>Node IP: x.x.x.x"]

        %% Connections (bottom-to-top flow)
        POD_example --> LSP_pod --> LS_node
        MGMT --> LSP_mgmt --> LS_node
        LS_node --> LSP_stor
        LSP_stor -.->|peer| LRP_rtos --> LR_cluster
        LR_cluster --> LRP_rtoj -.->|peer| LSP_jtor_cr --> LS_join
        LS_join --> LSP_jtor_gr -.->|peer| LRP_rtoj_gr --> LR_gr
        LR_gr --> LRP_rtoe -.->|peer| LSP_etor --> LS_ext
        LS_ext --> LSP_breth0 -.->|physical| EXT_NET
        LR_cluster --> LRP_rtots
    end

    %% Cluster-wide components (AFTER nodes to appear on top)
    %% Only include components with placement=cluster-wide or cluster-wide-visual
    %% Example: transit_switch in interconnect mode
    LS_cluster_component["Component Name<br/>Details"]

    %% Connections from nodes to cluster-wide components
    LRP_from_node -.->|connects to| LSP_cluster_port --> LS_cluster_component

Key Requirements

Graph Direction: Always graph BT (bottom-to-top)
Component Placement: Determined by $TMPDIR/ovn_*_placement.txt
- per-node → INSIDE node subgraph
- cluster-wide or cluster-wide-visual → OUTSIDE all subgraphs, defined AFTER all node subgraphs
- CRITICAL: Define ALL cluster-wide components AFTER all nodes to position them at the TOP
  - Prevents connection lines from overlapping with node subgraphs
  - Applies to ANY component with cluster-wide or cluster-wide-visual placement
Node Subgraphs: Each physical node gets a subgraph with direction BT
- Node Ordering: ALWAYS order nodes as: control-plane node first, then worker nodes sorted alphabetically by name
- Title format: "Node: {node_name} ({external_ip})"
- Get external IP from rtoe-GR_{node} router port network field
- Use different background colors for each node for visual distinction
- CRITICAL: Define components in BOTTOM-TO-TOP order (matches packet flow):
 1. Bottom Layer: Pods and Management Ports (traffic originates here)
 2. Layer 2: Pod LSPs
 3. Layer 3: Node Switch (where pods connect)
 4. Layer 4: Node Switch LSPs (stor, breth0)
 5. Layer 5: Cluster Router + Router Ports (rtos, rtoj, rtots)
 6. Layer 6: Join Switch + Join LSPs
 7. Layer 7: Gateway Router + Router Ports (rtoj, rtoe)
 8. Layer 8: External Switch + External LSPs (etor, breth0)
 9. Top Layer: External Network (physical bridge)
Pod Representation:
- CRITICAL: Show ALL pods from $TMPDIR/ovn_pods_detail.txt as SEPARATE entities
- DO NOT discover pods from LSPs - many pods (host-network) don't have individual LSPs
- Pod format: POD_{id}["Pod: {name} Namespace: {ns} IP: {ip}"]
- Connect pods to their respective LSPs:
 - Host-network pods (IP == node IP): POD_{id} --> MGMT_{node} --> LSP_k8s_{node}
 - Pod-network pods (IP in 10.244.x.x): POD_{id} --> LSP_{namespace}_{podname}
Physical Network Layer:
- Show explicit external network entities per node
- Format: EXT_NET_{node}["External Network Physical bridge: breth0 Node IP: {external_ip}"]
- Connect localnet LSPs to external network: LSP_breth0_{node} -.->|physical| EXT_NET_{node}
Colors (apply using classDef with color:#000 for black text):
- Pods: fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#000
- Switches: fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000
- Routers: fill:#fff9c4,stroke:#f57f17,stroke-width:2px,color:#000
- LSPs (ALL types): fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#000
- LRPs: fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px,color:#000
- External Network: fill:#e0e0e0,stroke:#424242,stroke-width:2px,color:#000
- Management Ports (MGMT entities only, not LSPs): fill:#fff8e1,stroke:#f57c00,stroke-width:2px,color:#000
- Node Subgraph Backgrounds (use style statements, NOT classDef): Apply at END of diagram after all class assignments. Rotate through these 3 colors:
  - Node 1 (index 0): style node1 fill:#e8f5e9,stroke:#2e7d32,stroke-width:3px,stroke-dasharray: 5 5,color:#000 (green)
  - Node 2 (index 1): style node2 fill:#e1f5fe,stroke:#0277bd,stroke-width:3px,stroke-dasharray: 5 5,color:#000 (blue)
  - Node 3 (index 2): style node3 fill:#fff3e0,stroke:#ef6c00,stroke-width:3px,stroke-dasharray: 5 5,color:#000 (orange)
  - Node 4+ repeats: Use index % 3 to rotate colors (0=green, 1=blue, 2=orange)
Required Info:
- Pods: Name, Namespace, IP
- Switches: Name, Subnet (from other_config), Gateway (optional)
- LSPs: Name, MAC, IP (or Type + Router Port for router ports)
- LRPs: Name, MAC, Network (IP/CIDR)
- Routers: Name, Description
- External Network: Physical bridge name (breth0), Node IP
Connections:
- Solid arrows: Layer connections
- Dashed arrows with |peer|: Peer port relationships

Pod Discovery

CRITICAL: Discover pods from $TMPDIR/ovn_pods_detail.txt, NOT from LSPs

The file $TMPDIR/ovn_pods_detail.txt contains ALL running pods in the cluster (populated by collect_ovn_data.py). Format: namespace|pod_name|pod_ip|node_name

Pod-to-LSP Mapping Logic:

Read ALL pods from $TMPDIR/ovn_pods_detail.txt
For each pod, determine which LSP it connects to:

a. Host-Network Pods (pod IP == node IP, e.g., 10.89.0.10):
- Connect to management port LSP: k8s-{node}
- These pods share the management port's IP address
- MUST be included in diagram despite not having individual LSPs
b. Pod-Network Pods (pod IP in pod network range, e.g., 10.244.x.x):
- Connect to individual LSP: {namespace}_{pod-name-with-hash}
- LSP name format: kube-system_coredns-674b8bbfcf-qhfrq
- Filter LSPs where type="" (empty string)
- Extract MAC and IP from LSP addresses field

Diagram representation:

POD_id["Pod: {name}<br/>Namespace: {ns}<br/>IP: {ip}"]
POD_id --> LSP_id

Special LSPs (NOT pods, treat as infrastructure):

k8s-{node}: Management port LSP (multiple host-network pods connect to this)
stor-{node}: Router port (type="router")
breth0_{node}: LocalNet port (type="localnet")
jtor-*, etor-*, tstor-*: Router ports (type="router")

Example: Control Plane Node with Host-Network Pods

%% 8 host-network pods sharing management port
POD_etcd["Pod: etcd-ovn-control-plane<br/>Namespace: kube-system<br/>IP: 10.89.0.10"]
POD_apiserver["Pod: kube-apiserver-ovn-control-plane<br/>Namespace: kube-system<br/>IP: 10.89.0.10"]
POD_controller["Pod: kube-controller-manager-ovn-control-plane<br/>Namespace: kube-system<br/>IP: 10.89.0.10"]
POD_scheduler["Pod: kube-scheduler-ovn-control-plane<br/>Namespace: kube-system<br/>IP: 10.89.0.10"]
POD_ovnkube_cp["Pod: ovnkube-control-plane-ovn-control-plane<br/>Namespace: ovn-kubernetes<br/>IP: 10.89.0.10"]
POD_ovnkube_id["Pod: ovnkube-identity-ovn-control-plane<br/>Namespace: ovn-kubernetes<br/>IP: 10.89.0.10"]
POD_ovnkube_node["Pod: ovnkube-node-xyz<br/>Namespace: ovn-kubernetes<br/>IP: 10.89.0.10"]
POD_ovs_node["Pod: ovs-node-xyz<br/>Namespace: ovn-kubernetes<br/>IP: 10.89.0.10"]

%% Management port LSP (shared by all host-network pods)
MGMT_cp["Management Port: k8s-ovn-control-plane<br/>IP: 10.89.0.10"]
LSP_mgmt["LSP: k8s-ovn-control-plane<br/>MAC: xx:xx:xx:xx:xx:xx<br/>IP: 10.244.0.2"]

%% All host-network pods connect to same management port
POD_etcd --> MGMT_cp
POD_apiserver --> MGMT_cp
POD_controller --> MGMT_cp
POD_scheduler --> MGMT_cp
POD_ovnkube_cp --> MGMT_cp
POD_ovnkube_id --> MGMT_cp
POD_ovnkube_node --> MGMT_cp
POD_ovs_node --> MGMT_cp

MGMT_cp --> LSP_mgmt --> LS_node

Final Steps

Generate complete Mermaid diagram following structure above
Save to file chosen by user
Show summary: nodes, switches, routers, ports, mode
Clean up temporary directory:
```
rm -rf "$TMPDIR"
```
Tell user to open file in IDE to view rendered diagram

generating-ovn-topology

Install Skill

SKILL.md