Claude Code Plugins

Community-maintained marketplace

Feedback

installation-orchestrator

@tbartel74/Vigil-Code
5
0

Expert management of install.sh (2000+ lines). Use for installation troubleshooting, idempotency checks, secret generation, volume migration, 11 services startup order (including heuristics and semantic), and user onboarding.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name installation-orchestrator
description Expert management of install.sh (2000+ lines). Use for installation troubleshooting, idempotency checks, secret generation, volume migration, 11 services startup order (including heuristics and semantic), and user onboarding.
version 2.0.0
allowed-tools Read, Write, Edit, Bash, Grep, Glob

Installation Orchestrator (v2.0.0)

Overview

Expert management of install.sh (2000+ lines bash) including idempotency, secret generation, volume migration, 11-service orchestration with 3-branch detection startup, and troubleshooting installation failures.

When to Use This Skill

  • Troubleshooting installation failures
  • Managing install.sh modifications
  • Secret generation and validation
  • Volume migration between versions
  • Idempotency checks
  • User onboarding flow
  • 3-branch service startup order (v2.0.0)

v2.0.0 Architecture

11 Docker Services

Core Services:
  - clickhouse (data storage, port 8123)
  - grafana (monitoring, port 3001)
  - n8n (workflow engine, port 5678)

3-Branch Detection (v2.0.0):
  - heuristics-service (Branch A, port 5005, 30% weight)
  - semantic-service (Branch B, port 5006, 35% weight)
  - prompt-guard-api (Branch C, port 8000, 35% weight)

PII Detection:
  - presidio-pii-api (port 5001)
  - language-detector (port 5002)

Web Interface:
  - web-ui-backend (port 8787)
  - web-ui-frontend (via proxy)
  - proxy (Caddy, port 80)

Installation Flow

1. Pre-flight Checks

- Docker installed and running
- Ports available (80, 5678, 8123, 3001, 8787, 5005, 5006, 8000)
- Disk space >10GB
- No existing .install-state.lock

2. Secret Generation

CLICKHOUSE_PASSWORD=$(openssl rand -base64 32)
GF_SECURITY_ADMIN_PASSWORD=$(openssl rand -base64 32)
SESSION_SECRET=$(openssl rand -base64 64)
JWT_SECRET=$(openssl rand -base64 32)
WEB_UI_ADMIN_PASSWORD=$(openssl rand -base64 24)

3. Service Startup Order (v2.0.0)

Phase 1 - Data Layer:
  1. clickhouse (data storage)
  2. grafana (monitoring)

Phase 2 - Detection Core:
  3. n8n (workflow engine)
  4. heuristics-service (Branch A - fast pattern matching)
  5. semantic-service (Branch B - embedding analysis)
  6. prompt-guard-api (Branch C - LLM validation, optional)

Phase 3 - PII Services:
  7. presidio-pii-api (dual-language PII)
  8. language-detector (hybrid detection)

Phase 4 - Web Interface:
  9. web-ui-backend (Express API)
  10. web-ui-frontend (React app)
  11. proxy (Caddy reverse proxy)

4. Health Checks (v2.0.0)

# Core services
for service in clickhouse grafana n8n web-ui; do
  wait_for_health $service 120s || fail
done

# 3-Branch detection services (v2.0.0)
wait_for_health heuristics-service 60s || warn "Branch A degraded"
wait_for_health semantic-service 90s || warn "Branch B degraded"
wait_for_health prompt-guard-api 120s || warn "Branch C degraded"

# PII services
wait_for_health presidio-pii-api 90s || warn "PII detection degraded"
wait_for_health language-detector 30s || warn "Language detection degraded"

5. Idempotency Lock

touch .install-state.lock
echo "INSTALL_DATE=$(date)" >> .install-state.lock
echo "VERSION=2.0.0" >> .install-state.lock
echo "SERVICES=11" >> .install-state.lock

Common Tasks

Task 1: Fresh Installation

./install.sh

# Prompts:
# 1. Generate secrets? [Y/n]
# 2. Set admin password (or auto-generate)
# 3. Delete existing vigil_data? [y/N]
# 4. Download Llama model? [Y/n] (for Branch C)

Task 2: Troubleshoot Failed Installation

# Check state
cat .install-state.lock

# View logs
docker-compose logs --tail=100

# Check 3-branch services specifically (v2.0.0)
docker logs vigil-heuristics-service --tail 50
docker logs vigil-semantic-service --tail 50
docker logs vigil-prompt-guard-api --tail 50

# Retry specific service
docker-compose up -d heuristics-service
docker logs vigil-heuristics-service

# Clean slate
rm .install-state.lock .env vigil_data -rf
./install.sh

Task 3: Validate Environment

./scripts/validate-env.sh

# Checks:
# - All required env vars present
# - Passwords meet requirements (min 8 chars)
# - Ports not in use (including 5005, 5006 for branches)
# - Docker network exists (vigil-net)
# - 11 services defined in docker-compose.yml

Task 4: Migrate Volumes (v1.x → v2.0.0)

# Backup old data
docker run --rm -v vigil_clickhouse_data:/data -v $(pwd):/backup alpine \
  tar czf /backup/clickhouse-v1.x-$(date +%Y%m%d).tar.gz /data

# Run v2.0.0 migration SQL (adds branch columns)
docker exec vigil-clickhouse clickhouse-client < services/monitoring/sql/migrations/v2.0.0.sql

# Verify migration (branch columns added)
docker exec vigil-clickhouse clickhouse-client -q "
  DESCRIBE n8n_logs.events_processed
" | grep -E "branch_[abc]_score|arbiter_decision"

# Expected output:
# branch_a_score    Float32
# branch_b_score    Float32
# branch_c_score    Float32
# arbiter_decision  String

Task 5: Verify 3-Branch Services (v2.0.0)

#!/bin/bash
# scripts/verify-branches.sh

echo "🔍 Verifying 3-Branch Detection Services..."

# Branch A: Heuristics
BRANCH_A=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:5005/health)
if [ "$BRANCH_A" == "200" ]; then
  echo "✅ Branch A (Heuristics): Healthy"
else
  echo "❌ Branch A (Heuristics): Down (HTTP $BRANCH_A)"
fi

# Branch B: Semantic
BRANCH_B=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:5006/health)
if [ "$BRANCH_B" == "200" ]; then
  echo "✅ Branch B (Semantic): Healthy"
else
  echo "❌ Branch B (Semantic): Down (HTTP $BRANCH_B)"
fi

# Branch C: LLM Guard
BRANCH_C=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8000/health)
if [ "$BRANCH_C" == "200" ]; then
  echo "✅ Branch C (LLM Guard): Healthy"
else
  echo "⚠️  Branch C (LLM Guard): Down (HTTP $BRANCH_C) - Optional"
fi

echo ""
echo "3-Branch Status: $([ "$BRANCH_A" == "200" ] && [ "$BRANCH_B" == "200" ] && echo "OPERATIONAL" || echo "DEGRADED")"

Troubleshooting

Issue: Port already in use

# Check all v2.0.0 ports
for port in 80 5678 8123 3001 8787 5001 5002 5005 5006 8000; do
  lsof -i :$port && echo "Port $port in use"
done

# Kill specific process
kill -9 $(lsof -t -i:5005)

Issue: Branch service won't start

# Check heuristics-service
docker logs vigil-heuristics-service --tail 100
# Common issue: missing patterns directory
# Fix: docker-compose build heuristics-service

# Check semantic-service
docker logs vigil-semantic-service --tail 100
# Common issue: model download failed
# Fix: docker exec vigil-semantic-service python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"

Issue: ClickHouse won't start

# Check volume permissions
ls -la vigil_data/clickhouse/

# Reset volume
docker-compose down -v
docker volume rm vigil_clickhouse_data
./install.sh

Issue: Secrets not loaded

# Verify .env file
cat .env | grep -E "(CLICKHOUSE|JWT|SESSION)_"

# Reload
docker-compose down
docker-compose up -d

Issue: Semantic service model download fails

# Pre-download model (run before install)
docker run --rm -v vigil_semantic_models:/models python:3.11-slim bash -c "
  pip install sentence-transformers &&
  python -c \"from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2', cache_folder='/models')\"
"

# Restart semantic service
docker-compose restart semantic-service

Port Reference (v2.0.0)

Port Service Description
80 proxy Caddy reverse proxy (main entry)
3001 grafana Monitoring dashboard
5001 presidio-pii-api Dual-language PII detection
5002 language-detector Hybrid language detection
5005 heuristics-service Branch A (30% weight)
5006 semantic-service Branch B (35% weight)
5678 n8n Workflow engine
8000 prompt-guard-api Branch C (35% weight)
8123 clickhouse Analytics database
8787 web-ui-backend Configuration API

Quick Reference

# Fresh install
./install.sh

# Status check (all 11 services)
./scripts/status.sh

# Verify 3-branch detection (v2.0.0)
./scripts/verify-branches.sh

# View logs
./scripts/logs.sh

# Restart
./scripts/restart.sh

# Uninstall
docker-compose down -v
rm -rf vigil_data .env .install-state.lock

Integration Points

With docker-vigil-orchestration:

when: Service won't start
action:
  1. Check vigil-net network connectivity
  2. Verify service dependencies
  3. Check port conflicts
  4. Review Docker resource limits

With clickhouse-grafana-monitoring:

when: Migration to v2.0.0
action:
  1. Run SQL migration script
  2. Verify branch columns exist
  3. Test ClickHouse queries
  4. Update Grafana dashboards

Last Updated: 2025-12-09 Install Script: 2000+ lines bash Services: 11 containers (v2.0.0) 3-Branch Ports: 5005 (Heuristics), 5006 (Semantic), 8000 (LLM Guard)

Version History

  • v2.0.0 (Current): 11 services, 3-branch detection startup, migration scripts
  • v1.6.11: 9 services, sequential detection