Claude Code Plugins

Community-maintained marketplace

Feedback

Manages distributed workloads and file sharing across Tailscale SSH-connected machines. Automates remote command execution, intelligent load balancing, file synchronization workflows, host health monitoring, and multi-machine orchestration using sshsync. Activates when discussing remote machines, Tailscale SSH, workload distribution, file sharing, or multi-host operations.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name tailscale-sshsync-agent
description Manages distributed workloads and file sharing across Tailscale SSH-connected machines. Automates remote command execution, intelligent load balancing, file synchronization workflows, host health monitoring, and multi-machine orchestration using sshsync. Activates when discussing remote machines, Tailscale SSH, workload distribution, file sharing, or multi-host operations.

Tailscale SSH Sync Agent

When to Use This Skill

This skill automatically activates when you need to:

Distribute workloads across multiple machines

  • "Run this on my least loaded machine"
  • "Execute this task on the machine with most resources"
  • "Balance work across my Tailscale network"

Share files between Tailscale-connected hosts

  • "Push this directory to all my development machines"
  • "Sync code across my homelab servers"
  • "Deploy configuration to production group"

Execute commands remotely across host groups

  • "Run system updates on all servers"
  • "Check disk space across web-servers group"
  • "Restart services on database hosts"

Monitor machine availability and health

  • "Which machines are online?"
  • "Show status of my Tailscale network"
  • "Check connectivity to remote hosts"

Automate multi-machine workflows

  • "Deploy to staging, test, then production"
  • "Backup files from all machines"
  • "Synchronize development environment across laptops"

How It Works

This agent provides intelligent workload distribution and file sharing management across Tailscale SSH-connected machines using the sshsync CLI tool.

Core Architecture:

  1. SSH Sync Wrapper: Python interface to sshsync CLI operations
  2. Tailscale Manager: Tailscale-specific connectivity and status management
  3. Load Balancer: Intelligent task distribution based on machine resources
  4. Workflow Executor: Common multi-machine workflow automation
  5. Validators: Parameter, host, and connection validation
  6. Helpers: Temporal context, formatting, and utilities

Key Features:

  • Automatic host discovery via Tailscale and SSH config
  • Intelligent load balancing based on CPU, memory, and current load
  • Group-based operations (execute on all web servers, databases, etc.)
  • Dry-run mode for preview before execution
  • Parallel execution across multiple hosts
  • Comprehensive error handling and retry logic
  • Connection validation before operations
  • Progress tracking for long-running operations

Data Sources

sshsync CLI Tool

What is sshsync?

sshsync is a Python CLI tool for managing SSH connections and executing operations across multiple hosts. It provides:

  • Group-based host management
  • Remote command execution with timeouts
  • File push/pull operations (single or recursive)
  • Integration with existing SSH config (~/.ssh/config)
  • Status checking and connectivity validation

Installation:

pip install sshsync

Configuration:

sshsync uses two configuration sources:

  1. SSH Config (~/.ssh/config): Host connection details
  2. sshsync Config (~/.config/sshsync/config.yaml): Group assignments

Example SSH Config:

Host homelab-1
  HostName 100.64.1.10
  User admin
  IdentityFile ~/.ssh/id_ed25519

Host prod-web-01
  HostName 100.64.1.20
  User deploy
  Port 22

Example sshsync Config:

groups:
  homelab:
    - homelab-1
    - homelab-2
  production:
    - prod-web-01
    - prod-web-02
    - prod-db-01
  development:
    - dev-laptop
    - dev-desktop

sshsync Commands Used:

Command Purpose Example
sshsync all Execute on all hosts sshsync all "df -h"
sshsync group Execute on group sshsync group web "systemctl status nginx"
sshsync push Push files to hosts sshsync push --group prod ./app /var/www/
sshsync pull Pull files from hosts sshsync pull --host db /var/log/mysql ./logs/
sshsync ls List hosts sshsync ls --with-status
sshsync sync Sync ungrouped hosts sshsync sync

Tailscale Integration

What is Tailscale?

Tailscale is a zero-config VPN that creates a secure network between your devices. It provides:

  • Automatic peer-to-peer connections via WireGuard
  • Magic DNS for easy host addressing (e.g., machine-name.tailnet-name.ts.net)
  • SSH capabilities built-in to Tailscale CLI
  • ACLs for access control

Tailscale SSH:

Tailscale includes SSH functionality that works seamlessly with standard SSH:

# Standard SSH via Tailscale
ssh user@machine-name

# Tailscale-specific SSH command
tailscale ssh machine-name

Integration with sshsync:

Since Tailscale SSH uses standard SSH protocol, it works perfectly with sshsync. Just configure your SSH config with Tailscale hostnames:

Host homelab-1
  HostName homelab-1.tailnet.ts.net
  User admin

Tailscale Commands Used:

Command Purpose Example
tailscale status Show network status Lists all connected machines
tailscale ping Check connectivity tailscale ping machine-name
tailscale ssh SSH to machine tailscale ssh user@machine

Workflows

1. Host Health Monitoring

User Query: "Which of my machines are online?"

Workflow:

  1. Load SSH config and sshsync groups
  2. Execute sshsync ls --with-status
  3. Parse connectivity results
  4. Query Tailscale status for additional context
  5. Return formatted health report with:
    • Online/offline status per host
    • Group memberships
    • Tailscale connection state
    • Last seen timestamp

Implementation: scripts/sshsync_wrapper.pyget_host_status()

Output Format:

🟢 homelab-1 (homelab) - Online - Tailscale: Connected
🟢 prod-web-01 (production, web-servers) - Online - Tailscale: Connected
🔴 dev-laptop (development) - Offline - Last seen: 2h ago
🟢 prod-db-01 (production, databases) - Online - Tailscale: Connected

Summary: 3/4 hosts online (75%)

2. Intelligent Load Balancing

User Query: "Run this task on the least loaded machine"

Workflow:

  1. Get list of candidate hosts (from group or all)
  2. For each online host, check:
    • CPU load (via uptime or top)
    • Memory usage (via free or vm_stat)
    • Disk space (via df)
  3. Calculate composite load score
  4. Select host with lowest score
  5. Execute task on selected host
  6. Return result with performance metrics

Implementation: scripts/load_balancer.pyselect_optimal_host()

Load Score Calculation:

score = (cpu_pct * 0.4) + (mem_pct * 0.3) + (disk_pct * 0.3)

Lower score = better candidate for task execution.

Output Format:

✓ Selected host: prod-web-02
  Reason: Lowest load score (0.32)
  - CPU: 15% (vs avg 45%)
  - Memory: 30% (vs avg 60%)
  - Disk: 40% (vs avg 55%)

Executing: npm run build
[Task output...]

✓ Completed in 2m 15s

3. File Synchronization Workflows

User Query: "Sync my code to all development machines"

Workflow:

  1. Validate source path exists locally
  2. Identify target group ("development")
  3. Check connectivity to all group members
  4. Show dry-run preview (files to be synced, sizes)
  5. Execute parallel push to all hosts
  6. Validate successful transfer on each host
  7. Return summary with per-host status

Implementation: scripts/sshsync_wrapper.pypush_to_group()

Supported Operations:

  • Push to all: Sync files to every configured host
  • Push to group: Sync to specific group (dev, prod, etc.)
  • Pull from host: Retrieve files from single host
  • Pull from group: Collect files from multiple hosts
  • Recursive sync: Entire directory trees with --recurse

Output Format:

📤 Syncing: ~/projects/myapp → /var/www/myapp
Group: development (3 hosts)

Preview (dry-run):
  - dev-laptop: 145 files, 12.3 MB
  - dev-desktop: 145 files, 12.3 MB
  - dev-server: 145 files, 12.3 MB

Execute? [Proceeding...]

✓ dev-laptop: Synced 145 files in 8s
✓ dev-desktop: Synced 145 files in 6s
✓ dev-server: Synced 145 files in 10s

Summary: 3/3 successful (435 files, 36.9 MB total)

4. Remote Command Orchestration

User Query: "Check disk space on all web servers"

Workflow:

  1. Identify target group ("web-servers")
  2. Validate group exists and has members
  3. Check connectivity to group members
  4. Execute command in parallel across group
  5. Collect and parse outputs
  6. Format results with per-host breakdown

Implementation: scripts/sshsync_wrapper.pyexecute_on_group()

Features:

  • Parallel execution: Commands run simultaneously on all hosts
  • Timeout handling: Configurable per-command timeout (default 10s)
  • Error isolation: Failure on one host doesn't stop others
  • Output aggregation: Collect and correlate all outputs
  • Dry-run mode: Preview what would execute without running

Output Format:

🔧 Executing on group 'web-servers': df -h /var/www

web-01:
  Filesystem: /dev/sda1
  Size: 100G, Used: 45G, Available: 50G (45% used)

web-02:
  Filesystem: /dev/sda1
  Size: 100G, Used: 67G, Available: 28G (67% used) ⚠️

web-03:
  Filesystem: /dev/sda1
  Size: 100G, Used: 52G, Available: 43G (52% used)

⚠️ Alert: web-02 is above 60% disk usage

5. Multi-Stage Deployment Workflow

User Query: "Deploy to staging, test, then production"

Workflow:

  1. Stage 1 - Staging Deploy:

    • Push code to staging group
    • Run build process
    • Execute automated tests
    • If tests fail: STOP and report error
  2. Stage 2 - Validation:

    • Check staging health endpoints
    • Validate database migrations
    • Run smoke tests
  3. Stage 3 - Production Deploy:

    • Push to production group (one at a time for zero-downtime)
    • Restart services gracefully
    • Verify each host before proceeding to next
  4. Stage 4 - Verification:

    • Check production health
    • Monitor for errors
    • Rollback if issues detected

Implementation: scripts/workflow_executor.pydeploy_workflow()

Output Format:

🚀 Multi-Stage Deployment Workflow

Stage 1: Staging Deployment
  ✓ Pushed code to staging-01
  ✓ Build completed (2m 15s)
  ✓ Tests passed (145/145)

Stage 2: Validation
  ✓ Health check passed
  ✓ Database migration OK
  ✓ Smoke tests passed (12/12)

Stage 3: Production Deployment
  ✓ prod-web-01: Deployed & verified
  ✓ prod-web-02: Deployed & verified
  ✓ prod-web-03: Deployed & verified

Stage 4: Verification
  ✓ All health checks passed
  ✓ No errors in logs (5min window)

✅ Deployment completed successfully in 12m 45s

Available Scripts

scripts/sshsync_wrapper.py

Purpose: Python wrapper around sshsync CLI for programmatic access

Functions:

  • get_host_status(group=None): Get online/offline status of hosts
  • execute_on_all(command, timeout=10, dry_run=False): Run command on all hosts
  • execute_on_group(group, command, timeout=10, dry_run=False): Run on specific group
  • execute_on_host(host, command, timeout=10): Run on single host
  • push_to_hosts(local_path, remote_path, hosts=None, group=None, recurse=False, dry_run=False): Push files
  • pull_from_host(host, remote_path, local_path, recurse=False, dry_run=False): Pull files
  • list_hosts(with_status=True): List all configured hosts
  • get_groups(): Get all defined groups and their members
  • add_hosts_to_group(group, hosts): Add hosts to a group

Usage Example:

from sshsync_wrapper import execute_on_group, push_to_hosts

# Execute command
result = execute_on_group(
    group="web-servers",
    command="systemctl status nginx",
    timeout=15
)

# Push files
push_to_hosts(
    local_path="./dist",
    remote_path="/var/www/app",
    group="production",
    recurse=True
)

scripts/tailscale_manager.py

Purpose: Tailscale-specific operations and status management

Functions:

  • get_tailscale_status(): Get Tailscale network status (all peers)
  • check_connectivity(host): Ping host via Tailscale
  • get_peer_info(hostname): Get detailed info about peer
  • list_online_machines(): List all online Tailscale machines
  • get_machine_ip(hostname): Get Tailscale IP for machine
  • validate_tailscale_ssh(host): Check if Tailscale SSH is working

Usage Example:

from tailscale_manager import get_tailscale_status, check_connectivity

# Get network status
status = get_tailscale_status()
print(f"Online machines: {status['online_count']}")

# Check specific host
is_online = check_connectivity("homelab-1")

scripts/load_balancer.py

Purpose: Intelligent task distribution based on machine resources

Functions:

  • get_machine_load(host): Get CPU, memory, disk metrics
  • calculate_load_score(metrics): Calculate composite load score
  • select_optimal_host(candidates, prefer_group=None): Pick best host
  • get_group_capacity(): Get aggregate capacity of group
  • distribute_tasks(tasks, hosts): Distribute multiple tasks optimally

Usage Example:

from load_balancer import select_optimal_host

# Find best machine for task
best_host = select_optimal_host(
    candidates=["web-01", "web-02", "web-03"],
    prefer_group="production"
)

# Execute on selected host
execute_on_host(best_host, "npm run build")

scripts/workflow_executor.py

Purpose: Common multi-machine workflow automation

Functions:

  • deploy_workflow(code_path, staging_group, prod_group): Full deployment pipeline
  • backup_workflow(hosts, backup_paths, destination): Backup from multiple hosts
  • sync_workflow(source_host, target_group, paths): Sync from one to many
  • rolling_restart(group, service_name): Zero-downtime service restart
  • health_check_workflow(group, endpoint): Check health across group

Usage Example:

from workflow_executor import deploy_workflow, backup_workflow

# Deploy with testing
deploy_workflow(
    code_path="./dist",
    staging_group="staging",
    prod_group="production"
)

# Backup from all databases
backup_workflow(
    hosts=["db-01", "db-02"],
    backup_paths=["/var/lib/mysql"],
    destination="./backups"
)

scripts/utils/helpers.py

Purpose: Common utilities and formatting functions

Functions:

  • format_bytes(bytes): Human-readable byte formatting (1.2 GB)
  • format_duration(seconds): Human-readable duration (2m 15s)
  • parse_ssh_config(): Parse ~/.ssh/config for host details
  • parse_sshsync_config(): Parse sshsync group configuration
  • get_timestamp(): Get ISO timestamp for logging
  • safe_execute(func, *args, **kwargs): Execute with error handling
  • validate_path(path): Check if path exists and is accessible

scripts/utils/validators/parameter_validator.py

Purpose: Validate user inputs and parameters

Functions:

  • validate_host(host, valid_hosts=None): Validate host exists
  • validate_group(group, valid_groups=None): Validate group exists
  • validate_path_exists(path): Check local path exists
  • validate_timeout(timeout): Ensure timeout is reasonable
  • validate_command(command): Basic command safety validation

scripts/utils/validators/host_validator.py

Purpose: Validate host configuration and availability

Functions:

  • validate_ssh_config(host): Check host has SSH config entry
  • validate_host_reachable(host, timeout=5): Check host is reachable
  • validate_group_members(group): Ensure group has valid members
  • get_invalid_hosts(hosts): Find hosts without valid config

scripts/utils/validators/connection_validator.py

Purpose: Validate SSH and Tailscale connections

Functions:

  • validate_ssh_connection(host): Test SSH connection works
  • validate_tailscale_connection(host): Test Tailscale connectivity
  • validate_ssh_key(host): Check SSH key authentication
  • get_connection_diagnostics(host): Comprehensive connection testing

Available Analyses

1. Host Availability Analysis

Function: analyze_host_availability(group=None)

Objective: Determine which machines are online and accessible

Inputs:

  • group (optional): Specific group to check (None = all hosts)

Outputs:

{
    'total_hosts': 10,
    'online_hosts': 8,
    'offline_hosts': 2,
    'availability_pct': 80.0,
    'by_group': {
        'production': {'online': 3, 'total': 3, 'pct': 100.0},
        'development': {'online': 2, 'total': 3, 'pct': 66.7},
        'homelab': {'online': 3, 'total': 4, 'pct': 75.0}
    },
    'offline_hosts_details': [
        {'host': 'dev-laptop', 'last_seen': '2h ago', 'groups': ['development']},
        {'host': 'homelab-4', 'last_seen': '1d ago', 'groups': ['homelab']}
    ]
}

Interpretation:

  • > 90%: Excellent availability
  • 70-90%: Good availability, monitor offline hosts
  • < 70%: Poor availability, investigate issues

2. Load Distribution Analysis

Function: analyze_load_distribution(group=None)

Objective: Understand resource usage across machines

Inputs:

  • group (optional): Specific group to analyze

Outputs:

{
    'hosts': [
        {
            'host': 'web-01',
            'cpu_pct': 45,
            'mem_pct': 60,
            'disk_pct': 40,
            'load_score': 0.49,
            'status': 'moderate'
        },
        # ... more hosts
    ],
    'aggregate': {
        'avg_cpu': 35,
        'avg_mem': 55,
        'avg_disk': 45,
        'total_capacity': 1200  # GB
    },
    'recommendations': [
        {
            'host': 'web-02',
            'issue': 'High CPU usage (85%)',
            'action': 'Consider migrating workloads'
        }
    ]
}

Load Status:

  • Low (score < 0.4): Good capacity for more work
  • Moderate (0.4-0.7): Normal operation
  • High (> 0.7): May need to offload work

3. File Sync Status Analysis

Function: analyze_sync_status(local_path, remote_path, group)

Objective: Compare local files with remote versions

Inputs:

  • local_path: Local directory to compare
  • remote_path: Remote directory path
  • group: Group to check

Outputs:

{
    'local_files': 145,
    'local_size': 12582912,  # bytes
    'hosts': [
        {
            'host': 'web-01',
            'status': 'in_sync',
            'files_match': 145,
            'files_different': 0,
            'missing_files': 0
        },
        {
            'host': 'web-02',
            'status': 'out_of_sync',
            'files_match': 140,
            'files_different': 3,
            'missing_files': 2,
            'details': ['config.json modified', 'index.html modified', ...]
        }
    ],
    'sync_percentage': 96.7,
    'recommended_action': 'Push to web-02'
}

4. Network Latency Analysis

Function: analyze_network_latency(hosts=None)

Objective: Measure connection latency to hosts

Inputs:

  • hosts (optional): Specific hosts to test (None = all)

Outputs:

{
    'hosts': [
        {'host': 'web-01', 'latency_ms': 15, 'status': 'excellent'},
        {'host': 'web-02', 'latency_ms': 45, 'status': 'good'},
        {'host': 'db-01', 'latency_ms': 150, 'status': 'fair'}
    ],
    'avg_latency': 70,
    'min_latency': 15,
    'max_latency': 150,
    'recommendations': [
        {'host': 'db-01', 'issue': 'High latency', 'action': 'Check network path'}
    ]
}

Latency Classification:

  • Excellent (< 50ms): Ideal for interactive tasks
  • Good (50-100ms): Suitable for most operations
  • Fair (100-200ms): May impact interactive workflows
  • Poor (> 200ms): Investigate network issues

5. Comprehensive Infrastructure Report

Function: comprehensive_infrastructure_report(group=None)

Objective: One-stop function for complete infrastructure overview

Inputs:

  • group (optional): Limit to specific group (None = all)

Outputs:

{
    'report_timestamp': '2025-10-19T19:43:41Z',
    'group': 'production',  # or 'all'
    'metrics': {
        'availability': {...},  # from analyze_host_availability
        'load_distribution': {...},  # from analyze_load_distribution
        'network_latency': {...},  # from analyze_network_latency
        'tailscale_status': {...}  # from Tailscale integration
    },
    'summary': "Production infrastructure: 3/3 hosts online, avg load 45%, network latency 35ms",
    'alerts': [
        "⚠ web-02: High CPU usage (85%)",
        "⚠ db-01: Elevated latency (150ms)"
    ],
    'recommendations': [
        "Consider rebalancing workload from web-02",
        "Investigate network path to db-01"
    ],
    'overall_health': 'good'  # excellent | good | fair | poor
}

Overall Health Classification:

  • Excellent: All metrics green, no alerts
  • Good: Most metrics healthy, minor alerts
  • Fair: Some concerning metrics, action recommended
  • Poor: Critical issues, immediate action required

Error Handling

Connection Errors

Error: Cannot connect to host

Causes:

  • Host is offline
  • Tailscale not connected
  • SSH key missing/invalid
  • Firewall blocking connection

Handling:

try:
    execute_on_host("web-01", "ls")
except ConnectionError as e:
    # Try Tailscale ping first
    if not check_connectivity("web-01"):
        return {
            'error': 'Host unreachable',
            'suggestion': 'Check Tailscale connection',
            'diagnostics': get_connection_diagnostics("web-01")
        }
    # Then check SSH
    if not validate_ssh_connection("web-01"):
        return {
            'error': 'SSH authentication failed',
            'suggestion': 'Check SSH keys: ssh-add -l'
        }

Timeout Errors

Error: Operation timed out

Causes:

  • Command taking too long
  • Network latency
  • Host overloaded

Handling:

  • Automatic retry with exponential backoff (3 attempts)
  • Increase timeout for known slow operations
  • Fall back to alternative host if available

File Transfer Errors

Error: File sync failed

Causes:

  • Insufficient disk space
  • Permission denied
  • Path doesn't exist

Handling:

  • Pre-check disk space on target
  • Validate permissions before transfer
  • Create directories if needed
  • Partial transfer recovery

Validation Errors

Error: Invalid parameter

Examples:

  • Unknown host
  • Non-existent group
  • Invalid path

Handling:

  • Validate all inputs before execution
  • Provide suggestions for similar valid options
  • Clear error messages with corrective actions

Mandatory Validations

Before Any Operation

  1. Parameter Validation:

    host = validate_host(host, valid_hosts=get_all_hosts())
    group = validate_group(group, valid_groups=get_groups())
    timeout = validate_timeout(timeout)
    
  2. Connection Validation:

    if not validate_host_reachable(host, timeout=5):
        raise ConnectionError(f"Host {host} is not reachable")
    
  3. Path Validation (for file operations):

    if not validate_path_exists(local_path):
        raise ValueError(f"Path does not exist: {local_path}")
    

During Operation

  1. Timeout Monitoring: Every operation has configurable timeout
  2. Progress Tracking: Long operations show progress
  3. Error Isolation: Failure on one host doesn't stop others

After Operation

  1. Result Validation:

    report = validate_operation_result(result)
    if report.has_critical_issues():
        raise OperationError(report.get_summary())
    
  2. State Verification: Confirm operation succeeded

  3. Logging: Record all operations for audit trail

Performance and Caching

Caching Strategy

Host Status Cache:

  • TTL: 60 seconds
  • Why: Host status doesn't change rapidly
  • Invalidation: Manual invalidate when connectivity changes

Load Metrics Cache:

  • TTL: 30 seconds
  • Why: Load changes frequently
  • Invalidation: Automatic on timeout

Group Configuration Cache:

  • TTL: 5 minutes
  • Why: Group membership rarely changes
  • Invalidation: Manual when groups modified

Performance Optimizations

  1. Parallel Execution:

    • Commands execute concurrently across hosts
    • ThreadPoolExecutor with max 10 workers
    • Prevents sequential bottleneck
  2. Connection Pooling:

    • Reuse SSH connections when possible
    • ControlMaster in SSH config
  3. Lazy Loading:

    • Only fetch data when needed
    • Don't load all host status unless required
  4. Progressive Results:

    • Stream results as they complete
    • Don't wait for slowest host

Automatic Detection Keywords

This skill automatically activates when you mention:

Hosts & Machines:

  • remote machine, remote host, remote server
  • Tailscale machine, Tailscale host
  • SSH host, SSH server
  • homelab, servers, cluster

Tools:

  • sshsync, ssh sync
  • Tailscale, tailnet
  • SSH, secure shell

Operations:

  • distribute workload, load balance
  • file sharing, file sync, file transfer
  • push files, pull files, sync files
  • remote execution, execute remotely
  • deploy to, deploy across
  • backup from, sync to

Groups:

  • host group, machine group, server group
  • production servers, development machines
  • web servers, database servers
  • staging environment, production environment

Actions:

  • check status, check connectivity
  • run on all, execute on group
  • sync across, distribute to
  • deploy to, push to, pull from

Monitoring:

  • which machines are online
  • host availability, server status
  • machine health, connectivity check
  • network status, latency check

Workflows:

  • multi-stage deployment
  • rolling restart, zero-downtime
  • backup workflow, sync workflow
  • distributed task, parallel execution

Usage Examples

Example 1: Check Host Status

User: "Which of my Tailscale machines are online?"

Agent Response:

Checking Tailscale network status...

🟢 homelab-1 (homelab) - Online - Tailscale: Connected
  IP: 100.64.1.10 | Latency: 15ms

🟢 homelab-2 (homelab) - Online - Tailscale: Connected
  IP: 100.64.1.11 | Latency: 18ms

🔴 dev-laptop (development) - Offline
  Last seen: 2 hours ago | Tailscale: Not connected

🟢 prod-web-01 (production, web-servers) - Online - Tailscale: Connected
  IP: 100.64.1.20 | Latency: 25ms

🟢 prod-db-01 (production, databases) - Online - Tailscale: Connected
  IP: 100.64.1.30 | Latency: 30ms

Summary: 4/5 hosts online (80%)
Average latency: 22ms

Example 2: Execute on Group

User: "Check disk space on all web servers"

Agent Response:

Executing on group 'web-servers': df -h /var/www

🔧 web-01:
  Filesystem: /dev/sda1
  Size: 100G | Used: 45G | Available: 50G (45% used)
  Status: ✓ Healthy

🔧 web-02:
  Filesystem: /dev/sda1
  Size: 100G | Used: 67G | Available: 28G (67% used)
  Status: ⚠ Warning - Above 60%

🔧 web-03:
  Filesystem: /dev/sda1
  Size: 100G | Used: 52G | Available: 43G (52% used)
  Status: ✓ Healthy

⚠️ Alert: web-02 is above 60% disk usage
Recommendation: Clear old logs or increase disk space

Example 3: Load-Balanced Execution

User: "Run this npm build on the machine with most available resources"

Agent Response:

Analyzing resource availability across 5 machines...

Load Analysis:
  web-01: CPU 45%, Mem 60%, Disk 40% → Score: 0.49 (moderate)
  web-02: CPU 85%, Mem 70%, Disk 65% → Score: 0.75 (high load)
  web-03: CPU 20%, Mem 35%, Disk 30% → Score: 0.28 (low load) ✓

✓ Selected: web-03
  Reason: Lowest load score (0.28)
  Available: CPU 80%, Memory 65%, Disk 70%

Executing on web-03: npm run build

[Build output...]
  ✓ Build completed successfully
  Duration: 2m 15s
  Output: ./dist (145 files, 12.3 MB)

Final load: CPU 65%, Mem 50% (still below average)

Example 4: File Sync

User: "Push my app directory to all production servers"

Agent Response:

📤 Syncing: ~/projects/myapp → /var/www/myapp
Target group: production (3 hosts)

Analyzing source...
  145 files, 12.3 MB total

Preview (dry-run):
  prod-web-01: 145 files → /var/www/myapp
  prod-web-02: 145 files → /var/www/myapp
  prod-db-01: Skipped (not in web-servers group)

Checking target capacity...
  ✓ prod-web-01: 50 GB available
  ✓ prod-web-02: 28 GB available

Executing push...

✓ prod-web-01: Synced 145 files in 8.2s
  Modified: 12 files
  Added: 3 files

✓ prod-web-02: Synced 145 files in 6.5s
  Modified: 12 files
  Added: 3 files

Summary: 2/2 successful (290 files, 24.6 MB total)

Example 5: Multi-Stage Deployment

User: "Deploy this to staging, run tests, then push to production"

Agent Response:

🚀 Multi-Stage Deployment Workflow

Stage 1: Staging Deployment
  📤 Pushing to staging-01...
    ✓ Code synced (145 files, 12.3 MB) - 8s

  🔨 Running build...
    $ npm run build
    ✓ Build completed - 2m 15s

  🧪 Running tests...
    $ npm test
    ✓ All tests passed (145/145) - 45s

Stage 2: Staging Validation
  ✓ Health endpoint: 200 OK
  ✓ Database migrations: Up to date
  ✓ Smoke tests: 12/12 passed

Stage 3: Production Deployment
  Rolling deployment to 3 hosts...

  ⏳ prod-web-01:
    ✓ Code synced
    ✓ Build completed
    ✓ Service restarted gracefully
    ✓ Health check passed
    Duration: 3m 10s

  ⏳ prod-web-02:
    ✓ Code synced
    ✓ Build completed
    ✓ Service restarted gracefully
    ✓ Health check passed
    Duration: 3m 5s

  ⏳ prod-web-03:
    ✓ Code synced
    ✓ Build completed
    ✓ Service restarted gracefully
    ✓ Health check passed
    Duration: 3m 12s

Stage 4: Production Verification
  ✓ All health checks passed (3/3)
  ✓ No errors in logs (5min window)
  ✓ Response times normal (avg 45ms)

✅ Deployment completed successfully!
  Total duration: 12m 45s
  Hosts updated: 3
  Zero downtime: ✓

Example 6: Comprehensive Infrastructure Report

User: "Give me a complete status report of my infrastructure"

Agent Response:

📊 Infrastructure Report
Generated: 2025-10-19 19:43:41

=== AVAILABILITY ===
Total Hosts: 10
Online: 8 (80%)
Offline: 2

By Group:
  production: 3/3 (100%) ✓
  development: 2/3 (66.7%) ⚠
  homelab: 3/4 (75%)

Offline Hosts:
  • dev-laptop (development) - Last seen: 2h ago
  • homelab-4 (homelab) - Last seen: 1d ago

=== LOAD DISTRIBUTION ===
Average Load: 45% CPU, 55% Memory, 45% Disk

Top 3 Loaded Hosts:
  1. web-02: 85% CPU, 70% Mem, 65% Disk (Score: 0.75) ⚠
  2. db-01: 60% CPU, 75% Mem, 55% Disk (Score: 0.65)
  3. web-01: 45% CPU, 60% Mem, 40% Disk (Score: 0.49)

Top 3 Available Hosts:
  1. web-03: 20% CPU, 35% Mem, 30% Disk (Score: 0.28) ✓
  2. homelab-1: 25% CPU, 40% Mem, 35% Disk (Score: 0.33)
  3. homelab-2: 30% CPU, 45% Mem, 40% Disk (Score: 0.38)

=== NETWORK LATENCY ===
Average: 35ms
Range: 15ms - 150ms

Excellent (< 50ms): 6 hosts
Good (50-100ms): 1 host
Fair (100-200ms): 1 host (db-01: 150ms) ⚠

=== TAILSCALE STATUS ===
Network: Connected
Peers Online: 8/10
Exit Node: None
MagicDNS: Enabled

=== ALERTS ===
⚠ web-02: High CPU usage (85%) - Consider load balancing
⚠ db-01: Elevated latency (150ms) - Check network path
⚠ dev-laptop: Offline for 2 hours - May need attention

=== RECOMMENDATIONS ===
1. Rebalance workload from web-02 to web-03
2. Investigate network latency to db-01
3. Check status of dev-laptop and homelab-4
4. Consider scheduling maintenance for web-02

Overall Health: GOOD ✓

Installation

See INSTALLATION.md for detailed setup instructions.

Quick start:

# 1. Install sshsync
pip install sshsync

# 2. Configure SSH hosts
vim ~/.ssh/config

# 3. Sync host groups
sshsync sync

# 4. Install agent
/plugin marketplace add ./tailscale-sshsync-agent

# 5. Test
"Which of my machines are online?"

Version

Current version: 1.0.0

See CHANGELOG.md for release history.

Architecture Decisions

See DECISIONS.md for detailed rationale behind tool selection, architecture choices, and trade-offs considered.