name	tailscale-sshsync-agent
description	Manages distributed workloads and file sharing across Tailscale SSH-connected machines. Automates remote command execution, intelligent load balancing, file synchronization workflows, host health monitoring, and multi-machine orchestration using sshsync. Activates when discussing remote machines, Tailscale SSH, workload distribution, file sharing, or multi-host operations.

Tailscale SSH Sync Agent

When to Use This Skill

This skill automatically activates when you need to:

✅ Distribute workloads across multiple machines

"Run this on my least loaded machine"
"Execute this task on the machine with most resources"
"Balance work across my Tailscale network"

✅ Share files between Tailscale-connected hosts

"Push this directory to all my development machines"
"Sync code across my homelab servers"
"Deploy configuration to production group"

✅ Execute commands remotely across host groups

"Run system updates on all servers"
"Check disk space across web-servers group"
"Restart services on database hosts"

✅ Monitor machine availability and health

"Which machines are online?"
"Show status of my Tailscale network"
"Check connectivity to remote hosts"

✅ Automate multi-machine workflows

"Deploy to staging, test, then production"
"Backup files from all machines"
"Synchronize development environment across laptops"

How It Works

This agent provides intelligent workload distribution and file sharing management across Tailscale SSH-connected machines using the sshsync CLI tool.

Core Architecture:

SSH Sync Wrapper: Python interface to sshsync CLI operations
Tailscale Manager: Tailscale-specific connectivity and status management
Load Balancer: Intelligent task distribution based on machine resources
Workflow Executor: Common multi-machine workflow automation
Validators: Parameter, host, and connection validation
Helpers: Temporal context, formatting, and utilities

Key Features:

Automatic host discovery via Tailscale and SSH config
Intelligent load balancing based on CPU, memory, and current load
Group-based operations (execute on all web servers, databases, etc.)
Dry-run mode for preview before execution
Parallel execution across multiple hosts
Comprehensive error handling and retry logic
Connection validation before operations
Progress tracking for long-running operations

Data Sources

sshsync CLI Tool

What is sshsync?

sshsync is a Python CLI tool for managing SSH connections and executing operations across multiple hosts. It provides:

Group-based host management
Remote command execution with timeouts
File push/pull operations (single or recursive)
Integration with existing SSH config (~/.ssh/config)
Status checking and connectivity validation

Installation:

pip install sshsync

Configuration:

sshsync uses two configuration sources:

SSH Config (~/.ssh/config): Host connection details
sshsync Config (~/.config/sshsync/config.yaml): Group assignments

Example SSH Config:

Host homelab-1
  HostName 100.64.1.10
  User admin
  IdentityFile ~/.ssh/id_ed25519

Host prod-web-01
  HostName 100.64.1.20
  User deploy
  Port 22

Example sshsync Config:

groups:
  homelab:
    - homelab-1
    - homelab-2
  production:
    - prod-web-01
    - prod-web-02
    - prod-db-01
  development:
    - dev-laptop
    - dev-desktop

sshsync Commands Used:

Command	Purpose	Example
`sshsync all`	Execute on all hosts	`sshsync all "df -h"`
`sshsync group`	Execute on group	`sshsync group web "systemctl status nginx"`
`sshsync push`	Push files to hosts	`sshsync push --group prod ./app /var/www/`
`sshsync pull`	Pull files from hosts	`sshsync pull --host db /var/log/mysql ./logs/`
`sshsync ls`	List hosts	`sshsync ls --with-status`
`sshsync sync`	Sync ungrouped hosts	`sshsync sync`

Tailscale Integration

What is Tailscale?

Tailscale is a zero-config VPN that creates a secure network between your devices. It provides:

Automatic peer-to-peer connections via WireGuard
Magic DNS for easy host addressing (e.g., machine-name.tailnet-name.ts.net)
SSH capabilities built-in to Tailscale CLI
ACLs for access control

Tailscale SSH:

Tailscale includes SSH functionality that works seamlessly with standard SSH:

# Standard SSH via Tailscale
ssh user@machine-name

# Tailscale-specific SSH command
tailscale ssh machine-name

Integration with sshsync:

Since Tailscale SSH uses standard SSH protocol, it works perfectly with sshsync. Just configure your SSH config with Tailscale hostnames:

Host homelab-1
  HostName homelab-1.tailnet.ts.net
  User admin

Tailscale Commands Used:

Command	Purpose	Example
`tailscale status`	Show network status	Lists all connected machines
`tailscale ping`	Check connectivity	`tailscale ping machine-name`
`tailscale ssh`	SSH to machine	`tailscale ssh user@machine`

Workflows

1. Host Health Monitoring

User Query: "Which of my machines are online?"

Workflow:

Load SSH config and sshsync groups
Execute sshsync ls --with-status
Parse connectivity results
Query Tailscale status for additional context
Return formatted health report with:
- Online/offline status per host
- Group memberships
- Tailscale connection state
- Last seen timestamp

Implementation: scripts/sshsync_wrapper.py → get_host_status()

Output Format:

🟢 homelab-1 (homelab) - Online - Tailscale: Connected
🟢 prod-web-01 (production, web-servers) - Online - Tailscale: Connected
🔴 dev-laptop (development) - Offline - Last seen: 2h ago
🟢 prod-db-01 (production, databases) - Online - Tailscale: Connected

Summary: 3/4 hosts online (75%)

2. Intelligent Load Balancing

User Query: "Run this task on the least loaded machine"

Workflow:

Get list of candidate hosts (from group or all)
For each online host, check:
- CPU load (via uptime or top)
- Memory usage (via free or vm_stat)
- Disk space (via df)
Calculate composite load score
Select host with lowest score
Execute task on selected host
Return result with performance metrics

Implementation: scripts/load_balancer.py → select_optimal_host()

Load Score Calculation:

score = (cpu_pct * 0.4) + (mem_pct * 0.3) + (disk_pct * 0.3)

Lower score = better candidate for task execution.

Output Format:

✓ Selected host: prod-web-02
  Reason: Lowest load score (0.32)
  - CPU: 15% (vs avg 45%)
  - Memory: 30% (vs avg 60%)
  - Disk: 40% (vs avg 55%)

Executing: npm run build
[Task output...]

✓ Completed in 2m 15s

3. File Synchronization Workflows

User Query: "Sync my code to all development machines"

Workflow:

Validate source path exists locally
Identify target group ("development")
Check connectivity to all group members
Show dry-run preview (files to be synced, sizes)
Execute parallel push to all hosts
Validate successful transfer on each host
Return summary with per-host status

Implementation: scripts/sshsync_wrapper.py → push_to_group()

Supported Operations:

Push to all: Sync files to every configured host
Push to group: Sync to specific group (dev, prod, etc.)
Pull from host: Retrieve files from single host
Pull from group: Collect files from multiple hosts
Recursive sync: Entire directory trees with --recurse

Output Format:

📤 Syncing: ~/projects/myapp → /var/www/myapp
Group: development (3 hosts)

Preview (dry-run):
  - dev-laptop: 145 files, 12.3 MB
  - dev-desktop: 145 files, 12.3 MB
  - dev-server: 145 files, 12.3 MB

Execute? [Proceeding...]

✓ dev-laptop: Synced 145 files in 8s
✓ dev-desktop: Synced 145 files in 6s
✓ dev-server: Synced 145 files in 10s

Summary: 3/3 successful (435 files, 36.9 MB total)

4. Remote Command Orchestration

User Query: "Check disk space on all web servers"

Workflow:

Identify target group ("web-servers")
Validate group exists and has members
Check connectivity to group members
Execute command in parallel across group
Collect and parse outputs
Format results with per-host breakdown

Implementation: scripts/sshsync_wrapper.py → execute_on_group()

Features:

Parallel execution: Commands run simultaneously on all hosts
Timeout handling: Configurable per-command timeout (default 10s)
Error isolation: Failure on one host doesn't stop others
Output aggregation: Collect and correlate all outputs
Dry-run mode: Preview what would execute without running

Output Format:

🔧 Executing on group 'web-servers': df -h /var/www

web-01:
  Filesystem: /dev/sda1
  Size: 100G, Used: 45G, Available: 50G (45% used)

web-02:
  Filesystem: /dev/sda1
  Size: 100G, Used: 67G, Available: 28G (67% used) ⚠️

web-03:
  Filesystem: /dev/sda1
  Size: 100G, Used: 52G, Available: 43G (52% used)

⚠️ Alert: web-02 is above 60% disk usage

5. Multi-Stage Deployment Workflow

User Query: "Deploy to staging, test, then production"

Workflow:

Stage 1 - Staging Deploy:
- Push code to staging group
- Run build process
- Execute automated tests
- If tests fail: STOP and report error
Stage 2 - Validation:
- Check staging health endpoints
- Validate database migrations
- Run smoke tests
Stage 3 - Production Deploy:
- Push to production group (one at a time for zero-downtime)
- Restart services gracefully
- Verify each host before proceeding to next
Stage 4 - Verification:
- Check production health
- Monitor for errors
- Rollback if issues detected

Implementation: scripts/workflow_executor.py → deploy_workflow()

Output Format:

🚀 Multi-Stage Deployment Workflow

Stage 1: Staging Deployment
  ✓ Pushed code to staging-01
  ✓ Build completed (2m 15s)
  ✓ Tests passed (145/145)

Stage 2: Validation
  ✓ Health check passed
  ✓ Database migration OK
  ✓ Smoke tests passed (12/12)

Stage 3: Production Deployment
  ✓ prod-web-01: Deployed & verified
  ✓ prod-web-02: Deployed & verified
  ✓ prod-web-03: Deployed & verified

Stage 4: Verification
  ✓ All health checks passed
  ✓ No errors in logs (5min window)

✅ Deployment completed successfully in 12m 45s

Available Scripts

scripts/sshsync_wrapper.py

Purpose: Python wrapper around sshsync CLI for programmatic access

Functions:

get_host_status(group=None): Get online/offline status of hosts
execute_on_all(command, timeout=10, dry_run=False): Run command on all hosts
execute_on_group(group, command, timeout=10, dry_run=False): Run on specific group
execute_on_host(host, command, timeout=10): Run on single host
push_to_hosts(local_path, remote_path, hosts=None, group=None, recurse=False, dry_run=False): Push files
pull_from_host(host, remote_path, local_path, recurse=False, dry_run=False): Pull files
list_hosts(with_status=True): List all configured hosts
get_groups(): Get all defined groups and their members
add_hosts_to_group(group, hosts): Add hosts to a group

Usage Example:

from sshsync_wrapper import execute_on_group, push_to_hosts

# Execute command
result = execute_on_group(
    group="web-servers",
    command="systemctl status nginx",
    timeout=15
)

# Push files
push_to_hosts(
    local_path="./dist",
    remote_path="/var/www/app",
    group="production",
    recurse=True
)

scripts/tailscale_manager.py

Purpose: Tailscale-specific operations and status management

Functions:

get_tailscale_status(): Get Tailscale network status (all peers)
check_connectivity(host): Ping host via Tailscale
get_peer_info(hostname): Get detailed info about peer
list_online_machines(): List all online Tailscale machines
get_machine_ip(hostname): Get Tailscale IP for machine
validate_tailscale_ssh(host): Check if Tailscale SSH is working

Usage Example:

from tailscale_manager import get_tailscale_status, check_connectivity

# Get network status
status = get_tailscale_status()
print(f"Online machines: {status['online_count']}")

# Check specific host
is_online = check_connectivity("homelab-1")

scripts/load_balancer.py

Purpose: Intelligent task distribution based on machine resources

Functions:

get_machine_load(host): Get CPU, memory, disk metrics
calculate_load_score(metrics): Calculate composite load score
select_optimal_host(candidates, prefer_group=None): Pick best host
get_group_capacity(): Get aggregate capacity of group
distribute_tasks(tasks, hosts): Distribute multiple tasks optimally

Usage Example:

from load_balancer import select_optimal_host

# Find best machine for task
best_host = select_optimal_host(
    candidates=["web-01", "web-02", "web-03"],
    prefer_group="production"
)

# Execute on selected host
execute_on_host(best_host, "npm run build")

scripts/workflow_executor.py

Purpose: Common multi-machine workflow automation

Functions:

deploy_workflow(code_path, staging_group, prod_group): Full deployment pipeline
backup_workflow(hosts, backup_paths, destination): Backup from multiple hosts
sync_workflow(source_host, target_group, paths): Sync from one to many
rolling_restart(group, service_name): Zero-downtime service restart
health_check_workflow(group, endpoint): Check health across group

Usage Example:

from workflow_executor import deploy_workflow, backup_workflow

# Deploy with testing
deploy_workflow(
    code_path="./dist",
    staging_group="staging",
    prod_group="production"
)

# Backup from all databases
backup_workflow(
    hosts=["db-01", "db-02"],
    backup_paths=["/var/lib/mysql"],
    destination="./backups"
)

scripts/utils/helpers.py

Purpose: Common utilities and formatting functions

Functions:

format_bytes(bytes): Human-readable byte formatting (1.2 GB)
format_duration(seconds): Human-readable duration (2m 15s)
parse_ssh_config(): Parse ~/.ssh/config for host details
parse_sshsync_config(): Parse sshsync group configuration
get_timestamp(): Get ISO timestamp for logging
safe_execute(func, *args, **kwargs): Execute with error handling
validate_path(path): Check if path exists and is accessible

scripts/utils/validators/parameter_validator.py

Purpose: Validate user inputs and parameters

Functions:

validate_host(host, valid_hosts=None): Validate host exists
validate_group(group, valid_groups=None): Validate group exists
validate_path_exists(path): Check local path exists
validate_timeout(timeout): Ensure timeout is reasonable
validate_command(command): Basic command safety validation

scripts/utils/validators/host_validator.py

Purpose: Validate host configuration and availability

Functions:

validate_ssh_config(host): Check host has SSH config entry
validate_host_reachable(host, timeout=5): Check host is reachable
validate_group_members(group): Ensure group has valid members
get_invalid_hosts(hosts): Find hosts without valid config

scripts/utils/validators/connection_validator.py

Purpose: Validate SSH and Tailscale connections

Functions:

validate_ssh_connection(host): Test SSH connection works
validate_tailscale_connection(host): Test Tailscale connectivity
validate_ssh_key(host): Check SSH key authentication
get_connection_diagnostics(host): Comprehensive connection testing

Available Analyses

1. Host Availability Analysis

Function: analyze_host_availability(group=None)

Objective: Determine which machines are online and accessible

Inputs:

group (optional): Specific group to check (None = all hosts)

Outputs:

{
    'total_hosts': 10,
    'online_hosts': 8,
    'offline_hosts': 2,
    'availability_pct': 80.0,
    'by_group': {
        'production': {'online': 3, 'total': 3, 'pct': 100.0},
        'development': {'online': 2, 'total': 3, 'pct': 66.7},
        'homelab': {'online': 3, 'total': 4, 'pct': 75.0}
    },
    'offline_hosts_details': [
        {'host': 'dev-laptop', 'last_seen': '2h ago', 'groups': ['development']},
        {'host': 'homelab-4', 'last_seen': '1d ago', 'groups': ['homelab']}
    ]
}

Interpretation:

> 90%: Excellent availability
70-90%: Good availability, monitor offline hosts
< 70%: Poor availability, investigate issues

2. Load Distribution Analysis

Function: analyze_load_distribution(group=None)

Objective: Understand resource usage across machines

Inputs:

group (optional): Specific group to analyze

Outputs:

{
    'hosts': [
        {
            'host': 'web-01',
            'cpu_pct': 45,
            'mem_pct': 60,
            'disk_pct': 40,
            'load_score': 0.49,
            'status': 'moderate'
        },
        # ... more hosts
    ],
    'aggregate': {
        'avg_cpu': 35,
        'avg_mem': 55,
        'avg_disk': 45,
        'total_capacity': 1200  # GB
    },
    'recommendations': [
        {
            'host': 'web-02',
            'issue': 'High CPU usage (85%)',
            'action': 'Consider migrating workloads'
        }
    ]
}

Load Status:

Low (score < 0.4): Good capacity for more work
Moderate (0.4-0.7): Normal operation
High (> 0.7): May need to offload work

3. File Sync Status Analysis

Function: analyze_sync_status(local_path, remote_path, group)

Objective: Compare local files with remote versions

Inputs:

local_path: Local directory to compare
remote_path: Remote directory path
group: Group to check

Outputs:

{
    'local_files': 145,
    'local_size': 12582912,  # bytes
    'hosts': [
        {
            'host': 'web-01',
            'status': 'in_sync',
            'files_match': 145,
            'files_different': 0,
            'missing_files': 0
        },
        {
            'host': 'web-02',
            'status': 'out_of_sync',
            'files_match': 140,
            'files_different': 3,
            'missing_files': 2,
            'details': ['config.json modified', 'index.html modified', ...]
        }
    ],
    'sync_percentage': 96.7,
    'recommended_action': 'Push to web-02'
}

4. Network Latency Analysis

Function: analyze_network_latency(hosts=None)

Objective: Measure connection latency to hosts

Inputs:

hosts (optional): Specific hosts to test (None = all)

Outputs:

{
    'hosts': [
        {'host': 'web-01', 'latency_ms': 15, 'status': 'excellent'},
        {'host': 'web-02', 'latency_ms': 45, 'status': 'good'},
        {'host': 'db-01', 'latency_ms': 150, 'status': 'fair'}
    ],
    'avg_latency': 70,
    'min_latency': 15,
    'max_latency': 150,
    'recommendations': [
        {'host': 'db-01', 'issue': 'High latency', 'action': 'Check network path'}
    ]
}

Latency Classification:

Excellent (< 50ms): Ideal for interactive tasks
Good (50-100ms): Suitable for most operations
Fair (100-200ms): May impact interactive workflows
Poor (> 200ms): Investigate network issues

5. Comprehensive Infrastructure Report

Function: comprehensive_infrastructure_report(group=None)

Objective: One-stop function for complete infrastructure overview

Inputs:

group (optional): Limit to specific group (None = all)

Outputs:

{
    'report_timestamp': '2025-10-19T19:43:41Z',
    'group': 'production',  # or 'all'
    'metrics': {
        'availability': {...},  # from analyze_host_availability
        'load_distribution': {...},  # from analyze_load_distribution
        'network_latency': {...},  # from analyze_network_latency
        'tailscale_status': {...}  # from Tailscale integration
    },
    'summary': "Production infrastructure: 3/3 hosts online, avg load 45%, network latency 35ms",
    'alerts': [
        "⚠ web-02: High CPU usage (85%)",
        "⚠ db-01: Elevated latency (150ms)"
    ],
    'recommendations': [
        "Consider rebalancing workload from web-02",
        "Investigate network path to db-01"
    ],
    'overall_health': 'good'  # excellent | good | fair | poor
}

Overall Health Classification:

Excellent: All metrics green, no alerts
Good: Most metrics healthy, minor alerts
Fair: Some concerning metrics, action recommended
Poor: Critical issues, immediate action required

Error Handling

Connection Errors

Error: Cannot connect to host

Causes:

Host is offline
Tailscale not connected
SSH key missing/invalid
Firewall blocking connection

Handling:

try:
    execute_on_host("web-01", "ls")
except ConnectionError as e:
    # Try Tailscale ping first
    if not check_connectivity("web-01"):
        return {
            'error': 'Host unreachable',
            'suggestion': 'Check Tailscale connection',
            'diagnostics': get_connection_diagnostics("web-01")
        }
    # Then check SSH
    if not validate_ssh_connection("web-01"):
        return {
            'error': 'SSH authentication failed',
            'suggestion': 'Check SSH keys: ssh-add -l'
        }

Timeout Errors

Error: Operation timed out

Causes:

Command taking too long
Network latency
Host overloaded

Handling:

Automatic retry with exponential backoff (3 attempts)
Increase timeout for known slow operations
Fall back to alternative host if available

File Transfer Errors

Error: File sync failed

Causes:

Insufficient disk space
Permission denied
Path doesn't exist

Handling:

Pre-check disk space on target
Validate permissions before transfer
Create directories if needed
Partial transfer recovery

Validation Errors

Error: Invalid parameter

Examples:

Unknown host
Non-existent group
Invalid path

Handling:

Validate all inputs before execution
Provide suggestions for similar valid options
Clear error messages with corrective actions

Mandatory Validations

Before Any Operation

Parameter Validation:

host = validate_host(host, valid_hosts=get_all_hosts())
group = validate_group(group, valid_groups=get_groups())
timeout = validate_timeout(timeout)

Connection Validation:

if not validate_host_reachable(host, timeout=5):
    raise ConnectionError(f"Host {host} is not reachable")

Path Validation (for file operations):

if not validate_path_exists(local_path):
    raise ValueError(f"Path does not exist: {local_path}")

During Operation

Timeout Monitoring: Every operation has configurable timeout
Progress Tracking: Long operations show progress
Error Isolation: Failure on one host doesn't stop others

After Operation

Result Validation:

report = validate_operation_result(result)
if report.has_critical_issues():
    raise OperationError(report.get_summary())

State Verification: Confirm operation succeeded
Logging: Record all operations for audit trail

Performance and Caching

Caching Strategy

Host Status Cache:

TTL: 60 seconds
Why: Host status doesn't change rapidly
Invalidation: Manual invalidate when connectivity changes

Load Metrics Cache:

TTL: 30 seconds
Why: Load changes frequently
Invalidation: Automatic on timeout

Group Configuration Cache:

TTL: 5 minutes
Why: Group membership rarely changes
Invalidation: Manual when groups modified

Performance Optimizations

Parallel Execution:
- Commands execute concurrently across hosts
- ThreadPoolExecutor with max 10 workers
- Prevents sequential bottleneck
Connection Pooling:
- Reuse SSH connections when possible
- ControlMaster in SSH config
Lazy Loading:
- Only fetch data when needed
- Don't load all host status unless required
Progressive Results:
- Stream results as they complete
- Don't wait for slowest host

Automatic Detection Keywords

This skill automatically activates when you mention:

Hosts & Machines:

remote machine, remote host, remote server
Tailscale machine, Tailscale host
SSH host, SSH server
homelab, servers, cluster

Tools:

sshsync, ssh sync
Tailscale, tailnet
SSH, secure shell

Operations:

distribute workload, load balance
file sharing, file sync, file transfer
push files, pull files, sync files
remote execution, execute remotely
deploy to, deploy across
backup from, sync to

Groups:

host group, machine group, server group
production servers, development machines
web servers, database servers
staging environment, production environment

Actions:

check status, check connectivity
run on all, execute on group
sync across, distribute to
deploy to, push to, pull from

Monitoring:

which machines are online
host availability, server status
machine health, connectivity check
network status, latency check

Workflows:

multi-stage deployment
rolling restart, zero-downtime
backup workflow, sync workflow
distributed task, parallel execution

Usage Examples

Example 1: Check Host Status

User: "Which of my Tailscale machines are online?"

Agent Response:

Checking Tailscale network status...

🟢 homelab-1 (homelab) - Online - Tailscale: Connected
  IP: 100.64.1.10 | Latency: 15ms

🟢 homelab-2 (homelab) - Online - Tailscale: Connected
  IP: 100.64.1.11 | Latency: 18ms

🔴 dev-laptop (development) - Offline
  Last seen: 2 hours ago | Tailscale: Not connected

🟢 prod-web-01 (production, web-servers) - Online - Tailscale: Connected
  IP: 100.64.1.20 | Latency: 25ms

🟢 prod-db-01 (production, databases) - Online - Tailscale: Connected
  IP: 100.64.1.30 | Latency: 30ms

Summary: 4/5 hosts online (80%)
Average latency: 22ms

Example 2: Execute on Group

User: "Check disk space on all web servers"

Agent Response:

Executing on group 'web-servers': df -h /var/www

🔧 web-01:
  Filesystem: /dev/sda1
  Size: 100G | Used: 45G | Available: 50G (45% used)
  Status: ✓ Healthy

🔧 web-02:
  Filesystem: /dev/sda1
  Size: 100G | Used: 67G | Available: 28G (67% used)
  Status: ⚠ Warning - Above 60%

🔧 web-03:
  Filesystem: /dev/sda1
  Size: 100G | Used: 52G | Available: 43G (52% used)
  Status: ✓ Healthy

⚠️ Alert: web-02 is above 60% disk usage
Recommendation: Clear old logs or increase disk space

Example 3: Load-Balanced Execution

User: "Run this npm build on the machine with most available resources"

Agent Response:

Analyzing resource availability across 5 machines...

Load Analysis:
  web-01: CPU 45%, Mem 60%, Disk 40% → Score: 0.49 (moderate)
  web-02: CPU 85%, Mem 70%, Disk 65% → Score: 0.75 (high load)
  web-03: CPU 20%, Mem 35%, Disk 30% → Score: 0.28 (low load) ✓

✓ Selected: web-03
  Reason: Lowest load score (0.28)
  Available: CPU 80%, Memory 65%, Disk 70%

Executing on web-03: npm run build

[Build output...]
  ✓ Build completed successfully
  Duration: 2m 15s
  Output: ./dist (145 files, 12.3 MB)

Final load: CPU 65%, Mem 50% (still below average)

Example 4: File Sync

User: "Push my app directory to all production servers"

Agent Response:

📤 Syncing: ~/projects/myapp → /var/www/myapp
Target group: production (3 hosts)

Analyzing source...
  145 files, 12.3 MB total

Preview (dry-run):
  prod-web-01: 145 files → /var/www/myapp
  prod-web-02: 145 files → /var/www/myapp
  prod-db-01: Skipped (not in web-servers group)

Checking target capacity...
  ✓ prod-web-01: 50 GB available
  ✓ prod-web-02: 28 GB available

Executing push...

✓ prod-web-01: Synced 145 files in 8.2s
  Modified: 12 files
  Added: 3 files

✓ prod-web-02: Synced 145 files in 6.5s
  Modified: 12 files
  Added: 3 files

Summary: 2/2 successful (290 files, 24.6 MB total)

Example 5: Multi-Stage Deployment

User: "Deploy this to staging, run tests, then push to production"

Agent Response:

🚀 Multi-Stage Deployment Workflow

Stage 1: Staging Deployment
  📤 Pushing to staging-01...
    ✓ Code synced (145 files, 12.3 MB) - 8s

  🔨 Running build...
    $ npm run build
    ✓ Build completed - 2m 15s

  🧪 Running tests...
    $ npm test
    ✓ All tests passed (145/145) - 45s

Stage 2: Staging Validation
  ✓ Health endpoint: 200 OK
  ✓ Database migrations: Up to date
  ✓ Smoke tests: 12/12 passed

Stage 3: Production Deployment
  Rolling deployment to 3 hosts...

  ⏳ prod-web-01:
    ✓ Code synced
    ✓ Build completed
    ✓ Service restarted gracefully
    ✓ Health check passed
    Duration: 3m 10s

  ⏳ prod-web-02:
    ✓ Code synced
    ✓ Build completed
    ✓ Service restarted gracefully
    ✓ Health check passed
    Duration: 3m 5s

  ⏳ prod-web-03:
    ✓ Code synced
    ✓ Build completed
    ✓ Service restarted gracefully
    ✓ Health check passed
    Duration: 3m 12s

Stage 4: Production Verification
  ✓ All health checks passed (3/3)
  ✓ No errors in logs (5min window)
  ✓ Response times normal (avg 45ms)

✅ Deployment completed successfully!
  Total duration: 12m 45s
  Hosts updated: 3
  Zero downtime: ✓

Example 6: Comprehensive Infrastructure Report

User: "Give me a complete status report of my infrastructure"

Agent Response:

📊 Infrastructure Report
Generated: 2025-10-19 19:43:41

=== AVAILABILITY ===
Total Hosts: 10
Online: 8 (80%)
Offline: 2

By Group:
  production: 3/3 (100%) ✓
  development: 2/3 (66.7%) ⚠
  homelab: 3/4 (75%)

Offline Hosts:
  • dev-laptop (development) - Last seen: 2h ago
  • homelab-4 (homelab) - Last seen: 1d ago

=== LOAD DISTRIBUTION ===
Average Load: 45% CPU, 55% Memory, 45% Disk

Top 3 Loaded Hosts:
  1. web-02: 85% CPU, 70% Mem, 65% Disk (Score: 0.75) ⚠
  2. db-01: 60% CPU, 75% Mem, 55% Disk (Score: 0.65)
  3. web-01: 45% CPU, 60% Mem, 40% Disk (Score: 0.49)

Top 3 Available Hosts:
  1. web-03: 20% CPU, 35% Mem, 30% Disk (Score: 0.28) ✓
  2. homelab-1: 25% CPU, 40% Mem, 35% Disk (Score: 0.33)
  3. homelab-2: 30% CPU, 45% Mem, 40% Disk (Score: 0.38)

=== NETWORK LATENCY ===
Average: 35ms
Range: 15ms - 150ms

Excellent (< 50ms): 6 hosts
Good (50-100ms): 1 host
Fair (100-200ms): 1 host (db-01: 150ms) ⚠

=== TAILSCALE STATUS ===
Network: Connected
Peers Online: 8/10
Exit Node: None
MagicDNS: Enabled

=== ALERTS ===
⚠ web-02: High CPU usage (85%) - Consider load balancing
⚠ db-01: Elevated latency (150ms) - Check network path
⚠ dev-laptop: Offline for 2 hours - May need attention

=== RECOMMENDATIONS ===
1. Rebalance workload from web-02 to web-03
2. Investigate network latency to db-01
3. Check status of dev-laptop and homelab-4
4. Consider scheduling maintenance for web-02

Overall Health: GOOD ✓

Installation

See INSTALLATION.md for detailed setup instructions.

Quick start:

# 1. Install sshsync
pip install sshsync

# 2. Configure SSH hosts
vim ~/.ssh/config

# 3. Sync host groups
sshsync sync

# 4. Install agent
/plugin marketplace add ./tailscale-sshsync-agent

# 5. Test
"Which of my machines are online?"

Version

Current version: 1.0.0

See CHANGELOG.md for release history.

Architecture Decisions

See DECISIONS.md for detailed rationale behind tool selection, architecture choices, and trade-offs considered.

Install Skill

SKILL.md

Tailscale SSH Sync Agent

When to Use This Skill

How It Works

Data Sources

sshsync CLI Tool

Tailscale Integration

Workflows

1. Host Health Monitoring

2. Intelligent Load Balancing

3. File Synchronization Workflows

4. Remote Command Orchestration

5. Multi-Stage Deployment Workflow

Available Scripts

scripts/sshsync_wrapper.py

scripts/tailscale_manager.py

scripts/load_balancer.py

scripts/workflow_executor.py

scripts/utils/helpers.py

scripts/utils/validators/parameter_validator.py

scripts/utils/validators/host_validator.py

scripts/utils/validators/connection_validator.py

Available Analyses

1. Host Availability Analysis

2. Load Distribution Analysis

3. File Sync Status Analysis

4. Network Latency Analysis

5. Comprehensive Infrastructure Report

Error Handling

Connection Errors

Timeout Errors

File Transfer Errors

Validation Errors

Mandatory Validations

Before Any Operation

During Operation

After Operation

Performance and Caching

Caching Strategy

Performance Optimizations

Automatic Detection Keywords

Usage Examples

Example 1: Check Host Status

Example 2: Execute on Group

Example 3: Load-Balanced Execution

Example 4: File Sync

Example 5: Multi-Stage Deployment

Example 6: Comprehensive Infrastructure Report

Installation

Version

Architecture Decisions