| name | executing-remote-plans |
| description | Distributed HEC-RAS execution across remote workers (PsExec, Docker, SSH, cloud). Handles worker initialization, queue scheduling, and result aggregation. Use when setting up remote execution, distributed computation, cloud workflows, scaling HEC-RAS across machines, parallel processing on multiple computers, Windows remote execution, container-based execution, session-based remote execution, PsExec configuration, Docker worker setup, or multi-machine HEC-RAS workflows. Triggers: remote execution, distributed execution, PsExec, Docker worker, SSH execution, cloud execution, parallel remote, multi-machine, session_id, remote worker, worker initialization, queue scheduling, network share, container execution. |
Executing Remote Plans
Execute HEC-RAS plans across multiple remote machines using distributed workers.
PRIMARY SOURCES: This skill is a lightweight navigator. For complete details, see:
ras_commander/remote/AGENTS.md- Coding conventions, architectureexamples/23_remote_execution_psexec.ipynb- Complete PsExec workflowdocs_old/feature_dev_notes/RasRemote/REMOTE_WORKER_SETUP_GUIDE.md- Machine setup
Quick Start
from ras_commander import init_ras_project, init_ras_worker, compute_parallel_remote
# Initialize project
init_ras_project("/path/to/project", "6.6")
# Create PsExec worker (Windows remote)
worker = init_ras_worker(
"psexec",
hostname="192.168.1.100",
share_path=r"\\192.168.1.100\RasRemote",
session_id=2, # CRITICAL: Query with "query session /server:hostname"
cores_total=16,
cores_per_plan=4
)
# Execute plans remotely
results = compute_parallel_remote(
plan_numbers=["01", "02", "03"],
workers=[worker],
num_cores=4
)
# Check results
for plan_num, result in results.items():
if result.success:
print(f"Plan {plan_num}: SUCCESS ({result.execution_time:.1f}s)")
print(f" HDF: {result.hdf_path}")
else:
print(f"Plan {plan_num}: FAILED - {result.error_message}")
CRITICAL: Session-Based Execution
HEC-RAS is a GUI application and REQUIRES session-based execution:
worker = init_ras_worker(
"psexec",
hostname="192.168.1.100",
share_path=r"\\192.168.1.100\RasRemote",
session_id=2, # CRITICAL: NOT system account
...
)
NEVER use system_account=True - HEC-RAS will hang without desktop session.
Determining Session ID
Query from controlling machine:
query session /server:192.168.1.100
# Output:
# SESSIONNAME USERNAME ID STATE
# console Administrator 2 Active
# ^
# Use this value
Typical Values:
- Session 0: SYSTEM (services only, NO DESKTOP)
- Session 1: Sometimes system (varies by Windows version)
- Session 2: Typical interactive user (MOST COMMON)
- Session 3+: Additional RDP sessions
See ras_commander/remote/AGENTS.md lines 97-101 for critical implementation notes.
Worker Types
Local Worker
Parallel execution on local machine:
worker = init_ras_worker(
"local",
worker_folder="C:/RasRemote",
cores_total=8,
cores_per_plan=2
)
PsExec Worker
Windows remote via network share:
worker = init_ras_worker(
"psexec",
hostname="192.168.1.100",
share_path=r"\\192.168.1.100\RasRemote", # UNC path from controlling machine
worker_folder=r"C:\RasRemote", # Local path on remote machine
session_id=2, # CRITICAL: Query with "query session"
cores_total=16,
cores_per_plan=4
)
Setup Requirements (see REMOTE_WORKER_SETUP_GUIDE.md):
- Network share created (
C:\RasRemoteshared as\\hostname\RasRemote) - Group Policy configuration (network access, local logon, batch job rights)
- Registry key:
LocalAccountTokenFilterPolicy=1 - Remote Registry service running
- User in Administrators group
Docker Worker
Container execution (local or remote):
# Local Docker
worker = init_ras_worker(
"docker",
docker_image="hecras:6.6",
cores_total=8,
cores_per_plan=4,
preprocess_on_host=True # Windows preprocessing, Linux execution
)
# Remote Docker via SSH
worker = init_ras_worker(
"docker",
docker_image="hecras:6.6",
docker_host="ssh://user@192.168.1.100",
ssh_key_path="~/.ssh/docker_worker",
share_path=r"\\192.168.1.100\DockerShare",
remote_staging_path=r"C:\DockerShare",
cores_total=8,
cores_per_plan=4
)
Docker Prerequisites:
- Docker Desktop installed (local) or Docker daemon on remote
- Python packages:
pip install docker paramiko - HEC-RAS Docker image built (see
ras-commander-cloudrepo) - SSH key configured for remote execution
Distributed Execution
results = compute_parallel_remote(
plan_numbers=["01", "02", "03"], # Plans to execute
workers=[worker1, worker2], # List of initialized workers
num_cores=4, # Cores per plan execution
clear_geompre=False, # Clear geometry preprocessor files
max_concurrent=None, # Max simultaneous executions (default: all slots)
autoclean=True # Delete temp folders after execution
)
Returns: Dict[str, ExecutionResult]
plan_number- Plan that was executedworker_id- Worker that executed the plansuccess- True if successfulhdf_path- Path to output HDF fileerror_message- Error message if failedexecution_time- Execution time in seconds
Queue-Aware Scheduling
Workers execute in priority order (lower queue_priority first):
# Local workers execute first (priority 0)
local = init_ras_worker("local", queue_priority=0, cores_total=8, cores_per_plan=2)
# Remote workers used when local full (priority 1)
remote = init_ras_worker("psexec", hostname="...", queue_priority=1, ...)
# Cloud workers for overflow (priority 2)
cloud = init_ras_worker("docker", docker_host="...", queue_priority=2, ...)
# Plans fill local slots first, then remote, then cloud
results = compute_parallel_remote(
plan_numbers=["01", "02", "03", "04", "05", "06"],
workers=[local, remote, cloud]
)
Worker Slots: Workers with max_parallel_plans > 1 create multiple slots:
- Worker with
cores_total=16andcores_per_plan=4→ 4 parallel slots - Each slot can run one plan at a time
- Total slots = sum of all worker
max_parallel_plans
Mixed Worker Pools
Combine local, remote, and cloud workers:
# Local worker (priority 0)
local = init_ras_worker(
"local",
worker_folder="C:/RasRemote",
cores_total=8,
cores_per_plan=2,
queue_priority=0
)
# Remote PsExec worker (priority 1)
remote = init_ras_worker(
"psexec",
hostname="192.168.1.100",
share_path=r"\\192.168.1.100\RasRemote",
session_id=2,
cores_total=16,
cores_per_plan=4,
queue_priority=1
)
# Docker worker (priority 2)
docker = init_ras_worker(
"docker",
docker_image="hecras:6.6",
docker_host="ssh://user@192.168.1.200",
ssh_key_path="~/.ssh/docker",
cores_total=8,
cores_per_plan=4,
queue_priority=2
)
# Execute with queue-aware scheduling
results = compute_parallel_remote(
plan_numbers=["01", "02", "03", "04", "05", "06", "07", "08"],
workers=[local, remote, docker]
)
Result Processing
results = compute_parallel_remote(plan_numbers=["01", "02"], workers=[worker])
for plan_num, result in results.items():
print(f"\nPlan {plan_num}:")
print(f" Worker: {result.worker_id}")
print(f" Success: {result.success}")
print(f" Time: {result.execution_time:.1f}s")
if result.success:
print(f" HDF: {result.hdf_path}")
# Verify HDF
from ras_commander import HdfResultsPlan
msgs = HdfResultsPlan.get_compute_messages(result.hdf_path)
if "completed successfully" in msgs.lower():
print(f" Status: Verified successful")
else:
print(f" Error: {result.error_message}")
Common Issues
PsExec Worker Hangs
Symptom: No error, HDF not created
Diagnosis:
# Check session ID
query session /server:192.168.1.100
# Verify user is Administrator
net localgroup Administrators
# Check Remote Registry service
sc query RemoteRegistry
Fix: Ensure session_id=2 (or correct session) and all configuration requirements met.
See REMOTE_WORKER_SETUP_GUIDE.md for complete setup instructions.
Permission Denied
Symptom: "Access is denied"
Fix: Check:
- Registry key:
LocalAccountTokenFilterPolicy=1 - Group Policy settings (network access, local logon, batch job)
- Share permissions (Share + NTFS)
Network Path Not Found
Symptom: Cannot access \\hostname\share
Diagnosis:
# Test from controlling machine
dir \\192.168.1.100\RasRemote
# Check firewall (port 445 SMB)
Test-NetConnection -ComputerName 192.168.1.100 -Port 445
# Verify Remote Registry running
sc \\192.168.1.100 query RemoteRegistry
Docker Connection Failed
Symptom: Cannot connect to Docker daemon
Diagnosis:
# Test Docker locally
docker ps
# Test remote Docker via SSH
ssh user@192.168.1.100 "docker info"
Fix: Ensure Docker Desktop running (local) or SSH keys configured (remote)
Architecture Overview
Module Structure
See ras_commander/remote/AGENTS.md for complete module structure and coding conventions.
ras_commander/remote/
├── __init__.py # Exports all public classes and functions
├── RasWorker.py # RasWorker base dataclass + init_ras_worker()
├── PsexecWorker.py # PsexecWorker (IMPLEMENTED)
├── LocalWorker.py # LocalWorker (IMPLEMENTED)
├── DockerWorker.py # DockerWorker (IMPLEMENTED, requires docker+paramiko)
├── SshWorker.py # SshWorker (stub, requires paramiko)
├── WinrmWorker.py # WinrmWorker (stub, requires pywinrm)
├── SlurmWorker.py # SlurmWorker (stub)
├── AwsEc2Worker.py # AwsEc2Worker (stub, requires boto3)
├── AzureFrWorker.py # AzureFrWorker (stub, requires azure-*)
├── Execution.py # compute_parallel_remote() + helpers
└── Utils.py # Shared utilities
Factory Pattern
# Top-level factory function (recommended)
worker = init_ras_worker(worker_type, **config)
# Lazy imports and routing to worker-specific functions
# See RasWorker.py lines 84-92 for implementation
Lazy Loading
Workers with optional dependencies implement check_*_dependencies():
- Docker:
pip install docker paramiko - SSH:
pip install paramiko - AWS:
pip install boto3
See ras_commander/remote/AGENTS.md lines 58-68 for lazy loading pattern.
Primary Sources
ras_commander/remote/AGENTS.md(156 lines)- Module structure and naming conventions
- Import patterns and coding standards
- Critical implementation notes (session_id, UNC paths, etc.)
- Worker implementation pattern
- Adding new workers
examples/23_remote_execution_psexec.ipynb- Complete PsExec workflow
- Session ID determination
- Network share setup
- Error handling and debugging
docs_old/feature_dev_notes/RasRemote/REMOTE_WORKER_SETUP_GUIDE.md- Step-by-step remote machine setup
- Group Policy configuration
- Registry key setup
- Troubleshooting common issues
- Network share creation
See Also
- Skill:
executing-plans- Local plan execution - Skill:
processing-hdf-results- Results analysis - CLAUDE.md: Remote Execution Subpackage section
- Documentation: ras-commander.readthedocs.io (remote execution)