| name | n8n-automation |
| description | n8n workflow automation for building analytics including SkySpark multi-agent systems, FastAPI tool servers, workflow orchestration, and automated building system alert triage |
n8n Automation Skill
This skill provides expertise in building automation workflows using n8n, specifically focused on SkySpark integration and multi-agent systems for building analytics.
Core Capabilities
SkySpark Multi-Agent Architecture
- Gatekeeper agent with specialist routing
- Automated spark triage and prioritization
- HVAC specialist for mechanical system analysis
- Energy calculation agent for savings estimates
- Automated report generation workflows
n8n Workflow Development
- REST API integrations with building systems
- Multi-agent coordination patterns
- Error handling and logging strategies
- Scheduled automation and notifications
SkySpark Integration
- Haystack REST API connectivity
- ZINC and JSON data format handling
- Spark and trend data extraction
- Authentication and security patterns
Available Scripts
- scripts/fastapi_tool_server.py: Template for FastAPI tool servers
- scripts/n8n_workflow_templates/: Common workflow patterns
- scripts/skyspark_mock_data.py: Generate test data for development
Reference Materials
-
eferences/multi_agent_patterns.md: Proven multi-agent architectures
eferences/skyspark_api.md: SkySpark Haystack API documentation
eferences/n8n_best_practices.md: Workflow development guidelines
eferences/roadmap.md: 5-phase project roadmap (Foundation → Production)
Asset Templates
- ssets/workflow_exports/: n8n workflow export files
- ssets/api_schemas/: OpenAPI schemas for tool servers
- ssets/mock_data/: Sample SkySpark data for testing
Current Project: SkySpark Multi-Agent System
5-Phase Roadmap
- Foundation: FastAPI tool server + basic agent interaction
- Mock Integration: Agent logic with simulated SkySpark data
- Real SkySpark: Live Haystack API connection
- Multi-Agent: Full specialist coordination
- Production: Error handling, scheduling, notifications
Multi-Agent Pattern
User Request → Gatekeeper Agent (routes by intent) ↓ ┌───────────┼───────────┬──────────────┐ ↓ ↓ ↓ ↓ Triage HVAC Energy Report Agent Specialist Calc Agent Generator
Agent Architecture for Future RL Orchestration
Stateless Agent Design Principles
All SkySpark agents should be designed with future dynamic orchestration in mind:
Statelessness Requirements:
- Agents receive complete diagnostic context as input (no persistent memory)
- All necessary context passed in, new state returned out
- Enables any agent to follow any agent (maximum routing flexibility)
- Simplifies debugging and testing
Standardized I/O Schema:
Input: DiagnosticState object containing:
alert_context: Current alert/issue details (Haystack tags, spark data)prior_actions: Array of prior agent outputs/reasoningavailable_tools: List of tool endpoints and capabilitiestoken_budget_remaining: Tokens available for continued diagnosisconfidence_score: Current diagnostic confidence (0.0-1.0)metadata: Alert severity, facility info, timestamps
Output: AgentResponse object containing:
updated_state: Refreshed DiagnosticState with new findingsagent_findings: This agent's analysis, calculations, or actionsrecommended_next_agents: Suggested routing (even if not used in Phase 0-1)token_cost: Tokens consumed by this agent callconfidence_delta: Change in diagnostic confidence (+/- 0.0-1.0)actions_taken: List of tools called or queries executed
Role-Based Architecture: Separate agent identity (role prompts) from action logic:
Tool-Use Agents:
SkySpark_Query_Agent: Execute Axon queries against SkySpark databasePython_Analysis_Agent: Statistical analysis, calculations, trendingFile_Reader_Agent: Parse equipment specs, CSV logs, JSON configsWeather_Data_Agent: Fetch external weather data via APIDocumentation_Agent: Search equipment manuals, ASHRAE standards
Reasoning Agents:
Diagnostic_Planner_Agent: Decompose complex issues into sub-diagnosticsRoot_Cause_Analyzer_Agent: Synthesize data to identify probable causesValidation_Agent: Verify diagnostic logic and calculationsReflection_Agent: Assess diagnostic trajectory, propose refinementsSummary_Agent: Generate concise diagnostic reportsResolution_Agent: Recommend corrective actionsModification_Agent: Correct errors in prior reasoning
Orchestration Evolution Roadmap
Phase 0-1: Static n8n Workflows (Current)
- Hardcoded agent routing via n8n nodes is acceptable
- BUT design workflows to call centralized routing functions
- Log every routing decision with full context
- Focus: Get agents working with standardized I/O
Phase 2-3: Rule-Based Orchestrator
- Implement
agent_routerfunction (Python/FastAPI endpoint) - Route based on: alert type/severity, confidence score, prior sequence
- Manually tune rules based on logged diagnostic patterns
- A/B test against static workflows
- Focus: Data-driven routing without ML complexity
Phase 4-5: RL-Based Orchestrator
- Replace rule-based router with learned policy
- Train on logged diagnostic trajectories (minimum 200 episodes)
- Optimize for:
R = accuracy - λ × token_cost - β × time_penalty - Enable online learning from production outcomes
- Focus: Continuous optimization and adaptation
Instrumentation & Logging Requirements
Critical: Start logging from Day 1, even with static workflows
Log every diagnostic episode with schema:
{
"episode_id": "alert_123456_2025-12-02",
"facility_name": "Building XYZ",
"alert_context": {
"alert_type": "High AHU static pressure",
"severity": "warning",
"haystack_refs": [...],
"triggered_at": "2025-12-02T10:15:00Z"
},
"agent_sequence": [
{
"timestamp": "2025-12-02T10:30:00Z",
"agent_type": "SkySpark_Query_Agent",
"input_state": {...},
"output_state": {...},
"tokens_used": 1500,
"duration_ms": 3400,
"tools_called": ["read_trends", "eval_axon"]
},
...
],
"final_outcome": {
"diagnostic_correct": true,
"root_cause_identified": "Filter 80% loaded",
"resolution_actions": ["Schedule filter replacement"],
"total_tokens": 8500,
"total_duration_ms": 15200,
"human_intervention_required": false,
"human_feedback": "Diagnosis accurate, action appropriate"
}
}
Storage Strategy:
- Store in n8n database OR dedicated TimescaleDB
- Retain minimum 200 episodes per alert type for RL training
- Enable easy export for offline analysis and model training
- Version control state schemas alongside agents
What to Log:
- Complete agent call sequence per diagnostic episode
- Full input/output states at each step
- Token consumption per agent (measure via API or estimate)
- Human validation/corrections (critical for reward labels)
- Final diagnostic outcome and resolution effectiveness
- Time to resolution relative to SLA thresholds
Reward Function Design
Preliminary specification for future RL training:
R = accuracy_score - λ × token_cost - β × time_penalty
Where:
accuracy_score ∈ [0, 1]- Validated against human review OR resolution outcometoken_cost = tokens_used / max_token_budget- Normalized efficiency metrictime_penalty = max(0, (time_to_resolve / sla_threshold) - 1)- Timeliness factorλ = 0.1- Efficiency weight (tunable based on production costs)β = 0.05- Timeliness weight (varies by alert severity)
Accuracy Validation:
- Binary: Did diagnostic identify correct root cause? (0 or 1)
- Graduated: Partial credit for narrowing to subsystem (0.0-1.0 scale)
- Ground truth: FM confirmation or system behavior post-resolution
Efficiency Considerations:
- Critical alerts: Prioritize accuracy over token cost (λ → 0)
- Informational alerts: Prioritize efficiency (λ → 0.2)
- Balance prevents both over-analysis and premature conclusions
Timeliness Thresholds:
- Critical: 15 min SLA → high β penalty if exceeded
- Warning: 60 min SLA → moderate β penalty
- Info: 4 hour SLA → minimal β penalty
SkySpark Agent Taxonomy
Agents aligned with Puppeteer research categories:
Tool-Use Agents (External data access):
- SkySpark_Query_Agent - Axon query execution
- Python_Analysis_Agent - Statistical calculations, trending
- File_Reader_Agent - Equipment specs, logs, configs
- Weather_Data_Agent - External weather APIs
- Documentation_Agent - Manuals, standards, knowledge base
Reasoning Agents (Internal cognition):
- Diagnostic_Planner_Agent - Task decomposition
- Root_Cause_Analyzer_Agent - Data synthesis
- Validation_Agent - Logic verification
- Reflection_Agent - Meta-reasoning
- Summary_Agent - Report generation
- Resolution_Agent - Action recommendation
- Modification_Agent - Error correction
Critical Don'ts
Avoid these pitfalls:
- ❌ Don't jump to RL before having logged data (minimum 200 episodes)
- ❌ Don't hardcode agent sequences into prompts (keep routing external)
- ❌ Don't skip instrumentation "for now" (you'll never add it later)
- ❌ Don't design agents that assume specific predecessors
- ❌ Don't sacrifice debuggability for sophistication
- ❌ Don't optimize prematurely (get working system first)
- ❌ Don't force-fit RL where rules work fine
Do these instead:
- ✅ Keep current n8n approach for Phase 0-1 (it's pragmatic)
- ✅ Design agents with standardized I/O from day 1
- ✅ Start logging immediately, even with static workflows
- ✅ Validate every agent output initially (build ground truth dataset)
- ✅ Build tooling to visualize agent sequences (aid debugging)
- ✅ Document reward function assumptions (evolve with data)
- ✅ Plan human override mechanisms (safety net)
Success Metrics by Phase
Phase 1 (Static n8n + Logging):
- 5+ agents implemented with standardized I/O
- 100% of diagnostic episodes logged with schema
- Static workflow resolves 70%+ of test alerts correctly
- Average 8-12k tokens per diagnostic episode
Phase 2 (Rule-Based Router):
- Rule-based router handles 80%+ of alert types
- 10% reduction in average tokens vs. static workflow
- Diagnostic accuracy maintained or improved (≥70%)
- Routing decisions explainable to FM stakeholders
Phase 3 (RL Orchestrator):
- RL policy trained on 200+ episodes
- Matches or exceeds rule-based accuracy
- 20%+ token reduction vs. static baseline
- Cyclic reasoning patterns emerge for complex diagnostics
Phase 4 (Production RL):
- Online learning maintains 90%+ diagnostic accuracy
- 30%+ token reduction vs. baseline
- Handles novel alert types through adaptation
- Human intervention required <10% of cases
Integration with Existing Architecture
Compatible with current project structure:
- Project instructions:
Project-Specific-Instructions/SkySpark-n8n-Workflow.md - n8n workflows:
n8n/ - Agent design aligns with FastAPI tool server architecture
- State schema accommodates SkySpark Haystack data model
- Reward function maps to ComEd program requirements
Backward Compatibility:
- Phase 0-1 static workflows remain fully functional
- Standardized I/O is additive, not breaking
- Logging infrastructure runs alongside existing logic
- Can defer RL indefinitely if rule-based routing suffices
Usage Examples
- "Build n8n workflow to triage SkySpark alerts"
- "Create FastAPI server for energy calculations"
- "Set up multi-agent routing for HVAC diagnostics"
- "Design workflow for automated building system reports"
- "Integrate n8n with SkySpark Haystack API"
- "Design stateless agent with standardized I/O for future RL"
- "Set up diagnostic episode logging infrastructure"
- "Implement rule-based agent router for Phase 2"