name	sf-ai-agentforce-testing
description	Comprehensive Agentforce testing skill with test execution, coverage analysis, and agentic fix loops. Run agent tests via sf CLI, analyze topic/action coverage, generate test specs, and automatically fix failing agents with 100-point scoring.
license	MIT
compatibility	Requires API v65.0+ (Winter '26) and Agentforce enabled org
metadata	[object Object]

sf-ai-agentforce-testing: Agentforce Test Execution & Coverage Analysis

Expert testing engineer specializing in Agentforce agent testing, topic/action coverage analysis, and agentic fix loops. Execute agent tests, analyze failures, and automatically fix issues via sf-ai-agentforce.

Core Responsibilities

Test Execution: Run agent tests via sf agent test run with coverage analysis
Test Spec Generation: Create YAML test specifications for agents
Coverage Analysis: Track topic selection accuracy, action invocation rates
Preview Testing: Interactive simulated and live agent testing
Agentic Fix Loop: Automatically fix failing agents and re-test
Cross-Skill Orchestration: Delegate fixes to sf-ai-agentforce, data to sf-data

📚 Document Map

Need	Document	Description
CLI commands	cli-commands.md	Complete sf agent test/preview reference
Test spec format	test-spec-reference.md	YAML specification format and examples
Auto-fix workflow	agentic-fix-loops.md	Automated test-fix cycles and Python scripts
Live preview setup	connected-app-setup.md	OAuth for live preview mode
Coverage metrics	coverage-analysis.md	Topic/action coverage analysis
Fix decision tree	agentic-fix-loop.md	Detailed fix strategies

⚡ Quick Links:

Scoring System - 5-category validation
CLI Command Reference - Essential commands
Agentic Fix Loop - Auto-fix workflow
Test Spec Reference - Complete YAML format guide
Automated Testing - Python scripts and workflows

⚠️ CRITICAL: Orchestration Order

sf-metadata → sf-apex → sf-flow → sf-deploy → sf-ai-agentforce → sf-deploy → sf-ai-agentforce-testing (you are here)

Why testing is LAST:

Agent must be published before running automated tests
Agent must be activated for preview mode
All dependencies (Flows, Apex) must be deployed first
Test data (via sf-data) should exist before testing actions

⚠️ MANDATORY Delegation:

Fixes: ALWAYS use Skill(skill="sf-ai-agentforce") for agent script fixes
Test Data: Use Skill(skill="sf-data") for action test data
OAuth Setup: Use Skill(skill="sf-connected-apps") for live preview

⚠️ CRITICAL: Org Requirements (Agent Testing Center)

Agent testing requires the Agent Testing Center feature, which is NOT enabled by default in all orgs.

Check if Agent Testing Center is Enabled

# This will fail if Agent Testing Center is not enabled
sf agent test list --target-org [alias]

# Expected errors if NOT enabled:
# "Not available for deploy for this organization"
# "INVALID_TYPE: Cannot use: AiEvaluationDefinition in this organization"

Orgs WITHOUT Agent Testing Center

Org Type	Agent Testing	Workaround
Standard DevHub	❌ Not available	Request feature enablement
SDO Demo Orgs	❌ Not available	Use scratch org with feature
Scratch Orgs	✅ If feature enabled	Include in scratch-def.json

Enabling Agent Testing Center

Scratch Org - Add to scratch-def.json:

{
  "features": ["AgentTestingCenter", "EinsteinGPTForSalesforce"]
}

Production/Sandbox - Contact Salesforce to enable the feature
Fallback - Use sf agent preview for manual testing (see Automated Testing Guide)

⚠️ CRITICAL: Prerequisites Checklist

Before running agent tests, verify:

Check	Command	Why
Agent Testing Center enabled	`sf agent test list --target-org [alias]`	⚠️ CRITICAL - tests will fail without this
Agent exists	`sf data query --use-tooling-api --query "SELECT Id FROM BotDefinition WHERE DeveloperName='X'"`	Can't test non-existent agent
Agent published	`sf agent validate authoring-bundle --api-name X`	Must be published to test
Agent activated	Check activation status	Required for preview mode
Dependencies deployed	Flows and Apex in org	Actions will fail without them
Connected App (live)	OAuth configured	Required for `--use-live-actions`

Workflow (6-Phase Pattern)

Phase 1: Prerequisites

Use AskUserQuestion to gather:

Agent name/API name
Target org alias
Test mode (simulated vs live)
Coverage threshold (default: 80%)
Enable agentic fix loop?

Then:

Verify agent is published and activated
Check for existing test specs: Glob: **/*.yaml, Glob: **/tests/*.yaml
Create TodoWrite tasks

Phase 2: Test Spec Creation

Option A: Interactive Generation (no automation available)

# Interactive test spec generation
sf agent generate test-spec --output-file ./tests/agent-spec.yaml

# ⚠️ NOTE: There is NO --api-name flag! The command is interactive-only.

Option B: Automated Generation (Python script)

# Generate from agent file
python3 hooks/scripts/generate-test-spec.py \
  --agent-file /path/to/Agent.agent \
  --output tests/agent-spec.yaml \
  --verbose

See Test Spec Reference for complete YAML format guide.

Create Test in Org:

sf agent test create --spec ./tests/agent-spec.yaml --api-name MyAgentTest --target-org [alias]

Phase 3: Test Execution

Automated Tests:

sf agent test run --api-name MyAgentTest --wait 10 --result-format json --target-org [alias]

Interactive Preview (Simulated):

sf agent preview --api-name AgentName --output-dir ./logs --target-org [alias]

Interactive Preview (Live):

sf agent preview --api-name AgentName --use-live-actions --client-app AppName --apex-debug --target-org [alias]

Phase 4: Results Analysis

Parse test results JSON and display formatted summary:

📊 AGENT TEST RESULTS
════════════════════════════════════════════════════════════════

Agent: Customer_Support_Agent
Org: my-sandbox
Duration: 45.2s
Mode: Simulated

SUMMARY
───────────────────────────────────────────────────────────────
✅ Passed:    18
❌ Failed:    2
⏭️ Skipped:   0
📈 Topic Selection: 95%
🎯 Action Invocation: 90%

FAILED TESTS
───────────────────────────────────────────────────────────────
❌ test_complex_order_inquiry
   Utterance: "What's the status of orders 12345 and 67890?"
   Expected: get_order_status invoked 2 times
   Actual: get_order_status invoked 1 time
   Category: ACTION_INVOCATION_COUNT_MISMATCH

COVERAGE SUMMARY
───────────────────────────────────────────────────────────────
Topics Tested:       4/5 (80%) ⚠️
Actions Tested:      6/8 (75%) ⚠️
Guardrails Tested:   3/3 (100%) ✅

Phase 5: Agentic Fix Loop

When tests fail, automatically fix via sf-ai-agentforce:

Error Category	Root Cause	Auto-Fix Strategy
`TOPIC_NOT_MATCHED`	Topic description doesn't match utterance	Add keywords to topic description
`ACTION_NOT_INVOKED`	Action description not triggered	Improve action description
`WRONG_ACTION_SELECTED`	Wrong action chosen	Differentiate descriptions
`ACTION_FAILED`	Flow/Apex error	Delegate to sf-flow or sf-apex
`GUARDRAIL_NOT_TRIGGERED`	System instructions permissive	Add explicit guardrails

Auto-Fix Command Example:

Skill(skill="sf-ai-agentforce", args="Fix agent [AgentName] - Error: [category] - [details]")

See Agentic Fix Loops Guide for:

Complete decision tree
Detailed fix strategies for each error type
Cross-skill orchestration workflow
Python scripts for automated testing
Example fix loop executions

Phase 6: Coverage Improvement

If coverage < threshold:

Identify untested topics/actions from results
Add test cases to spec YAML
Update test: sf agent test create --spec ./tests/agent-spec.yaml --force-overwrite
Re-run: sf agent test run --api-name MyAgentTest --wait 10

Scoring System (100 Points)

Category	Points	Key Rules
Topic Selection Coverage	25	All topics have test cases; various phrasings tested
Action Invocation	25	All actions tested with valid inputs/outputs
Edge Case Coverage	20	Negative tests; empty inputs; special characters; boundaries
Test Spec Quality	15	Proper YAML; descriptions provided; categories assigned
Agentic Fix Success	15	Auto-fixes resolve issues within 3 attempts

Scoring Thresholds:

⭐⭐⭐⭐⭐ 90-100 pts → Production Ready
⭐⭐⭐⭐   80-89 pts → Good, minor improvements
⭐⭐⭐    70-79 pts → Acceptable, needs work
⭐⭐      60-69 pts → Below standard
⭐        <60 pts  → BLOCKED - Major issues

⛔ TESTING GUARDRAILS (MANDATORY)

BEFORE running tests, verify:

Check	Command	Why
Agent published	`sf agent list --target-org [alias]`	Can't test unpublished agent
Agent activated	Check status	Preview requires activation
Flows deployed	`sf org list metadata --metadata-type Flow`	Actions need Flows
Connected App (live)	Check OAuth	Live mode requires auth

NEVER do these:

Anti-Pattern	Problem	Correct Pattern
Test unpublished agent	Tests fail silently	Publish first: `sf agent publish authoring-bundle`
Skip simulated testing	Live mode hides logic bugs	Always test simulated first
Ignore guardrail tests	Security gaps in production	Always test harmful/off-topic inputs
Single phrasing per topic	Misses routing failures	Test 3+ phrasings per topic

CLI Command Reference

Test Lifecycle Commands

Command	Purpose	Example
`sf agent generate test-spec`	Create test YAML	`sf agent generate test-spec --output-dir ./tests`
`sf agent test create`	Deploy test to org	`sf agent test create --spec ./tests/spec.yaml --target-org alias`
`sf agent test run`	Execute tests	`sf agent test run --api-name Test --wait 10 --target-org alias`
`sf agent test results`	Get results	`sf agent test results --job-id ID --result-format json`
`sf agent test resume`	Resume async test	`sf agent test resume --use-most-recent --target-org alias`
`sf agent test list`	List test runs	`sf agent test list --target-org alias`

Preview Commands

Command	Purpose	Example
`sf agent preview`	Interactive testing	`sf agent preview --api-name Agent --target-org alias`
`--use-live-actions`	Use real Flows/Apex	`sf agent preview --use-live-actions --client-app App`
`--output-dir`	Save transcripts	`sf agent preview --output-dir ./logs`
`--apex-debug`	Capture debug logs	`sf agent preview --apex-debug`

Result Formats

Format	Use Case	Flag
`human`	Terminal display (default)	`--result-format human`
`json`	CI/CD parsing	`--result-format json`
`junit`	Test reporting	`--result-format junit`
`tap`	Test Anything Protocol	`--result-format tap`

Test Spec Quick Reference

Basic Template:

subjectType: AGENT
subjectName: <Agent_Name>

testCases:
  # Topic routing
  - utterance: "What's on your menu?"
    expectation:
      topic: product_faq
      actionSequence: []

  # Action invocation
  - utterance: "Search for Harry Potter books"
    expectation:
      topic: book_search
      actionSequence:
        - search_catalog

  # Edge case
  - utterance: ""
    expectation:
      graceful_handling: true

For complete YAML format reference, see Test Spec Reference

Cross-Skill Integration

Required Delegations:

Scenario	Skill to Call	Command
Fix agent script	sf-ai-agentforce	`Skill(skill="sf-ai-agentforce", args="Fix...")`
Create test data	sf-data	`Skill(skill="sf-data", args="Create...")`
Fix failing Flow	sf-flow	`Skill(skill="sf-flow", args="Fix...")`
Setup OAuth	sf-connected-apps	`Skill(skill="sf-connected-apps", args="Create...")`
Analyze debug logs	sf-debug	`Skill(skill="sf-debug", args="Analyze...")`

For complete orchestration workflow, see Agentic Fix Loops

Automated Testing (Python Scripts)

This skill includes Python scripts for fully automated agent testing:

Script	Purpose
`generate-test-spec.py`	Parse .agent files, generate YAML test specs
`run-automated-tests.py`	Orchestrate full test workflow with fix suggestions

Quick Usage:

# Generate test spec from agent file
python3 hooks/scripts/generate-test-spec.py \
  --agent-file /path/to/Agent.agent \
  --output specs/Agent-tests.yaml

# Run full automated workflow
python3 hooks/scripts/run-automated-tests.py \
  --agent-name MyAgent \
  --agent-dir /path/to/project \
  --target-org dev

For complete documentation, see Agentic Fix Loops Guide

Templates Reference

Template	Purpose	Location
`basic-test-spec.yaml`	Quick start (3-5 tests)	`templates/`
`comprehensive-test-spec.yaml`	Full coverage (20+ tests)	`templates/`
`guardrail-tests.yaml`	Security/safety scenarios	`templates/`
`escalation-tests.yaml`	Human handoff scenarios	`templates/`
`standard-test-spec.yaml`	Reference format	`templates/`

💡 Key Insights

Problem	Symptom	Solution
Tests fail silently	No results returned	Agent not published - run `sf agent publish authoring-bundle`
Topic not matched	Wrong topic selected	Add keywords to topic description (see Fix Loops)
Action not invoked	Action never called	Improve action description, add explicit reference
Live preview 401	Authentication error	Connected App not configured - use sf-connected-apps
Async tests stuck	Job never completes	Use `sf agent test resume --use-most-recent`
Empty responses	Agent doesn't respond	Check agent is activated
Agent Testing Center unavailable	"INVALID_TYPE" error	Use `sf agent preview` as fallback

Quick Start Example

# 1. Check if Agent Testing Center is enabled
sf agent test list --target-org dev

# 2. Generate test spec (automated)
python3 hooks/scripts/generate-test-spec.py \
  --agent-file ./agents/MyAgent.agent \
  --output ./tests/myagent-tests.yaml

# 3. Create test in org
sf agent test create \
  --spec ./tests/myagent-tests.yaml \
  --api-name MyAgentTest \
  --target-org dev

# 4. Run tests
sf agent test run \
  --api-name MyAgentTest \
  --wait 10 \
  --result-format json \
  --target-org dev

# 5. View results
sf agent test results \
  --use-most-recent \
  --verbose \
  --result-format json \
  --target-org dev

For complete workflows and fix loops, see:

Agentic Fix Loops - Automated testing and fix workflows
Test Spec Reference - Complete YAML format guide

sf-ai-agentforce-testing

Install Skill

SKILL.md