| name | Plexus Score Setup |
| description | Standard workflow for creating Plexus scorecard and score records via the GraphQL API. This is the administrative setup (name, external_id, description) - NOT guidelines or configuration work. |
Purpose
This skill provides the workflow for creating scorecard and score records in Plexus via the GraphQL API. This is the administrative setup - creating the records with metadata like names, external IDs, and descriptions. This is separate from content work like guidelines and configuration.
When to Use This Skill
Use this skill when:
- User needs to create a new scorecard
- User needs to add a new score to an existing scorecard
- User needs to set up the records in Plexus before working on guidelines/config
- User says they need to "set up" or "create" a scorecard or score
Important: What This Skill Does NOT Cover
This skill is ONLY about creating records via the GraphQL API. It does NOT cover:
- Guidelines development (use
plexus-score-guidelines-updateragent for that) - Score configuration/YAML (use
plexus-score-config-updateragent for that) - Classifier types, classes, labels, or any classification mechanics
Those are separate workflows that happen AFTER the basic records are created via the API.
Complete Workflow
Note: All operations use the Plexus GraphQL API via MCP tools. Do NOT use the plexus CLI command.
Phase 1: Information Gathering
Before doing ANY work, gather this basic metadata from the user:
If Creating a New Scorecard:
- Scorecard name (human-readable, e.g., "Acme Content Quality")
- Scorecard key (web slug: lowercase with hyphens, ideally starts with client name, e.g., "acme-content-quality")
- Scorecard external_id (required, can be same as key)
- Scorecard description (short description of what this scorecard is for)
If Creating a New Score:
- Which scorecard does this score belong to?
- If unsure, use
mcp__Plexus__plexus_scorecards_listto list available scorecards
- If unsure, use
- Score name (human-readable, e.g., "Medical Advice Detection")
- Score external_id (kebab-case, e.g., "medical-advice")
- Score description (short description of what this score measures)
That's it! Do NOT ask about:
- Classifier types (binary, multi-class, etc.)
- Valid classes or labels
- Target/default classes
- Classification mechanics
- Guidelines content
Those come later when working on guidelines and configuration.
IMPORTANT: Do NOT proceed until you have the basic metadata above. If information is missing, ask the user.
Phase 2: Scorecard/Score Creation
If Creating a Scorecard:
Check if scorecard already exists
Use mcp__Plexus__plexus_scorecards_list to see existing scorecardsCreate the scorecard record (if it doesn't exist)
Use mcp__Plexus__plexus_scorecard_create with: - scorecard_name: the human-readable name - key: the web slug (lowercase with hyphens) - external_id: required identifier - description: short descriptionVerify creation
Use mcp__Plexus__plexus_scorecard_info to confirm the scorecard was created
If Creating a Score:
Check if score already exists
Use mcp__Plexus__plexus_scorecard_info with the scorecard name Review the list of scores to see if this score already existsCreate the score record (if it doesn't exist)
Use mcp__Plexus__plexus_score_create with: - scorecard_name: the scorecard name - score_name: the human-readable name - score_external_id: the kebab-case identifier - description: short descriptionVerify creation
Use mcp__Plexus__plexus_score_info to confirm the score was created
Phase 3: Handoff
After creating the records via the API:
Summarize what was created
- Scorecard name, key, external_id (if applicable)
- Score name, external_id (if applicable)
- What records now exist in Plexus
Next steps (inform user of options):
- They can now work on guidelines (use
plexus-score-guidelines-updateragent) - After guidelines, they can work on configuration (use
plexus-score-config-updateragent) - Or they can leave them as stubs for now and return to them later
- Guidelines and config can be created separately from this setup
- They can now work on guidelines (use
Critical Rules
Scope: This is Administrative Setup Only
- This skill ONLY covers creating records via the GraphQL API
- Do NOT gather information about classification mechanics
- Do NOT create guidelines or configuration as part of this workflow
- Guidelines and config are separate workflows that happen later (or never, if just doing stub setup)
Do Not Skip Steps
- Always gather basic metadata BEFORE creating anything
- Always verify creation after using create tools
- If user wants guidelines/config too, create the records FIRST, then hand off to agents
Use the Right Tools
- Scorecard creation: Use
mcp__Plexus__plexus_scorecard_createMCP tool (you, the main agent) - Score creation: Use
mcp__Plexus__plexus_score_createMCP tool (you, the main agent) - DO NOT use the
plexusCLI command - only use MCP tools which call the GraphQL API
Information Management
- Only ask for the basic metadata listed in Phase 1
- Do NOT ask about classifier types, classes, or classification logic
- If you're missing basic metadata, stop and ask the user
Common Mistakes to Avoid
- Asking about classification details - This workflow only needs basic metadata (name, external_id, description)
- Trying to create guidelines/config immediately - Those are separate workflows; just create the records
- Using CLI instead of MCP - Always use MCP tools
- Assuming guidelines/config must be complete - They can be stubs initially
- Conflating setup with content work - Setup = database records, Content = guidelines + config
Example Session Flow
Example 1: Creating Just the Score Record
User: "I need to set up a new score for medical advice detection in the content-quality scorecard"
Claude: I'll help you create the score record. I need some basic information:
1. Which scorecard? (you said content-quality - I'll verify it exists)
2. Score name? (e.g., "Medical Advice Detection")
3. Score external_id? (e.g., "medical-advice")
4. Short description? (what does this score measure?)
User: [Provides: "Medical Advice Detection", "medical-advice", "Detects if content contains medical advice"]
Claude: Great! Let me create the score record.
[Uses mcp__Plexus__plexus_score_create]
[Uses mcp__Plexus__plexus_score_info to verify]
Done! The score "Medical Advice Detection" has been created in the content-quality scorecard with external_id "medical-advice".
Next steps (when you're ready):
- Create guidelines using the plexus-score-guidelines-updater agent
- Create configuration using the plexus-score-config-updater agent
- Or leave as a stub for now
Example 2: Creating a New Scorecard
User: "I need to create a new scorecard for the Acme client"
Claude: I'll help you create the scorecard. I need:
1. Scorecard name? (e.g., "Acme Content Quality")
2. Scorecard key? (web slug, ideally starts with client name, e.g., "acme-content-quality")
3. External_id? (can be same as key)
4. Short description?
User: [Provides information]
Claude: Perfect! Let me create the scorecard.
[Uses mcp__Plexus__plexus_scorecard_create]
[Uses mcp__Plexus__plexus_scorecard_info to verify]
Done! The scorecard "Acme Content Quality" has been created with key "acme-content-quality".
You can now add scores to this scorecard when you're ready.
Troubleshooting
Scorecard Already Exists
If plexus_scorecard_create fails because the scorecard exists:
- Use
mcp__Plexus__plexus_scorecard_infoto check current state - Inform the user the scorecard already exists
- Ask if they want to update it or create scores within it instead
Score Already Exists
If plexus_score_create fails because the score exists:
- Use
mcp__Plexus__plexus_score_infoto check current state - Inform the user the score already exists
- Ask if they want to update metadata or work on guidelines/config instead
Missing Information
If you realize you're missing basic metadata:
- STOP and ask the user for the specific fields (name, external_id, description, key)
- Do not make assumptions about naming or identifiers
- Do not proceed without the required metadata
Working on Score Configuration (YAML)
CRITICAL: If the user wants to work on score configuration YAML code (creating or editing), you MUST use the plexus-score-config-updater agent. Do NOT attempt to edit score configuration YAML yourself.
When to Use the Config Updater Agent
Use plexus-score-config-updater agent when:
- User wants to create initial YAML configuration for a score
- User wants to update/modify existing score configuration
- User wants to sync configuration with updated guidelines
- User mentions "score config", "score YAML", "configuration code"
Why This Agent is Required
The score configuration YAML is complex and requires:
- Understanding of Plexus data source configurations
- Knowledge of classifier/extractor structures
- Proper YAML formatting and validation
- Running evaluations to verify changes
The agent MUST load comprehensive Plexus documentation before working on any configuration. The documentation explains data sources, classifier structures, YAML formatting, and scoring logic.
How to Use It
Use Task tool with subagent_type="plexus-score-config-updater"
The agent will:
1. FIRST: Load documentation using get_plexus_documentation(filename="score-yaml-format")
2. Pull current configuration (or create from scratch)
3. Make changes based on guidelines and requirements
4. Validate with a 10-sample evaluation
5. Push only if evaluation passes
The agent knows to load the documentation automatically - you don't need to specify this.
Never edit score configuration YAML directly in the main conversation - always delegate to this agent.
Basic Score YAML Configuration Template
Most scores follow this basic structure using a LangGraphScore with a single Classifier node:
name: Score Name
id: "score-id-from-api"
description: Brief description of what this score measures
class: LangGraphScore
model_provider: ChatOpenAI
model_name: gpt-4o-mini-2024-07-18
graph:
- name: score_check
class: Classifier
valid_classes:
- Yes
- No
# Add "NA" for binary with abstentions
system_message: >
System prompt that sets the context and role for the classifier.
user_message: >
User prompt that provides the specific task and instructions.
CALL TRANSCRIPT: <transcript> {{text}} </transcript>
[Detailed instructions about when to mark each class]
Answer with the appropriate class.
parse_from_start: true
output:
value: classification
explanation: explanation
data:
class: FeedbackItems
scorecard: scorecard-id-number
score: Score Name
days: 90
limit: 200
limit_per_cell: 50
balance: false
key: score-key-kebab-case
Key Components:
- LangGraphScore: The main score class for most classifiers
- Classifier node: Single node that runs the classification
- valid_classes: List of classes (Yes/No for binary, add NA for abstentions, or multiple classes)
- system_message & user_message: The prompts (can reference existing prompts from legacy scores)
- data.FeedbackItems: Standard data source configuration for evaluation
- {{text}}: Template variable that gets replaced with the transcript
When creating configuration, provide this template to the config-updater agent as a reference.
Related Skills and Agents
- plexus-guidelines skill: Format and validation rules for guidelines documents
- plexus-score-guidelines-updater agent: Creates/updates guidelines
- plexus-score-config-updater agent: Creates/updates YAML configuration (USE THIS FOR ALL CONFIG WORK)
- plexus-alignment-analyzer agent: Analyzes feedback to improve configs (used later for iteration)
Success Criteria
You've successfully completed this workflow when:
- Scorecard/score record exists in Plexus (via GraphQL API) with correct metadata (name, external_id, description, key if scorecard)
- Creation is verified using the MCP info tools
- User is informed what was created
- User knows the next steps (guidelines/config work) is separate and optional