| name | Confidence Check |
| description | Pre-implementation confidence assessment (≥90% required). Use before starting any implementation to verify readiness with duplicate check, architecture compliance, official docs verification, OSS references, and root cause identification. |
Confidence Check Skill
Purpose
Prevents wrong-direction execution by assessing confidence BEFORE starting implementation.
Requirement: ≥90% confidence to proceed with implementation.
Test Results (2025-10-21):
- Precision: 1.000 (no false positives)
- Recall: 1.000 (no false negatives)
- 8/8 test cases passed
When to Use
Use this skill BEFORE implementing any task to ensure:
- No duplicate implementations exist
- Architecture compliance verified
- Official documentation reviewed
- Working OSS implementations found
- Root cause properly identified
Confidence Assessment Criteria
Calculate confidence score (0.0 - 1.0) based on 5 checks:
1. No Duplicate Implementations? (25%)
Check: Search codebase for existing functionality
# Use Grep to search for similar functions
# Use Glob to find related modules
✅ Pass if no duplicates found ❌ Fail if similar implementation exists
2. Architecture Compliance? (25%)
Check: Verify tech stack alignment
- Read
CLAUDE.md,PLANNING.md - Confirm existing patterns used
- Avoid reinventing existing solutions
✅ Pass if uses existing tech stack (e.g., Supabase, UV, pytest) ❌ Fail if introduces new dependencies unnecessarily
3. Official Documentation Verified? (20%)
Check: Review official docs before implementation
- Use Context7 MCP for official docs
- Use WebFetch for documentation URLs
- Verify API compatibility
✅ Pass if official docs reviewed ❌ Fail if relying on assumptions
4. Working OSS Implementations Referenced? (15%)
Check: Find proven implementations
- Use Tavily MCP or WebSearch
- Search GitHub for examples
- Verify working code samples
✅ Pass if OSS reference found ❌ Fail if no working examples
5. Root Cause Identified? (15%)
Check: Understand the actual problem
- Analyze error messages
- Check logs and stack traces
- Identify underlying issue
✅ Pass if root cause clear ❌ Fail if symptoms unclear
Confidence Score Calculation
Total = Check1 (25%) + Check2 (25%) + Check3 (20%) + Check4 (15%) + Check5 (15%)
If Total >= 0.90: ✅ Proceed with implementation
If Total >= 0.70: ⚠️ Present alternatives, ask questions
If Total < 0.70: ❌ STOP - Request more context
Output Format
📋 Confidence Checks:
✅ No duplicate implementations found
✅ Uses existing tech stack
✅ Official documentation verified
✅ Working OSS implementation found
✅ Root cause identified
📊 Confidence: 1.00 (100%)
✅ High confidence - Proceeding to implementation
Implementation Details
This skill uses the airis-agent MCP server confidence_check tool.
Python API (direct import):
from airis_agent.api.confidence import ConfidenceRequest, evaluate_confidence
request = ConfidenceRequest(
task="Implement user authentication",
duplicate_check_complete=True,
architecture_check_complete=True,
official_docs_verified=True,
oss_reference_complete=True,
root_cause_identified=True
)
response = evaluate_confidence(request)
# response.score: 0.0-1.0
# response.action: "proceed" | "investigate" | "stop"
# response.checks: List[str]
MCP Tool (via airis-agent MCP server):
- Tool name:
confidence_check - Server:
airis-agent - Parameters: task (required), 5 boolean flags (optional)
- Returns: JSON with score, action, checks
ROI
Token Savings: Spend 100-200 tokens on confidence check to save 5,000-50,000 tokens on wrong-direction work.
Success Rate: 100% precision and recall in production testing.
MCP Invocation
Call the confidence_check tool on the airis-agent MCP server to execute the ABI directly:
use_tool("airis-agent", "confidence_check", {
"task": "{describe current assignment}",
"duplicate_check_complete": true,
"architecture_check_complete": true,
"official_docs_verified": true,
"oss_reference_complete": true,
"root_cause_identified": false
})
The response includes score, action, and the human-readable checklist above.