| name | learning-systems |
| description | Implicit feedback scoring, confidence decay, and anti-pattern detection. Use when understanding how the swarm plugin learns from outcomes, implementing learning loops, or debugging why patterns are being promoted or deprecated. Unique to opencode-swarm-plugin. |
Learning Systems
The swarm plugin learns from task outcomes to improve decomposition quality over time. Three interconnected systems track pattern effectiveness: implicit feedback scoring, confidence decay, and pattern maturity progression.
Implicit Feedback Scoring
Convert task outcomes into learning signals without explicit user feedback.
What Gets Scored
Duration signals:
- Fast (<5 min) = helpful (1.0)
- Medium (5-30 min) = neutral (0.6)
- Slow (>30 min) = harmful (0.2)
Error signals:
- 0 errors = helpful (1.0)
- 1-2 errors = neutral (0.6)
- 3+ errors = harmful (0.2)
Retry signals:
- 0 retries = helpful (1.0)
- 1 retry = neutral (0.7)
- 2+ retries = harmful (0.3)
Success signal:
- Success = 1.0 (40% weight)
- Failure = 0.0
Weighted Score Calculation
rawScore = success * 0.4 + duration * 0.2 + errors * 0.2 + retries * 0.2;
Thresholds:
- rawScore >= 0.7 → helpful
- rawScore <= 0.4 → harmful
- 0.4 < rawScore < 0.7 → neutral
Recording Outcomes
Call swarm_record_outcome after subtask completion:
swarm_record_outcome({
bead_id: "bd-123.1",
duration_ms: 180000, // 3 minutes
error_count: 0,
retry_count: 0,
success: true,
files_touched: ["src/auth.ts"],
strategy: "file-based",
});
Fields tracked:
bead_id- subtask identifierduration_ms- time from start to completionerror_count- errors encountered (from ErrorAccumulator)retry_count- number of retry attemptssuccess- whether subtask completed successfullyfiles_touched- modified file pathsstrategy- decomposition strategy used (optional)failure_mode- classification if success=false (optional)failure_details- error context (optional)
Confidence Decay
Evaluation criteria weights fade unless revalidated. Prevents stale patterns from dominating future decompositions.
Half-Life Formula
decayed_value = raw_value * 0.5^(age_days / 90)
Decay timeline:
- Day 0: 100% weight
- Day 90: 50% weight
- Day 180: 25% weight
- Day 270: 12.5% weight
Criterion Weight Calculation
Aggregate decayed feedback events:
helpfulSum = sum(helpful_events.map((e) => e.raw_value * decay(e.timestamp)));
harmfulSum = sum(harmful_events.map((e) => e.raw_value * decay(e.timestamp)));
weight = max(0.1, helpfulSum / (helpfulSum + harmfulSum));
Weight floor: minimum 0.1 prevents complete zeroing
Revalidation
Recording new feedback resets decay timer for that criterion:
{
criterion: "type_safe",
weight: 0.85,
helpful_count: 12,
harmful_count: 3,
last_validated: "2024-12-12T00:00:00Z", // Reset on new feedback
half_life_days: 90,
}
When Criteria Get Deprecated
total = helpful_count + harmful_count;
harmfulRatio = harmful_count / total;
if (total >= 3 && harmfulRatio > 0.3) {
// Deprecate criterion - reduce impact to 0
}
Pattern Maturity States
Patterns progress through lifecycle based on feedback accumulation:
candidate → established → proven (or deprecated)
State Transitions
candidate (initial state):
- Total feedback < 3 events
- Not enough data to judge
- Multiplier: 0.5x
established:
- Total feedback >= 3 events
- Has track record but not proven
- Multiplier: 1.0x
proven:
- Decayed helpful >= 5 AND
- Harmful ratio < 15%
- Multiplier: 1.5x
deprecated:
- Harmful ratio > 30% AND
- Total feedback >= 3 events
- Multiplier: 0x (excluded)
Decay Applied to State Calculation
State determination uses decayed counts, not raw counts:
const { decayedHelpful, decayedHarmful } =
calculateDecayedCounts(feedbackEvents);
const total = decayedHelpful + decayedHarmful;
const harmfulRatio = decayedHarmful / total;
// State logic applies to decayed values
Old feedback matters less. Pattern must maintain recent positive signal to stay proven.
Manual State Changes
Promote to proven:
promotePattern(maturity); // External validation confirms effectiveness
Deprecate:
deprecatePattern(maturity, "Causes file conflicts in 80% of cases");
Cannot promote deprecated patterns. Must reset.
Multipliers in Decomposition
Apply maturity multiplier to pattern scores:
const multipliers = {
candidate: 0.5,
established: 1.0,
proven: 1.5,
deprecated: 0,
};
pattern_score = base_score * multipliers[maturity.state];
Proven patterns get 50% boost, deprecated patterns excluded entirely.
Anti-Pattern Inversion
Failed patterns auto-convert to anti-patterns at >60% failure rate.
Inversion Threshold
const total = pattern.success_count + pattern.failure_count;
if (total >= 3 && pattern.failure_count / total >= 0.6) {
invertToAntiPattern(pattern, reason);
}
Minimum observations: 3 total (prevents hasty inversion) Failure ratio: 60% (3+ failures in 5 attempts)
Inversion Process
Original pattern:
{
id: "pattern-123",
content: "Split by file type",
kind: "pattern",
is_negative: false,
success_count: 2,
failure_count: 5,
}
Inverted anti-pattern:
{
id: "anti-pattern-123",
content: "AVOID: Split by file type. Failed 5/7 times (71% failure rate)",
kind: "anti_pattern",
is_negative: true,
success_count: 2,
failure_count: 5,
reason: "Failed 5/7 times (71% failure rate)",
}
Recording Observations
Track pattern outcomes to accumulate success/failure counts:
recordPatternObservation(
pattern,
success: true, // or false
beadId: "bd-123.1",
)
// Returns:
{
pattern: updatedPattern,
inversion?: {
original: pattern,
inverted: antiPattern,
reason: "Failed 5/7 times (71% failure rate)",
}
}
Pattern Extraction
Auto-detect strategies from decomposition descriptions:
extractPatternsFromDescription(
"We'll split by file type, one file per subtask",
);
// Returns: ["Split by file type", "One file per subtask"]
Detected strategies:
- Split by file type
- Split by component
- Split by layer (UI/logic/data)
- Split by feature
- One file per subtask
- Handle shared types first
- Separate API routes
- Tests alongside implementation
- Tests in separate subtask
- Maximize parallelization
- Sequential execution order
- Respect dependency chain
Using Anti-Patterns in Prompts
Format for decomposition prompt inclusion:
formatAntiPatternsForPrompt(patterns);
Output:
## Anti-Patterns to Avoid
Based on past failures, avoid these decomposition strategies:
- AVOID: Split by file type. Failed 12/15 times (80% failure rate)
- AVOID: One file per subtask. Failed 8/10 times (80% failure rate)
Error Accumulator
Track errors during subtask execution for retry prompts and outcome scoring.
Error Types
type ErrorType =
| "validation" // Schema/type errors
| "timeout" // Task exceeded time limit
| "conflict" // File reservation conflicts
| "tool_failure" // Tool invocation failed
| "unknown"; // Unclassified
Recording Errors
errorAccumulator.recordError(
beadId: "bd-123.1",
errorType: "validation",
message: "Type error in src/auth.ts",
options: {
stack_trace: "...",
tool_name: "typecheck",
context: "After adding OAuth types",
}
)
Generating Error Context
Format accumulated errors for retry prompts:
const context = await errorAccumulator.getErrorContext(
beadId: "bd-123.1",
includeResolved: false,
)
Output:
## Previous Errors
The following errors were encountered during execution:
### validation (2 errors)
- **Type error in src/auth.ts**
- Context: After adding OAuth types
- Tool: typecheck
- Time: 12/12/2024, 10:30 AM
- **Missing import in src/session.ts**
- Tool: typecheck
- Time: 12/12/2024, 10:35 AM
**Action Required**: Address these errors before proceeding. Consider:
- What caused each error?
- How can you prevent similar errors?
- Are there patterns across error types?
Resolving Errors
Mark errors resolved after fixing:
await errorAccumulator.resolveError(errorId);
Resolved errors excluded from retry context by default.
Error Statistics
Get error counts for outcome tracking:
const stats = await errorAccumulator.getErrorStats("bd-123.1")
// Returns:
{
total: 5,
unresolved: 2,
by_type: {
validation: 3,
timeout: 1,
tool_failure: 1,
}
}
Use total for error_count in outcome signals.
Using the Learning System
Integration Points
1. During decomposition (swarm_plan_prompt):
- Query CASS for similar tasks
- Load pattern maturity records
- Include proven patterns in prompt
- Exclude deprecated patterns
2. During execution:
- ErrorAccumulator tracks errors
- Record retry attempts
- Track duration from start to completion
3. After completion (swarm_complete):
- Record outcome signals
- Score implicit feedback
- Update pattern observations
- Check for anti-pattern inversions
- Update maturity states
Full Workflow Example
// 1. Decomposition phase
const cass_results = cass_search({ query: "user authentication", limit: 5 });
const patterns = loadPatterns(); // Get maturity records
const prompt = swarm_plan_prompt({
task: "Add OAuth",
context: formatPatternsWithMaturityForPrompt(patterns),
query_cass: true,
});
// 2. Execution phase
const errorAccumulator = new ErrorAccumulator();
const startTime = Date.now();
try {
// Work happens...
await implement_subtask();
} catch (error) {
await errorAccumulator.recordError(
bead_id,
classifyError(error),
error.message,
);
retryCount++;
}
// 3. Completion phase
const duration = Date.now() - startTime;
const errorStats = await errorAccumulator.getErrorStats(bead_id);
swarm_record_outcome({
bead_id,
duration_ms: duration,
error_count: errorStats.total,
retry_count: retryCount,
success: true,
files_touched: modifiedFiles,
strategy: "file-based",
});
// 4. Learning updates
const scored = scoreImplicitFeedback({
bead_id,
duration_ms: duration,
error_count: errorStats.total,
retry_count: retryCount,
success: true,
timestamp: new Date().toISOString(),
strategy: "file-based",
});
// Update patterns
for (const pattern of extractedPatterns) {
const { pattern: updated, inversion } = recordPatternObservation(
pattern,
scored.type === "helpful",
bead_id,
);
if (inversion) {
console.log(`Pattern inverted: ${inversion.reason}`);
storeAntiPattern(inversion.inverted);
}
}
Configuration Tuning
Adjust thresholds based on project characteristics:
const learningConfig = {
halfLifeDays: 90, // Decay speed
minFeedbackForAdjustment: 3, // Min observations for weight adjustment
maxHarmfulRatio: 0.3, // Max harmful % before deprecating criterion
fastCompletionThresholdMs: 300000, // 5 min = fast
slowCompletionThresholdMs: 1800000, // 30 min = slow
maxErrorsForHelpful: 2, // Max errors before marking harmful
};
const antiPatternConfig = {
minObservations: 3, // Min before inversion
failureRatioThreshold: 0.6, // 60% failure triggers inversion
antiPatternPrefix: "AVOID: ",
};
const maturityConfig = {
minFeedback: 3, // Min for leaving candidate state
minHelpful: 5, // Decayed helpful threshold for proven
maxHarmful: 0.15, // Max 15% harmful for proven
deprecationThreshold: 0.3, // 30% harmful triggers deprecation
halfLifeDays: 90,
};
Debugging Pattern Issues
Why is pattern not proven?
Check decayed counts:
const feedback = await getFeedback(patternId);
const { decayedHelpful, decayedHarmful } = calculateDecayedCounts(feedback);
console.log({ decayedHelpful, decayedHarmful });
// Need: decayedHelpful >= 5 AND harmfulRatio < 0.15
Why was pattern inverted?
Check observation counts:
const total = pattern.success_count + pattern.failure_count;
const failureRatio = pattern.failure_count / total;
console.log({ total, failureRatio });
// Inverts if: total >= 3 AND failureRatio >= 0.6
Why is criterion weight low?
Check feedback events:
const events = await getFeedbackByCriterion("type_safe");
const weight = calculateCriterionWeight(events);
console.log(weight);
// Shows: helpful vs harmful counts, last_validated date
Storage Interfaces
FeedbackStorage
Persist feedback events for criterion weight calculation:
interface FeedbackStorage {
store(event: FeedbackEvent): Promise<void>;
getByCriterion(criterion: string): Promise<FeedbackEvent[]>;
getByBead(beadId: string): Promise<FeedbackEvent[]>;
getAll(): Promise<FeedbackEvent[]>;
}
ErrorStorage
Persist errors for retry prompts:
interface ErrorStorage {
store(entry: ErrorEntry): Promise<void>;
getByBead(beadId: string): Promise<ErrorEntry[]>;
getUnresolvedByBead(beadId: string): Promise<ErrorEntry[]>;
markResolved(id: string): Promise<void>;
getAll(): Promise<ErrorEntry[]>;
}
PatternStorage
Persist decomposition patterns:
interface PatternStorage {
store(pattern: DecompositionPattern): Promise<void>;
get(id: string): Promise<DecompositionPattern | null>;
getAll(): Promise<DecompositionPattern[]>;
getAntiPatterns(): Promise<DecompositionPattern[]>;
getByTag(tag: string): Promise<DecompositionPattern[]>;
findByContent(content: string): Promise<DecompositionPattern[]>;
}
MaturityStorage
Persist pattern maturity records:
interface MaturityStorage {
store(maturity: PatternMaturity): Promise<void>;
get(patternId: string): Promise<PatternMaturity | null>;
getAll(): Promise<PatternMaturity[]>;
getByState(state: MaturityState): Promise<PatternMaturity[]>;
storeFeedback(feedback: MaturityFeedback): Promise<void>;
getFeedback(patternId: string): Promise<MaturityFeedback[]>;
}
In-memory implementations provided for testing. Production should use persistent storage (file-based JSONL or SQLite).