| name | supply-chain-security |
| description | A capability-oriented reasoning framework for identifying and describing security-relevant capability expansions in software supply-chain version transitions. Provides capability vocabulary, counterfactual reasoning scaffolds, and contextual analysis tools. Use when analyzing code version transitions to understand what new affordances emerge, not to classify code as malicious/benign. Atomic unit of analysis is the version transition (diff), not standalone code. |
Capability-Oriented Security Reasoning Framework
Non-goal: This framework does not attempt to classify code as malicious or benign. It enumerates potential capability changes and contextual signals that may support or refute security hypotheses.
Goal: Provide a constrained vocabulary and reasoning structure for describing what becomes possible when code changes, enabling systematic capability expansion analysis.
Atomic unit: Version transition (diff), not standalone code. Capabilities are attributed to added/modified hunks.
Core Principle: Capability-First Reasoning
Traditional approach:
"Does this match a known attack pattern?" → Binary classification
This framework:
"What new affordances does this create?" → Capability description → Contextual reasoning
Capability Taxonomy
Use this vocabulary to describe what code can do, not what it "is."
Capabilities should be attributed to added/modified hunks where possible. Existing capabilities present in both versions are background context, not delta.
Network Capabilities
network.http_client- Can initiate HTTP/HTTPS requestsnetwork.socket- Can create raw network socketsnetwork.dns- Can perform DNS queriesnetwork.alternate_protocol- Can use FTP, SMTP, etc.
Environment Capabilities
environment.read_single- Can read specific environment variableenvironment.read_wholesale- Can enumerate all environment variablesenvironment.write- Can modify environment
Filesystem Capabilities
filesystem.read_generic- Can read filesfilesystem.read_sensitive- Can access.ssh,.aws,.env, etc.filesystem.write- Can create/modify filesfilesystem.permission_change- Can chmod/chown files
Process Capabilities
process.spawn- Can create child processesprocess.exec- Can execute system commandsprocess.eval- Can dynamically execute code
Data Transformation Capabilities
encoding.base64- Can encode/decode base64encoding.hex- Can encode/decode hexadecimalencoding.compress- Can compress/decompress (gzip, zlib)crypto.encrypt- Can encrypt datacrypto.decrypt- Can decrypt data
Conditional Execution Capabilities
conditional.environment_gated- Execution depends on environment variablesconditional.time_gated- Execution depends on date/timeconditional.platform_gated- Execution depends on OS/platformconditional.input_gated- Execution depends on function arguments
Execution Phase Capabilities (CRITICAL for supply-chain)
phase.install_time- Runs during package installation (npm lifecycle hooks, setup.py)phase.import_time- Runs when module is imported (module-level side effects)phase.build_time- Runs during build/compilation (build scripts, webpack)phase.runtime- Runs when explicitly invoked via API
Why phase matters: Install-time execution bypasses code review. Build-time divergence enables XZ-style attacks.
Counterfactual Reasoning Framework
For each code change, systematically enumerate:
1. Capability Delta
Before: List capabilities present in previous version After: List capabilities present in new version Added: Capabilities in After but not in Before (focus here) Removed: Capabilities in Before but not in After
Attribution: Link capabilities to specific hunks/lines where possible.
2. Affordance Questions
For each added capability, ask:
- Reach: What data can this capability access?
- Transform: How can that data be modified?
- Transmit: Where can that data be sent?
- Persist: Can effects outlive the process?
- Trigger: Under what conditions does this activate?
- Phase: When does this execute (install/import/build/runtime)?
3. Composition Analysis
For capability combinations, describe:
- Data flow: A → B → C (e.g., env_read → encode → network)
- Control flow: IF condition THEN capability (e.g., if env.CI then network.http)
- Timing: Sequential, parallel, or conditional chains
- Phase interaction: Does install-time code enable runtime behavior?
4. Intent Alignment Assessment
Compare observed capabilities with stated package purpose:
- Stated purpose: From package description, README, documentation
- Implied capabilities: What capabilities does purpose require?
- Observed capabilities: What capabilities exist in code?
- Alignment gap: Capabilities present but not implied by purpose
5. Uncertainty Qualification
Observation Confidence:
- HIGH: Capability is explicit (imports + callsite visible in code)
- MEDIUM: Capability inferred (wrapper function, indirect call, dynamic import)
- LOW: Capability speculative (requires runtime resolution, obfuscated)
Dynamic Resolution Flag:
requires_dynamic_resolution: true- Cannot determine statically (eval, computed imports)requires_dynamic_resolution: false- Statically observable
Context Budget Policy
To prevent hidden overfitting and ensure reproducible evaluation:
Default context (always provide):
- Changed files only (diffs)
- Minimal package metadata (name, version, 1-sentence description)
Escalation context (optional, must log):
- Full file context (not just diffs)
- Complete README
- Dependency tree
- Maintainer history
Logging requirement: If escalating beyond default context, document what additional context was used and why.
This ensures methods sections can accurately describe information available to the model.
Available Tools
Note: Tools are executable scripts in the tools/ directory. Call them via bash when needed.
1. extract_capabilities (REQUIRED)
Extracts security-relevant capabilities from code with diff-aware attribution.
Purpose: Build factual inventory of what code can do
When to use: Always, as first step in analysis
Returns: List of capabilities with:
capability- Taxonomy identifierphase- Execution phase (if detectable)evidence_span- {file, hunk_id, start_line, end_line}origin- "added" | "removed" | "preexisting"confidence_obs- "HIGH" | "MEDIUM" | "LOW"requires_dynamic_resolution- true | falsecontext- Code snippet showing capability
Example:
extract_capabilities(
old_code="...",
new_code="import requests\nif os.environ.get('CI'): requests.get(...)",
language="python"
)
# Returns: [
# {
# capability: "network.http_client",
# phase: "import_time",
# evidence_span: {file: "main.py", hunk: 1, start: 1, end: 1},
# origin: "added",
# confidence_obs: "HIGH",
# requires_dynamic_resolution: false,
# context: "import requests"
# },
# {
# capability: "conditional.environment_gated",
# phase: "runtime",
# evidence_span: {file: "main.py", hunk: 2, start: 2, end: 2},
# origin: "added",
# confidence_obs: "HIGH",
# requires_dynamic_resolution: false,
# context: "if os.environ.get('CI')"
# }
# ]
2. analyze_execution_paths (OPTIONAL - Confirmatory Only)
Surfaces potential execution paths through code.
Purpose: Understand how capabilities might compose
When to use: When you need to trace data/control flow
NOT for: Determining reachability or confirmed behavior
Returns:
possible_paths- Sequences of capability nodesconditions- Normalized triggersnote- Always includes "possible, not confirmed"- Never returns "reachable: true" or definitive flow
Example:
analyze_execution_paths(
code="...",
language="javascript"
)
# Returns: {
# possible_paths: ["env_read → encode → network", "env_read → filesystem"],
# conditions: ["process.env.CI", "process.platform === 'linux'"],
# note: "These are possible paths based on static analysis, not confirmed execution"
# }
3. search_capability_examples (OPTIONAL - Explanatory Only)
Finds historical examples where capability overlap exists.
Purpose: Provide context, not classification
When to use: To explain or provide evidence for hypothesis
NOT for: Pattern matching, similarity scoring, or labeling
Returns (sanitized schema):
example_name- Identifier onlycapabilities_overlap- List of overlapping capabilitieswhy_relevant- One sentence explanationcaution- Always included disclaimer
NO similarity scores. NO "this matches X" language.
Example:
search_capability_examples(
capabilities=["environment.read_wholesale", "network.http_client", "phase.install_time"]
)
# Returns: [
# {
# example_name: "ctx-2021",
# capabilities_overlap: ["environment.read_wholesale", "network.http_client"],
# why_relevant: "Historical example of wholesale env access + network transmission",
# caution: "Overlap exists for context. Does not indicate malicious intent."
# }
# ]
Capability Risk Composition Matrix
This describes potential security implications of capability combinations, not verdicts.
| Capabilities | Potential Implication | Why Notable |
|---|---|---|
| environment.read_wholesale + network.http_client | Data exfiltration channel | All env vars accessible + transmission capability |
| process.exec + network.http_client | Remote command execution channel | External input could control commands |
| filesystem.read_sensitive + encoding.base64 + network.http_client | Credential theft channel | Sensitive data + obfuscation + transmission |
| conditional.environment_gated + network.http_client | Selective activation | Behavior varies by environment (CI vs local) |
| phase.install_time + network.http_client | Pre-review execution | Runs before code review, in high-privilege context |
| phase.build_time + filesystem.write | Build-time injection | Can modify artifacts not in source control |
| encoding.base64 + process.eval | Obfuscated code execution | Hidden logic execution |
Note: These describe possibilities, not probabilities or intentions.
Historical Capability Pattern Examples
These are post-hoc explanations, not detection rules.
Example: event-stream (2018)
Capabilities observed:
environment.read_single(npm_package_description)conditional.environment_gatedcrypto.decryptphase.runtime
Use of this example: Illustrates that environment-gated execution can enable targeted attacks. Does NOT mean all env-gated code is malicious.
Example: ua-parser-js (2021)
Capabilities observed:
conditional.platform_gated(process.platform)process.spawnphase.install_time
Use of this example: Shows install-time + platform-gating pattern. Does NOT mean install hooks indicate compromise.
Example: ctx/phpass (2021)
Capabilities observed:
environment.read_wholesale(os.environ)encoding.base64network.http_clientphase.install_time(setup.py)
Use of this example: Demonstrates wholesale env + encoding + network pattern. Does NOT make this combination automatically suspicious.
Example: XZ Utils (CVE-2024-3094, 2024)
Capabilities observed:
phase.build_time(injection in release tarball, not git)conditional.environment_gated(SSH + systemd context)filesystem.write(binary blobs)- Long-term social engineering
Use of this example: Illustrates build-time vs source-time capability divergence. Does NOT mean all build scripts are suspect.
False Positive Awareness
Benign code often has security-relevant capabilities:
Telemetry/Analytics
Capabilities: network.http_client + conditional.environment_gated
Benign when: Documented, opt-out available, analytics domain matches package
Check: Is DISABLE_ANALYTICS respected? Is domain in README?
Update Checks
Capabilities: network.http_client Benign when: Checking version only, not sending user data Check: Is request to package registry? Is response only version info?
License Validation
Capabilities: network.http_client + environment.read_single Benign when: Commercial package, license endpoint documented Check: Is package commercial? Is validation endpoint disclosed?
Handling Obfuscated Code
Malicious code is often heavily obfuscated to evade analysis. This framework includes strategies for analyzing obfuscated code.
Obfuscation Indicators
- Hex-encoded function names (
_0x4e9bf4,_0x112fa8) - Large arrays of encoded strings
- Self-modifying code patterns
- Computed property access (
window[_0x4e9bf4(0x174)]) - Nested function calls with numeric offsets
- Unusual arithmetic expressions as array indices
De-Obfuscation Strategy
When encountering obfuscated code:
Identify String Arrays: Look for large arrays containing encoded strings
- Often named
_0xNNNNor similar patterns - Usually defined at module/function scope
- Often named
Find Decoder Functions: Locate functions that map indices to strings
- Pattern:
function _0xNNNN(index) { return array[index - offset]; } - May include string transformations (base64, rot13, etc.)
- Pattern:
Trace High-Value API Calls: Focus on capability-relevant APIs even if obfuscated
- Look for patterns like
window[...](DOM access) - Network APIs:
fetch,XMLHttpRequest,.get,.post,.send - Crypto APIs: wallet-related strings in arrays
- Environment:
process,env, global object access
- Look for patterns like
Extract String Literals: Analyze string array contents
- Cryptocurrency addresses (bc1, 0x, etc.)
- Domain names and URLs
- API endpoint patterns
- Wallet-related terms (ethereum, solana, bitcoin)
Infer Capabilities from Context: Even without full de-obfuscation
window[encoded](encoded_method)→ likely DOM/browser API- Conditional checks + network → environment-gated behavior
- Large encoded arrays + network → likely data exfiltration
Obfuscated Code Analysis Workflow
1. Identify obfuscation pattern (array + decoder function)
↓
2. Extract string array contents (literal strings)
↓
3. Search for security-relevant keywords:
- wallet, ethereum, solana, bitcoin, crypto
- fetch, XMLHttpRequest, request, http
- window, document, navigator
- process.env, os.environ
↓
4. Map API patterns to capabilities:
- window.ethereum → credential_access (wallet interaction)
- fetch/XHR → network.http_client
- Conditionals → conditional.environment_gated
↓
5. Describe capabilities with:
- confidence: LOW/MEDIUM (due to obfuscation)
- requires_dynamic_resolution: true
- evidence: String literals found in array
Example: Obfuscated Wallet Stealer
const _0x112fa8=_0x180f;
function _0x180f(_0x240418,_0xdfe6b8){
const _0x3b4f1d=_0x550a();
return _0x3b4f1d[_0x240418-0x100];
}
function _0x550a(){
return ['ethereum','solana','bitcoin','fetch','send'];
}
typeof window[_0x112fa8(0x100)]!='undefined'?checkWallet():skip();
Capabilities identified (even without full de-obfuscation):
network.http_client(confidence: MEDIUM) - 'fetch', 'send' in string arraycredential_access(confidence: MEDIUM) - 'ethereum', 'solana', 'bitcoin' + window accessconditional.environment_gated(confidence: HIGH) - typeof check for windowrequires_dynamic_resolution: true- Obfuscated control flow
Evidence: Lines where string array contains wallet-related terms, lines where window[encoded] pattern appears
Confidence Levels for Obfuscated Code
- HIGH confidence: When string literals directly indicate capabilities (e.g., "https://evil.com" in array)
- MEDIUM confidence: When API patterns are recognizable despite obfuscation
- LOW confidence: When only structural patterns suggest capabilities
Always mark: requires_dynamic_resolution: true for heavily obfuscated code
Analysis Workflow
Extract capabilities (use
extract_capabilitiestool)- Get diff-attributed inventory
- Note phase, origin, confidence for each
Compute capability delta
- Focus on
origin: "added" - Background context:
origin: "preexisting"
- Focus on
Describe affordances (use counterfactual framework)
- What becomes possible that wasn't before?
- How do capabilities compose?
- What phase do they execute in?
Assess intent alignment (compare to package purpose)
- Do capabilities match stated purpose?
- Is there an alignment gap?
(Optional) Check execution paths (use
analyze_execution_paths)- How might capabilities connect?
- What data flows are possible?
(Optional) Find examples (use
search_capability_examples)- Has overlap occurred before?
- What context do historical cases provide?
Render analysis (describe, don't classify)
- Enumerate capabilities with evidence
- Describe potential implications
- State confidence and uncertainty
- Provide context
Output Format
Your analysis should describe what is, not what it means:
✅ Good Output Format
Capability Delta:
- Added: network.http_client (line 15, hunk 2, confidence: HIGH, phase: import_time)
- Added: environment.read_wholesale (line 12, hunk 2, confidence: HIGH, phase: runtime)
- Added: conditional.environment_gated (line 11, hunk 2, confidence: HIGH, condition: process.env.CI)
Evidence Spans:
- File: main.py, hunk 2, lines 11-15 (new code added in this version)
Composition:
- Observed path: env_read → network (lines 12-15)
- Conditional: Only when process.env.CI is truthy
- Phase: import_time network setup, runtime execution
Affordance Description:
- This combination creates a channel for environment variable transmission
- Activation is selective (CI environments only)
- All environment variables are accessible (wholesale access)
- Executes when package is imported (phase.import_time for network import)
Intent Alignment:
- Package purpose: "Simple date formatting utility"
- Implied capabilities: String manipulation, date parsing
- Observed capabilities: Network transmission, environment access
- Gap: Network and environment capabilities not implied by "date formatting"
Uncertainty:
- Confidence (observation): HIGH - Direct evidence in added hunks
- Requires dynamic resolution: false - All capabilities statically observable
- Alternative interpretation: None identified
Historical Context (from search_capability_examples):
- Overlap exists with ctx-2021: env.read_wholesale + network.http_client
- Caution: Overlap provides context only, not classification
❌ Bad Output Format
VERDICT: CRITICAL RISK - Malicious credential theft detected
CONFIDENCE: 95%
Pattern match: ctx attack (similarity: 0.95)
BLOCK DEPLOYMENT
Key Constraints
- No autonomous conclusions: Tools surface data, YOU reason
- No risk scoring: Describe implications, don't score them
- No classification: Enumerate capabilities, don't label "malicious/benign"
- Pattern examples are explanatory: Historical overlap provides context, not verdicts
- Confidence is about observation: How certain are you about what code does, not what it "is"
- Diff-scoped attribution: Link capabilities to specific hunks where possible
- Phase-aware analysis: Always note when code executes (install/import/build/runtime)
- Context budget: Log any escalation beyond default context
This Framework Defines Your Dataset Labels
Direct mapping to annotation schema:
capability_delta[]- List of added/removed capabilitiestrigger_surface[]- Conditional execution patternsphase_delta[]- Changes in execution phasealignment_gap- Qualitative intent mismatch descriptionconfidence_obs- HIGH/MEDIUM/LOW per capabilityevidence_span- Localization for each capabilityrequires_dynamic_resolution- Static/dynamic analysis boundary
This Framework Is
✅ A capability vocabulary
✅ A reasoning scaffold
✅ An annotation ontology
✅ A dataset labeling schema
✅ A reviewer-legible explanation layer
This Framework Is NOT
❌ A malware detector ❌ A rules engine ❌ A source of truth ❌ A substitute for reasoning ❌ A pattern matching system