| name | malware-analysis |
| description | Professional malware analysis workflow for PE executables and suspicious files. Triggers on file uploads with requests like "analyze this malware", "analyze this sample", "what does this executable do", "check this file for malware", or any request to examine suspicious files. Performs static analysis, threat intelligence triage, behavioral inference, and produces analyst-grade reports with reasoned conclusions. |
Malware Analysis Skill
This skill produces analyst-grade threat reports — not data dumps. Every conclusion must be backed by evidence and reasoning.
Core Principles
- Evidence-based reasoning: Never state a conclusion without explaining WHY
- Connect the dots: Link indicators to behaviors to capabilities to impact
- Assess confidence: State how confident you are and why
- Actionable output: Reports should enable decisions, not just inform
Analysis Workflow
Step 1: Collect Data
Run all scripts to gather raw data:
# Static analysis - get hashes, PE info, strings, APIs, entropy
python3 scripts/static_analysis.py /path/to/sample -f json > static.json
# Threat intelligence - check reputation across sources
python3 scripts/triage.py -t file /path/to/sample -f json > triage.json
# IOC extraction - extract network/host indicators
python3 scripts/extract_iocs.py /path/to/sample -f json > iocs.json
Step 2: Analyze and Reason (THIS IS THE KEY STEP)
Using the collected data, perform analyst-grade reasoning:
2.1 Threat Intelligence Assessment
Ask yourself:
- Is this sample known? If found in MalwareBazaar/ThreatFox, it's confirmed malware
- What's the VT detection rate?
- 0 detections: New sample, FP, or clean — requires behavioral analysis
- 1-5 detections: Possibly new variant or targeted — suspicious
- 5-15 detections: Confirmed malicious by multiple vendors
- 15+ detections: Well-known malware
- What family is it attributed to? Research that family's typical behavior
- When was it first seen? Recent = active campaign
Always explain your reasoning:
"This sample is identified as RedLine Stealer by MalwareBazaar with 45/70 VT detections. The high detection rate and presence in curated malware repositories confirms this is a known threat, not a false positive."
2.2 Behavioral Analysis from Static Indicators
API Analysis - Map APIs to behaviors:
| API Pattern | Likely Behavior | Reasoning |
|---|---|---|
| VirtualAlloc + VirtualProtect + WriteProcessMemory + CreateRemoteThread | Process Injection | This is the classic injection pattern: allocate memory, make it executable, write code, execute in target |
| CredEnumerate, CryptUnprotectData | Credential Theft | These APIs specifically access Windows credential stores and DPAPI-protected data (browser passwords) |
| InternetOpen + URLDownloadToFile | Downloader | Initializes HTTP and downloads files — classic dropper behavior |
| RegSetValueEx + Run key paths in strings | Persistence | Writing to Run keys ensures execution at startup |
| IsDebuggerPresent, GetTickCount, NtQuerySystemInformation | Anti-Analysis | Multiple evasion checks suggest the malware hides its behavior during analysis |
| CryptEncrypt + file enumeration APIs | Possible Ransomware | Encryption capability combined with file discovery — but could also be secure C2 |
Always explain your reasoning:
"The presence of VirtualAlloc, VirtualProtect, and CreateRemoteThread together strongly suggests process injection capability. Individually these APIs have legitimate uses, but this specific combination is the textbook pattern for injecting code into other processes."
Packing Analysis:
| Indicator | Meaning | Confidence |
|---|---|---|
| Entropy > 7.0 | Compressed/encrypted content | High |
| Section entropy > 7.0 (especially .text) | Packed code section | High |
| UPX0, UPX1, .aspack, .packed sections | Known packer signatures | Very High |
| RWX sections | Self-modifying code | Medium |
| Small import table with GetProcAddress/LoadLibrary only | Dynamic API resolution | High |
If packed, state the implication:
"This sample shows multiple packing indicators (entropy 7.4, UPX sections). The static analysis findings represent the unpacker stub, NOT the actual payload. Dynamic analysis is required to reveal true functionality."
2.3 Capability Assessment
Based on the evidence, determine what the malware CAN DO:
| Capability | Required Evidence | Confidence Level |
|---|---|---|
| Process Injection | 2+ injection APIs | High if 3+, Medium if 2 |
| Credential Theft | Any cred access API | High (these are specific) |
| Keylogging | SetWindowsHookEx | Medium (has legit uses) |
| Network C2 | 2+ network APIs + extracted URLs/IPs | High |
| File Download | URLDownloadToFile or similar | High |
| Persistence | Registry/service APIs + relevant strings | Medium |
| Encryption/Ransomware | Crypto APIs + file enumeration | Medium (needs context) |
State confidence and reasoning:
"Credential Theft Capability: HIGH CONFIDENCE — CryptUnprotectData is present, which specifically decrypts DPAPI-protected data including browser passwords. This API has no legitimate use case in most software."
2.4 Risk Assessment
Determine risk level with justification:
| Risk Level | Criteria |
|---|---|
| CRITICAL | Credential theft APIs, process injection, confirmed malware family known for data theft/ransomware |
| HIGH | Multiple malicious capabilities, network C2, persistence mechanisms |
| MEDIUM | Suspicious indicators but no confirmed malicious capability, or packing hiding true behavior |
| LOW | Few indicators, possibly legitimate software with suspicious patterns |
| UNKNOWN | Insufficient evidence, heavily packed, or no TI hits |
Step 3: Write the Report
Structure your report as follows:
# Threat Analysis Report: [MALWARE_NAME or "Unknown Sample"]
| | |
|---|---|
| **Risk Level** | [CRITICAL/HIGH/MEDIUM/LOW] |
| **Confidence** | [High/Medium/Low] |
| **Analysis Date** | [DATE] |
---
## Executive Summary
[2-3 sentences: What is this? Is it malicious? What can it do? How do we know?]
**Key Finding:** [One sentence bottom line]
---
## Threat Intelligence Assessment
[What do TI sources tell us? Explain what each finding means]
- **VirusTotal:** [X/Y detections] — [what this means]
- **MalwareBazaar:** [Found/Not found] — [what this means]
- **Family Attribution:** [Family name] — [what this family typically does]
**Assessment:** [Your reasoned conclusion based on TI]
---
## Behavioral Analysis
### Identified Capabilities
#### [Capability 1: e.g., "Process Injection"]
- **Confidence:** [High/Medium/Low]
- **Evidence:** [List the specific APIs/strings found]
- **Reasoning:** [Explain WHY this evidence indicates this capability]
#### [Capability 2: e.g., "Credential Theft"]
...
### Packing Assessment
[Is it packed? What does this mean for the analysis?]
### Anti-Analysis Techniques
[What evasion techniques were identified?]
---
## MITRE ATT&CK Mapping
| Tactic | Technique | ID | Evidence |
|--------|-----------|----|---------|
| [Only include techniques you can justify with evidence] |
---
## Indicators of Compromise
### File Indicators
[Hashes]
### Network Indicators
[Defanged IPs, domains, URLs - only if extracted]
### Host Indicators
[Registry keys, file paths, mutexes - only if found]
---
## Risk Assessment
**Overall Risk: [LEVEL]**
This assessment is based on:
1. [Reason 1]
2. [Reason 2]
3. [Reason 3]
**Confidence in Assessment: [High/Medium/Low]**
- [Why this confidence level]
---
## Recommendations
### Immediate Actions
[What should be done RIGHT NOW based on risk level]
### Detection Opportunities
[How to detect this threat]
### Further Analysis Needed
[What questions remain unanswered]
Entropy Interpretation
| Entropy | Meaning |
|---|---|
| 0-1 | Highly structured (empty, repetitive) |
| 4-5 | Plain text, readable strings |
| 5-6 | Compiled code (normal .text section) |
| 6-7 | Compressed data, some obfuscation |
| 7-8 | Encrypted/compressed (PACKED) |
File Signatures
| Bytes | Type |
|---|---|
| 4D 5A (MZ) | PE executable |
| 50 4B (PK) | ZIP/Office document |
| 7F 45 4C 46 | ELF executable |
| D0 CF 11 E0 | OLE/Legacy Office |
| 25 50 44 46 |
Example Analysis Reasoning
BAD (data dump):
"Found APIs: VirtualAlloc, CreateRemoteThread, RegSetValueEx. Entropy: 7.2. VT: 34/70."
GOOD (analyst reasoning):
"This sample demonstrates process injection capability (HIGH CONFIDENCE) based on the presence of VirtualAlloc and CreateRemoteThread. These APIs, when used together, form the classic code injection pattern where memory is allocated in a target process and a thread is created to execute the injected code. The high entropy (7.2) suggests the payload is packed, meaning the observed APIs may belong to the unpacker stub rather than the final payload. The 34/70 VirusTotal detection rate confirms this is recognized malware, with multiple vendors identifying it as a variant of Agent Tesla — an info-stealer known for credential harvesting. Given the injection capability and association with a credential-stealing family, this sample poses a CRITICAL risk to credential security on any system where it executes."
Scripts Reference
static_analysis.py
python3 scripts/static_analysis.py <file> -f [text|json]
Extracts: hashes, file type, PE headers, sections, entropy, imports, strings, suspicious indicators
triage.py
python3 scripts/triage.py <ioc> -f [text|json]
python3 scripts/triage.py -t file <filepath> -f json
python3 scripts/triage.py --status # Check API config
Queries: MalwareBazaar, ThreatFox, URLhaus, VirusTotal, AbuseIPDB
extract_iocs.py
python3 scripts/extract_iocs.py <file> -f [text|json|csv]
Extracts: IPs, domains, URLs, emails, hashes, registry keys, file paths, crypto wallets, mutexes