| name | idalib-analysis |
| description | Analyze binaries using IDA Pro's Python API (idalib) in headless mode. Use when examining program structure, functions, disassembly, cross-references, or strings without the GUI. |
IDA Pro Headless Analysis with idalib
Use this skill to analyze binary files with IDA Pro's Python API in headless mode.
Setup
First, ensure IDA Pro is installed by running:
$CLAUDE_PROJECT_DIR/.claude/skills/idalib-analysis/scripts/install-ida.sh
Wait for the script to complete before proceeding. This may take a few minutes on first run.
Use the IDA Domain API
Always prefer the IDA Domain API over the legacy low-level IDA Python SDK. The Domain API provides a clean, Pythonic interface that is easier to use and understand.
Documentation Resources
| Resource | URL |
|---|---|
| LLM-optimized overview | https://ida-domain.docs.hex-rays.com/llms.txt |
| Getting Started | https://ida-domain.docs.hex-rays.com/getting_started/index.md |
| Examples | https://ida-domain.docs.hex-rays.com/examples/index.md |
| API Reference | https://ida-domain.docs.hex-rays.com/ref/{module}/index.md |
Available API modules: bytes, comments, database, entries, flowchart, functions, heads, hooks, instructions, names, operands, segments, signature_files, strings, types, xrefs
To fetch specific API documentation, use URLs like:
https://ida-domain.docs.hex-rays.com/ref/functions/index.md- Function analysis APIhttps://ida-domain.docs.hex-rays.com/ref/xrefs/index.md- Cross-reference APIhttps://ida-domain.docs.hex-rays.com/ref/strings/index.md- String analysis API
Opening a Database
from ida_domain import Database
from ida_domain.database import IdaCommandOptions
# Open with auto-analysis enabled and save database for faster subsequent runs
ida_options = IdaCommandOptions(auto_analysis=True, new_database=False)
with Database.open("path/to/binary", ida_options, save_on_close=True) as db:
# Your analysis here
pass
# Database is automatically closed and saved
Key Database Properties
with Database.open(path, ida_options) as db:
db.minimum_ea # Start address
db.maximum_ea # End address
db.metadata # Database metadata
db.architecture # Target architecture
db.functions # All functions (iterable)
db.strings # All strings (iterable)
db.segments # Memory segments
db.names # Symbols and labels
db.entries # Entry points
db.types # Type definitions
db.comments # All comments
db.xrefs # Cross-reference utilities
db.bytes # Byte manipulation
db.instructions # Instruction access
Common Analysis Tasks
List functions:
for func in db.functions:
name = db.functions.get_name(func)
print(f"{hex(func.start_ea)}: {name} ({func.size} bytes)")
Get function disassembly and pseudocode:
func = next(f for f in db.functions if db.functions.get_name(f) == "main")
for line in db.functions.get_disassembly(func):
print(line)
# Pseudocode requires Hex-Rays decompiler license - handle gracefully
try:
for line in db.functions.get_pseudocode(func):
print(line)
except RuntimeError as e:
print(f"Decompilation unavailable: {e}")
Find strings:
for s in db.strings:
print(f"{hex(s.address)}: {s}")
Cross-references:
# References TO an address
for xref in db.xrefs.to_ea(target_addr):
print(f"Referenced from {hex(xref.from_ea)} (type: {xref.type.name})")
# References FROM an address
for xref in db.xrefs.from_ea(source_addr):
print(f"References {hex(xref.to_ea)}")
# Specific xref types
for xref in db.xrefs.calls_to_ea(func_addr):
print(f"Called from {hex(xref.from_ea)}")
Read bytes:
byte_val = db.bytes.get_byte_at(addr)
dword_val = db.bytes.get_dword_at(addr)
disasm = db.bytes.get_disassembly_at(addr)
Analysis Methodology
Write and execute small, focused scripts rather than reading large amounts of data from the binary. This approach is more efficient and produces better results:
- Form a hypothesis about what you're looking for
- Design a script to gather the minimum data needed to test the hypothesis
- Execute the script and analyze the results
- Iterate based on findings
Example: Investigating a suspicious function
Instead of dumping all disassembly, write targeted scripts:
# Script 1: Find functions that reference interesting strings
from ida_domain import Database
from ida_domain.database import IdaCommandOptions
ida_options = IdaCommandOptions(auto_analysis=True, new_database=False)
with Database.open("sample.exe", ida_options, save_on_close=True) as db:
for s in db.strings:
if "password" in str(s).lower():
print(f"\nString at {hex(s.address)}: {s}")
for xref in db.xrefs.to_ea(s.address):
print(f" Referenced from {hex(xref.from_ea)}")
# Script 2: Analyze a specific function found in Script 1
with Database.open("sample.exe", ida_options, save_on_close=True) as db:
target_addr = 0x401234 # Address from previous script
for func in db.functions:
if func.start_ea <= target_addr < func.end_ea:
print(f"Function: {db.functions.get_name(func)}")
print(f"Signature: {db.functions.get_signature(func)}")
# Try pseudocode first (requires Hex-Rays license)
try:
print("\nPseudocode:")
for line in db.functions.get_pseudocode(func):
print(f" {line}")
except RuntimeError:
# Fall back to disassembly if decompiler unavailable
print("\nDisassembly (decompiler unavailable):")
for line in db.functions.get_disassembly(func):
print(f" {line}")
break
Performance Tips
- Enable auto_analysis=True on first open to let IDA analyze the binary
- Use save_on_close=True to persist the analysis database (.idb/.i64)
- Subsequent opens are faster because analysis results are cached in the .idb
- Write focused scripts that gather specific data rather than iterating over everything
Troubleshooting
- Check
/tmp/claude-idalib.logfor installation and setup issues - Database files (.idb/.i64) are created alongside the binary
- If imports fail, verify IDA Pro is installed and IDADIR is set
Decompilation Not Working
Pseudocode/decompilation requires a Hex-Rays decompiler license, which is separate from the IDA Pro base license. If get_pseudocode() or get_microcode() fails with RuntimeError, check the license status:
import ida_hexrays
# Check if decompiler is available
def is_decompiler_available():
"""Check if Hex-Rays decompiler is licensed and available."""
if not ida_hexrays.init_hexrays_plugin():
return False
# Try a test decompilation - MERR_LICENSE (-23) means no license
import ida_funcs
for func_ea in range(db.minimum_ea, db.maximum_ea):
func = ida_funcs.get_func(func_ea)
if func:
hf = ida_hexrays.hexrays_failure_t()
cfunc = ida_hexrays.decompile(func.start_ea, hf)
if cfunc:
return True
# Error code -23 is MERR_LICENSE
if hf.code == -23:
return False
break
return False
Error codes reference:
MERR_LICENSE (-23): No valid Hex-Rays decompiler licenseMERR_ONLY32 (-24): 32-bit decompiler not available (need hexx86 plugin)MERR_ONLY64 (-25): 64-bit decompiler not available (need hexx64 plugin)
Workaround when decompilation is unavailable: Use disassembly analysis instead - the get_disassembly() method always works and provides assembly-level insight.
Exploring the API at Runtime
When the documentation doesn't answer your question, explore the API directly:
import inspect
from ida_domain import Database
from ida_domain.functions import Functions
# List all public methods on a class
for name, method in inspect.getmembers(Functions, predicate=inspect.isfunction):
if not name.startswith('_'):
print(f"{name}: {inspect.signature(method)}")
# Get docstring for a specific method
print(Functions.get_pseudocode.__doc__)
# Within a database context, explore available attributes
with Database.open(path, ida_options) as db:
# List all database properties
print([attr for attr in dir(db) if not attr.startswith('_')])
Legacy API (Avoid)
The legacy idc, idautils, ida_funcs APIs still work but are harder to use. Prefer the Domain API for new analysis scripts. Only use legacy APIs when Domain API doesn't expose needed functionality.