Reverse Engineering Fundamentals.
Quick-reference for disassemblers, debuggers, and the signatures to look for first — including an ARM reference for analysts moving from x86 and the modern browser as an attack surface.
Tool selection
- Ghidra. Free, decompiler-first workflow, scriptable in Python/Java. Best default for malware analysis and CTF.
- IDA Pro. Commercial, gold standard for large binaries, best graph view, FLIRT signatures for library identification. Hex-Rays decompiler still tops Ghidra's on complex code.
- Binary Ninja. Affordable, ILs (LLIL/MLIL/HLIL) are excellent for analysis automation. Headless API for batch processing.
- radare2 / rizin. Free, scriptable, CLI-first. Steep learning curve. Use when you need to script per-instruction analysis over thousands of binaries.
- x64dbg. Free Windows debugger. Default for unpacking malware on Windows.
- WinDbg. Microsoft's debugger. Required for kernel debugging, dump analysis. Modern WinDbg has reasonable UI now.
- gdb + GEF or pwndbg. Linux debugger with security-focused extensions. Standard for binary CTFs.
- rr. Time-travel debugger for Linux. Record once, run forwards and backwards. Game-changer for hard bugs.
First-look triage of an unknown binary
- File type.
file unknown.bin→ format + arch. - Section anomalies. Ghidra/IDA section listing. .text very small + huge .data section = packer. Section name mismatch (e.g., UPX0/UPX1) = packed.
- Imports.
nm -D/dumpbin /imports. CryptDecrypt + VirtualAlloc + WriteProcessMemory = unpacker stub. Curl + json + base64 = exfil. Minimal imports + LoadLibrary/GetProcAddress = dynamic resolution to hide intent. - Strings.
strings -n 7for ASCII;strings -elfor UTF-16. URLs, file paths, error messages, version strings. - Entropy.
binwalk -E. Sections at >7.0 entropy = encrypted/compressed. Packed binary will show a low-entropy stub at the entry point then a high-entropy region it unpacks. - Packer signatures. Detect-It-Easy / PEiD. UPX →
upx -dunpacks itself. Themida/VMProtect = expect manual unpacking weeks. - Anti-debug telltales.
IsDebuggerPresent,CheckRemoteDebuggerPresent, PEB BeingDebugged read, INT 3 scan over own code, timing checks (RDTSC delta).
ARM for x86 analysts
- Registers. ARM64: x0–x30 (x30=LR), SP, PC. ARM32: r0–r12, r13=SP, r14=LR, r15=PC. Calling convention: x0–x7 args (ARM64), r0–r3 (ARM32). Return in x0/r0.
- Endianness. ARM default little-endian; some embedded big-endian. Confirm with header.
- Common patterns.
stp x29, x30, [sp, #-16]!— function prologue (save FP + LR).ldp x29, x30, [sp], #16— epilogue.bl func— call (branch-and-link, sets LR).ret— return (branch to LR).
- Syscalls. Linux ARM64: syscall number in x8, args in x0–x5,
svc #0. macOS ARM64 uses different syscall conventions; consult/usr/share/man/man2. - PAC (Pointer Authentication). ARMv8.3+. Return addresses signed before push, verified before use. Bypass: signing gadget (rare), forge via leak of signing key (impossible without kernel), pivot away from signed returns to BR/BLR-style indirect (BTI may also be enabled).
Browser as attack surface
- Process model. Chrome: one browser process + N renderer processes (sandboxed) + GPU + network process. V8 engine in renderer. Firefox similar with content processes. Safari multi-process with WebKit.
- Sandbox boundary. Renderer can't open files, network sockets directly; talks to broker via IPC (Mojo in Chrome). Bypass = sandbox escape via IPC vuln or via a browser-process-side bug.
- JIT engines. V8 TurboFan, JavaScriptCore FTL, SpiderMonkey IonMonkey. Common vuln class: type confusion via incorrect optimization assumption. JIT-spray for code-cache write/exec primitive.
- Mitigations. Site isolation (Chrome) — separate process per origin to defeat Spectre-class leaks. CFI for indirect calls. JIT-isolation (renderer can't read/write JIT pages).
- Historic CVE clusters. V8 TurboFan bounds-check elimination (CVE-2018-17463, CVE-2020-6418). WebKit JS bounds-check (CVE-2021-30858). Use-after-free in DOM event-handler tear-down.
Rule of thumbFor malware analysis, run the binary in a snapshot VM with Wireshark + Sysmon first, then reverse what you saw. Pure-static analysis on packed/obfuscated samples takes 10x longer than dynamic-then-static. For exploit analysis, the reverse — dynamic only after you understand the static structure.
From reference to evidence