Background · 6 Reverse Engineering, Binary & Malware

Reverse Engineering Fundamentals.

Quick-reference for disassemblers, debuggers, and the signatures to look for first — including an ARM reference for analysts moving from x86 and the modern browser as an attack surface.

Tool selection

Ghidra. Free, decompiler-first workflow, scriptable in Python/Java. Best default for malware analysis and CTF.
IDA Pro. Commercial, gold standard for large binaries, best graph view, FLIRT signatures for library identification. Hex-Rays decompiler still tops Ghidra's on complex code.
Binary Ninja. Affordable, ILs (LLIL/MLIL/HLIL) are excellent for analysis automation. Headless API for batch processing.
radare2 / rizin. Free, scriptable, CLI-first. Steep learning curve. Use when you need to script per-instruction analysis over thousands of binaries.
x64dbg. Free Windows debugger. Default for unpacking malware on Windows.
WinDbg. Microsoft's debugger. Required for kernel debugging, dump analysis. Modern WinDbg has reasonable UI now.
gdb + GEF or pwndbg. Linux debugger with security-focused extensions. Standard for binary CTFs.
rr. Time-travel debugger for Linux. Record once, run forwards and backwards. Game-changer for hard bugs.

First-look triage of an unknown binary

File type. file unknown.bin → format + arch.
Section anomalies. Ghidra/IDA section listing. .text very small + huge .data section = packer. Section name mismatch (e.g., UPX0/UPX1) = packed.
Imports. nm -D / dumpbin /imports. CryptDecrypt + VirtualAlloc + WriteProcessMemory = unpacker stub. Curl + json + base64 = exfil. Minimal imports + LoadLibrary/GetProcAddress = dynamic resolution to hide intent.
Strings. strings -n 7 for ASCII; strings -el for UTF-16. URLs, file paths, error messages, version strings.
Entropy. binwalk -E. Sections at >7.0 entropy = encrypted/compressed. Packed binary will show a low-entropy stub at the entry point then a high-entropy region it unpacks.
Packer signatures. Detect-It-Easy / PEiD. UPX → upx -d unpacks itself. Themida/VMProtect = expect manual unpacking weeks.
Anti-debug telltales. IsDebuggerPresent, CheckRemoteDebuggerPresent, PEB BeingDebugged read, INT 3 scan over own code, timing checks (RDTSC delta).

ARM for x86 analysts

Registers. ARM64: x0–x30 (x30=LR), SP, PC. ARM32: r0–r12, r13=SP, r14=LR, r15=PC. Calling convention: x0–x7 args (ARM64), r0–r3 (ARM32). Return in x0/r0.
Endianness. ARM default little-endian; some embedded big-endian. Confirm with header.
Common patterns.
- stp x29, x30, [sp, #-16]! — function prologue (save FP + LR).
- ldp x29, x30, [sp], #16 — epilogue.
- bl func — call (branch-and-link, sets LR).
- ret — return (branch to LR).
Syscalls. Linux ARM64: syscall number in x8, args in x0–x5, svc #0. macOS ARM64 uses different syscall conventions; consult /usr/share/man/man2.
PAC (Pointer Authentication). ARMv8.3+. Return addresses signed before push, verified before use. Bypass: signing gadget (rare), forge via leak of signing key (impossible without kernel), pivot away from signed returns to BR/BLR-style indirect (BTI may also be enabled).

Browser as attack surface

Process model. Chrome: one browser process + N renderer processes (sandboxed) + GPU + network process. V8 engine in renderer. Firefox similar with content processes. Safari multi-process with WebKit.
Sandbox boundary. Renderer can't open files, network sockets directly; talks to broker via IPC (Mojo in Chrome). Bypass = sandbox escape via IPC vuln or via a browser-process-side bug.
JIT engines. V8 TurboFan, JavaScriptCore FTL, SpiderMonkey IonMonkey. Common vuln class: type confusion via incorrect optimization assumption. JIT-spray for code-cache write/exec primitive.
Mitigations. Site isolation (Chrome) — separate process per origin to defeat Spectre-class leaks. CFI for indirect calls. JIT-isolation (renderer can't read/write JIT pages).
Historic CVE clusters. V8 TurboFan bounds-check elimination (CVE-2018-17463, CVE-2020-6418). WebKit JS bounds-check (CVE-2021-30858). Use-after-free in DOM event-handler tear-down.

Rule of thumbFor malware analysis, run the binary in a snapshot VM with Wireshark + Sysmon first, then reverse what you saw. Pure-static analysis on packed/obfuscated samples takes 10x longer than dynamic-then-static. For exploit analysis, the reverse — dynamic only after you understand the static structure.

Related notes in this domain

From reference to evidence