Foreword |
|
xvii | |
|
Preface |
|
xxi | |
Acknowledgments |
|
xxiii | |
Introduction |
|
1 | (1) |
|
What Is Binary Analysis, and Why Do You Need It? |
|
|
2 | (1) |
|
What Makes Binary Analysis Challenging? |
|
|
3 | (1) |
|
Who Should Read This Book? |
|
|
4 | (1) |
|
|
4 | (1) |
|
|
5 | (1) |
|
Instruction Set Architecture |
|
|
6 | (1) |
|
|
6 | (1) |
|
Binary Format and Development Platform |
|
|
6 | (1) |
|
Code Samples and Virtual Machine |
|
|
7 | (1) |
|
|
8 | (2) |
|
|
10 | (1) |
|
|
11 | (20) |
|
1.1 The C Compilation Process |
|
|
12 | (6) |
|
1.1.1 The Preprocessing Phase |
|
|
12 | (2) |
|
1.1.2 The Compilation Phase |
|
|
14 | (2) |
|
|
16 | (1) |
|
|
17 | (1) |
|
1.2 Symbols and Stripped Binaries |
|
|
18 | (3) |
|
1.2.1 Viewing Symbolic Information |
|
|
18 | (2) |
|
1.2.2 Another Binary Turns to the Dark Side: Stripping a Binary |
|
|
20 | (1) |
|
1.3 Disassembling a Binary |
|
|
21 | (6) |
|
1.3.1 Looking Inside an Object File |
|
|
21 | (2) |
|
1.3.2 Examining a Complete Binary Executable |
|
|
23 | (4) |
|
1.4 Loading and Executing a Binary |
|
|
27 | (2) |
|
|
29 | (1) |
|
|
29 | (2) |
|
|
31 | (26) |
|
2.1 The Executable Header |
|
|
33 | (5) |
|
|
34 | (1) |
|
2.1.2 The e_jype, e_machine, and e_version Fields |
|
|
35 | (1) |
|
|
36 | (1) |
|
2.1.4 The e_phoff and e_shoff Fields |
|
|
36 | (1) |
|
|
36 | (1) |
|
|
37 | (1) |
|
2.1.7 The e_*entsize and e_*num Fields |
|
|
37 | (1) |
|
2.1.8 The e_shstrndx Field |
|
|
37 | (1) |
|
|
38 | (3) |
|
|
39 | (1) |
|
|
39 | (1) |
|
|
40 | (1) |
|
2.2.4 The sh_addr, sh_offset, and sh_size Fields |
|
|
40 | (1) |
|
|
40 | (1) |
|
|
41 | (1) |
|
2.2.7 The sh_addralign Field |
|
|
41 | (1) |
|
2.2.8 The sh_entsize Field |
|
|
41 | (1) |
|
|
41 | (11) |
|
2.3.1 The .init and .fini Sections |
|
|
43 | (1) |
|
|
43 | (1) |
|
2.3.3 The .bss, data, and .rodata Sections |
|
|
44 | (1) |
|
2.3.4 Lazy Binding and the .pit, .got, and .got.plt Sections |
|
|
45 | (3) |
|
2.3.5 The .rel.* and .rela.* Sections |
|
|
48 | (2) |
|
2.3.6 The dynamic Section |
|
|
50 | (1) |
|
2.3.7 The .init_array and .fini_array Sections |
|
|
51 | (1) |
|
2.3.8 The .shstrtab, .symtab, .strtab, .dynsym, and .dynstr Sections |
|
|
52 | (1) |
|
|
52 | (3) |
|
|
54 | (1) |
|
|
54 | (1) |
|
2.4.3 The p_offset, p_vaddr, p_paddr, pjilesz, and p_memsz Fields |
|
|
54 | (1) |
|
|
55 | (1) |
|
|
55 | (1) |
|
|
3 | (54) |
|
3 The Pe Format: A Brief Introduction |
|
|
57 | (10) |
|
3.1 The MS-DOS Header and MS-DOS Stub |
|
|
58 | (1) |
|
3.2 The PE Signature, File Header, and Optional Header |
|
|
58 | (4) |
|
|
61 | (1) |
|
|
61 | (1) |
|
3.2.3 The PE Optional Header |
|
|
62 | (1) |
|
3.3 The Section Header Table |
|
|
62 | (1) |
|
|
63 | (2) |
|
3.4.1 The .edata and .idata Sections |
|
|
64 | (1) |
|
3.4.2 Padding in PE Code Sections |
|
|
64 | (1) |
|
|
65 | (1) |
|
|
65 | (2) |
|
4 Building A Binary Loader Using Libbfd |
|
|
67 | (22) |
|
|
68 | (1) |
|
4.2 A Simple Binary-Loading Interface |
|
|
68 | (4) |
|
|
71 | (1) |
|
|
71 | (1) |
|
|
71 | (1) |
|
4.3 Implementing the Binary Loader |
|
|
72 | (11) |
|
4.3.1 Initializing libbfd and Opening a Binary |
|
|
73 | (2) |
|
4.3.2 Parsing Basic Binary Properties |
|
|
75 | (3) |
|
|
78 | (3) |
|
|
81 | (2) |
|
4.4 Testing the Binary Loader |
|
|
83 | (2) |
|
|
85 | (1) |
|
|
86 | (3) |
|
PART II BINARY ANALYSIS FUNDAMENTALS |
|
|
|
5 Basic Binary Analysis In Linux |
|
|
89 | (26) |
|
5.1 Resolving Identity Crises Using file |
|
|
90 | (3) |
|
5.2 Using Idd to Explore Dependencies |
|
|
93 | (1) |
|
5.3 Viewing File Contents with xxd |
|
|
94 | (2) |
|
5.4 Parsing the Extracted ELF with reade If |
|
|
96 | (3) |
|
5.5 Parsing Symbols with nm |
|
|
99 | (3) |
|
5.6 Looking for Hints with strings |
|
|
102 | (2) |
|
5.7 Tracing System Calls and Library Calls with strace and Itrace |
|
|
104 | (5) |
|
5.8 Examining Instruction-Level Behavior Using objdump |
|
|
109 | (2) |
|
5.9 Dumping a Dynamic String Buffer Using gdb |
|
|
111 | (2) |
|
|
113 | (1) |
|
|
113 | (2) |
|
6 Disassembly And Binary Analysis Fundamentals |
|
|
115 | (40) |
|
|
116 | (6) |
|
|
117 | (1) |
|
6.1.2 Recursive Disassembly |
|
|
118 | (4) |
|
|
122 | (7) |
|
6.2.1 Example: Tracing a Binary Execution with gdb |
|
|
122 | (3) |
|
6.2.2 Code Coverage Strategies |
|
|
125 | (4) |
|
6.3 Structuring Disassembled Code and Data |
|
|
129 | (12) |
|
|
129 | (7) |
|
|
136 | (2) |
|
|
138 | (1) |
|
6.3.4 Intermediate Representations |
|
|
139 | (2) |
|
6.4 Fundamental Analysis Methods |
|
|
141 | (11) |
|
6.4.1 Binary Analysis Properties |
|
|
142 | (4) |
|
6.4.2 Control-Flow Analysis |
|
|
146 | (2) |
|
|
148 | (4) |
|
6.5 Effects of Compiler Settings on Disassembly |
|
|
152 | (1) |
|
|
153 | (1) |
|
|
153 | (2) |
|
7 Simple Code Injection Techniques For Elf |
|
|
155 | (36) |
|
7.1 Bare-Metal Binary Modification Using Hex Editing |
|
|
155 | (8) |
|
7.1.1 Observing an Off-by-One Bug in Action |
|
|
156 | (3) |
|
7.1.2 Fixing the Off-by-One Bug |
|
|
159 | (4) |
|
7.2 Modifying Shared Library Behavior Using LD_PRELOAD |
|
|
163 | (6) |
|
7.2.1 A Heap Overflow Vulnerability |
|
|
163 | (2) |
|
7.2.2 Detecting the Heap Overflow |
|
|
165 | (4) |
|
7.3 Injecting a Code Section |
|
|
169 | (6) |
|
7.3.1 Injecting an ELF Section: A High-Level Overview |
|
|
169 | (2) |
|
7.3.2 Using elfinject to Inject an ELF Section |
|
|
171 | (4) |
|
7.4 Calling Injected Code |
|
|
175 | (12) |
|
7.4.1 Entry Point Modification |
|
|
176 | (3) |
|
7.4.2 Hijacking Constructors and Destructors |
|
|
179 | (3) |
|
7.4.3 Hijacking GOT Entries |
|
|
182 | (3) |
|
7.4.4 Hijacking PLT Entries |
|
|
185 | (1) |
|
7.4.5 Redirecting Direct and Indirect Calls |
|
|
186 | (1) |
|
|
187 | (1) |
|
|
187 | (4) |
|
PART III ADVANCED BINARY ANALYSIS |
|
|
|
8 Customizing Disassembly |
|
|
191 | (32) |
|
8.1 Why Write a Custom Disassembly Pass? |
|
|
192 | (4) |
|
8.1.1 A Case for Custom Disassembly: Obfuscated Code |
|
|
192 | (3) |
|
8.1.2 Other Reasons to Write a Custom Disassembler |
|
|
195 | (1) |
|
8.2 Introduction to Capstone |
|
|
196 | (17) |
|
8.2.1 Installing Capstone |
|
|
196 | (2) |
|
8.2.2 Linear Disassembly with Capstone |
|
|
198 | (5) |
|
8.2.3 Exploring the Capstone C API |
|
|
203 | (1) |
|
8.2.4 Recursive Disassembly with Capstone |
|
|
204 | (9) |
|
8.3 Implementing a ROP Gadget Scanner |
|
|
213 | (8) |
|
8.3.1 Introduction to Return-Oriented Programming |
|
|
213 | (2) |
|
8.3.2 Finding ROP Gadgets |
|
|
215 | (6) |
|
|
221 | (1) |
|
|
221 | (2) |
|
|
223 | (42) |
|
9.1 What Is Binary Instrumentation? |
|
|
224 | (2) |
|
9.1.1 Binary Instrumentation APIs |
|
|
224 | (1) |
|
9.1.2 Static vs. Dynamic Binary Instrumentation |
|
|
225 | (1) |
|
9.2 Static Binary Instrumentation |
|
|
226 | (7) |
|
|
227 | (1) |
|
9.2.2 The Trampoline Approach |
|
|
228 | (5) |
|
9.3 Dynamic Binary Instrumentation |
|
|
233 | (4) |
|
9.3.1 Architecture of a DBI System |
|
|
233 | (2) |
|
9.3.2 Introduction to Pin |
|
|
235 | (2) |
|
|
237 | (14) |
|
9.4.1 The Profiler's Data Structures and Setup Code |
|
|
237 | (3) |
|
9.4.2 Parsing Function Symbols |
|
|
240 | (1) |
|
9.4.3 Instrumenting Basic Blocks |
|
|
241 | (2) |
|
9.4.4 Instrumenting Control Flow Instructions |
|
|
243 | (3) |
|
9.4.5 Counting Instructions, Control Transfers, and Syscalls |
|
|
246 | (1) |
|
9.4.6 Testing the Profiler |
|
|
247 | (4) |
|
9.5 Automatic Binary Unpacking with Pin |
|
|
251 | (12) |
|
9.5.1 Introduction to Executable Packers |
|
|
251 | (2) |
|
9.5.2 The Unpacker's Data Structures and Setup Code |
|
|
253 | (2) |
|
9.5.3 Instrumenting Memory Writes |
|
|
255 | (1) |
|
9.5.4 Instrumenting Control-Flow Instructions |
|
|
256 | (1) |
|
9.5.5 Tracking Memory Writes |
|
|
256 | (2) |
|
9.5.6 Detecting the Original Entry Point and Dumping the Unpacked Binary |
|
|
258 | (1) |
|
9.5.7 Testing the Unpacker |
|
|
259 | (4) |
|
|
263 | (1) |
|
|
264 | (1) |
|
10 Principles Of Dynamic Taint Analysis |
|
|
265 | (14) |
|
|
266 | (1) |
|
10.2 DTA in Three Steps: Taint Sources, Taint Sinks, and Taint Propagation |
|
|
266 | (2) |
|
10.2.1 Defining Taint Sources |
|
|
266 | (1) |
|
10.2.2 Defining Taint Sinks |
|
|
267 | (1) |
|
10.2.3 Tracking Taint Propagation |
|
|
267 | (1) |
|
10.3 Using DTA to Detect the Heartbleed Bug |
|
|
268 | (3) |
|
10.3.1 A Brief Overview of the Heartbleed Vulnerability |
|
|
268 | (1) |
|
10.3.2 Detecting Heartbleed Through Tainting |
|
|
269 | (2) |
|
10.4 DTA Design Factors: Taint Granularity, Taint Colors, and Taint Policies |
|
|
271 | (7) |
|
|
271 | (1) |
|
|
272 | (1) |
|
10.4.3 Taint Propagation Policies |
|
|
273 | (1) |
|
10.4.4 Overtainting and Undertainting |
|
|
274 | (1) |
|
10.4.5 Control Dependencies |
|
|
275 | (1) |
|
|
276 | (2) |
|
|
278 | (1) |
|
|
278 | (1) |
|
11 Practical Dynamic Taint Analysis With LIBDFT |
|
|
279 | (30) |
|
|
279 | (4) |
|
11.1.1 Internals of libdft |
|
|
280 | (2) |
|
|
282 | (1) |
|
11.2 Using DTA to Detect Remote Control-Hijacking |
|
|
283 | (13) |
|
11.2.1 Checking Taint Information |
|
|
286 | (2) |
|
11.2.2 Taint Sources: Tainting Received Bytes |
|
|
288 | (2) |
|
11.2.3 Taint Sinks: Checking execve Arguments |
|
|
290 | (1) |
|
11.2.4 Detecting a Control-Flow Hijacking Attempt |
|
|
291 | (5) |
|
11.3 Circumventing DTA with Implicit Flows |
|
|
296 | (1) |
|
11.4 A DTA-Based Data Exfiltration Detector |
|
|
297 | (10) |
|
11.4.1 Taint Sources: Tracking Taint for Open Files |
|
|
299 | (4) |
|
11.4.2 Taint Sinks: Monitoring Network Sends for Data Exfiltration |
|
|
303 | (1) |
|
11.4.3 Detecting a Data Exfiltration Attempt |
|
|
304 | (3) |
|
|
307 | (1) |
|
|
307 | (2) |
|
12 Principles Of Symbolic Execution |
|
|
309 | (24) |
|
12.1 An Overview of Symbolic Execution |
|
|
309 | (12) |
|
12.1.1 Symbolic vs. Concrete Execution |
|
|
310 | (3) |
|
12.1.2 Variants and Limitations of Symbolic Execution |
|
|
313 | (6) |
|
12.1.3 Increasing the Scalability of Symbolic Execution |
|
|
319 | (2) |
|
12.2 Constraint Solving with Z3 |
|
|
321 | (9) |
|
12.2.1 Proving Reachability of an Instruction |
|
|
322 | (3) |
|
12.2.2 Proving Unreachability of an Instruction |
|
|
325 | (1) |
|
12.2.3 Proving Validity of a Formula |
|
|
325 | (2) |
|
12.2.4 Simplifying Expressions |
|
|
327 | (1) |
|
12.2.5 Modeling Constraints for Machine Code with Bitvectors |
|
|
327 | (2) |
|
12.2.6 Solving an Opaque Predicate Over Bitvectors |
|
|
329 | (1) |
|
|
330 | (1) |
|
|
330 | (3) |
|
13 Practical Symbolic Execution With Triton |
|
|
333 | (40) |
|
13.1 Introduction to Triton |
|
|
334 | (1) |
|
13.2 Maintaining Symbolic State with Abstract Syntax Trees |
|
|
335 | (2) |
|
13.3 Backward Slicing with Triton |
|
|
337 | (9) |
|
13.3.1 Triton Header Files and Configuring Triton |
|
|
340 | (1) |
|
13.3.2 The Symbolic Configuration File |
|
|
340 | (2) |
|
13.3.3 Emulating Instructions |
|
|
342 | (1) |
|
13.3.4 Setting Triton's Architecture |
|
|
343 | (1) |
|
13.3.5 Computing the Backward Slice |
|
|
344 | (2) |
|
13.4 Using Triton to Increase Code Coverage |
|
|
346 | (9) |
|
13.4.1 Creating Symbolic Variables |
|
|
348 | (1) |
|
13.4.2 Finding a Model for a New Path |
|
|
348 | (4) |
|
13.4.3 Testing the Code Coverage Tool |
|
|
352 | (3) |
|
13.5 Automatically Exploiting a Vulnerability |
|
|
355 | (15) |
|
13.5.1 The Vulnerable Program |
|
|
356 | (3) |
|
13.5.2 Finding the Address of the Vulnerable Call Site |
|
|
359 | (2) |
|
13.5.3 Building the Exploit Generator |
|
|
361 | (6) |
|
13.5.4 Getting a Root Shell |
|
|
367 | (3) |
|
|
370 | (1) |
|
|
370 | (3) |
|
|
|
A A Crash Course On X86 Assembly |
|
|
373 | (18) |
|
A.1 Layout of an Assembly Program |
|
|
374 | (2) |
|
A.1.1 Assembly Instructions, Directives, Labels, and Comments |
|
|
374 | (1) |
|
A.1.2 Separation Between Code and Data |
|
|
375 | (1) |
|
A.1.3 AT&T vs. Intel Syntax |
|
|
376 | (1) |
|
A.2 Structure of an x86 Instruction |
|
|
376 | (4) |
|
A.2.1 Assembly-Level Representation of x86 Instructions |
|
|
376 | (1) |
|
A.2.2 Machine-Level Structure of x86 Instructions |
|
|
376 | (1) |
|
|
377 | (2) |
|
|
379 | (1) |
|
|
380 | (1) |
|
A.3 Common x86 Instructions |
|
|
380 | (3) |
|
A.3.1 Comparing Operands and Setting Status Flags |
|
|
382 | (1) |
|
A.3.2 Implementing System Calls |
|
|
382 | (1) |
|
A.3.3 Implementing Conditional Jumps |
|
|
382 | (1) |
|
A.3.4 Loading Memory Addresses |
|
|
383 | (1) |
|
A.4 Common Code Constructs in Assembly |
|
|
383 | (8) |
|
|
383 | (1) |
|
A.4.2 Function Calls and Function Frames |
|
|
384 | (4) |
|
A.4.3 Conditional Branches |
|
|
388 | (1) |
|
|
389 | (2) |
|
B Implementing Pt_Note Overwriting Using Libelf |
|
|
391 | (22) |
|
|
392 | (1) |
|
B.2 Data Structures Used in elfinject |
|
|
392 | (1) |
|
|
393 | (4) |
|
B.4 Getting the Executable Header |
|
|
397 | (1) |
|
B.5 Finding the PT_NOTE Segment |
|
|
398 | (1) |
|
B.6 Injecting the Code Bytes |
|
|
399 | (1) |
|
B.7 Aligning the Load Address for the Injected Section |
|
|
400 | (1) |
|
B.8 Overwriting the note.ABI-tag Section Header |
|
|
401 | (5) |
|
B.9 Setting the Name of the Injected Section |
|
|
406 | (2) |
|
B.10 Overwriting the PT_NOTE Program Header |
|
|
408 | (2) |
|
B.11 Modifying the Entry Point |
|
|
410 | (3) |
|
C List Of Binary Analysis Tools |
|
|
413 | (4) |
|
|
413 | (2) |
|
|
415 | (1) |
|
C.3 Disassembly Frameworks |
|
|
415 | (1) |
|
C.4 Binary Analysis Frameworks |
|
|
416 | (1) |
|
|
417 | (4) |
|
D.1 Standards and References |
|
|
417 | (1) |
|
|
418 | (2) |
|
|
420 | (1) |
Index |
|
421 | |