Preface |
|
xv | |
1 Computer Abstractions and Technology |
|
2 | (72) |
|
|
3 | (7) |
|
|
10 | (3) |
|
|
13 | (13) |
|
|
26 | (13) |
|
|
39 | (2) |
|
1.6 The Sea Change: The Switch from Uniprocessors to Multiprocessors |
|
|
41 | (3) |
|
1.7 Real Stuff: Manufacturing and Benchmarking the AMD Opteron X4 |
|
|
44 | (7) |
|
1.8 Fallacies and Pitfalls |
|
|
51 | (3) |
|
|
54 | (1) |
|
1.10 Historical Perspective and Further Reading |
|
|
55 | (1) |
|
|
56 | (18) |
2 Instructions: Language of the Computer |
|
74 | (148) |
|
|
76 | (1) |
|
2.2 Operations of the Computer Hardware |
|
|
77 | (3) |
|
2.3 Operands of the Computer Hardware |
|
|
80 | (7) |
|
2.4 Signed and Unsigned Numbers |
|
|
87 | (7) |
|
2.5 Representing Instructions in the Computer |
|
|
94 | (8) |
|
|
102 | (3) |
|
2.7 Instructions for Making Decisions |
|
|
105 | (7) |
|
2.8 Supporting Procedures in Computer Hardware |
|
|
112 | (10) |
|
2.9 Communicating with People |
|
|
122 | (6) |
|
2.10 MIPS Addressing for 32-Bit Immediates and Addresses |
|
|
128 | (9) |
|
2.11 Parallelism and Instructions: Synchronization |
|
|
137 | (2) |
|
2.12 Translating and Starting a Program |
|
|
139 | (10) |
|
2.13 A C Sort Example to Put It All Together |
|
|
149 | (8) |
|
2.14 Arrays versus Pointers |
|
|
157 | (4) |
|
2.15 Advanced Material: Compiling C and Interpreting Java |
|
|
161 | (1) |
|
2.16 Real Stuff: ARM Instructions |
|
|
161 | (4) |
|
2.17 Real Stuff: x86 Instructions |
|
|
165 | (9) |
|
2.18 Fallacies and Pitfalls |
|
|
174 | (2) |
|
|
176 | (3) |
|
2.20 Historical Perspective and Further Reading |
|
|
179 | (1) |
|
|
179 | (43) |
3 Arithmetic for Computers |
|
222 | (76) |
|
|
224 | (1) |
|
3.2 Addition and Subtraction |
|
|
224 | (6) |
|
|
230 | (6) |
|
|
236 | (6) |
|
|
242 | (28) |
|
3.6 Parallelism and Computer Arithmetic: Associativity |
|
|
270 | (2) |
|
3.7 Real Stuff: Floating Point in the x86 |
|
|
272 | (3) |
|
3.8 Fallacies and Pitfalls |
|
|
275 | (5) |
|
|
280 | (3) |
|
3.10 Historical Perspective and Further Reading |
|
|
283 | (1) |
|
|
283 | (15) |
4 The Processor |
|
298 | (152) |
|
|
300 | (3) |
|
4.2 Logic Design Conventions |
|
|
303 | (4) |
|
|
307 | (9) |
|
4.4 A Simple Implementation Scheme |
|
|
316 | (14) |
|
4.5 An Overview of Pipelining |
|
|
330 | (14) |
|
4.6 Pipetined Datapath and Control |
|
|
344 | (19) |
|
4.7 Data Hazards: Forwarding versus Stalling |
|
|
363 | (12) |
|
|
375 | (9) |
|
|
384 | (7) |
|
4.10 Parallelism and Advanced Instruction-Level Parallelism |
|
|
391 | (13) |
|
4.11 Real Stuff: the AMD Opteron X4 (Barcelona) Pipeline |
|
|
404 | (2) |
|
4.12 Advanced Topic: an Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations |
|
|
406 | (1) |
|
4.13 Fallacies and Pitfalls |
|
|
407 | (1) |
|
|
408 | (1) |
|
4.15 Historical Perspective and Further Reading |
|
|
409 | (1) |
|
|
409 | (41) |
5 Large and Fast: Exploiting Memory Hierarchy |
|
450 | (118) |
|
|
452 | (5) |
|
|
457 | (18) |
|
5.3 Measuring and Improving Cache Performance |
|
|
475 | (17) |
|
|
492 | (26) |
|
5.5 A Common Framework for Memory Hierarchies |
|
|
518 | (7) |
|
|
525 | (4) |
|
5.7 Using a Finite-State Machine to Control a Simple Cache |
|
|
529 | (5) |
|
5.8 Parallelism and Memory Hierarchies: Cache Coherence |
|
|
534 | (4) |
|
5.9 Advanced Material: Implementing Cache Controllers |
|
|
538 | (1) |
|
5.10 Real Stuff: the AMD Opteron X4 (Barcelona) and Intel Nehalem Memory Hierarchies |
|
|
539 | (4) |
|
5.11 Fallacies and Pitfalls |
|
|
543 | (4) |
|
|
547 | (1) |
|
5.13 Historical Perspective and Further Reading |
|
|
548 | (1) |
|
|
548 | (20) |
6 Storage and Other I/O Topics |
|
568 | (62) |
|
|
570 | (3) |
|
6.2 Dependability, Reliability, and Availability |
|
|
573 | (2) |
|
|
575 | (5) |
|
|
580 | (2) |
|
6.5 Connecting Processors, Memory, and I/O Devices |
|
|
582 | (4) |
|
6.6 Interfacing I/O Devices to the Processor, Memory, and Operating System |
|
|
586 | (10) |
|
6.7 I/O Performance Measures: Examples from Disk and File Systems |
|
|
596 | (2) |
|
6.8 Designing an I/O System |
|
|
598 | (1) |
|
6.9 Parallelism and I/O: Redundant Arrays of Inexpensive Disks |
|
|
599 | (7) |
|
6.10 Real Stuff: Sun Fire x4150 Server |
|
|
606 | (6) |
|
6.11 Advanced Topics: Networks |
|
|
612 | (1) |
|
6.12 Fallacies and Pitfalls |
|
|
613 | (4) |
|
|
617 | (1) |
|
6.14 Historical Perspective and Further Reading |
|
|
618 | (1) |
|
|
619 | (11) |
7 Multicores, Multiprocessors, and Clusters |
|
630 | |
|
|
632 | (2) |
|
7.2 The Difficulty of Creating Parallel Processing Programs |
|
|
634 | (4) |
|
7.3 Shared Memory Multiprocessors |
|
|
638 | (3) |
|
7.4 Clusters and Other Message-Passing Multiprocessors |
|
|
641 | (4) |
|
7.5 Hardware Multithreading |
|
|
645 | (3) |
|
7.6 SISD, MIMD, SIMD, SPMD, and Vector |
|
|
648 | (6) |
|
7.7 Introduction to Graphics Processing Units |
|
|
654 | (6) |
|
7.8 Introduction to Multiprocessor Network Topologies |
|
|
660 | (4) |
|
7.9 Multiprocessor Benchmarks |
|
|
664 | (3) |
|
7.10 Roofline: A Simple Performance Model |
|
|
667 | (8) |
|
7.11 Real Stuff: Benchmarking Four Multicores Using the Roofline Model |
|
|
675 | (9) |
|
7.12 Fallacies and Pitfalls |
|
|
684 | (2) |
|
|
686 | (2) |
|
7.14 Historical Perspective and Further Reading |
|
|
688 | (1) |
|
|
688 | |
Appendices |
|
|
A Graphics and Computing GPUs |
|
|
A-2 | |
|
|
A-3 | |
|
A.2 GPU System Architectures |
|
|
A-7 | |
|
|
A-12 | |
|
A.4 Multithreaded Multiprocessor Architecture |
|
|
A-25 | |
|
A.5 Parallel Memory System |
|
|
A-36 | |
|
A.6 Floating Point Arithmetic |
|
|
A-41 | |
|
A.7 Real Stuff: The NVIDIA GeForce 8800 |
|
|
A-46 | |
|
A.8 Real Stuff: Mapping Applications to GPUs |
|
|
A-55 | |
|
A.9 Fallacies and Pitfalls |
|
|
A-72 | |
|
|
A-76 | |
|
A.11 Historical Perspective and Further Reading |
|
|
A-77 | |
|
B Assemblers, Linkers, and the SPIM Simulator |
|
|
B-2 | |
|
|
B-3 | |
|
|
B-10 | |
|
|
B-18 | |
|
|
B-19 | |
|
|
B-20 | |
|
B.6 Procedure Call Convention |
|
|
B-22 | |
|
B.7 Exceptions and Interrupts |
|
|
B-33 | |
|
|
B-38 | |
|
|
B-40 | |
|
B.10 MIPS R2000 Assembly Language |
|
|
B-45 | |
|
|
B-81 | |
|
|
B-82 | |
Index |
|
I-1 | |