Preface |
|
xv | |
|
Computer Abstractions and Technology |
|
|
2 | (58) |
|
|
3 | (8) |
|
1.2 Eight Great Ideas in Computer Architecture |
|
|
11 | (2) |
|
|
13 | (3) |
|
|
16 | (8) |
|
1.5 Technologies for Building Processors and Memory |
|
|
24 | (4) |
|
|
28 | (12) |
|
|
40 | (3) |
|
1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors |
|
|
43 | (3) |
|
1.9 Real Stuff. Benchmarking the Intel Core i7 |
|
|
46 | (3) |
|
1.10 Fallacies and Pitfalls |
|
|
49 | (3) |
|
|
52 | (2) |
|
1.12 Historical Perspective and Further Reading |
|
|
54 | (1) |
|
|
54 | (6) |
|
Instructions: Language of the Computer |
|
|
60 | (116) |
|
|
62 | (1) |
|
2.2 Operations of the Computer Hardware |
|
|
63 | (3) |
|
2.3 Operands of the Computer Hardware |
|
|
66 | (7) |
|
2.4 Signed and Unsigned Numbers |
|
|
73 | (7) |
|
2.5 Representing Instructions in the Computer |
|
|
80 | (7) |
|
|
87 | (3) |
|
2.7 Instructions for Making Decisions |
|
|
90 | (6) |
|
2.8 Supporting Procedures in Computer \ lardware |
|
|
96 | (10) |
|
2.9 Communicating with People |
|
|
106 | (5) |
|
2.10 MIPS Addressing for 32-Bit Immediates and Addresses |
|
|
111 | (10) |
|
2.11 Parallelism and Instructions: Synchronization |
|
|
121 | (1) |
|
2.12 Translating and Starting a Program |
|
|
121 | (11) |
|
2.13 A C Sort Example to Put It All Together |
|
|
132 | (9) |
|
2.14 Arrays versus Pointers |
|
|
141 | (4) |
|
2.15 Advanced Material: C Compiling C and Interpreting Java |
|
|
145 | (1) |
|
2.16 Real Stuff: ARMv7 (32-bit) Instructions |
|
|
145 | (4) |
|
2.17 Real Stuff: x86 Instructions |
|
|
149 | (9) |
|
2.18 Real Stuff: ARMv8 (64-bit) Instructions |
|
|
158 | (1) |
|
2.19 Fallacies and Pitfalls |
|
|
159 | (2) |
|
|
161 | (2) |
|
2.21 Historical Perspective and Further Reading |
|
|
163 | (1) |
|
|
164 | (12) |
|
|
176 | (66) |
|
|
178 | (1) |
|
3.2 Addition and Subtraction |
|
|
178 | (5) |
|
|
183 | (6) |
|
|
189 | (7) |
|
|
196 | (26) |
|
3.6 Parallelism and Computer Arithmetic: Subword Parallelism |
|
|
222 | (2) |
|
3.7 Real Stuff: Streaming SIMD Extensions and Advanced Vector Extensions in x86 |
|
|
224 | (1) |
|
3.8 Going Faster: Subword Parallelism and Matrix Multiply |
|
|
225 | (4) |
|
3.9 Fallacies and Pitfalls |
|
|
229 | (3) |
|
|
232 | (4) |
|
3.11 Historical Perspective and Further Reading |
|
|
236 | (1) |
|
|
237 | (5) |
|
|
242 | (130) |
|
|
244 | (4) |
|
4.2 Logic Design Conventions |
|
|
248 | (3) |
|
|
251 | (8) |
|
4.4 A Simple Implementation Scheme |
|
|
259 | (13) |
|
4.5 An Overview of Pipelining |
|
|
272 | (14) |
|
4.6 Pipelined Datapath and Control |
|
|
286 | (17) |
|
4.7 Data Hazards: Forwarding versus Stalling |
|
|
303 | (13) |
|
|
316 | (9) |
|
|
325 | (7) |
|
4.10 Parallelism via Instructions |
|
|
332 | (12) |
|
4.11 Real Stuff: The ARM Cortex-A8 and Intel Core i7 Pipelines |
|
|
344 | (7) |
|
4.12 Going Faster: Instruction-Level Parallelism and Matrix Multiply |
|
|
351 | (3) |
|
4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations |
|
|
354 | (1) |
|
4.14 Fallacies and Pitfalls |
|
|
355 | (1) |
|
|
356 | (1) |
|
4.16 Historical Perspective and Further Reading |
|
|
357 | (1) |
|
|
357 | (15) |
|
Large and Fast: Exploiting Memory Hierarchy |
|
|
372 | (128) |
|
|
374 | (4) |
|
|
378 | (5) |
|
|
383 | (15) |
|
5.4 Measuring and Improving Cache Performance |
|
|
398 | (20) |
|
5.5 Dependable Memory Hierarchy |
|
|
418 | (6) |
|
|
424 | (3) |
|
|
427 | (27) |
|
5.8 A Common Framework for Memory Hierarchy |
|
|
454 | (7) |
|
5.9 Using a Finite-State Machine to, Control a Simple Cache |
|
|
461 | (5) |
|
5.10 Parallelism and Memory Hierarchies: Cache Coherence |
|
|
466 | (4) |
|
5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks |
|
|
470 | (1) |
|
5.12 Advanced Material: Implementing Cache Controllers |
|
|
470 | (1) |
|
5.13 Real Stuff: The ARM Cortex-A8 and Intel Core i7 Memory Hierarchies |
|
|
471 | (4) |
|
5.14 Going Faster: Cache Blocking and Matrix Multiply |
|
|
475 | (3) |
|
5.15 Fallacies and Pitfalls |
|
|
478 | (4) |
|
|
482 | (1) |
|
5.17 Historical Perspective and Further Reading |
|
|
483 | (1) |
|
|
483 | (17) |
|
Parallel Processors from Client to Cloud |
|
|
500 | |
|
|
502 | (2) |
|
6.2 The Difficulty of Creating Parallel Processing Programs |
|
|
504 | (5) |
|
6.3 SISD, MIMD, SIMD, SPMD, and Vector |
|
|
509 | (7) |
|
6.4 Hardware Multithreading |
|
|
516 | (3) |
|
6.5 Multicore and Other Shared Memory Multiprocessors |
|
|
519 | (5) |
|
6.6 Introduction to Graphics Processing Units |
|
|
524 | (7) |
|
6.7 Clusters, Warehouse Scale Computers, and Other Message-Passing Multiprocessors |
|
|
531 | (5) |
|
6.8 Introduction to Multiprocessor Network Topologies |
|
|
536 | (3) |
|
6.9 Communicating to the Outside World: Cluster Networking |
|
|
539 | (1) |
|
6.10 Multiprocessor Benchmarks and Performance Models |
|
|
540 | (10) |
|
6.11 Real Stuff: Benchmarking Intel Core i7 versus NVIDIA Tesla GPU |
|
|
550 | (5) |
|
6.12 Going Faster: Multiple Processors and Matrix Multiply |
|
|
555 | (3) |
|
6.13 Fallacies and Pitfalls |
|
|
558 | (2) |
|
|
560 | (3) |
|
6.15 Historical Perspective and Further Reading |
|
|
563 | (1) |
|
|
563 | |
|
|
|
Assemblers, Linkers, and the SPIM Simulator |
|
|
2 | (1) |
|
|
3 | (7) |
|
|
10 | (8) |
|
|
18 | (1) |
|
|
19 | (1) |
|
|
20 | (2) |
|
A.6 Procedure Call Convention |
|
|
22 | (11) |
|
A.7 Exceptions and Interrupts |
|
|
33 | (5) |
|
|
38 | (2) |
|
|
40 | (5) |
|
A.10 MIPS R2000 Assembly Language |
|
|
45 | (36) |
|
|
81 | (1) |
|
|
82 | |
|
The Basics of Logic Design |
|
|
2 | |
|
|
3 | (1) |
|
B.2 Gates, Truth Tables, and Logic Equations |
|
|
4 | (5) |
|
|
9 | (11) |
|
B.4 Using a Hardware Description Language |
|
|
20 | (6) |
|
B.5 Constructing a Basic Arithmetic Logic Unit |
|
|
26 | (12) |
|
B.6 Faster Addition: Carry Lookahead |
|
|
38 | (10) |
|
|
48 | (2) |
|
B.8 Memory Elements: Flip-Flops, Latches, and Registers |
|
|
50 | (8) |
|
B.9 Memory Elements: SRAMs and DRAMs |
|
|
58 | (9) |
|
B.10 Finite State Machines |
|
|
67 | (5) |
|
B.11 Timing Methodologies |
|
|
72 | (6) |
|
B.12 Field Programmable Devices |
|
|
78 | (1) |
|
|
79 | (1) |
|
|
80 | |
Index |
|
1 | |