Preface |
|
xvii | |
Acknowledgements |
|
xxi | |
Author |
|
xxiii | |
|
Part I Fundamental Computer Organisation |
|
|
|
1 Computer and Its Environment |
|
|
3 | (30) |
|
|
3 | (1) |
|
1.2 Computer Organisation and Architecture |
|
|
4 | (1) |
|
1.3 Hardware and Software: An Introductory Concept |
|
|
5 | (2) |
|
1.4 Hardware and Software: Their Roles and Characteristics |
|
|
7 | (1) |
|
1.5 Evolution of Computers: Salient Milestones |
|
|
8 | (18) |
|
1.5.1 The Generation of Computers: Electronic Era |
|
|
8 | (1) |
|
1.5.1.1 Von Neumann Architecture: Stored-Program Concept |
|
|
8 | (2) |
|
1.5.1.2 Second-Generation Systems (1955--1965) |
|
|
10 | (1) |
|
1.5.1.3 Integrated Circuits (ICs) and Moore's Law |
|
|
10 | (1) |
|
1.5.1.4 Third-Generation Systems (1965--1971): The MSI Era |
|
|
11 | (3) |
|
1.5.1.5 Fourth-Generation Systems (1972--1978): The LSI Era |
|
|
14 | (2) |
|
1.5.1.6 Fifth-Generation Systems (1978--1991): The VLSI Era |
|
|
16 | (3) |
|
1.5.1.7 Sixth-Generation Systems (1991--Present): The ULSI Era |
|
|
19 | (7) |
|
1.5.1.8 Grand Challenges: Tomorrow's Microprocessors |
|
|
26 | (1) |
|
1.6 Evolution of Operating System and System Software: Their Roles |
|
|
26 | (3) |
|
1.6.1 Modern Operating Systems |
|
|
27 | (2) |
|
1.7 Genesis of Computer Organisation and Architecture |
|
|
29 | (1) |
|
|
29 | (4) |
|
|
30 | (2) |
|
Suggested References and Websites |
|
|
32 | (1) |
|
2 Computer System Organisation |
|
|
33 | (28) |
|
2.1 Modular Design Levels |
|
|
34 | (3) |
|
|
37 | (20) |
|
2.2.1 The Processor Level |
|
|
37 | (1) |
|
|
38 | (1) |
|
2.2.1.2 Performance and Related Factors |
|
|
39 | (1) |
|
|
40 | (1) |
|
2.2.1.4 Performance Assessment: A Rough Estimation |
|
|
41 | (2) |
|
2.2.1.5 Design Principles: CISC and RISC |
|
|
43 | (1) |
|
2.2.1.6 Speed-Up Approach |
|
|
43 | (1) |
|
2.2.1.7 Performance Measurements |
|
|
44 | (1) |
|
|
45 | (1) |
|
2.2.2.1 Combinational Components |
|
|
45 | (1) |
|
2.2.2.2 Sequential Components |
|
|
46 | (1) |
|
2.2.2.3 General Representation |
|
|
46 | (1) |
|
2.2.2.4 Combinational Circuits |
|
|
47 | (1) |
|
2.2.2.5 Sequential Circuits |
|
|
47 | (7) |
|
2.2.2.6 Tri-State Buffers |
|
|
54 | (1) |
|
|
55 | (1) |
|
2.2.3.1 Basic Memory Components: Latches and Flip-Flops |
|
|
55 | (1) |
|
2.2.4 Genesis of Digital Systems |
|
|
55 | (2) |
|
|
57 | (4) |
|
|
58 | (1) |
|
Suggested References and Websites |
|
|
59 | (2) |
|
3 Processor Basics -- Structure and Function |
|
|
61 | (46) |
|
|
61 | (1) |
|
3.2 Processor (CPU) Organisation |
|
|
62 | (2) |
|
3.2.1 Fundamental Concepts |
|
|
62 | (2) |
|
3.3 Register Organisation |
|
|
64 | (4) |
|
3.3.1 User-Accessible Registers |
|
|
64 | (1) |
|
3.3.2 Control and Status Registers |
|
|
65 | (1) |
|
3.3.3 Register Organisation in Microprocessor: IA-32/64 and MC68000 |
|
|
66 | (1) |
|
3.3.3.1 Motorola MC68000 Series |
|
|
66 | (1) |
|
3.3.3.2 Intel IA-32 Architecture |
|
|
67 | (1) |
|
3.3.3.3 Intel IA-64 Architecture |
|
|
68 | (1) |
|
|
68 | (1) |
|
3.5 Generalized Structure of CPU |
|
|
69 | (1) |
|
3.6 CPU Operation: Instruction Execution |
|
|
70 | (1) |
|
|
71 | (2) |
|
3.7.1 Machine Instruction Elements |
|
|
71 | (1) |
|
3.7.2 Instruction Formats and Design Criteria |
|
|
72 | (1) |
|
|
73 | (1) |
|
3.9 Intel X-86 (IA-32 and IA-64) Data Types |
|
|
74 | (1) |
|
3.10 Types of Instructions and Related Operations |
|
|
74 | (13) |
|
|
75 | (1) |
|
|
75 | (1) |
|
|
76 | (1) |
|
|
77 | (2) |
|
3.10.5 Input/Output (I/O) |
|
|
79 | (1) |
|
3.10.6 Transfer of Control |
|
|
79 | (1) |
|
3.10.6.1 Branch Instructions |
|
|
79 | (1) |
|
3.10.6.2 Skip Instruction |
|
|
80 | (1) |
|
3.10.6.3 Subroutine Call Instruction |
|
|
80 | (3) |
|
|
83 | (1) |
|
3.10.7.1 System Call versus Subroutine Call |
|
|
84 | (1) |
|
3.10.8 Other Operations: IA-32 Instruction Set |
|
|
84 | (1) |
|
3.10.8.1 MMX (Multimedia Extension) Operation |
|
|
85 | (1) |
|
3.10.8.2 Streaming SIMD Extension (SSE) |
|
|
86 | (1) |
|
3.11 Instruction Addressing Scheme |
|
|
87 | (1) |
|
|
88 | (10) |
|
3.13 Intel X-86 Addressing Modes: IA-32 and IA-64 |
|
|
98 | (2) |
|
3.14 Register-Organised CPU |
|
|
100 | (1) |
|
3.14.1 Accumulator-Based CPU (Single Accumulator Organisation) |
|
|
100 | (1) |
|
3.14.2 General Register-Organised CPU (Multiple Register) |
|
|
100 | (1) |
|
3.14.2.1 The Intel IA-32/IA-64 Architecture |
|
|
101 | (1) |
|
3.14.2.2 Register Organisation |
|
|
101 | (1) |
|
3.15 Stack-Organised CPU (A Stack Processor) |
|
|
101 | (2) |
|
3.15.1 Expression Evaluation and Reverse Polish Notation |
|
|
102 | (1) |
|
3.16 Stack-Organised Symbolic LISP Processor |
|
|
103 | (1) |
|
|
103 | (4) |
|
|
104 | (2) |
|
|
106 | (1) |
|
|
107 | (120) |
|
4.1 Memory System Overview |
|
|
108 | (5) |
|
4.1.1 Key Characteristics of the Memory System |
|
|
108 | (5) |
|
|
113 | (2) |
|
4.3 Semiconductor Main Memory |
|
|
115 | (25) |
|
4.3.1 Random-Access Memory (RAM) |
|
|
115 | (2) |
|
|
117 | (1) |
|
|
117 | (1) |
|
|
118 | (3) |
|
4.3.4.1 Schemes For Refreshing DRAM |
|
|
121 | (1) |
|
4.3.5 SRAM vs DRAM: A Rough Comparison |
|
|
121 | (1) |
|
|
122 | (1) |
|
|
122 | (2) |
|
4.3.6.2 2%D (Word-Oriented) Organisation |
|
|
124 | (2) |
|
4.3.7 Advanced DRAM Organisation |
|
|
126 | (1) |
|
4.3.7.1 SDRAMs (Synchronous DRAMs) |
|
|
126 | (2) |
|
|
128 | (1) |
|
4.3.7.3 Rambus DRAM (RDRAM) |
|
|
129 | (2) |
|
4.3.7.4 Cache DRAM (CDRAM) |
|
|
131 | (1) |
|
4.3.8 Other Types of Random-Access Semiconductor Memory |
|
|
132 | (1) |
|
4.3.8.1 ROM (Read-Only Memory) |
|
|
132 | (2) |
|
4.3.8.2 PROM (Programmable ROM) |
|
|
134 | (1) |
|
4.3.8.3 EPROM (Erasable PROM) |
|
|
135 | (1) |
|
4.3.8.4 EEPROMs (Electrically Erasable PROM) |
|
|
136 | (1) |
|
4.3.8.5 Flash Memory (Flash EEPROM) |
|
|
137 | (1) |
|
4.3.8.6 USB Flash Drive: Pen Drive |
|
|
138 | (1) |
|
4.3.9 RAM Module Organisation |
|
|
139 | (1) |
|
4.4 Serial-Access Memory: External Memory |
|
|
140 | (9) |
|
|
140 | (3) |
|
4.4.2 Rotating Memory (Disk) Organisation |
|
|
143 | (1) |
|
4.4.2.1 Read--Write Mechanism |
|
|
144 | (1) |
|
4.4.3 Device Controller: Rotating Memory |
|
|
144 | (1) |
|
|
144 | (2) |
|
4.4.4.1 Commodity Disk Considerations |
|
|
146 | (1) |
|
4.4.4.2 RAID: Redundant Array of Inexpensive Disks |
|
|
146 | (2) |
|
|
148 | (1) |
|
|
148 | (1) |
|
4.5 Optical Memory: External Memory |
|
|
149 | (9) |
|
4.5.1 Compact Disk (CD) Technology |
|
|
149 | (3) |
|
4.5.2 CD-ROM (Compact Disk Read-Only Memory) |
|
|
152 | (2) |
|
4.5.3 CD-Recordable: CD-R (WORM) |
|
|
154 | (1) |
|
4.5.4 CD-Rewritable (CD-RW): Erasable Optical Disk |
|
|
154 | (1) |
|
4.5.5 Digital Versatile Disk (DVD) |
|
|
155 | (3) |
|
4.5.5.1 High-Definition Optical Disk: HD-DVD and Blu-Ray |
|
|
158 | (1) |
|
|
158 | (16) |
|
|
158 | (1) |
|
|
159 | (1) |
|
|
160 | (1) |
|
4.6.4 Types of Virtual Memory |
|
|
160 | (1) |
|
4.6.5 Address Translation Mechanisms |
|
|
161 | (1) |
|
4.6.6 Translation Lookaside Buffer (TLB) |
|
|
162 | (1) |
|
|
163 | (1) |
|
4.6.8 Demand Paging Systems |
|
|
164 | (1) |
|
4.6.9 Page Replacement Principles |
|
|
164 | (1) |
|
4.6.10 Page Replacement Policies |
|
|
165 | (1) |
|
4.6.10.1 Not Recently Used Page Replacement (NRU) |
|
|
166 | (1) |
|
4.6.10.2 First-In-First-Out (FIFO) |
|
|
167 | (1) |
|
4.6.10.3 Least Recently Used Page (LRU) |
|
|
167 | (1) |
|
4.6.10.4 Performance Comparison |
|
|
168 | (1) |
|
|
168 | (1) |
|
4.6.11.1 Pure Segmentation |
|
|
169 | (1) |
|
4.6.11.2 Segmentation with Paging |
|
|
170 | (1) |
|
4.6.11.3 Paged Segmentation in Mainframe (IBM 370/XA) |
|
|
170 | (2) |
|
4.6.11.4 Paged Segmentation in Microprocessor (Intel Pentium) |
|
|
172 | (2) |
|
|
174 | (37) |
|
|
174 | (1) |
|
|
174 | (2) |
|
|
176 | (1) |
|
|
176 | (2) |
|
4.7.5 Cache-Main Memory Hierarchy: Its Performance |
|
|
178 | (1) |
|
|
178 | (1) |
|
4.7.7 Cache Design Issues: Different Elements |
|
|
179 | (1) |
|
|
179 | (1) |
|
|
180 | (1) |
|
|
180 | (9) |
|
4.7.7.4 Cache Initialization |
|
|
189 | (1) |
|
4.7.7.5 Writing into Cache |
|
|
189 | (1) |
|
4.7.7.6 Replacement Policy (Algorithm) |
|
|
190 | (1) |
|
4.7.8 Multiple-Level Caches |
|
|
191 | (1) |
|
4.7.9 Unified Cache and Split Cache |
|
|
192 | (2) |
|
4.7.9.1 Implementation: PENTIUM Cache Organisation |
|
|
194 | (1) |
|
4.7.9.2 Motorola RISC MPC7450 Cache Organisation |
|
|
195 | (1) |
|
|
195 | (1) |
|
4.7.10.1 Physical Address Cache |
|
|
196 | (1) |
|
4.7.10.2 Virtual Address Cache |
|
|
197 | (1) |
|
4.7.11 Miss Rate and Miss Penalty |
|
|
198 | (1) |
|
4.7.11.1 Types of Cache Misses and Reduction Techniques |
|
|
199 | (2) |
|
4.7.11.2 Miss Penalty and Reduction Techniques |
|
|
201 | (1) |
|
4.7.12 Caches in Multiprocessor |
|
|
202 | (1) |
|
|
203 | (1) |
|
4.7.14 Reasons of Coherence Problem |
|
|
204 | (1) |
|
4.7.15 Cache Coherence Problem: Solution Methodologies |
|
|
204 | (1) |
|
4.7.15.1 No Private Cache |
|
|
205 | (1) |
|
4.7.15.2 Software Solution |
|
|
205 | (1) |
|
4.7.15.3 Hardware-Only Solution |
|
|
206 | (3) |
|
4.7.16 Two-Level Memory Performance: Cost Consideration |
|
|
209 | (2) |
|
4.7.17 Memory Hierarchy Design: Size and Cost Consideration |
|
|
211 | (1) |
|
4.8 Interleaved Memory Organisation |
|
|
211 | (5) |
|
|
211 | (1) |
|
4.8.2 Memory Interleaving |
|
|
212 | (1) |
|
4.8.3 Types of Interleaving |
|
|
212 | (2) |
|
4.8.4 Interleaving in Motorola 68040 |
|
|
214 | (1) |
|
|
215 | (1) |
|
4.9 Associative Memory Organisation |
|
|
216 | (4) |
|
|
216 | (1) |
|
|
217 | (1) |
|
4.9.2.1 Word-Organised Associative Memory |
|
|
217 | (3) |
|
|
220 | (7) |
|
|
221 | (5) |
|
Suggested References and Websites |
|
|
226 | (1) |
|
5 Input-Output Organisation |
|
|
227 | (34) |
|
|
228 | (1) |
|
5.2 I/O Module: I/O Interface |
|
|
228 | (2) |
|
|
229 | (1) |
|
5.3 Types of I/O Operations: Definitions and Differences |
|
|
230 | (13) |
|
5.3.1 Programmed I/O (Using Buffer) |
|
|
230 | (1) |
|
5.3.2 Interrupt-Driven I/O |
|
|
231 | (1) |
|
5.3.2.1 Interrupt-Driven I/O: Design Issues |
|
|
231 | (4) |
|
5.3.3 Direct Memory Access (DMA) I/O |
|
|
235 | (1) |
|
|
235 | (1) |
|
|
235 | (1) |
|
5.3.3.3 Essential Features |
|
|
236 | (1) |
|
5.3.3.4 Processing Details |
|
|
237 | (1) |
|
5.3.3.5 Different Transfer Types |
|
|
238 | (1) |
|
5.3.3.6 Implementation Mechanisms: Different Approaches |
|
|
239 | (1) |
|
5.3.4 I/O Processor (I/O Channels) |
|
|
240 | (1) |
|
|
240 | (1) |
|
|
241 | (1) |
|
5.3.4.3 I/O Processor (IOP) And Its Organisation |
|
|
241 | (2) |
|
5.4 Bus, Bus System and Bus Design |
|
|
243 | (11) |
|
|
243 | (1) |
|
|
243 | (1) |
|
|
243 | (1) |
|
5.4.4 Bus Design Parameters |
|
|
244 | (1) |
|
5.4.5 Bus Interfacing: Tri-State Devices |
|
|
244 | (1) |
|
5.4.6 Some Representative Bus Systems of Early Days |
|
|
245 | (1) |
|
5.4.7 PCI (Peripheral Component Interconnect): Local Bus |
|
|
245 | (2) |
|
5.4.8 SCSI (Small Computer System Interface) BUS |
|
|
247 | (1) |
|
5.4.9 Universal Serial Bus (USB) |
|
|
247 | (2) |
|
5.4.10 FireWire Serial Bus |
|
|
249 | (2) |
|
|
251 | (3) |
|
5.5 PORT and Its Different Types |
|
|
254 | (2) |
|
|
254 | (1) |
|
|
255 | (1) |
|
|
255 | (1) |
|
|
256 | (5) |
|
|
257 | (2) |
|
Suggested References and Websites |
|
|
259 | (2) |
|
6 Control Unit: Design and Operation |
|
|
261 | (22) |
|
|
262 | (1) |
|
6.2 Micro-Operations: Fetch Cycle |
|
|
263 | (1) |
|
|
264 | (2) |
|
6.4 Methods of Implementation |
|
|
266 | (15) |
|
|
266 | (2) |
|
6.4.1.1 Control Unit Logic |
|
|
268 | (1) |
|
6.4.1.2 Control Signals in Accumulator-Based CPU |
|
|
268 | (1) |
|
6.4.2 Microprogrammed Control |
|
|
268 | (1) |
|
6.4.2.1 Basic Concepts: Microinstructions |
|
|
268 | (2) |
|
6.4.2.2 Microprogrammed Control Unit Organisation |
|
|
270 | (2) |
|
6.4.2.3 Microinstruction Design Issues |
|
|
272 | (2) |
|
6.4.2.4 Horizontal versus Vertical |
|
|
274 | (1) |
|
|
275 | (1) |
|
6.4.2.6 Addressing Schemes |
|
|
276 | (1) |
|
|
277 | (1) |
|
6.4.2.8 Merits and Drawbacks |
|
|
277 | (1) |
|
6.4.2.9 Application Areas |
|
|
278 | (1) |
|
|
278 | (3) |
|
|
281 | (2) |
|
|
281 | (1) |
|
|
282 | (1) |
|
7 Arithmetic and Logic Unit Organisation |
|
|
283 | (44) |
|
7.1 Numerical Representations: Number Systems |
|
|
284 | (2) |
|
|
284 | (1) |
|
|
284 | (1) |
|
7.1.3 Hexadecimal and Octal System |
|
|
284 | (1) |
|
7.1.3.1 Merits of Hex and Octal Systems |
|
|
285 | (1) |
|
7.1.4 BCD (Binary-Coded Decimal) Code |
|
|
285 | (1) |
|
|
286 | (1) |
|
7.2 Number Representations: Binary Systems |
|
|
286 | (3) |
|
7.2.1 Sign-Magnitude Representation |
|
|
286 | (1) |
|
7.2.2 1's (One's) Complement Representation |
|
|
287 | (1) |
|
7.2.3 2's (Two's) Complement Representation |
|
|
287 | (1) |
|
7.2.3.1 Conversion: Decimal to 2's Complement and Vice Versa |
|
|
288 | (1) |
|
7.3 Addition and Subtraction: Signed Numbers |
|
|
289 | (1) |
|
7.4 Overflow: Integer Arithmetic |
|
|
289 | (2) |
|
|
291 | (1) |
|
7.6 Arithmetic and Logic Unit (ALU) |
|
|
292 | (1) |
|
7.7 Fixed-Point Arithmetic |
|
|
292 | (15) |
|
7.7.1 Addition and Subtraction |
|
|
293 | (1) |
|
|
294 | (1) |
|
|
294 | (1) |
|
|
294 | (1) |
|
7.7.2.1 Carry-Lookahead Adder (CLA) |
|
|
295 | (1) |
|
|
295 | (1) |
|
7.7.2.3 Carry-Save Adder (CSA) |
|
|
295 | (2) |
|
|
297 | (1) |
|
7.7.3.1 Unsigned Integers |
|
|
298 | (1) |
|
7.7.3.2 Signed-Magnitude Numbers |
|
|
299 | (1) |
|
7.7.3.3 Signed-Operand Multiplication |
|
|
299 | (3) |
|
7.7.3.4 Fast Multiplication: Carry-Save Addition |
|
|
302 | (1) |
|
|
302 | (1) |
|
7.7.4.1 Unsigned Integers |
|
|
303 | (4) |
|
7.8 Floating-Point Representation |
|
|
307 | (7) |
|
|
309 | (1) |
|
7.8.2 Range and Precision |
|
|
310 | (2) |
|
7.8.3 IEEE Standard: Binary Floating-Point Representation |
|
|
312 | (1) |
|
7.8.4 Exceptions and Special Values |
|
|
313 | (1) |
|
7.8.5 Floating-Point Representation: Merits and Drawbacks |
|
|
314 | (1) |
|
7.9 Floating-Point Arithmetic |
|
|
314 | (4) |
|
7.9.1 Addition and Subtraction |
|
|
315 | (1) |
|
7.9.1.1 Implementation: Floating-Point Unit |
|
|
316 | (1) |
|
7.9.2 Multiplication and Division |
|
|
317 | (1) |
|
7.9.2.1 Implementation: Floating-Point Multiplication |
|
|
317 | (1) |
|
7.10 Precision Considerations: Guard Bits |
|
|
318 | (2) |
|
|
318 | (1) |
|
7.10.2 Rounding: IEEE Standard |
|
|
318 | (1) |
|
7.10.3 Infinity; NaNs; and Denormalized Numbers: IEEE Standards |
|
|
319 | (1) |
|
7.11 Summary of Floating-Point Numbers |
|
|
320 | (1) |
|
|
321 | (6) |
|
|
321 | (3) |
|
Suggested References and Websites |
|
|
324 | (3) |
|
Part II High-End Processor Organisation |
|
|
|
|
327 | (78) |
|
|
327 | (1) |
|
8.2 Pipeline Approach: Instruction-Level Parallelism |
|
|
328 | (1) |
|
|
328 | (2) |
|
8.4 Linear and Nonlinear (Static and Dynamic) Pipelines |
|
|
330 | (7) |
|
8.4.1 Linear Pipeline: Asynchronous and Synchronous Models |
|
|
331 | (1) |
|
8.4.2 Characteristics and Behaviour: Space-Time |
|
|
331 | (1) |
|
8.4.3 Speed-Up, Efficiency and Throughput |
|
|
332 | (2) |
|
8.4.4 Nonlinear (Dynamic) Pipeline |
|
|
334 | (1) |
|
8.4.4.1 Reservation Table |
|
|
335 | (1) |
|
8.4.4.2 Latency and Collision |
|
|
336 | (1) |
|
|
337 | (21) |
|
8.5.1 Instruction Pipeline |
|
|
337 | (1) |
|
|
338 | (2) |
|
8.5.2 Pipeline Hazards and Solution Methodology |
|
|
340 | (1) |
|
8.5.2.1 Structural Hazard and Solution Approaches |
|
|
340 | (1) |
|
8.5.2.2 Data Hazard (Data Dependency) and Solution Approaches |
|
|
341 | (5) |
|
|
346 | (11) |
|
8.5.3 Arithmetic Pipeline |
|
|
357 | (1) |
|
8.5.3.1 Adder Pipeline Design |
|
|
357 | (1) |
|
8.5.3.2 Multiplication Pipeline Design |
|
|
357 | (1) |
|
8.6 Pipeline Control and Collision-Free Scheduling |
|
|
358 | (5) |
|
8.6.1 Control Scheme: Collision Vectors |
|
|
358 | (2) |
|
|
360 | (2) |
|
8.6.3 Greedy Cycles and Minimum Average Latency (MAL) |
|
|
362 | (1) |
|
8.6.4 Dynamic Pipeline Scheduling |
|
|
363 | (1) |
|
|
363 | (1) |
|
8.7 Superpipeline Architecture |
|
|
363 | (2) |
|
8.7.1 Superpipeline Performance |
|
|
365 | (1) |
|
8.8 Superscalar Architecture |
|
|
365 | (6) |
|
8.8.1 Requirements and Essential Components |
|
|
367 | (1) |
|
8.8.2 Multipipeline Scheduling |
|
|
368 | (1) |
|
8.8.3 Superscalar Performance |
|
|
369 | (1) |
|
8.8.4 Superscalar Processors: Key Factors |
|
|
369 | (1) |
|
8.8.5 Implementation: Superscalar Processors |
|
|
370 | (1) |
|
8.9 Superpipelined Superscalar Processors |
|
|
371 | (3) |
|
8.9.1 Superpipelined Superscalar Performance |
|
|
372 | (1) |
|
8.9.2 Implementation: Superpipelined Superscalar Processors |
|
|
372 | (1) |
|
|
372 | (1) |
|
|
373 | (1) |
|
8.10 VLIW and EPIC Architectures |
|
|
374 | (2) |
|
8.10.1 Instruction Bundles: The Intel IA-64 Family |
|
|
376 | (1) |
|
8.11 Thread-Level Parallelism: Multithreading |
|
|
376 | (7) |
|
|
379 | (1) |
|
8.11.2 Superscalar Processor |
|
|
380 | (1) |
|
|
381 | (1) |
|
8.11.4 Simultaneous Hardware Multithreaded Processor (SHMT) |
|
|
382 | (1) |
|
8.11.5 Chip Multiprocessors (Multicore Processors) |
|
|
382 | (1) |
|
8.12 Multicore Architecture |
|
|
383 | (11) |
|
|
383 | (5) |
|
|
388 | (1) |
|
|
388 | (1) |
|
8.12.4 Multicore Organisation |
|
|
389 | (2) |
|
8.12.5 Basic Multicore Implementation: Intel Core Duo |
|
|
391 | (2) |
|
|
393 | (1) |
|
|
394 | (1) |
|
8.13 Multicore with Hardware Multithreading |
|
|
394 | (4) |
|
|
395 | (1) |
|
|
396 | (1) |
|
8.13.2.1 Distinctive Features of Intel Core i7 900-Series Processors |
|
|
396 | (1) |
|
8.13.3 Sun UltraSPARC T2 Processor |
|
|
397 | (1) |
|
|
398 | (7) |
|
|
400 | (3) |
|
Suggested References and Websites |
|
|
403 | (2) |
|
|
405 | (24) |
|
9.1 Background: Evolution of Computer Architecture |
|
|
406 | (1) |
|
9.2 Characteristics of CICS and Its Drawbacks |
|
|
407 | (1) |
|
|
408 | (1) |
|
9.3 RISC: Definition and Features |
|
|
408 | (1) |
|
9.4 Representative RISC Processors |
|
|
409 | (1) |
|
|
410 | (1) |
|
9.6 The RISC Impacts and Drawbacks |
|
|
410 | (1) |
|
|
411 | (1) |
|
9.7 RISC versus CISC Debate |
|
|
411 | (2) |
|
9.7.1 Running Programs in High-Level Languages |
|
|
412 | (1) |
|
9.7.2 Technology of the Components |
|
|
412 | (1) |
|
9.7.3 Role of Large Register File |
|
|
412 | (1) |
|
|
413 | (1) |
|
|
414 | (1) |
|
9.10 RISC Instruction Format |
|
|
415 | (1) |
|
9.11 RISC Addressing Mode |
|
|
415 | (1) |
|
9.12 Register Windows: The Large Register File |
|
|
416 | (1) |
|
9.13 Register File and Cache Memory |
|
|
417 | (1) |
|
9.14 Comparison between RISCs and CISCs |
|
|
418 | (1) |
|
|
419 | (1) |
|
9.16 RISC and CISC Union: Hybrid Architecture |
|
|
419 | (1) |
|
9.17 Types of RISC Processors |
|
|
420 | (5) |
|
9.17.1 PowerPC Processors |
|
|
420 | (1) |
|
9.17.2 SPARC Family of Processors |
|
|
420 | (1) |
|
9.17.2.1 UltraSPARC Processors |
|
|
421 | (1) |
|
|
421 | (1) |
|
9.17.4 PA-RISC Processors |
|
|
422 | (1) |
|
9.17.5 ARM (Advanced RISC Machine) Processors |
|
|
422 | (2) |
|
9.17.6 Motorola Processors (MC 88000) |
|
|
424 | (1) |
|
9.18 Comparison of Four Representative RISC Machines |
|
|
425 | (1) |
|
|
425 | (4) |
|
|
426 | (1) |
|
Suggested References and Websites |
|
|
427 | (2) |
|
10 Parallel Architectures |
|
|
429 | (108) |
|
|
429 | (2) |
|
10.2 Classification of Computer Architectures: Flynn's Proposal |
|
|
431 | (3) |
|
10.3 Parallel Computers: Forms and Issues |
|
|
434 | (1) |
|
10.4 Parallel Computers: Its Classification |
|
|
435 | (4) |
|
10.5 Parallel Computers: Its Environment |
|
|
439 | (1) |
|
10.6 Interconnection Networks |
|
|
440 | (19) |
|
10.6.1 Interconnection Network: Different Types |
|
|
441 | (1) |
|
10.6.1.1 Hierarchical Common (Shared) Bus Systems |
|
|
442 | (2) |
|
10.6.1.2 Crossbar Networks |
|
|
444 | (1) |
|
10.6.1.3 Multiport Memory |
|
|
445 | (1) |
|
10.6.1.4 Multistage Networks |
|
|
446 | (1) |
|
10.6.1.5 Omega Network (Perfect Shuffle) |
|
|
447 | (2) |
|
|
449 | (1) |
|
10.6.1.7 The Hot-Spot Problem |
|
|
449 | (1) |
|
10.6.1.8 Butterfly Network |
|
|
450 | (1) |
|
10.6.2 Implementation of Multistage Networks |
|
|
451 | (1) |
|
10.6.3 Comparison of Dynamic Networks |
|
|
452 | (1) |
|
10.6.4 Static Connection Networks (Message-Passing Approach) |
|
|
452 | (1) |
|
|
452 | (1) |
|
|
453 | (1) |
|
|
454 | (1) |
|
|
455 | (1) |
|
|
456 | (1) |
|
|
457 | (1) |
|
|
457 | (1) |
|
10.6.5 Comparison of Static Networks |
|
|
457 | (1) |
|
10.6.6 Hybrid (Mixed Topology) Networks |
|
|
458 | (1) |
|
10.6.7 Important Characteristics of a Network |
|
|
458 | (1) |
|
10.7 Multiprocessor Architectures |
|
|
459 | (10) |
|
10.7.1 Shared-Memory Multiprocessor |
|
|
459 | (2) |
|
10.7.2 Symmetric Multiprocessors (SMP): UMA Model |
|
|
461 | (2) |
|
10.7.3 Distributed Shared Memory Multiprocessors (DSM): NUMA Model |
|
|
463 | (1) |
|
10.7.4 Cache-Coherent NUMA: CC-NUMA Model |
|
|
464 | (2) |
|
10.7.5 No Remote Memory Access (NORMA) |
|
|
466 | (1) |
|
10.7.6 General-Purpose Multiprocessors |
|
|
467 | (1) |
|
10.7.6.1 Implementation: A Mainframe SMP (IBM z990 Series) |
|
|
468 | (1) |
|
10.7.7 Operating System Considerations |
|
|
468 | (1) |
|
10.8 Multicomputer Architectures |
|
|
469 | (12) |
|
10.8.1 Design Considerations |
|
|
470 | (1) |
|
10.8.2 Multicomputer Generations |
|
|
471 | (1) |
|
10.8.3 Different Models of Multicomputer Systems |
|
|
472 | (2) |
|
10.8.3.1 Multitiered Architecture: Three-tier Client-Server Architecture |
|
|
474 | (1) |
|
|
475 | (2) |
|
10.8.5 Distributed Systems |
|
|
477 | (4) |
|
10.9 Clusters: A Distributed Computer System Design |
|
|
481 | (17) |
|
10.9.1 Distinct Advantage |
|
|
483 | (1) |
|
10.9.2 Classification of Clusters |
|
|
484 | (1) |
|
10.9.3 Different Clustering Methods |
|
|
485 | (2) |
|
10.9.4 General Architectures |
|
|
487 | (3) |
|
10.9.5 Operating System Considerations |
|
|
490 | (1) |
|
|
490 | (3) |
|
|
493 | (3) |
|
|
496 | (2) |
|
|
498 | (18) |
|
10.10.1 SIMD Computer Organisations |
|
|
500 | (2) |
|
10.10.2 Vector Processors and SIMD Vector Computers |
|
|
502 | (2) |
|
|
504 | (1) |
|
|
504 | (1) |
|
10.10.5 Vector Instruction Format |
|
|
504 | (1) |
|
10.10.6 Vector Instruction: Different Types |
|
|
505 | (3) |
|
10.10.7 Vector Processing |
|
|
508 | (1) |
|
10.10.8 Vectorization Inhibitor |
|
|
509 | (1) |
|
10.10.9 Vectorizing Compilers |
|
|
509 | (1) |
|
10.10.10 Different Types of Vector Processor Organisations |
|
|
510 | (2) |
|
10.10.11 Salient Features of Vector Operations |
|
|
512 | (1) |
|
10.10.12 Basic Vector Processor Architecture |
|
|
513 | (2) |
|
10.10.13 Memory-Access Schemes |
|
|
515 | (1) |
|
10.10.14 Implementation: The CRAY 1 Architecture |
|
|
516 | (1) |
|
10.11 Array Processors and SIMD Parallel Computers |
|
|
516 | (5) |
|
|
519 | (1) |
|
10.11.2 Implementation: Connection Machine CM 2 Architecture |
|
|
520 | (1) |
|
10.11.3 Vector Processor Versus Array Processor: A Rough Comparison |
|
|
521 | (1) |
|
10.12 Massively Parallel Processing (MPP) System |
|
|
521 | (6) |
|
10.12.1 Representative MPP System: Connection Machine CM-5 |
|
|
524 | (3) |
|
10.13 Scalable Parallel Computer Architecture |
|
|
527 | (1) |
|
|
528 | (2) |
|
10.14.1 The Contemporary Fastest Supercomputer System |
|
|
529 | (1) |
|
|
530 | (7) |
|
|
531 | (4) |
|
Suggested References and Websites |
|
|
535 | (2) |
Additional Reading |
|
537 | (4) |
Suggested Websites |
|
541 | (2) |
Index |
|
543 | |