Muutke küpsiste eelistusi

Introduction to Parallel Programming [Pehme köide]

(Indian Institute of Technology, Delhi)
  • Formaat: Paperback / softback, 350 pages, kõrgus x laius x paksus: 241x185x15 mm, kaal: 500 g, Worked examples or Exercises
  • Ilmumisaeg: 05-Jan-2023
  • Kirjastus: Cambridge University Press
  • ISBN-10: 1009069535
  • ISBN-13: 9781009069533
Teised raamatud teemal:
  • Formaat: Paperback / softback, 350 pages, kõrgus x laius x paksus: 241x185x15 mm, kaal: 500 g, Worked examples or Exercises
  • Ilmumisaeg: 05-Jan-2023
  • Kirjastus: Cambridge University Press
  • ISBN-10: 1009069535
  • ISBN-13: 9781009069533
Teised raamatud teemal:
In modern computer science, most programming is parallel programming. This textbook will be invaluable as a first course in parallel programming. It covers different parallel programming styles, describes parallel architecture, parallel programming techniques, presents algorithmic techniques and discusses parallel design and performance issues.

In modern computer science, there exists no truly sequential computing system; and most advanced programming is parallel programming. This is particularly evident in modern application domains like scientific computation, data science, machine intelligence, etc. This lucid introductory textbook will be invaluable to students of computer science and technology, acting as a self-contained primer to parallel programming. It takes the reader from introduction to expertise, addressing a broad gamut of issues. It covers different parallel programming styles, describes parallel architecture, includes parallel programming frameworks and techniques, presents algorithmic and analysis techniques and discusses parallel design and performance issues. With its broad coverage, the book can be useful in a wide range of courses; and can also prove useful as a ready reckoner for professionals in the field.

Muu info

This book introduces students to the full gamut of different parallel programming styles and their relationship to hardware architecture.
List of Figures
xiii
Introduction xvii
Concurrency and Parallelism xvii
Why Study Parallel Prograrnming xvii
What Is in This Book xix
1 An Introduction to Parallel Computer Architecture
1(30)
1.1 Parallel Organization
1(3)
SISD: Single Instruction, Single Data
2(1)
SIMD: Single Instruction, Multiple Data
2(1)
MIMD: Multiple Instruction, Multiple Data
3(1)
MISD: Multiple Instruction, Single Data
3(1)
1.2 System Architecture
4(1)
1.3 CPU Architecture
5(3)
1.4 Memory and Cache
8(3)
1.5 GPU Architecture
11(2)
1.6 Interconnect Architecture
13(11)
Routing
13(1)
Links
13(1)
Types and Quality of Networks
14(2)
Torus Network
16(2)
Hypercube Network
18(1)
Cross-Bar Network
19(1)
Shuffle-Exchange Network
20(1)
Clos Network
21(1)
Tree Network
22(2)
Network Comparison
24(1)
1.7 Summary
24(7)
2 Parallel Programming Models
31(14)
2.1 Distributed-Memory Prograrnming Model
32(1)
2.2 Shared-Memory Programming Model
33(2)
2.3 Task Graph Model
35(2)
2.4 Variants of Task Parallelism
37(2)
2.5 Summary
39(6)
3 Parallel Performance Analysis
45(30)
3.1 Simple Parallel Model
46(1)
3.2 Bulk-Synchronous Parallel Model
47(5)
BSP Computation Time
48(1)
BSP Example
49(3)
3.3 PRAM Model
52(5)
PRAM Computation Time
55(1)
PRAM Example
55(2)
3.4 Parallel Performance Evaluation
57(5)
Latency and Throughput
57(1)
Speed-up
58(1)
Cost
58(1)
Efficiency
59(1)
Scalability
59(1)
Iso-efficiency
60(2)
3.5 Parallel Work
62(1)
Brent's Work-Time Scheduling Principle
63(1)
3.6 Amdahl's Law
63(2)
3.7 Gustafson's Law
65(1)
3.8 Karp--Flatt Metric
66(1)
3.9 Summary
67(8)
4 Synchronization and Communication Primitives
75(36)
4.1 Threads and Processes
75(2)
4.2 Race Condition and Consistency of State
77(8)
Sequential Consistency
78(4)
Causal Consistency
82(1)
FIFO and Processor Consistency
82(2)
Weak Consistency
84(1)
Linearizability
85(1)
4.3 Synchronization
85(5)
Synchronization Condition
86(1)
Protocol Control
86(1)
Progress
86(2)
Synchronization Hazards
88(2)
4.4 Mutual Exclusion
90(9)
Lock
90(1)
Peterson's Algorithm
91(3)
Bakery Algorithm
94(1)
Compare and Swap
95(1)
Transactional Memory
96(1)
Barrier and Consensus
97(2)
4.5 Communication
99(5)
Point-to-Point Communication
99(3)
RPC
102(1)
Collective Communication
102(2)
4.6 Summary
104(7)
5 Parallel Program Design
111(28)
5.1 Design Steps
112(3)
Granularity
112(1)
Communication
113(1)
Synchronization
114(1)
Load Balance
115(1)
5.2 Task Decomposition
115(9)
Domain Decomposition
116(4)
Functional Decomposition
120(3)
Task Graph Metrics
123(1)
5.3 Task Execution
124(6)
Preliminary Task Mapping
125(1)
Task Scheduling Framework
126(1)
Centralized Push Scheduling Strategy
127(2)
Distributed Push Scheduling
129(1)
Pull Scheduling
129(1)
5.4 Input/Output
130(2)
5.5 Debugging and Profiling
132(1)
5.6 Summary
133(6)
6 Middleware: The Practice of Parallel Programming
139(72)
6.1 OpenMP
139(16)
Preliminaries
140(1)
OpenMP Thread Creation
140(1)
OpenMP Memory Model
141(2)
OpenMP Reduction
143(1)
OpenMP Synchronization
144(3)
Sharing a Loop's Work
147(3)
Other Work-Sharing Pragmas
150(1)
SHMD Pragma
151(2)
Tasks
153(2)
6.2 MPI
155(25)
MPI Send and Receive
156(2)
Message-Passing Synchronization
158(3)
MPI Data Types
161(3)
MPI Collective Communication
164(3)
MPI Barrier
167(1)
MPI Reduction
167(2)
One-Sided Communication
169(4)
MPI File IO
173(3)
MPI Groups and Communicators
176(1)
MPI Dynamic Parallelism
177(1)
MPI Process Topology
178(2)
6.3 Chapel
180(4)
Partitioned Global Address Space
180(1)
Chapel Tasks
181(2)
Chapel Variable Scope
183(1)
6.4 Map-Reduce
184(4)
Parallel Implementation
185(1)
Hadoop
186(2)
6.5 GPU Programming
188(19)
OpenMP GPU Off-Load
188(3)
Data and Function on Device
191(2)
Thread Blocks in OpenMP
193(1)
CUDA
194(1)
CUDA Programming Model
195(2)
CPU-GPU Memory Transfer
197(1)
Concurrent Kernels
198(1)
CUDA Synchronization
199(3)
CUDA Shared Memory
202(1)
CUDA Parallel Memory Access
203(3)
False Sharing
206(1)
6.6 Summary
207(4)
7 Parallel Algorithms and Techniques
211(58)
7.1 Divide and Conquer: Prefix-Sum
212(5)
Parallel Prefix-Sum: Method
214(1)
Parallel Prefix-Sum: Method
215(1)
Parallel Prefix-Sum: Method
215(2)
7.2 Divide and Conquer: Merge Two Sorted Lists
217(10)
Parallel Merge: Method
218(1)
Parallel Merge: Method
219(3)
Parallel Merge: Method
222(4)
Parallel Merge: Method
226(1)
7.3 Accelerated Cascading: Find Minima
227(3)
7.4 Recursive Doubling: List Ranking
230(1)
7.5 Recursive Doubling: Euler Tour
231(2)
7.6 Recursive Doubling: Connected Components
233(5)
7.7 Pipelining: Merge-Sort
238(11)
Basic Merge-Sort
238(2)
Pipelined Merges
240(5)
4-Cover Property Analysis
245(3)
Merge Operation per Tick
248(1)
7.8 Application of Prefix-Sum: Radix-Sort
249(1)
7.9 Exploiting Parallelism: Quick-Sort
250(4)
7.10 Fixing Processor Count: Sample-Sort
254(3)
7.11 Exploiting Parallelism: Minimum Spanning Tree
257(7)
Parallel Priority Queue
260(3)
MST with Parallel Priority Queue
263(1)
7.12 Summary
264(5)
Bibliography 269(8)
Index 277
Dr Subodh Kumar is Professor at the Department of Computer Science and Engineering at the Indian Institute of Technology, Delhi; an institution he has been associated with since 2007. During this time, he has headed the High Performance Computing group of the institute, and taught several courses on computer graphics, data structures and algorithms, design practices in computer science and parallel programming. Previously, he held the post of Assistant Professor at the Johns Hopkins University. His research interests include rendering algorithms, virtual reality, geometry processing, human machine interface, visualization, large scale parallel computation and HPC.