Series Foreword |
|
xiii | |
Preface to the Third Edition |
|
xv | |
Preface to the Second Edition |
|
xix | |
Preface to the First Edition |
|
xxi | |
1 Background |
|
1 | (12) |
|
1.1 Why Parallel Computing? |
|
|
1 | (1) |
|
1.2 Obstacles to Progress |
|
|
2 | (1) |
|
|
3 | (7) |
|
1.3.1 Parallel Computational Models |
|
|
3 | (6) |
|
1.3.2 Advantages of the Message-Passing Model |
|
|
9 | (1) |
|
1.4 Evolution of Message-Passing Systems |
|
|
10 | (1) |
|
|
11 | (2) |
2 Introduction to MPI |
|
13 | (10) |
|
|
13 | (1) |
|
|
13 | (1) |
|
|
14 | (4) |
|
2.4 Other Interesting Features of MPI |
|
|
18 | (2) |
|
2.5 Is MPI Large or Small? |
|
|
20 | (1) |
|
2.6 Decisions Left to the Implementer |
|
|
21 | (2) |
3 Using MPI in Simple Programs |
|
23 | (46) |
|
|
23 | (5) |
|
3.2 Running Your First MPI Program |
|
|
28 | (1) |
|
3.3 A First MPI Program in C |
|
|
29 | (1) |
|
3.4 Using MPI from Other Languages |
|
|
29 | (2) |
|
|
31 | (1) |
|
3.6 A Self-Scheduling Example: Matrix-Vector Multiplication |
|
|
32 | (6) |
|
3.7 Studying Parallel Performance |
|
|
38 | (11) |
|
3.7.1 Elementary Scalability Calculations |
|
|
39 | (2) |
|
3.7.2 Gathering Data on Program Execution |
|
|
41 | (1) |
|
3.7.3 Instrumenting a Parallel Program with MPE Logging |
|
|
42 | (1) |
|
|
43 | (1) |
|
3.7.5 Instrumenting the Matrix-Matrix Multiply Program |
|
|
43 | (4) |
|
3.7.6 Notes on Implementation of Logging |
|
|
47 | (1) |
|
3.7.7 Graphical Display of Logfiles |
|
|
48 | (1) |
|
|
49 | (6) |
|
3.9 Another Way of Forming New Communicators |
|
|
55 | (2) |
|
3.10 A Handy Graphics Library for Parallel Programs |
|
|
57 | (3) |
|
3.11 Common Errors and Misunderstandings |
|
|
60 | (2) |
|
3.12 Summary of a Simple Subset of MPI |
|
|
62 | (1) |
|
3.13 Application: Computational Fluid Dynamics |
|
|
62 | (7) |
|
3.13.1 Parallel Formulation |
|
|
63 | (2) |
|
3.13.2 Parallel Implementation |
|
|
65 | (4) |
4 Intermediate MPI |
|
69 | (44) |
|
|
70 | (3) |
|
|
73 | (8) |
|
4.3 A Code for the Poisson Problem |
|
|
81 | (10) |
|
4.4 Using Nonblocking Communications |
|
|
91 | (3) |
|
4.5 Synchronous Sends and "Safe" Programs |
|
|
94 | (1) |
|
|
95 | (3) |
|
4.7 Jacobi with a 2-D Decomposition |
|
|
98 | (2) |
|
4.8 An MPI Derived Datatype |
|
|
100 | (1) |
|
4.9 Overlapping Communication and Computation |
|
|
101 | (4) |
|
4.10 More on Timing Programs |
|
|
105 | (1) |
|
|
106 | (1) |
|
4.12 Common Errors and Misunderstandings |
|
|
107 | (1) |
|
4.13 Application: Nek5000/NekCEM |
|
|
108 | (5) |
5 Fun with Datatypes |
|
113 | (42) |
|
|
113 | (6) |
|
5.1.1 Basic Datatypes and Concepts |
|
|
113 | (3) |
|
|
116 | (2) |
|
5.1.3 Understanding Extents |
|
|
118 | (1) |
|
|
119 | (17) |
|
|
120 | (4) |
|
5.2.2 Nonblocking Pipeline |
|
|
124 | (3) |
|
5.2.3 Moving Particles between Processes |
|
|
127 | (5) |
|
5.2.4 Sending Dynamically Allocated Data |
|
|
132 | (2) |
|
5.2.5 User-Controlled Data Packing |
|
|
134 | (2) |
|
5.3 Visualizing the Mandelbrot Set |
|
|
136 | (10) |
|
5.3.1 Sending Arrays of Structures |
|
|
144 | (2) |
|
|
146 | (2) |
|
5.5 More on Datatypes for Structures |
|
|
148 | (1) |
|
5.6 Deprecated and Removed Functions |
|
|
149 | (1) |
|
5.7 Common Errors and Misunderstandings |
|
|
150 | (2) |
|
5.8 Application: Cosmological Large-Scale Structure Formation |
|
|
152 | (3) |
6 Parallel Libraries |
|
155 | (34) |
|
|
155 | (6) |
|
6.1.1 The Need for Parallel Libraries |
|
|
155 | (1) |
|
6.1.2 Common Deficiencies of Early Message-Passing Systems |
|
|
156 | (2) |
|
6.1.3 Review of MPI Features That Support Libraries |
|
|
158 | (3) |
|
|
161 | (9) |
|
6.3 Linear Algebra on Grids |
|
|
170 | (9) |
|
6.3.1 Mappings and Logical Grids |
|
|
170 | (5) |
|
6.3.2 Vectors and Matrices |
|
|
175 | (2) |
|
6.3.3 Components of a Parallel Library |
|
|
177 | (2) |
|
6.4 The UNPACK Benchmark in MPI |
|
|
179 | (4) |
|
6.5 Strategies for Library Building |
|
|
183 | (1) |
|
6.6 Examples of Libraries |
|
|
184 | (1) |
|
6.7 Application: Nuclear Green's Function Monte Carlo |
|
|
185 | (4) |
7 Other Features of MPI |
|
189 | (56) |
|
7.1 Working with Global Data |
|
|
189 | (12) |
|
7.1.1 Shared Memory, Global Data, and Distributed Memory |
|
|
189 | (1) |
|
|
190 | (3) |
|
7.1.3 The Shared Counter Using Polling Instead of an Extra Process |
|
|
193 | (3) |
|
7.1.4 Fairness in Message Passing |
|
|
196 | (2) |
|
7.1.5 Exploiting Request-Response Message Patterns |
|
|
198 | (3) |
|
7.2 Advanced Collective Operations |
|
|
201 | (7) |
|
|
201 | (1) |
|
7.2.2 Collective Computation |
|
|
201 | (5) |
|
7.2.3 Common Errors and Misunderstandings |
|
|
206 | (2) |
|
|
208 | (8) |
|
7.4 Heterogeneous Computing |
|
|
216 | (1) |
|
7.5 Hybrid Programming with MPI and OpenMP |
|
|
217 | (1) |
|
7.6 The MPI Profiling Interface |
|
|
218 | (8) |
|
7.6.1 Finding Buffering Problems |
|
|
221 | (2) |
|
7.6.2 Finding Load Imbalances |
|
|
223 | (1) |
|
7.6.3 Mechanics of Using the Profiling Interface |
|
|
223 | (3) |
|
|
226 | (8) |
|
|
226 | (3) |
|
7.7.2 Example of Error Handling |
|
|
229 | (1) |
|
7.7.3 User-Defined Error Handlers |
|
|
229 | (3) |
|
7.7.4 Terminating MPI Programs |
|
|
232 | (1) |
|
7.7.5 Common Errors and Misunderstandings |
|
|
232 | (2) |
|
|
234 | (3) |
|
|
236 | (1) |
|
7.8.2 Is MPI Initialized? |
|
|
236 | (1) |
|
7.9 Determining the Version of MPI |
|
|
237 | (2) |
|
7.10 Other Functions in MPI |
|
|
239 | (1) |
|
7.11 Application: No-Core Configuration Interaction Calculations in Nuclear Physics |
|
|
240 | (5) |
8 Understanding How MPI Implementations Work |
|
245 | (8) |
|
|
245 | (4) |
|
|
245 | (1) |
|
|
246 | (1) |
|
8.1.3 Rendezvous Protocol |
|
|
246 | (1) |
|
8.1.4 Matching Protocols to MPI's Send Modes |
|
|
247 | (1) |
|
8.1.5 Performance Implications |
|
|
248 | (1) |
|
8.1.6 Alternative MPI Implementation Strategies |
|
|
249 | (1) |
|
8.1.7 Tuning MPI Implementations |
|
|
249 | (1) |
|
8.2 How Difficult Is MPI to Implement? |
|
|
249 | (1) |
|
8.3 Device Capabilities and the MPI Library Definition |
|
|
250 | (1) |
|
8.4 Reliability of Data Transfer |
|
|
251 | (2) |
9 Comparing MPI with Sockets |
|
253 | (6) |
|
9.1 Process Startup and Shutdown |
|
|
255 | (2) |
|
|
257 | (2) |
10 Wait! There's More! |
|
259 | (4) |
|
|
259 | (1) |
|
|
260 | (1) |
|
10.3 Will There Be an MPI-4? |
|
|
261 | (1) |
|
10.4 Beyond Message Passing Altogether |
|
|
261 | (1) |
|
|
262 | (1) |
Glossary of Selected Terms |
|
263 | (10) |
A The MPE Multiprocessing Environment |
|
273 | (6) |
|
|
273 | (2) |
|
|
275 | (1) |
|
|
276 | (3) |
B MPI Resources Online |
|
279 | (2) |
C Language Details |
|
281 | (6) |
|
C.1 Arrays in C and Fortran |
|
|
281 | (4) |
|
C.1.1 Column and Row Major Ordering |
|
|
281 | (1) |
|
C.1.2 Meshes vs. Matrices |
|
|
281 | (1) |
|
C.1.3 Higher Dimensional Arrays |
|
|
282 | (3) |
|
|
285 | (2) |
References |
|
287 | (14) |
Subject Index |
|
301 | (4) |
Function and Term Index |
|
305 | |