Preface |
|
xiii | |
PART I PARALLELISM |
|
1 | (71) |
|
|
3 | (12) |
|
Parallel Computing Hardware |
|
|
4 | (4) |
|
What Have We Learned from Applications? |
|
|
8 | (3) |
|
|
11 | (2) |
|
Toward a Science of Parallel Computation |
|
|
13 | (2) |
|
Parallel Computer Architectures |
|
|
15 | (28) |
|
Uniprocessor Architecture |
|
|
16 | (10) |
|
|
26 | (14) |
|
Future Directions for Parallel Architectures |
|
|
40 | (1) |
|
|
41 | (2) |
|
Parallel Programming Considerations |
|
|
43 | (29) |
|
Architectural Considerations |
|
|
45 | (4) |
|
Decomposing Programs for Parallelism |
|
|
49 | (7) |
|
Enhancing Parallel Performance |
|
|
56 | (7) |
|
Memory-Hierarchy Management |
|
|
63 | (3) |
|
|
66 | (1) |
|
Performance Analysis and Tuning |
|
|
67 | (2) |
|
|
69 | (1) |
|
|
70 | (2) |
|
|
70 | (2) |
PART II APPLICATIONS |
|
72 | (219) |
|
General Application Issues |
|
|
75 | (18) |
|
Application Characteristics in a Simple Example |
|
|
75 | (4) |
|
Communication Structure in Jacobi's Method for Poisson's Equation |
|
|
79 | (3) |
|
Communication Overhead for More General Update Stencils |
|
|
82 | (2) |
|
Applications as Basic Complex Systems |
|
|
84 | (3) |
|
Time-Stepped and Event-Driven Simulations |
|
|
87 | (1) |
|
Temporal Structure of Applications |
|
|
88 | (1) |
|
Summary of Parallelization of Basic Complex Systems |
|
|
89 | (1) |
|
|
90 | (1) |
|
|
91 | (2) |
|
Parallel Computing in Computational Fluid Dynamics |
|
|
93 | (52) |
|
Introduction to Computational Fluid Dynamics |
|
|
94 | (4) |
|
|
98 | (34) |
|
|
132 | (12) |
|
|
144 | (1) |
|
Parallel Computing in Environment and Energy |
|
|
145 | (22) |
|
|
146 | (6) |
|
IPARS and Grid Computing by NetSolve |
|
|
152 | (3) |
|
Tracking and Interactive Simulation in IPARS |
|
|
155 | (4) |
|
|
159 | (3) |
|
A Coupled Simulation of Flow and Transport with ADR |
|
|
162 | (3) |
|
|
165 | (2) |
|
Parallel Computational Chemistry: An Overview of NWChem |
|
|
167 | (28) |
|
Molecular Quantum Chemistry |
|
|
168 | (3) |
|
|
171 | (3) |
|
NWChem Parallel Computing Support |
|
|
174 | (4) |
|
|
178 | (8) |
|
NWChem's Place in the Computational Chemistry Community |
|
|
186 | (2) |
|
A Larger Perspective: Common Features of Computational Chemistry Algorithms |
|
|
188 | (4) |
|
|
192 | (3) |
|
|
195 | (96) |
|
Numerical (General) Relativity |
|
|
195 | (4) |
|
Numerical Simulations in Lattice Quantum Chromodynamics |
|
|
199 | (8) |
|
|
207 | (5) |
|
Simulations of Earthquakes |
|
|
212 | (7) |
|
Cosmological Structure Formation |
|
|
219 | (8) |
|
Computational Electromagnetics |
|
|
227 | (5) |
|
Parallel Algorithms in Data Mining |
|
|
232 | (11) |
|
High-Performance Computing in Signal and Image Processing |
|
|
243 | (6) |
|
Deterministic Monte Carlo Methods and Parallelism |
|
|
249 | (9) |
|
Quasi-Real Time Microtomography Experiments at Photon Sources |
|
|
258 | (7) |
|
WebHLA-Based Meta-Computing Environment for Forces Modeling and Simulation |
|
|
265 | (15) |
|
Computational Structure of Applications |
|
|
280 | (10) |
|
|
290 | (1) |
PART III SOFTWARE TECHNOLOGIES |
|
291 | (190) |
|
|
293 | (20) |
|
Selecting a Parallel Program Technology |
|
|
294 | (14) |
|
Achieving Correct and Efficient Execution |
|
|
308 | (2) |
|
|
310 | (3) |
|
Message Passing And Threads |
|
|
313 | (18) |
|
Message-Passing Programming Model |
|
|
314 | (9) |
|
Multithreaded Programming |
|
|
323 | (6) |
|
|
329 | (2) |
|
|
331 | (26) |
|
Parallel I/O Infrastructure |
|
|
333 | (6) |
|
|
339 | (5) |
|
Parallel I/O Optimizations |
|
|
344 | (4) |
|
How Can Users Achieve High I/O Performance? |
|
|
348 | (7) |
|
|
355 | (2) |
|
|
357 | (26) |
|
Automatic Parallelization |
|
|
359 | (2) |
|
Data-Parallel Programming in High Performance Fortran |
|
|
361 | (5) |
|
Shared-Memory Parallel Programming in OpenMP |
|
|
366 | (5) |
|
Single-Program, Multiple-Data Programming in Co-Array Fortran |
|
|
371 | (6) |
|
|
377 | (1) |
|
|
378 | (1) |
|
|
379 | (4) |
|
|
380 | (3) |
|
Parallel Object-Oriented Libraries |
|
|
383 | (26) |
|
Object-Oriented Parallel Libraries |
|
|
384 | (7) |
|
Object-Oriented Parallel Programming in Java |
|
|
391 | (5) |
|
Multithreaded Computation in C++ |
|
|
396 | (5) |
|
Remote Function Calls, Global Pointers, and Java RMI |
|
|
401 | (2) |
|
Component-Based Software Design |
|
|
403 | (3) |
|
|
406 | (3) |
|
Problem-Solving Environments |
|
|
409 | (34) |
|
NetSolve: Network-Enabled Solvers |
|
|
411 | (7) |
|
WebFlow-Object Web Computing |
|
|
418 | (11) |
|
|
429 | (11) |
|
Other Grid-Computing Environments |
|
|
440 | (2) |
|
|
442 | (1) |
|
Tools for Performance Tuning and Debugging |
|
|
443 | (26) |
|
Correctness and Performance Monitoring Basics |
|
|
444 | (7) |
|
Measurement and Debugging Implementation Challenges |
|
|
451 | (2) |
|
Deep Compiler Integration |
|
|
453 | (3) |
|
Software Tool Interfaces and Usability |
|
|
456 | (3) |
|
|
459 | (7) |
|
Challenges and Open Problems |
|
|
466 | (1) |
|
|
466 | (1) |
|
|
467 | (2) |
|
|
469 | (12) |
|
|
469 | (1) |
|
|
470 | (1) |
|
Parallel Solution of Poisson's Equation |
|
|
470 | (7) |
|
|
477 | (4) |
PART IV ENABLING TECHNOLOGIES AND ALGORITHMS |
|
481 | (239) |
|
Reusable Software and Algorithms |
|
|
483 | (8) |
|
Templates: Design Patterns for Parallel Software |
|
|
483 | (1) |
|
Communicators and Data Structure Neutrality |
|
|
484 | (1) |
|
Standard Libraries and Components |
|
|
485 | (1) |
|
Automatic Differentiation |
|
|
486 | (1) |
|
Templates and Numerical Linear Algebra |
|
|
487 | (2) |
|
|
489 | (2) |
|
Graph Partitioning for High-Performance Scientific Simulations |
|
|
491 | (52) |
|
Modeling Mesh-Based Computations as Graphs |
|
|
493 | (2) |
|
Static Graph-Partitioning Techniques |
|
|
495 | (21) |
|
Load Balancing of Adaptive Computations |
|
|
516 | (9) |
|
Parallel Graph Partitioning |
|
|
525 | (1) |
|
Multiconstraint, Multiobjective Graph Partitioning |
|
|
526 | (12) |
|
|
538 | (5) |
|
|
543 | (32) |
|
Mesh-Generation Strategies and Techniques |
|
|
544 | (6) |
|
Mesh-Generation Process and Geometry Preparation |
|
|
550 | (2) |
|
|
552 | (8) |
|
|
560 | (1) |
|
|
561 | (3) |
|
|
564 | (3) |
|
|
567 | (2) |
|
The Pacing Obstacle: Geometry/Mesh Generation |
|
|
569 | (2) |
|
|
571 | (1) |
|
|
572 | (3) |
|
Templates and Numerical Linear Algebra |
|
|
575 | (46) |
|
Dense Linear Algebra Algorithms |
|
|
576 | (4) |
|
The Influence of Computer Architecture on Performance |
|
|
580 | (3) |
|
Dense Linear Algebra Libraries |
|
|
583 | (7) |
|
Sparse Linear Algebra Methods |
|
|
590 | (1) |
|
|
591 | (5) |
|
Iterative Solution Methods |
|
|
596 | (7) |
|
Sparse Eigenvalue Problems |
|
|
603 | (16) |
|
|
619 | (2) |
|
Software for the Scalable Solution of Partial Differential Equations |
|
|
621 | (28) |
|
|
622 | (1) |
|
Challenges in Parallel PDE Computations |
|
|
623 | (4) |
|
Parallel Solution Strategies |
|
|
627 | (1) |
|
PETSc Approach to Parallel Software for PDEs |
|
|
628 | (17) |
|
|
645 | (2) |
|
|
647 | (2) |
|
Parallel Continuous Optimization |
|
|
649 | (22) |
|
|
651 | (2) |
|
|
653 | (6) |
|
|
659 | (4) |
|
Optimization of Linked Subsystems |
|
|
663 | (3) |
|
Variable and Constraint Distribution |
|
|
666 | (3) |
|
|
669 | (2) |
|
Path Following in Scientific Computing and Its Implementation in Auto |
|
|
671 | (30) |
|
|
673 | (2) |
|
Global Continuation and Degree Theory |
|
|
675 | (2) |
|
|
677 | (2) |
|
|
679 | (4) |
|
Branch Switching at Bifurcations |
|
|
683 | (3) |
|
Computational Examples: AUTO |
|
|
686 | (8) |
|
|
694 | (5) |
|
|
699 | (2) |
|
Automatic Differentiation |
|
|
701 | (19) |
|
Overview of Automatic Differentiation |
|
|
703 | (4) |
|
Automatic-Differentiation Implementation Techniques |
|
|
707 | (2) |
|
Automatic-Differentiation Software |
|
|
709 | (2) |
|
Automatic Differentiation of Message-Passing Parallel Codes |
|
|
711 | (3) |
|
Advanced Use of Automatic Differentiation |
|
|
714 | (5) |
|
|
719 | (1) |
PART V CONCLUSION |
|
720 | (9) |
|
Wrap-Up and Signposts to the Future |
|
|
723 | (6) |
|
|
723 | (1) |
|
|
724 | (1) |
|
|
725 | (2) |
|
Templates, Algorithms, and Technologies |
|
|
727 | (1) |
|
|
727 | (2) |
References |
|
729 | (62) |
Index |
|
791 | (42) |
About the Authors |
|
833 | |