List of Figures |
|
xiii | |
List of Tables |
|
xxi | |
Series in Computational Physics |
|
xxiii | |
Foreword |
|
xxv | |
Preface |
|
xxvii | |
About the Editors |
|
xxxiii | |
Contributors |
|
xxxv | |
1 The Charm++ Programming Model |
|
1 | (16) |
|
|
|
|
2 | (1) |
|
1.2 Object-Based Programming Model |
|
|
3 | (5) |
|
1.3 Capabilities of the Adaptive Runtime System |
|
|
8 | (2) |
|
1.4 Extensions to the Basic Model |
|
|
10 | (3) |
|
|
13 | (1) |
|
1.6 Other Languages in the Charm++ Family |
|
|
14 | (1) |
|
|
15 | (1) |
|
|
16 | (1) |
2 Designing Charm++ Programs |
|
17 | (18) |
|
|
2.1 Simple Stencil: Using Over-Decomposition and Selecting Grain-size |
|
|
17 | (6) |
|
2.1.1 Grainsize Decisions |
|
|
18 | (3) |
|
|
21 | (1) |
|
2.1.3 Migrating Chares, Load Balancing and Fault Tolerance |
|
|
21 | (2) |
|
2.2 Multi-Physics Modules Using Multiple Chare Arrays |
|
|
23 | (4) |
|
|
24 | (3) |
|
2.3 SAMR: Chare Arrays with Dynamic Insertion and Flexible Indices |
|
|
27 | (2) |
|
2.4 Combinatorial Search: Task Parallelism |
|
|
29 | (1) |
|
2.5 Other Features and Design Considerations |
|
|
30 | (1) |
|
2.6 Utility of Charm++ for Future Applications |
|
|
31 | (1) |
|
|
32 | (3) |
3 Tools for Debugging and Performance Analysis |
|
35 | (26) |
|
|
|
|
|
|
|
36 | (1) |
|
3.2 Scalable Debugging with CharmDebug |
|
|
36 | (11) |
|
3.2.1 Accessing User Information |
|
|
37 | (3) |
|
3.2.2 Debugging Problems at Large Scale |
|
|
40 | (6) |
|
|
46 | (1) |
|
3.3 Performance Visualization and Analysis via Projections |
|
|
47 | (13) |
|
3.3.1 A Simple Projections Primer |
|
|
49 | (2) |
|
3.3.2 Features of Projections via Use Cases |
|
|
51 | (7) |
|
3.3.3 Advanced Features for Scalable Performance Analysis |
|
|
58 | (1) |
|
|
59 | (1) |
|
|
60 | (1) |
4 Scalable Molecular Dynamics with NAMD |
|
61 | (18) |
|
|
|
|
|
|
|
|
|
61 | (1) |
|
4.2 Need for Biomolecular Simulations |
|
|
62 | (1) |
|
4.3 Parallel Molecular Dynamics |
|
|
63 | (1) |
|
4.4 NAMD's Parallel Design |
|
|
64 | (3) |
|
|
65 | (1) |
|
|
66 | (1) |
|
4.5 Enabling Large Simulations |
|
|
67 | (5) |
|
4.5.1 Hierarchical Load Balancing |
|
|
67 | (1) |
|
|
68 | (2) |
|
4.5.3 Optimizing Fine-Grained Communication in NAMD |
|
|
70 | (1) |
|
4.5.4 Parallel Input/Output |
|
|
71 | (1) |
|
|
72 | (3) |
|
4.7 Simulations Enabled by NAMD |
|
|
75 | (1) |
|
|
76 | (3) |
5 OpenAtom: Ab initio Molecular Dynamics for Petascale Plat forms |
|
79 | (26) |
|
|
|
|
|
|
|
80 | (1) |
|
5.2 Car-Parrinello Molecular Dynamics |
|
|
81 | (6) |
|
5.2.1 Density Functional Theory, KS Density Functional Theory and the Local Density Approximation |
|
|
82 | (2) |
|
5.2.2 DFT Computations within Basis Sets |
|
|
84 | (1) |
|
|
84 | (1) |
|
5.2.4 Ab initio Molecular Dynamics and CPAIMD |
|
|
84 | (1) |
|
|
85 | (1) |
|
|
86 | (1) |
|
5.3 Parallel Application Design |
|
|
87 | (8) |
|
5.3.1 Modular Design and Benefits |
|
|
87 | (2) |
|
|
89 | (4) |
|
5.3.3 Topology Aware Mapping |
|
|
93 | (2) |
|
5.4 Charm++ Feature Development |
|
|
95 | (2) |
|
|
97 | (1) |
|
5.6 Impact on Science and Technology |
|
|
98 | (5) |
|
5.6.1 Carbon Based Materials for Photovoltaic Applications |
|
|
98 | (4) |
|
5.6.2 Metal Insulator Transitions for Novel Devices |
|
|
102 | (1) |
|
|
103 | (2) |
6 N-body Simulations with ChaNGa |
|
105 | (32) |
|
|
|
|
|
|
106 | (1) |
|
|
107 | (8) |
|
6.2.1 Domain Decomposition and Load Balancing |
|
|
108 | (2) |
|
|
110 | (2) |
|
|
112 | (1) |
|
|
113 | (1) |
|
6.2.5 Periodic Boundary Conditions |
|
|
114 | (1) |
|
|
114 | (1) |
|
|
115 | (1) |
|
|
115 | (6) |
|
|
115 | (1) |
|
|
116 | (5) |
|
|
121 | (12) |
|
6.4.1 Domain Decomposition and Tree Build Performance |
|
|
121 | (2) |
|
6.4.2 Single-Stepping Performance |
|
|
123 | (2) |
|
6.4.3 Multi-Stepping Performance |
|
|
125 | (1) |
|
|
125 | (8) |
|
6.5 Conclusions and Future Work |
|
|
133 | (4) |
7 Remote Visualization of Cosmological Data Using Salsa |
|
137 | (12) |
|
|
|
|
137 | (1) |
|
7.2 Salsa Client/Server Rendering Architecture |
|
|
138 | (6) |
|
7.2.1 Client Server Communication Styles |
|
|
139 | (2) |
|
7.2.2 Image Compression in Salsa |
|
|
141 | (1) |
|
7.2.3 GPU Particle Rendering on the Server |
|
|
142 | (2) |
|
7.3 Remote Visualization User Interface |
|
|
144 | (1) |
|
7.4 Example Use: Galaxy Clusters |
|
|
145 | (4) |
8 Improving Scalability of BRAMS: a Regional Weather Forecast Model |
|
149 | (38) |
|
|
|
|
|
150 | (1) |
|
8.2 Load Balancing Strategies for Weather Models |
|
|
151 | (2) |
|
8.3 The BRAMS Weather Model |
|
|
153 | (1) |
|
8.4 Load Balancing Approach |
|
|
154 | (4) |
|
8.4.1 Adaptations to AMPI |
|
|
155 | (2) |
|
8.4.2 Balancing Algorithms Employed |
|
|
157 | (1) |
|
|
158 | (4) |
|
8.6 Fully Distributed Strategies |
|
|
162 | (4) |
|
8.6.1 Hilbert Curve-Based Load Balancer |
|
|
162 | (2) |
|
8.6.2 Diffusion-Based Load Balancer |
|
|
164 | (2) |
|
|
166 | (16) |
|
8.7.1 First Set of Experiments: Privatization Strategy |
|
|
166 | (1) |
|
8.7.2 Second Set of Experiments: Virtualization Effects |
|
|
167 | (4) |
|
8.7.3 Third Set of Experiments: Centralized Load Balancers |
|
|
171 | (3) |
|
8.7.4 Fourth Set of Experiments: Distributed Load Balancers |
|
|
174 | (8) |
|
|
182 | (5) |
9 Crack Propagation Analysis with Automatic Load Balancing |
|
187 | (24) |
|
|
|
|
|
|
187 | (6) |
|
|
188 | (2) |
|
9.1.2 Implementation of the ParFUM Framework |
|
|
190 | (3) |
|
9.2 Load Balancing Finite Element Codes in Charm++ |
|
|
193 | (6) |
|
9.2.1 Runtime Support for Thread Migration |
|
|
193 | (1) |
|
9.2.2 Comparison to Prior Work |
|
|
194 | (1) |
|
9.2.3 Automatic Load Balancing for FEM |
|
|
195 | (1) |
|
9.2.4 Load Balancing Strategies |
|
|
196 | (1) |
|
9.2.5 Agile Load Balancing |
|
|
197 | (2) |
|
9.3 Cohesive and Elasto-plastic Finite Element Model of Fracture |
|
|
199 | (11) |
|
9.3.1 Case Study 1: Elasto-Plastic Wave Propagation |
|
|
202 | (4) |
|
9.3.2 Case Study 2: Dynamic Fracture |
|
|
206 | (4) |
|
|
210 | (1) |
10 Contagion Diffusion with EpiSimdemics |
|
211 | (36) |
|
|
|
|
|
|
|
|
|
212 | (3) |
|
|
215 | (3) |
|
|
215 | (2) |
|
10.2.2 Application to Computational Epidemiology |
|
|
217 | (1) |
|
|
218 | (6) |
|
|
220 | (1) |
|
10.3.2 Modeling Behavior of Individual Agents |
|
|
220 | (1) |
|
10.3.3 Intervention and Behavior Modification |
|
|
221 | (2) |
|
10.3.4 Social Network Representation |
|
|
223 | (1) |
|
10.4 EpiSimdemics Algorithm |
|
|
224 | (3) |
|
10.5 Charm++ Implementation |
|
|
227 | (5) |
|
10.5.1 Designing the Chares |
|
|
228 | (2) |
|
10.5.2 The EpiSimdemics Algorithm |
|
|
230 | (2) |
|
|
232 | (1) |
|
10.6 Performance of EpiSimdemics |
|
|
232 | (9) |
|
10.6.1 Experimental Setup |
|
|
233 | (1) |
|
10.6.2 Performance Characteristics |
|
|
233 | (1) |
|
10.6.3 Effects of Synchronization |
|
|
234 | (1) |
|
10.6.4 Effects of Strong Scaling |
|
|
235 | (1) |
|
10.6.5 Effects of Weak Scaling |
|
|
236 | (1) |
|
10.6.6 Effect of Load Balancing |
|
|
237 | (4) |
|
10.7 Representative Study |
|
|
241 | (6) |
Bibliography |
|
247 | (24) |
Index |
|
271 | |