Muutke küpsiste eelistusi

E-raamat: Parallel Scientific Computing [Wiley Online]

  • Formaat: 384 pages
  • Sari: ISTE
  • Ilmumisaeg: 15-Dec-2015
  • Kirjastus: ISTE Ltd and John Wiley & Sons Inc
  • ISBN-10: 1118761685
  • ISBN-13: 9781118761687
Teised raamatud teemal:
  • Wiley Online
  • Hind: 174,45 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Formaat: 384 pages
  • Sari: ISTE
  • Ilmumisaeg: 15-Dec-2015
  • Kirjastus: ISTE Ltd and John Wiley & Sons Inc
  • ISBN-10: 1118761685
  • ISBN-13: 9781118761687
Teised raamatud teemal:

Scientific computing has become an indispensable tool in numerous fields, such as physics, mechanics, biology,
finance and industry. For example, it enables us, thanks to efficient algorithms adapted to current computers, to
simulate, without the help of models or experimentations, the deflection of beams in bending, the sound level in a theater room or a fluid flowing around an aircraft wing.
This book presents the scientific computing techniques applied to parallel computing for the numerical simulation of large-scale problems; these problems result from systems modeled by partial differential equations. Computing concepts will be tackled via examples.
Implementation and programming techniques resulting from the finite element method will be presented for direct solvers, iterative solvers and domain decomposition methods, along with an introduction to MPI and OpenMP.

Preface xi
Introduction xv
Chapter 1 Computer Architectures 1(16)
1.1 Different types of parallelism
1(6)
1.1.1 Overlap, concurrency and parallelism
1(3)
1.1.2 Temporal and spatial parallelism for arithmetic logic units
4(2)
1.1.3 Parallelism and memory
6(1)
1.2 Memory architecture
7(7)
1.2.1 Interleaved multi-bank memory
7(1)
1.2.2 Memory hierarchy
8(5)
1.2.3 Distributed memory
13(1)
1.3 Hybrid architecture
14(3)
1.3.1 Graphics-type accelerators
14(2)
1.3.2 Hybrid computers
16(1)
Chapter 2 Parallelization and Programming Models 17(36)
2.1 Parallelization
17(2)
2.2 Performance criteria
19(6)
2.2.1 Degree of parallelism
19(2)
2.2.2 Load balancing
21(1)
2.2.3 Granularity
21(1)
2.2.4 Scalability
22(3)
2.3 Data parallelism
25(12)
2.3.1 Loop tasks
25(1)
2.3.2 Dependencies
26(1)
2.3.3 Examples of dependence
27(3)
2.3.4 Reduction operations
30(1)
2.3.5 Nested loops
31(3)
2.3.6 OpenMP
34(3)
2.4 Vectorization: a case study
37(6)
2.4.1 Vector computers and vectorization
37(1)
2.4.2 Dependence
38(1)
2.4.3 Reduction operations
39(2)
2.4.4 Pipeline operations
41(2)
2.5 Message-passing
43(6)
2.5.1 Message-passing programming
43(1)
2.5.2 Parallel environment management
44(1)
2.5.3 Point-to-point communications
45(1)
2.5.4 Collective communications
46(3)
2.6 Performance analysis
49(4)
Chapter 3 Parallel Algorithm Concepts 53(18)
3.1 Parallel algorithms for recurrences
54(4)
3.1.1 The principles of reduction methods
54(1)
3.1.2 Overhead and stability of reduction methods
55(2)
3.1.3 Cyclic reduction
57(1)
3.2 Data locality and distribution: product of matrices
58(13)
3.2.1 Row and column algorithms
58(2)
3.2.2 Block algorithms
60(4)
3.2.3 Distributed algorithms
64(2)
3.2.4 Implementation
66(5)
Chapter 4 Basics of Numerical Matrix Analysis 71(22)
4.1 Review of basic notions of linear algebra
71(8)
4.1.1 Vector spaces, scalar products and orthogonal projection
71(3)
4.1.2 Linear applications and matrices
74(5)
4.2 Properties of matrices
79(14)
4.2.1 Matrices, eigenvalues and eigenvectors
79(1)
4.2.2 Norms of a matrix
80(3)
4.2.3 Basis change
83(2)
4.2.4 Conditioning of a matrix
85(8)
Chapter 5 Sparse Matrices 93(12)
5.1 Origins of sparse matrices
93(5)
5.2 Parallel formation of sparse matrices: shared memory
98(1)
5.3 Parallel formation by block of sparse matrices: distributed memory
99(6)
5.3.1 Parallelization by sets of vertices
99(2)
5.3.2 Parallelization by sets of elements
101(1)
5.3.3 Comparison: sets of vertices and elements
101(4)
Chapter 6 Solving Linear Systems 105(4)
6.1 Direct methods
105(1)
6.2 Iterative methods
106(3)
Chapter 7 LU Methods for Solving Linear Systems 109(16)
7.1 Principle of LU decomposition
109(4)
7.2 Gauss factorization
113(2)
7.3 Gauss—Jordan factorization
115(6)
7.3.1 Row pivoting
118(3)
7.4 Crout and Cholesky factorizations for symmetric matrices
121(4)
Chapter 8 Parallelization of LU Methods for Dense Matrices 125(14)
8.1 Block factorization
125(5)
8.2 Implementation of block factorization in a message-passing environment
130(5)
8.3 Parallelization of forward and backward substitutions
135(4)
Chapter 9 LU Methods for Sparse Matrices 139(22)
9.1 Structure of factorized matrices
139(3)
9.2 Symbolic factorization and renumbering
142(5)
9.3 Elimination trees
147(5)
9.4 Elimination trees and dependencies
152(1)
9.5 Nested dissections
153(6)
9.6 Forward and backward substitutions
159(2)
Chapter 10 Basics of Krylov Subspaces 161(6)
10.1 Krylov subspaces
161(3)
10.2 Construction of the Arnoldi basis
164(3)
Chapter 11 Methods with Complete Orthogonalization for Symmetric Positive Definite Matrices 167(18)
11.1 Construction of the Lanczos basis for symmetric matrices
167(1)
11.2 The Lanczos method
168(5)
11.3 The conjugate gradient method
173(4)
11.4 Comparison with the gradient method
177(3)
11.5 Principle of preconditioning for symmetric positive definite matrices
180(5)
Chapter 12 Exact Orthogonalization Methods for Arbitrary Matrices 185(16)
12.1 The GMRES method
185(8)
12.2 The case of symmetric matrices: the MINRES method
193(3)
12.3 The ORTHODIR method
196(2)
12.4 Principle of preconditioning for non-symmetric matrices
198(3)
Chapter 13 Biorthogonalization Methods for Non-symmetric Matrices 201(24)
13.1 Lanczos biorthogonal basis for non-symmetric matrices
201(5)
13.2 The non-symmetric Lanczos method
206(1)
13.3 The biconjugate gradient method: BiCG
207(4)
13.4 The quasi-minimal residual method: QMR
211(6)
13.5 The BiCGSTAB
217(8)
Chapter 14 Parallelization of Krylov Methods 225(18)
14.1 Parallelization of dense matrix-vector product
225(2)
14.2 Parallelization of sparse matrix-vector product based on node sets
227(2)
14.3 Parallelization of sparse matrix-vector product based on element sets
229(9)
14.3.1 Review of the principles of domain decomposition
229(2)
14.3.2 Matrix-vector product
231(2)
14.3.3 Interface exchanges
233(3)
14.3.4 Asynchronous matrix-vector product with non-blocking communications
236(1)
14.3.5 Comparison: parallelization based on node and element sets
236(2)
14.4 Parallelization of the scalar product
238(3)
14.4.1 By weight
239(1)
14.4.2 By distributivity
239(1)
14.4.3 By ownership
240(1)
14.5 Summary of the parallelization of Krylov methods
241(2)
Chapter 15 Parallel Preconditioning Methods 243(36)
15.1 Diagonal
243(2)
15.2 Incomplete factorization methods
245(5)
15.2.1 Principle
245(3)
15.2.2 Parallelization
248(2)
15.3 Schur complement method
250(7)
15.3.1 Optimal local preconditioning
250(1)
15.3.2 Principle of the Schur complement method
251(3)
15.3.3 Properties of the Schur complement method
254(3)
15.4 Algebraic multigrid
257(6)
15.4.1 Preconditioning using projection
257(1)
15.4.2 Algebraic construction of a coarse grid
258(3)
15.4.3 Algebraic multigrid methods
261(2)
15.5 The Schwarz additive method of preconditioning
263(12)
15.5.1 Principle of the overlap
263(2)
15.5.2 Multiplicative versus additive Schwarz methods
265(3)
15.5.3 Additive Schwarz preconditioning
268(1)
15.5.4 Restricted additive Schwarz: parallel implementation
269(6)
15.6 Preconditioners based on the physics
275(4)
15.6.1 Gauss—Seidel method
275(1)
15.6.2 Linelet method
276(3)
Appendices 279(60)
Appendix 1
281(20)
Appendix 2
301(22)
Appendix 3
323(16)
Bibliography 339(4)
Index 343
Fr&Eeacute;déric Magoulès is Professor at LISA / MAS école Centrale Paris, France.

François-Xavier Roux is Professor at University Pierre & Marie Curie - Paris 6, France.