Tasuta saatmine! | Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Parallel Programming for Modern High Performance Computing Systems [Kõva köide]

4.00/5 (6 hinnangut Goodreads-ist)

Pawel Czarnul

Formaat: Hardback, 304 pages, kõrgus x laius: 234x156 mm, kaal: 614 g, 20 Tables, black and white; 64 Illustrations, black and white
Ilmumisaeg: 28-Feb-2018
Kirjastus: CRC Press
ISBN-10: 1138305952
ISBN-13: 9781138305953

Teised raamatud teemal:

Parallel processing
Supercomputers
Information technology: general issues - (Hetkel poes: 3 nimetust)
Network management - (Hetkel poes: 1 nimetust)

Kõva köide
Hind: 144,75 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 3-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Hardback, 304 pages, kõrgus x laius: 234x156 mm, kaal: 614 g, 20 Tables, black and white; 64 Illustrations, black and white
Ilmumisaeg: 28-Feb-2018
Kirjastus: CRC Press
ISBN-10: 1138305952
ISBN-13: 9781138305953

Teised raamatud teemal:

Parallel processing
Supercomputers
Information technology: general issues - (Hetkel poes: 3 nimetust)
Network management - (Hetkel poes: 1 nimetust)

Püsilink: https://www.kriso.ee/db/9781138305953.html

Märksõnad:

In view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems.

It first discusses selected and popular state-of-the-art computing devices and systems available today, These include multicore CPUs, manycore (co)processors, such as Intel Xeon Phi, accelerators, such as GPUs, and clusters, as well as programming models supported on these platforms.

It next introduces parallelization through important programming paradigms, such as master-slave, geometric Single Program Multiple Data (SPMD) and divide-and-conquer.

The practical and useful elements of the most popular and important APIs for programming parallel HPC systems are discussed, including MPI, OpenMP, Pthreads, CUDA, OpenCL, and OpenACC. It also demonstrates, through selected code listings, how selected APIs can be used to implement important programming paradigms. Furthermore, it shows how the codes can be compiled and executed in a Linux environment.

The book also presents hybrid codes that integrate selected APIs for potentially multi-level parallelization and utilization of heterogeneous resources, and it shows how to use modern elements of these APIs. Selected optimization techniques are also included, such as overlapping communication and computations implemented using various APIs.

Features:

Discusses the popular and currently available computing devices and cluster systems

Includes typical paradigms used in parallel programs

Explores popular APIs for programming parallel applications

Provides code templates that can be used for implementation of paradigms

Provides hybrid code examples allowing multi-level parallelization

Covers the optimization of parallel programs

List of figures

xiii

List of tables

xvii

List of listings

xix

Preface

xxiii

Chapter 1 Understanding the need for parallel computing

(10)

1.1 Introduction

(1)

1.2 From Problem to Parallel Solution - Development Steps

(2)

1.3 Approaches to Parallelization

(2)

1.4 Selected Use Cases With Popular Apis

(1)

1.5 Outline of the Book

(4)

Chapter 2 Overview of selected parallel and distributed systems for high performance computing

(18)

2.1 Generic Taxonomy of Parallel Computing Systems

(1)

2.2 Multicore Cpus

(2)

2.3 Gpus

(3)

2.4 Manycore Cpus/Coprocessors

(2)

2.5 Cluster Systems

(1)

2.6 Growth of High Performance Computing Systems and Relevant Metrics

(2)

2.7 Volunteer-Based Systems

(3)

2.8 Grid Systems

(4)

Chapter 3 Typical paradigms for parallel applications

(40)

3.1 Aspects of Parallelization

(5)

3.1.1 Data partitioning and granularity

(2)

3.1.2 Communication

(1)

3.1.3 Data allocation

(1)

3.1.4 Load balancing

(1)

3.1.5 HPC related metrics

(1)

3.2 Master-Slave

(4)

3.3 Geometric Spmd

(16)

3.4 Pipelining

(1)

3.5 Divide-and-Conquer

(13)

Chapter 4 Selected APIs for parallel programming

(116)

4.1 Message Passing Interface (MPI)

(28)

4.1.1 Programming model and application structure

(1)

4.1.2 The world of MPI processes and threads

(1)

4.1.3 Initializing and finalizing usage of MPI

(1)

4.1.4 Communication modes

(1)

4.1.5 Basic point-to-point communication routines

(2)

4.1.6 Basic MPI collective communication routines

(5)

4.1.7 Packing buffers and creating custom data types

(2)

4.1.8 Receiving a message with wildcards

(1)

4.1.9 Receiving a message with unknown data size

(1)

4.1.10 Various send modes

(1)

4.1.11 Non-blocking communication

(2)

4.1.12 One-sided MPI API

(5)

4.1.13 A sample MPI application

(2)

4.1.14 Multithreading in MPI

(2)

4.1.15 Dynamic creation of processes in MPI

(2)

4.1.16 Parallel MPI I/O

101

(1)

4.2 Openmp

102

(16)

4.2.1 Programming model and application structure

102

(2)

4.2.2 Commonly used directives and functions

104

(5)

4.2.3 The number of threads in a parallel region

109

(1)

4.2.4 Synchronization of threads within a parallel region and single thread execution

109

(2)

4.2.5 Important environment variables

111

(1)

4.2.6 A sample OpenMP application

112

(3)

4.2.7 Selected SIMD directives

115

(1)

4.2.8 Device offload instructions

115

(2)

4.2.9 Tasking in OpenMP

117

(1)

4.3 Pthreads

118

(9)

4.3.1 Programming model and application structure

118

(3)

4.3.2 Mutual exclusion

121

(2)

4.3.3 Using condition variables

123

(1)

4.3.4 Barrier

124

(1)

4.3.5 Synchronization

125

(1)

4.3.6 A sample Pthreads application

125

(2)

4.4 Cuda

127

(20)

4.4.1 Programming model and application structure

127

(4)

4.4.2 Scheduling and synchronization

131

(3)

4.4.3 Constraints

134

(1)

4.4.4 A sample CUDA application

134

(3)

4.4.5 Streams and asynchronous operations

137

(4)

4.4.6 Dynamic parallelism

141

(2)

4.4.7 Unified Memory in CUDA

143

(2)

4.4.8 Management of GPU devices

145

(2)

4.5 Opencl

147

(20)

4.5.1 Programming model and application structure

147

(8)

4.5.2 Coordinates and Indexing

155

(1)

4.5.3 Queuing data reads/writes and kernel execution

156

(1)

4.5.4 Synchronization functions

157

(1)

4.5.5 A sample OpenCL application

158

(9)

4.6 Openacc

167

(5)

4.6.1 Programming model and application structure

167

(1)

4.6.2 Common directives

168

(1)

4.6.3 Data management

169

(2)

4.6.4 A sample OpenACC application

171

(1)

4.6.5 Asynchronous processing and synchronization

171

(1)

4.6.6 Device management

172

(1)

4.7 Selected Hybrid Approaches

172

(13)

4.7.1 MPI+Pthreads

173

(4)

4.7.2 MPI+OpenMP

177

(3)

4.7.3 MPI+CUDA

180

(5)

Chapter 5 Programming parallel paradigms using selected APIs

185

(66)

5.1 Master-Slave

185

(33)

5.1.1 MPI

186

(4)

5.1.2 OpenMP

190

(7)

5.1.3 MPI+OpenMP

197

(2)

5.1.4 MPI+Pthreads

199

(8)

5.1.5 CUDA

207

(6)

5.1.6 OpenMP+CUDA

213

(5)

5.2 Geometric Spmd

218

(11)

5.2.1 MPI

218

(2)

5.2.2 MPI+OpenMP

220

(5)

5.2.3 OpenMP

225

(1)

5.2.4 MPI+CUDA

225

(4)

5.3 Divide-and-Conquer

229

(22)

5.3.1 OpenMP

229

(3)

5.3.2 CUDA

232

(3)

5.3.3 MPI

235

(1)

5.3.3.1 Balanced version

236

(4)

5.3.3.2 Version with dynamic process creation

240

(11)

Chapter 6 Optimization techniques and best practices for parallel codes

251

(22)

6.1 Data Prefetching, Communication and Computations Overlapping and Increasing Computation Efficiency

252

(5)

6.1.1 MPI

253

(3)

6.1.2 CUDA

256

(1)

6.2 Data Granularity

257

(1)

6.3 Minimization of Overheads

258

(2)

6.3.1 Initialization and synchronization overheads

258

(2)

6.3.2 Load balancing vs cost of synchronization

260

(1)

6.4 Process/Thread Affinity

260

(1)

6.5 Data Types and Accuracy

261

(1)

6.6 Data Organization and Arrangement

261

(1)

6.7 Checkpointing

262

(2)

6.8 Simulation of Parallel Application Execution

264

(1)

6.9 Best Practices and Typical Optimizations

265

(8)

6.9.1 GPUs/CUDA

265

(1)

6.9.2 Intel Xeon Phi

266

(3)

6.9.3 Clusters

269

(1)

6.9.4 Hybrid systems

270

(3)

Appendix A Resources

273

(2)

A.1 Software Packages

273

(2)

Appendix B Further reading

275

(22)

B.1 Context of This Book

275

(1)

B.2 Other Resources on Parallel Programming

275

(22)

Index

297

Pawel Czanul

Parallel Programming for Modern High Performance Computing Systems [Kõva köide]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv