Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Programming in Parallel with CUDA: A Practical Guide

4.00/5 (4 hinnangut Goodreads-ist)

Richard Ansorge (University of Cambridge)

Formaat: PDF+DRM
Ilmumisaeg: 02-Jun-2022
Kirjastus: Cambridge University Press
Keel: eng
ISBN-13: 9781108858885

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 61,74 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: PDF+DRM
Ilmumisaeg: 02-Jun-2022
Kirjastus: Cambridge University Press
Keel: eng
ISBN-13: 9781108858885

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

"CUDA is now the dominant language used for programming GPUs, one of the most exciting hardware developments of recent decades. With CUDA, you can use a desktop PC for work that would have previously required a large cluster of PCs or access to a HPC facility. As a result, CUDA is increasingly important in scientific and technical computing across the whole STEM community, from medical physics and financial modelling to big data applications and beyond. This unique book on CUDA draws on the author's passion for and long experience of developing and using computers to acquire and analyse scientific data. The result is an innovative text featuring a much richer set of examples than found in any other comparable book on GPU computing. Much attention has been paid to the C++ coding style, which is compact, elegant and efficient. A code base of examples and supporting material is available online, which readers can build on for their own projects"--

Muu info

A handy guide to speeding up scientific calculations with real-world examples including simulation, image processing and image registration.

List of Figures

List of Tables

xiii

List of Examples

Preface

xix

1 Introduction to GPU Kernels and Hardware

(21)

1.1 Background

(1)

1.2 First CUDA Example

(8)

1.3 CPU Architecture

(1)

1.4 CPU Compute Power

(1)

1.5 CPU Memory Management: Latency Hiding Using Caches

(1)

1.6 CPU: Parallel Instruction Set

(1)

1.7 GPU Architecture

(1)

1.8 Pascal Architecture

(1)

1.9 GPU Memory Types

(2)

1.10 Warps and Waves

(1)

1.11 Blocks and Grids

(1)

1.12 Occupancy

(2)

2 Thinking and Coding in Parallel

(50)

2.1 Flynn's Taxonomy

(8)

2.2 Kernel Call Syntax

(1)

2.3 3D Kernel Launches

(6)

2.4 Latency Hiding and Occupancy

(2)

2.5 Parallel Patterns

(1)

2.6 Parallel Reduce

(11)

2.7 Shared Memory

(2)

2.8 Matrix Multiplication

(8)

2.9 Tiled Matrix Multiplication

(4)

2.10 BLAS

(7)

3 Warps and Cooperative Groups

(34)

3.1 CUDA Objects in Cooperative Groups

(5)

3.2 Tiled Partitions

(5)

3.3 Vector Loading

(4)

3.4 Warp-Level Intrinsic Functions and Sub-warps

(1)

3.5 Thread Divergence and Synchronisation

(2)

3.6 Avoiding Deadlock

(4)

3.7 Coalesced Groups

(7)

3.8 HPC Features

103

(3)

4 Parallel Stencils

106

(36)

4.1 2D Stencils

106

(12)

4.2 Cascaded Calculation of 2D Stencils

118

(5)

4.3 3D Stencils

123

(3)

4.4 Digital Image Processing

126

(8)

4.5 Sobel Filter

134

(1)

4.6 Median Filter

135

(7)

5 Textures

142

(36)

5.1 Image Interpolation

143

(1)

5.2 GPU Textures

144

(2)

5.3 Image Rotation

146

(1)

5.4 The Lerp Function

147

(4)

5.5 Texture Hardware

151

(5)

5.6 Colour Images

156

(1)

5.7 Viewing Images

157

(4)

5.8 Affine Transformations of Volumetric Images

161

(6)

5.9 3D Image Registration

167

(8)

5.10 Image Registration Results

175

(3)

6 Monte Carlo Applications

178

(31)

6.1 Introduction

178

(7)

6.2 The cuRAND Library

185

(11)

6.3 Generating Other Distributions

196

(2)

6.4 Ising Model

198

(11)

7 Concurrency Using CUDA Streams and Events

209

(30)

7.1 Concurrent Kernel Execution

209

(2)

7.2 CUDA Pipeline Example

211

(4)

7.3 Thrust and cudaDeviceReset

215

(1)

7.4 Results from the Pipeline Example

216

(2)

7.5 CUDA Events

218

(7)

7.6 Disk Overheads

225

(8)

7.7 CUDA Graphs

233

(6)

8 Application to PET Scanners

239

(54)

8.1 Introduction to PET

239

(2)

8.2 Data Storage and Definition of Scanner Geometry

241

(6)

8.3 Simulating a PET Scanner

247

(12)

8.4 Building the System Matrix

259

(3)

8.5 PET Reconstruction

262

(4)

8.6 Results

266

(2)

8.7 Implementation of OSEM

268

(2)

8.8 Depth of Interaction (DOI)

270

(3)

8.9 PET Results Using DOI

273

(1)

8.10 Block Detectors

274

(12)

8.11 Richardson-Lucy Image Deblurring

286

(7)

9 Scaling Up

293

(32)

9.1 GPU Selection

295

(3)

9.2 CUDA Unified Virtual Addressing (UVA)

298

(1)

9.3 Peer-to-Peer Access in CUDA

299

(2)

9.4 CUDA Zero-Copy Memory

301

(1)

9.5 Unified Memory (UM)

302

(11)

9.6 A Brief Introduction to MPI

313

(12)

10 Tools for Profiling and Debugging

325

(33)

10.1 The gpulog Example

325

(5)

10.2 Profiling with nvprof

330

(3)

10.3 Profiling with the NVIDIA Visual Profiler (NVVP)

333

(3)

10.4 Nsight Systems

336

(2)

10.5 Nsight Compute

338

(1)

10.6 Nsight Compute Sections

339

(8)

10.7 Debugging with Printf

347

(2)

10.8 Debugging with Microsoft Visual Studio

349

(3)

10.9 Debugging Kernel Code

352

(2)

10.10 Memory Checking

354

(4)

11 Tensor Cores

358

(15)

11.1 Tensor Cores and FP16

358

(2)

11.2 Warp Matrix Functions

360

(5)

11.3 Supported Data Types

365

(1)

11.4 Tensor Core Reduction

366

(5)

11.5 Conclusion

371

(2)

Appendix A A Brief History of CUDA

373

(9)

Appendix B Atomic Operations

382

(5)

Appendix C The NVCC Compiler

387

(6)

Appendix D AVX and the Intel Compiler

393

(9)

Appendix E Number Formats

402

(4)

Appendix F CUDA Documentation and Libraries

406

(4)

Appendix G The CX Header Files

410

(25)

Appendix H AI and Python

435

(3)

Appendix I Topics in C++

438

(10)

Index

448

Richard Ansorge is Emeritus University Senior Lecturer at the Cavendish Laboratory, University of Cambridge and Emeritus Tutor and Fellow at Fitzwilliam College, Cambridge. He is the author of over 170 peer-reviewed publications and co-author of the book The Physics and Mathematics of MRI (2016).

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97811088588852e.html

Märksõnad:

E-raamat: Programming in Parallel with CUDA: A Practical Guide

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Muu info

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv