Muutke küpsiste eelistusi

E-raamat: Programming in Parallel with CUDA: A Practical Guide

(University of Cambridge)
  • Formaat: PDF+DRM
  • Ilmumisaeg: 02-Jun-2022
  • Kirjastus: Cambridge University Press
  • Keel: eng
  • ISBN-13: 9781108858885
  • Formaat - PDF+DRM
  • Hind: 61,74 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: PDF+DRM
  • Ilmumisaeg: 02-Jun-2022
  • Kirjastus: Cambridge University Press
  • Keel: eng
  • ISBN-13: 9781108858885

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

"CUDA is now the dominant language used for programming GPUs, one of the most exciting hardware developments of recent decades. With CUDA, you can use a desktop PC for work that would have previously required a large cluster of PCs or access to a HPC facility. As a result, CUDA is increasingly important in scientific and technical computing across the whole STEM community, from medical physics and financial modelling to big data applications and beyond. This unique book on CUDA draws on the author's passion for and long experience of developing and using computers to acquire and analyse scientific data. The result is an innovative text featuring a much richer set of examples than found in any other comparable book on GPU computing. Much attention has been paid to the C++ coding style, which is compact, elegant and efficient. A code base of examples and supporting material is available online, which readers can build on for their own projects"--

Muu info

A handy guide to speeding up scientific calculations with real-world examples including simulation, image processing and image registration.
List of Figures x
List of Tables xiii
List of Examples xv
Preface xix
1 Introduction to GPU Kernels and Hardware 1(21)
1.1 Background
1(1)
1.2 First CUDA Example
2(8)
1.3 CPU Architecture
10(1)
1.4 CPU Compute Power
11(1)
1.5 CPU Memory Management: Latency Hiding Using Caches
12(1)
1.6 CPU: Parallel Instruction Set
13(1)
1.7 GPU Architecture
14(1)
1.8 Pascal Architecture
15(1)
1.9 GPU Memory Types
16(2)
1.10 Warps and Waves
18(1)
1.11 Blocks and Grids
19(1)
1.12 Occupancy
20(2)
2 Thinking and Coding in Parallel 22(50)
2.1 Flynn's Taxonomy
22(8)
2.2 Kernel Call Syntax
30(1)
2.3 3D Kernel Launches
31(6)
2.4 Latency Hiding and Occupancy
37(2)
2.5 Parallel Patterns
39(1)
2.6 Parallel Reduce
40(11)
2.7 Shared Memory
51(2)
2.8 Matrix Multiplication
53(8)
2.9 Tiled Matrix Multiplication
61(4)
2.10 BLAS
65(7)
3 Warps and Cooperative Groups 72(34)
3.1 CUDA Objects in Cooperative Groups
75(5)
3.2 Tiled Partitions
80(5)
3.3 Vector Loading
85(4)
3.4 Warp-Level Intrinsic Functions and Sub-warps
89(1)
3.5 Thread Divergence and Synchronisation
90(2)
3.6 Avoiding Deadlock
92(4)
3.7 Coalesced Groups
96(7)
3.8 HPC Features
103(3)
4 Parallel Stencils 106(36)
4.1 2D Stencils
106(12)
4.2 Cascaded Calculation of 2D Stencils
118(5)
4.3 3D Stencils
123(3)
4.4 Digital Image Processing
126(8)
4.5 Sobel Filter
134(1)
4.6 Median Filter
135(7)
5 Textures 142(36)
5.1 Image Interpolation
143(1)
5.2 GPU Textures
144(2)
5.3 Image Rotation
146(1)
5.4 The Lerp Function
147(4)
5.5 Texture Hardware
151(5)
5.6 Colour Images
156(1)
5.7 Viewing Images
157(4)
5.8 Affine Transformations of Volumetric Images
161(6)
5.9 3D Image Registration
167(8)
5.10 Image Registration Results
175(3)
6 Monte Carlo Applications 178(31)
6.1 Introduction
178(7)
6.2 The cuRAND Library
185(11)
6.3 Generating Other Distributions
196(2)
6.4 Ising Model
198(11)
7 Concurrency Using CUDA Streams and Events 209(30)
7.1 Concurrent Kernel Execution
209(2)
7.2 CUDA Pipeline Example
211(4)
7.3 Thrust and cudaDeviceReset
215(1)
7.4 Results from the Pipeline Example
216(2)
7.5 CUDA Events
218(7)
7.6 Disk Overheads
225(8)
7.7 CUDA Graphs
233(6)
8 Application to PET Scanners 239(54)
8.1 Introduction to PET
239(2)
8.2 Data Storage and Definition of Scanner Geometry
241(6)
8.3 Simulating a PET Scanner
247(12)
8.4 Building the System Matrix
259(3)
8.5 PET Reconstruction
262(4)
8.6 Results
266(2)
8.7 Implementation of OSEM
268(2)
8.8 Depth of Interaction (DOI)
270(3)
8.9 PET Results Using DOI
273(1)
8.10 Block Detectors
274(12)
8.11 Richardson-Lucy Image Deblurring
286(7)
9 Scaling Up 293(32)
9.1 GPU Selection
295(3)
9.2 CUDA Unified Virtual Addressing (UVA)
298(1)
9.3 Peer-to-Peer Access in CUDA
299(2)
9.4 CUDA Zero-Copy Memory
301(1)
9.5 Unified Memory (UM)
302(11)
9.6 A Brief Introduction to MPI
313(12)
10 Tools for Profiling and Debugging 325(33)
10.1 The gpulog Example
325(5)
10.2 Profiling with nvprof
330(3)
10.3 Profiling with the NVIDIA Visual Profiler (NVVP)
333(3)
10.4 Nsight Systems
336(2)
10.5 Nsight Compute
338(1)
10.6 Nsight Compute Sections
339(8)
10.7 Debugging with Printf
347(2)
10.8 Debugging with Microsoft Visual Studio
349(3)
10.9 Debugging Kernel Code
352(2)
10.10 Memory Checking
354(4)
11 Tensor Cores 358(15)
11.1 Tensor Cores and FP16
358(2)
11.2 Warp Matrix Functions
360(5)
11.3 Supported Data Types
365(1)
11.4 Tensor Core Reduction
366(5)
11.5 Conclusion
371(2)
Appendix A A Brief History of CUDA 373(9)
Appendix B Atomic Operations 382(5)
Appendix C The NVCC Compiler 387(6)
Appendix D AVX and the Intel Compiler 393(9)
Appendix E Number Formats 402(4)
Appendix F CUDA Documentation and Libraries 406(4)
Appendix G The CX Header Files 410(25)
Appendix H AI and Python 435(3)
Appendix I Topics in C++ 438(10)
Index 448
Richard Ansorge is Emeritus University Senior Lecturer at the Cavendish Laboratory, University of Cambridge and Emeritus Tutor and Fellow at Fitzwilliam College, Cambridge. He is the author of over 170 peer-reviewed publications and co-author of the book The Physics and Mathematics of MRI (2016).