Customer Support: +372 7440010

E-book: Programming Massively Parallel Processors: A Hands-on Approach

4.02/5 (212 ratings by Goodreads)

David B. Kirk (NVIDIA Fellow), Wen-mei W. Hwu (CTO, MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign, USA)

Format: EPUB+DRM
Pub. Date: 22-Feb-2010
Publisher: Morgan Kaufmann Publishers In
Language: eng
ISBN-13: 9780123814739

Other books in subject:

Computer programming / software development

Format - EPUB+DRM
Price: 58,68 €*
* the price is final i.e. no additional discount will apply
Add to basket
Add to Wishlist
This ebook is for personal use only. E-Books are non-refundable.

Format: EPUB+DRM
Pub. Date: 22-Feb-2010
Publisher: Morgan Kaufmann Publishers In
Language: eng
ISBN-13: 9780123814739

Other books in subject:

Computer programming / software development

DRM restrictions

Copying (copy/paste):

not allowed
Printing:

not allowed
Usage:

Digital Rights Management (DRM)
The publisher has supplied this book in encrypted form, which means that you need to install free software in order to unlock and read it. To read this e-book you have to create Adobe ID More info here. Ebook can be read and downloaded up to 6 devices (single user with the same Adobe ID).

Required software
To read this ebook on a mobile device (phone or tablet) you'll need to install this free app: PocketBook Reader (iOS / Android)

To download and read this eBook on a PC or Mac you need Adobe Digital Editions (This is a free app specially developed for eBooks. It's not the same as Adobe Reader, which you probably already have on your computer.)

You can't read this ebook with Amazon Kindle

Programming Massively Parallel Processors discusses the basic concepts of parallel programming and GPU architecture. Various techniques for constructing parallel programs are explored in detail. Case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs.

This book describes computational thinking techniques that will enable students to think about problems in ways that are amenable to high-performance parallel computing. It utilizes CUDA (Compute Unified Device Architecture), NVIDIA's software development tool created specifically for massively parallel environments. Studies learn how to achieve both high-performance and high-reliability using the CUDA programming model as well as OpenCL.

This book is recommended for advanced students, software engineers, programmers, and hardware engineers.

Reviews

"For those interested in the GPU path to parallel enlightenment, this new book from David Kirk and Wen-mei Hwu is a godsend, as it introduces CUDA (tm), a C-like data parallel language, and Tesla(tm), the architecture of the current generation of NVIDIA GPUs. In addition to explaining the language and the architecture, they define the nature of data parallel problems that run well on the heterogeneous CPU-GPU hardware ... This book is a valuable addition to the recently reinvigorated parallel computing literature." --David Patterson, Director of The Parallel Computing Research Laboratory and the Pardee Professor of Computer Science, U.C. Berkeley. Co-author of Computer Architecture: A Quantitative Approach

"Written by two teaching pioneers, this book is the definitive practical reference on programming massively parallel processors--a true technological gold mine. The hands-on learning included is cutting-edge, yet very readable. This is a most rewarding read for students, engineers, and scientists interested in supercharging computational resources to solve today's and tomorrow's hardest problems." --Nicolas Pinto, MIT, NVIDIA Fellow, 2009

"I have always admired Wen-mei Hwu's and David Kirk's ability to turn complex problems into easy-to-comprehend concepts. They have done it again in this book. This joint venture of a passionate teacher and a GPU evangelizer tackles the trade-off between the simple explanation of the concepts and the in-depth analysis of the programming techniques. This is a great book to learn both massive parallel programming and CUDA." --Mateo Valero, Director, Barcelona Supercomputing Center

"The use of GPUs is having a big impact in scientific computing. David Kirk and Wen-mei Hwu's new book is an important contribution towards educating our students on the ideas and techniques of programming for massively parallel processors." --Mike Giles, Professor of Scientific Computing, University of Oxford

"This book is the most comprehensive and authoritative introduction to GPU computing yet. David Kirk and Wen-mei Hwu are the pioneers in this increasingly important field, and their insights are invaluable and fascinating. This book will be the standard reference for years to come." --Hanspeter Pfister, Harvard University

"This is a vital and much-needed text. GPU programming is growing by leaps and bounds. This new book will be very welcomed and highly useful across inter-disciplinary fields." --Shannon Steinfadt, Kent State University

"GPUs have hundreds of cores capable of delivering transformative performance increases across a wide range of computational challenges. The rise of these multi-core architectures has raised the need to teach advanced programmers a new and essential skill: how to program massively parallel processors." -CNNMoney.com

"This book is a valuable resource for all students from science and engineering disciplines where parallel programming skills are needed to allow solving compute-intensive problems." --BCS: The British Computer Societys online journal

Preface

Acknowledgments

xvii

Dedication

xix

Introduction

(20)

GPUs as Parallel Computers

(6)

Architecture of a Modern GPU

(2)

Why More Speed or Parallelism?

(3)

Parallel Programming Languages and Models

(2)

Overaching Goals

(1)

Organization of the Book

(5)

History of GPU Computing

(18)

Evolution of Graphics Pipelines

(11)

The Era of Fixed-Function Graphics Pipelines

(4)

Evolution of Programmable Real-Time Graphics

(3)

Unified Graphics and Computing Processors

(2)

GPGPU: An Intermediate Step

(1)

GPU Computing

(2)

Scalable GPUs

(1)

Recent Developments

(1)

Future Trends

(5)

Introduction to CUDA

(20)

Data Parallelism

(2)

CUDA Program Structure

(1)

A Matrix-Matrix Multiplication Example

(4)

Device Memories and Data Transfer

(5)

Kernel Functions and Threading

(5)

Summary

(3)

Function declarations

(1)

Kernel launch

(1)

Predefined variables

(1)

Runtime API

(2)

CUDA Threads

(18)

CUDA Thread Organization

(5)

Using blockIdx and threadIdx

(4)

Synchronization and Transparent Scalability

(2)

Thread Assignment

(1)

Thread Scheduling and Latency Tolerance

(3)

Summary

(1)

Exercises

(3)

CUDA™ Memories

(18)

Importance of Memory Access Efficiency

(1)

CUDA Device Memory Types

(4)

A Strategy for Reducing Global Memory Traffic

(7)

Memory as a Limiting Factor to Parallelism

(2)

Summary

(1)

Exercises

(2)

Performance Considerations

(30)

More on Thread Execution

(7)

Global Memory Bandwidth

103

(8)

Dynamic Partitioning of SM Resources

111

(2)

Data Prefetching

113

(2)

Instruction Mix

115

(1)

Thread Granularity

116

(2)

Measured Performance and Summary

118

(2)

Exercises

120

(5)

Floating Point Considerations

125

(16)

Floating-Point Format

126

(3)

Normalized Representation of M

126

(1)

Excess Encoding of E

127

(2)

Representable Numbers

129

(5)

Special Bit Patterns and Precision

134

(1)

Arithmetic Accuracy and Rounding

135

(1)

Algorithm Considerations

136

(2)

Summary

138

(1)

Exercises

138

(3)

Application Case Study: Advanced MRI Reconstruction

141

(32)

Application Background

142

(2)

Iterative Reconstruction

144

(4)

Computing FHd

148

(19)

Determine the Kernel Parallelism Structure

149

(7)

Getting Around the Memory Bandwidth Limitation

156

(7)

Using Hardware Trigonometry Functions

163

(3)

Experimental Performance Tuning

166

(1)

Final Evaluation

167

(3)

Exercises

170

(3)

Application Case Study: Molecular Visualization and Analysis

173

(18)

Application Background

174

(2)

A Simple Kernel Implementation

176

(4)

Instruction Execution Efficiency

180

(2)

Memory Coalescing

182

(3)

Additional Performance Comparisons

185

(2)

Using Multiple GPUs

187

(1)

Exercises

188

(3)

Parallel Programming and Computational Thinking

191

(14)

Goals of Parallcl Programming

192

(1)

Problem Decomposition

193

(3)

Algorithm Selection

196

(6)

Computational Thinking

202

(2)

Exercises

204

(1)

A Brief Introduction to Opencl™

205

(16)

Background

205

(2)

Data Parallelism Model

207

(2)

Device Architecture

209

(2)

Kernel Functions

211

(1)

Device Management and Kernel Launch

212

(2)

Electrostatic Potential Map in OpenCL

214

(5)

Summary

219

(1)

Exercises

220

(1)

Conclusion and Future Outlook

221

(12)

Goals Revisited

221

(2)

Memory Architecture Evolution

223

(4)

Large Virtual and Physical Address Spaces

223

(1)

Unified Device Memory Space

224

(1)

Configurable Caching and Scratch Pad

225

(1)

Enhanced Atomic Operations

226

(1)

Enhanced Global Memory Access

226

(1)

Kernel Execution Control Evolution

227

(2)

Function Calls within Kernel Functions

227

(1)

Exception Handling in Kernel Functions

227

(1)

Simultaneous Execution of Multiple Kernels

228

(1)

Interruptible Kernels

228

(1)

Core Performance

229

(1)

Double-Precision Speed

229

(1)

Better Control Flow Efficiency

229

(1)

Programming Environment

230

(1)

A Bright Outlook

230

(3)

APPENDIX A MATRIX MULTIPLICATION HOST-ONLY VERSION SOURCE CODE

233

(12)

matrixmul.cu

233

(4)

matrixmul_gold.cpp

237

(1)

matrixmul.h

238

(1)

assist.h

239

(4)

Expected Output

243

(2)

APPENDIX B GPU COMPUTE CAPABILITIES

245

(6)

GPU Compute Capability Tables

245

(1)

Memory Coalescing Variations

246

(5)

Index

251

David B. Kirk is well recognized for his contributions to graphics hardware and algorithm research. By the time he began his studies at Caltech, he had already earned B.S. and M.S. degrees in mechanical engineering from MIT and worked as an engineer for Raster Technologies and Hewlett-Packard's Apollo Systems Division, and after receiving his doctorate, he joined Crystal Dynamics, a video-game manufacturing company, as chief scientist and head of technology. In 1997, he took the position of Chief Scientist at NVIDIA, a leader in visual computing technologies, and he is currently an NVIDIA Fellow. At NVIDIA, Kirk led graphics-technology development for some of today's most popular consumer-entertainment platforms, playing a key role in providing mass-market graphics capabilities previously available only on workstations costing hundreds of thousands of dollars. For his role in bringing high-performance graphics to personal computers, Kirk received the 2002 Computer Graphics Achievement Award from the Association for Computing Machinery and the Special Interest Group on Graphics and Interactive Technology (ACM SIGGRAPH) and, in 2006, was elected to the National Academy of Engineering, one of the highest professional distinctions for engineers. Kirk holds 50 patents and patent applications relating to graphics design and has published more than 50 articles on graphics technology, won several best-paper awards, and edited the book Graphics Gems III. A technological "evangelist" who cares deeply about education, he has supported new curriculum initiatives at Caltech and has been a frequent university lecturer and conference keynote speaker worldwide. Wen-mei W. Hwu is a Professor and holds the Sanders-AMD Endowed Chair in the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign. His research interests are in the area of architecture, implementation, compilation, and algorithms for parallel computing. He is the chief scientist of Parallel Computing Institute and director of the IMPACT research group (www.impact.crhc.illinois.edu). He is a co-founder and CTO of MulticoreWare. For his contributions in research and teaching, he received the ACM SigArch Maurice Wilkes Award, the ACM Grace Murray Hopper Award, the Tau Beta Pi Daniel C. Drucker Eminent Faculty Award, the ISCA Influential Paper Award, the IEEE Computer Society B. R. Rau Award and the Distinguished Alumni Award in Computer Science of the University of California, Berkeley. He is a fellow of IEEE and ACM. He directs the UIUC CUDA Center of Excellence and serves as one of the principal investigators of the NSF Blue Waters Petascale computer project. Dr. Hwu received his Ph.D. degree in Computer Science from the University of California, Berkeley.

More information about ebooks

Permanent link: https://www.kriso.ee/db/97801238147396e.html

Keywords:

E-book: Programming Massively Parallel Processors: A Hands-on Approach

DRM restrictions

Copying (copy/paste):

Printing:

Usage:

Reviews

Account & settings

Search

Search database

Refine By

Subjects Ebook Subjects

Choose shopping cart