Klienditugi: 7440010 (E-R 10-18)

Fundamentals of Multicore Software Development [Kõva köide]

2.83/5 (10 hinnangut Goodreads-ist)

Edited by Victor Pankratius, Edited by Ali-Reza Adl-Tabatabai (Intel Corporation, Santa Clara, California, USA), Edited by Walter Tichy (University of Karlsruhe, Germany)

Formaat: Hardback, 330 pages, kõrgus x laius: 234x156 mm, kaal: 589 g, 1 Tables, black and white; 101 Illustrations, black and white
Sari: Chapman & Hall/CRC Computational Science
Ilmumisaeg: 12-Dec-2011
Kirjastus: CRC Press Inc
ISBN-10: 143981273X
ISBN-13: 9781439812730

Teised raamatud teemal:

Computer science - (Hetkel poes: 7 nimetust)
Information technology: general issues - (Hetkel poes: 1 nimetust)
Software Engineering
Programming & scripting languages: general - (Hetkel poes: 1 nimetust)

Kõva köide
Hind: 272,80 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Hardback, 330 pages, kõrgus x laius: 234x156 mm, kaal: 589 g, 1 Tables, black and white; 101 Illustrations, black and white
Sari: Chapman & Hall/CRC Computational Science
Ilmumisaeg: 12-Dec-2011
Kirjastus: CRC Press Inc
ISBN-10: 143981273X
ISBN-13: 9781439812730

Teised raamatud teemal:

Computer science - (Hetkel poes: 7 nimetust)
Information technology: general issues - (Hetkel poes: 1 nimetust)
Software Engineering
Programming & scripting languages: general - (Hetkel poes: 1 nimetust)

Püsilink: https://www.kriso.ee/db/9781439812730.html

Märksõnad:

This collection of twelve articles on multi-core software programming provides detailed information on modern application development with a variety of languages and platforms. The work is divided into sections covering parallel programming basics, languages for multi-core, heterogeneous processors, and emerging technologies. Individual chapters cover topics such as parallel design patterns, parallelism in .NET and Java, programming the cell processor, and auto-tuning parallel application performance. Contributors are academics in computer science from American and European universities. Annotation ©2012 Book News, Inc., Portland, OR (booknews.com)

With multicore processors now in every computer, server, and embedded device, the need for cost-effective, reliable parallel software has never been greater. By explaining key aspects of multicore programming, Fundamentals of Multicore Software Development helps software engineers understand parallel programming and master the multicore challenge.

Accessible to newcomers to the field, the book captures the state of the art of multicore programming in computer science. It covers the fundamentals of multicore hardware, parallel design patterns, and parallel programming in C++, .NET, and Java. It also discusses manycore computing on graphics cards and heterogeneous multicore platforms, automatic parallelization, automatic performance tuning, transactional memory, and emerging applications.

As computing power increasingly comes from parallelism, software developers must embrace parallel programming. Written by leaders in the field, this book provides an overview of the existing and up-and-coming programming choices for multicores. It addresses issues in systems architecture, operating systems, languages, and compilers.

Arvustused

Fundamentals of Multicore Software Development provides a well-organized overview of advances in parallel architectures and software programming. This reviewer learned much from [ the book] and highly recommends it, whether for personal interest or for use as an introductory text. Robert Schaefer, ACM SIGSOFT Software Engineering Notes, May 2012

The individual chapters are well written and self contained; they can be read independently yet fit together well into a coherent and logical presentation. Each chapter includes extensive references. The book will likely appeal most to researchers. Andrew R. Huber, Computing Reviews, March 2012

This book paints a great picture of where we are, and gives more than an inkling of where we may go next. As we gain broader, more general experience with parallel computing based on the foundation presented here, we can be sure that we are helping to rewrite the next chapter probably the most significant one in the amazing history of computing. From the Foreword by Burton J. Smith, Technical Fellow, Microsoft Corporation

Foreword

vii

Editors

Contributors

Chapter 1 Introduction

(6)

Victor Pankratius

Ali-Reza Adl-Tabatabai

Walter F. Tichy

1.1 Where We Are Today

(1)

1.2 How This Book Helps

(1)

1.3 Audience

(1)

1.4 Organization

(6)

1.4.1 Part I: Basics of Parallel Programming

(1)

1.4.2 Part II: Programming Languages for Multicore

(1)

1.4.3 Part III: Programming Heterogeneous Processors

(1)

1.4.4 Part IV: Emerging Technologies

(3)

Part I Basics of Parallel Programming

(46)

Chapter 2 Fundamentals of Multicore Hardware and Parallel Programming

(22)

Barry Wilkinson

2.1 Introduction

(1)

2.2 Potential for Increased Speed

(3)

2.3 Types of Parallel Computing Platforms

(3)

2.4 Processor Design

(2)

2.5 Multicore Processor Architectures

(3)

2.5.1 General

(1)

2.5.2 Symmetric Multicore Designs

(2)

2.5.3 Asymmetric Multicore Designs

(1)

2.6 Programming Multicore Systems

(4)

2.6.1 Processes and Threads

(2)

2.6.2 Thread APIs

(2)

2.6.3 OpenMP

(1)

2.7 Parallel Programming Strategies

(3)

2.7.1 Task and Data Parallelism

(1)

2.7.1.1 Embarrassingly Parallel Computations

(1)

2.7.1.2 Pipelining

(1)

2.7.1.3 Synchronous Computations

(1)

2.7.1.4 Workpool

(1)

2.8 Summary

(1)

References

(3)

Chapter 3 Parallel Design Patterns

(22)

Tim Mattson

3.1 Parallel Programming Challenge

(1)

3.2 Design Patterns: Background and History

(1)

3.3 Essential Patterns for Parallel Programming

(17)

3.3.1 Parallel Algorithm Strategy Patterns

(1)

3.3.1.1 Task Parallelism Pattern

(1)

3.3.1.2 Data Parallelism

(1)

3.3.1.3 Divide and Conquer

(1)

3.3.1.4 Pipeline

(1)

3.3.1.5 Geometric Decomposition

(1)

3.3.2 Implementation Strategy Patterns

(1)

3.3.2.1 SPMD

(1)

3.3.2.2 SIMD

(1)

3.3.2.3 Loop-Level Parallelism

(1)

3.3.2.4 Fork-Join

(1)

3.3.2.5 Master-Worker/Task-Queue

(1)

3.4 Conclusions and Next Steps

(1)

References

(3)

Part II Programming Languages for Multicore

(76)

Chapter 4 Threads and Shared Variables in C++

(24)

Hans Boehm

4.1 Basic Model and Thread Creation

(1)

4.2 Small Detour: C++0x Lambda Expressions

(1)

4.3 Complete Example

(1)

4.4 Shared Variables

(2)

4.5 More Refined Approach

(2)

4.6 Avoiding Data Races

(1)

4.7 Mutexes

(1)

4.8 Atomic Variables

(4)

4.8.1 Low-Level Atomics

(1)

4.9 Other Synchronization Mechanisms

(3)

4.9.1 Unique_lock

(1)

4.9.2 Condition Variables

(1)

4.9.3 Other Mutex Variants and Facilities

(1)

4.9.4 Call_once

(1)

4.10 Terminating a Multi-Threaded C++ Program

(1)

4.11 Task-Based Models

(2)

4.12 Relationship to Earlier Standards

(3)

4.12.1 Separate Thread Libraries

(1)

4.12.2 No Atomics

(1)

4.12.3 Adjacent Field Overwrites

(1)

4.12.4 Other Compiler-Introduced Races

(1)

4.12.5 Program Termination

(1)

References

(3)

Chapter 5 Parallelism in .NET and Java

(22)

Judith Bishop

5.1 Introduction

(2)

5.1.1 Types of Parallelism

(1)

5.1.2 Overview of the
Chapter

(1)

5.2 .NET Parallel Landscape

(1)

5.3 Task Parallel Library

(5)

5.3.1 Basic Methods-For, For Each, and Invoke

(2)

5.3.2 Breaking Out of a Loop

(1)

5.3.3 Tasks and Futures

(2)

5.4 PLINQ

(1)

5.5 Evaluation

(2)

5.6 Java Platform

(3)

5.6.1 Thread Basics

(1)

5.6.2 java.util.concurrent

(2)

5.7 Fork-Join Framework

(2)

5.7.1 Performance

(1)

5.8 ParallelArray Package

(1)

5.9 Conclusion

(1)

Acknowledgments

(1)

References

(4)

Chapter 6 OpenMP

101

(28)

Barbara Chapman

James LaGrone

6.1 Introduction

101

(4)

6.1.1 Idea of OpenMP

102

(1)

6.1.2 Overview of Features

102

(2)

6.1.3 Who Developed OpenMP? How Is It Evolving'?

104

(1)

6.2 OpenMP 3.0 Specification

105

(13)

6.2.1 Parallel Regions and Worksharing

105

(1)

6.2.1.1 Scheduling Parallel Loops

108

(2)

6.2.2 Data Environment

110

(1)

6.2.2.1 Using Data Attributes

110

(2)

6.2.3 Explicit Tasks

112

(1)

6.2.3.1 Using Explicit Tasks

112

(1)

6.2.4 Synchronization

113

(1)

6.2.4.1 Performing Reductions in OpenMP

115

(1)

6.2.5 OpenMP Library Routines and Environment Variables

116

(1)

6.2.5.1 SPMD Programming Style

117

(1)

6.3 Implementation of OpenMP

118

(3)

6.4 Programming for Performance

121

(2)

6.5 Summary

123

(1)

References

124

(5)

Part III Programming Heterogeneous Processors

129

(70)

Chapter 7 Scalable Manycore Computing with CUDA

131

(24)

Michael Garland

Vinod Grover

Kevin Skadron

7.1 Introduction

131

(1)

7.2 Manycore GPU Machine Model

132

(2)

7.3 Structure of CUDA Programs

134

(4)

7.3.1 Program Placement

134

(1)

7.3.2 Parallel Kernels

135

(1)

7.3.3 Communicating within Blocks

136

(1)

7.3.4 Device Memory Management

137

(1)

7.3.5 Complete CUDA Example

138

(1)

7.4 Execution of Kernels on the GPU

138

(5)

7.4.1 Kernel Scheduling

139

(2)

7.4.2 Coordinating Tasks in Kernels

141

(1)

7.4.3 Memory Consistency

142

(1)

7.5 Writing a CUDA Program

143

(5)

7.5.1 Block-Level Parallel Prefix

143

(2)

7.5.2 Array Reduction

145

(2)

7.5.3 Coordinating Whole Grids

147

(1)

7.6 GPU Architecture

148

(4)

7.7 Further Reading

152

(1)

References

152

(3)

Chapter 8 Programming the Cell Processor

155

(44)

Christoph W. Kessler

8.1 Introduction

156

(1)

8.2 Cell Processor Architecture Overview

157

(8)

8.2.1 Power Processing Element

157

(1)

8.2.2 Synergistic Processing Element

158

(2)

8.2.3 Element Interconnect Bus

160

(1)

8.2.4 DMA Communication and Memory Access

161

(2)

8.2.5 Channels

163

(1)

8.2.6 Mailboxes

164

(1)

8.2.7 Signals

164

(1)

8.3 Cell Programming with the SDK

165

(9)

8.3.1 PPE/SPE Thread Coordination

165

(2)

8.3.2 DMA Communication

167

(2)

8.3.3 DMA Communication and Multi-Buffering

169

(1)

8.3.4 Using SIMD Instructions on SPE

170

(4)

8.3.5 Summary: Cell Programming with the SDK

174

(1)

8.4 Cell SDK Compilers, Libraries, and Tools

174

(3)

8.4.1 Compilers

175

(1)

8.4.2 Full-System Simulator

175

(1)

8.4.3 Performance Analysis and Visualization

175

(1)

8.4.4 Cell IDE

176

(1)

8.4.5 Libraries, Components, and Frameworks

176

(1)

8.5 Higher-Level Programming Environments for Cell

177

(10)

8.5.1 OpenMP

178

(1)

8.5.2 CellSs

179

(1)

8.5.3 Sequoia

180

(1)

8.5.4 RapidMind

180

(1)

8.5.5 Sieve C++

181

(1)

8.5.6 Offload C++

182

(2)

8.5.7 NestStep

184

(1)

8.5.8 BlockLib

185

(1)

8.5.9 StarPU for Cell

186

(1)

8.5.10 Other High-Level Programming Environments for Cell

186

(1)

8.6 Algorithms and Components for Cell

187

(1)

8.7 Summary and Outlook

188

(3)

8.8 Bibliographical Remarks

191

(1)

Acknowledgments

191

(1)

Disclaimers and Declarations

192

(1)

Trademarks

192

(1)

References

193

(6)

Part IV Emerging Technologies

199

(110)

Chapter 9 Automatic Extraction of Parallelism from Sequential Code

201

(38)

David I. August

Jialu Huang

Thomas B. Jablin

Hanjun Kim

Thomas R. Mason

Prakash Prabhu

Arun Raman

Yun Zhang

9.1 Introduction

202

(1)

9.1.1 Background

202

(1)

9.1.2 Techniques and Tools

202

(1)

9.2 Dependence Analysis

203

(6)

9.2.1 Introduction

203

(1)

9.2.2 Data Dependence Analysis

204

(1)

9.2.2.1 Data Dependence Graph

204

(1)

9.2.2.2 Analysis

206

(1)

9.2.3 Control Dependence Analysis

207

(1)

9.2.3.1 Control Dependence Graph

207

(1)

9.2.3.2 Analysis

208

(1)

9.2.4 Program Dependence Graph

209

(1)

9.3 DOALL Parallelization

209

(8)

9.3.1 Introduction

209

(2)

9.3.2 Code Generation

211

(1)

9.3.3 Advanced Topic: Reduction

212

(2)

9.3.4 Advanced Topic: Speculative DOALL

214

(1)

9.3.5 Advanced Topic: Further Techniques and Transformations

215

(2)

9.4 DOACROSS Parallelization

217

(6)

9.4.1 Introduction

217

(2)

9.4.2 Code Generation

219

(1)

9.4.3 Advanced Topic: Speculation

220

(3)

9.5 Pipeline Parallelization

223

(7)

9.5.1 Introduction

223

(1)

9.5.2 Code Partitioning

224

(1)

9.5.3 Code Generation

225

(1)

9.5.4 Advanced Topic: Speculation

226

(4)

9.6 Bringing It All Together

230

(2)

9.7 Conclusion

232

(1)

References

233

(6)

Chapter 10 Auto-Tuning Parallel Application Performance

239

(26)

Christoph A. Schaefer

Victor Pankratius

Walter F. Tichy

10.1 Introduction

240

(1)

10.2 Motivating Example

240

(1)

10.3 Terminology

241

(3)

10.3.1 Auto-Tuning

242

(1)

10.3.2 Classification of Approaches

243

(1)

10.4 Overview of the Tunable Architectures Approach

244

(1)

10.5 Designing Tunable Applications

245

(6)

10.5.1 Tunable Architectures

246

(1)

10.5.1.1 Atomic Components

246

(1)

10.5.1.2 Connectors

247

(1)

10.5.1.3 Runtime System and Backend

247

(1)

10.5.1.4 A Tunable Architecture Example

248

(2)

10.5.2 CO2P3S

250

(1)

10.5.3 Comparison

251

(1)

10.6 Implementation with Tuning Instrumentation Languages

251

(5)

10.6.1 Atune-IL

252

(2)

10.6.2 X-Language

254

(1)

10.6.3 POET

255

(1)

10.6.4 Orio

255

(1)

10.6.5 Comparison

255

(1)

10.7 Performance Optimization

256

(4)

10.7.1 Auto-Tuning Cycle

256

(1)

10.7.2 Search Techniques

257

(1)

10.7.3 Auto-Tuning Systems

258

(1)

10.7.3.1 Atune

258

(1)

10.7.3.2 ATLAS/AEOS

259

(1)

10.7.3.3 Active Harmony

259

(1)

10.7.3.4 Model-Based Systems

260

(1)

10.7.4 Comparison

260

(1)

10.8 Conclusion and Outlook

260

(1)

References

261

(4)

Chapter 11 Transactional Memory

265

(26)

Tim Harris

11.1 Introduction

265

(3)

11.2 Transactional Memory Taxonomy

268

(2)

11.2.1 Eager/Lazy Version Management

268

(1)

11.2.2 Eager/Lazy Conflict Detection

268

(1)

11.2.3 Semantics

269

(1)

11.3 Hardware Transactional Memory

270

(3)

11.3.1 Classical Cache-Based Bounded-Size HTM

270

(2)

11.3.2 LogTM

272

(1)

11.4 Software Transactional Memory

273

(6)

11.4.1 Bartok-STM

274

(3)

11.4.2 TL2

277

(2)

11.5 Atomic Blocks

279

(6)

11.5.1 Semantics of Atomic Blocks

280

(2)

11.5.2 Optimizing Atomic Blocks

282

(1)

11.5.3 Composable Blocking

283

(2)

11.6 Performance

285

(2)

11.7 Where Next with TM?

287

(1)

11.8
Chapter Notes

288

(1)

References

289

(2)

Chapter 12 Emerging Applications

291

(18)

Pradeep Dubey

12.1 Introduction

291

(1)

12.2 RMS Taxonomy

292

(8)

12.2.1 Interactive RMS (iRMS)

294

(2)

12.2.2 Growing Significance of Data-Driven Models

296

(1)

12.2.2.1 Massive Data Computing: An Algorithmic Opportunity

297

(1)

12.2.3 Nested RMS

298

(1)

12.2.4 Structured Decomposition of RMS Applications

298

(2)

12.3 System Implications

300

(5)

12.3.1 Nature and Source of Underlying Parallelism

300

(1)

12.3.1.1 Approximate, Yet Real Time

300

(1)

12.3.1.2 Curse of Dimensionality and Irregular Access Pattern

301

(1)

12.3.1.3 Parallelism: Both Coarse and Fine Grain

301

(1)

12.3.1.4 Throughput Computing and Manycore

302

(1)

12.3.1.5 Revisiting Amdahl's Law for Throughput Computing

302

(1)

12.3.2 Scalability of RMS Applications

303

(1)

12.3.2.1 Scalability Implications of Dataset Growth

304

(1)

12.3.3 Homogenous versus Heterogeneous Decomposition

305

(1)

12.4 Conclusion

305

(1)

References

306

(3)

Index

309

Victor Pankratius heads the Multicore Software Engineering group at the Karlsruhe Institute of Technology. He is also the elected chairman of the Software Engineering for Parallel Systems (SEPARS) international working group. With a focus on making parallel programming easier, his research encompasses auto-tuning, language design, debugging, and empirical studies.

Ali-Reza Adl-Tabatabai is a senior principal engineer at Intel Corporation, where he leads a team working on compilers and scalable runtimes. His research concentrates on language features that make it easier to build reliable and scalable parallel programs for future multicore architectures.

Walter Tichy is a professor of computer science and head of the Programming Systems group at the Karlsruhe Institute of Technology. He is also a member of the board of directors of software engineering at Forschungszentrum Informatik (FZI), an independent research institution. His research covers tools and methods to simplify the engineering of general-purpose parallel software, including race detection, auto-tuning, and high-level languages for expressing parallelism.

Fundamentals of Multicore Software Development [Kõva köide]

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv