Muutke küpsiste eelistusi

Parallel Computing: Accelerating Computational Science and Engineering (CSE) [Kõva köide]

Edited by , Edited by , Edited by , Edited by , Edited by , Edited by
  • Formaat: Hardback, 872 pages
  • Sari: Advances in Parallel Computing 25
  • Ilmumisaeg: 01-Mar-2014
  • Kirjastus: IOS Press,US
  • ISBN-10: 1614993807
  • ISBN-13: 9781614993803
Teised raamatud teemal:
  • Formaat: Hardback, 872 pages
  • Sari: Advances in Parallel Computing 25
  • Ilmumisaeg: 01-Mar-2014
  • Kirjastus: IOS Press,US
  • ISBN-10: 1614993807
  • ISBN-13: 9781614993803
Teised raamatud teemal:
Parallel computing has been the enabling technology of high-end machines for many years. Now, it has finally become the ubiquitous key to the efficient use of any kind of multi-processor computer architecture, from smart phones, tablets, embedded systems and cloud computing up to exascale computers. This book presents the proceedings of ParCo2013 – the latest edition of the biennial International Conference on Parallel Computing – held from 10 to 13 September 2013, in Garching, Germany. The conference focused on several key parallel computing areas. Themes included parallel programming models for multi- and manycore CPUs, GPUs, FPGAs and heterogeneous platforms, the performance engineering processes that must be adapted to efficiently use these new and innovative platforms, novel numerical algorithms and approaches to large-scale simulations of problems in science and engineering.The conference programme also included twelve mini-symposia (including an industry session and a special PhD Symposium), which comprehensively represented and intensified the discussion of current hot topics in high performance and parallel computing. These special sessions covered large-scale supercomputing, novel challenges arising from parallel architectures (multi-/manycore, heterogeneous platforms, FPGAs), multi-level algorithms as well as multi-scale, multi-physics and multi-dimensional problems.It is clear that parallel computing – including the processing of large data sets (“Big Data”) – will remain a persistent driver of research in all fields of innovative computing, which makes this book relevant to all those with an interest in this field.

The invited talks cover extreme data science at the National Energy Research Scientific Computing Center, and performance analysis techniques for the exascale co-design process. About another 85 papers consider such topics as approximate inverse preconditioners for Krylov methods on heterogeneous parallel computers, simulating multiphase flows in the subsurface of supercomputers based on graphics processing units, a fault tolerant implementation of multi-level Monte Carlo methods, global communication schemes for the sparse grid combination technique, potentials and limitation for energy efficient auto-tuning, and a generic prototype to benchmark algorithms and data structures for hierarchical hybrid grids. Annotation ©2016 Ringgold, Inc., Portland, OR (protoview.com)
Preface v
Michael Bader
Arndt Bode
Hans-Joachim Bungartz
Michael Gerndt
Gerhard R. Joubert
Frans Peters
Conference Organisation vii
Invited Talks
Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center
3(16)
Sudip Dosanjh
Shane Canon
Jack Deslippe
Kjiersten Fagnan
Richard Gerber
Lisa Gerhardt
Jason Hick
Douglas Jacobsen
David Skinner
Nicholas J. Wright
Performance Analysis Techniques for the Exascale Co-Design Process
19(16)
Martin Schulz
Jim Belak
Abhinav Bhatele
Peer-Timo Bremer
Greg Bronevetsky
Marc Casas
Todd Gamblin
Katherine E. Isaacs
Ignacio Laguna
Joshua Levine
Valerio Pascucci
David Richards
Barry Rountree
Parallel Programming Models XMP-IO Function and Its Application to MapReduce on the K Computer
35(8)
Tomotake Nakamura
Mitsuhisa Sato
POLCA -- A Programming Model for Large Scale, Strongly Heterogeneous Infrastructures
43(10)
Lutz Schubert
Jan Kuper
Jose Gracia
Exploitation of Quality/Throughput Tradeoffs in Image Processing Through Invasive Computing
53(10)
Alexandru Tanase
Vahid Lari
Frank Hannig
Jurgen Teich
An Efficient Thread Mapping Strategy for Multiprogramming on Manycore Processors
63(9)
Ashkan Tousimojarad
Wim Vanderbauwhede
A Scalable Farm Skeleton for Heterogeneous Parallel Programming
72(10)
Steffen Ernsting
Herbert Kuchen
Towards Truly Boolean Arrays in Data-Parallel Array Processing
82(10)
Clemens Grelck
Hraban Luyat
Deep Packet Inspection on Commodity Hardware Using FastFlow
92(11)
M. Danelutto
L. Deri
D. De Sensi
M. Torquati
Performance Analysis and Tools
Formalizing Bottlenecks in Task-Based OpenMP Applications
103(10)
Shajulin Benedict
Michael Gerndt
Diana-Mihaela Gudu
Characterizing Performance of Applications on Blue Gene/Q
113(10)
Paul F. Baumeister
Hans Boettiger
Thorsten Hater
Michael Knobloch
Thilo Maurer
Andrea Nobile
Dirk Pleiter
Nicolas Vandenbergen
Specification of Periscope Tuning Framework Plugins
123(12)
Robert Mijakovic
Antonio Pimenta Soto
Isaias A. Compres Urena
Michael Gerndt
Anna Sikora
Eduardo Cesar
Parallel Numerical Linear Algebra
On Using Speculative Computations for Parallel Reduction to Tridiagonal Form
135(8)
Sergey V. Kuznetsov
Fast Approximate Solution of the Non-Symmetric Generalized Eigenvalue Problem on Multicore Architectures
143(10)
Peter Benner
Martin Kohler
Jens Saak
Locality Optimization on a NUMA Architecture for Hybrid LU Factorization
153(10)
Adrien Remy
Marc Baboulin
Masha Sosonkina
Brigitte Rozoy
Variable Block Algebraic Recursive Multilevel Solver (VBARMS) for Sparse Linear Systems
163(10)
Bruno Carpentieri
Jia Liao
Masha Sosonkina
A Proposal of a Single-Synchronized Solver Suited to Large Scale Linear Systems on Parallel Computers with Distributed Memory
173(10)
Seiji Fujino
Keiichi Murakami
Kosuke Iwasato
Approximate Inverse Preconditioners for Krylov Methods on Heterogeneous Parallel Computers
183(10)
Daniele Bertaccini
Salvatore Filippone
Cache and Energy Efficiency of Sparse Matrix-Vector Multiplication for Different BLAS Numerical Types with the RSB Format
193(10)
Michele Martone
Heterogeneous Sparse Matrix Computations on Hybrid GPU/CPU Platforms
203(12)
Valeria Cardellini
Alessandro Fanfarillo
Salvatore Filippone
Parallel Algorithms
MapReduce Streaming Algorithms for Laplace Relaxation on the Cloud
215(10)
Atanas Radenski
Boyana Norris
Space Exploration Using Parallel Orbits: A Study in Parallel Symbolic Computing
225(8)
Vladimir Janjic
Christopher Brown
Max Neunhoffer
Kevin Hammond
Steve Linton
Hans-Wolfgang Loidl
SFC-Based Communication Metadata Encoding for Adaptive Mesh Refinement
233(10)
Martin Schreiber
Tobias Weinzierl
Hans-Joachim Bungartz
Graph Repartitioning with Both Dynamic Load and Dynamic Processor Allocation
243(10)
Clement Vuchener
Aurelien Esnard
ForestClaw: Hybrid Forest-of-Octrees AMR for Hyperbolic Conservation Laws
253(10)
Carsten Burstedde
Donna Calhoun
Kyle Mandli
Andy R. Terrel
A Space-Time Parallel Solver for the Three-Dimensional Heat Equation
263(10)
Robert Speck
Daniel Ruprecht
Matthew Emmett
Matthias Bolten
Rolf Krause
An Efficient Pipelined Implementation of Space-Time Parallel Applications
273(12)
Toshiya Takami
Daiki Fukudome
GPU Computing and Applications
Efficient GPU-Based Optimization of Volume Meshes
285(10)
Eric Shaffer
Zuofu Cheng
Raine Yeh
George Zagaris
Luke Olson
Fast Uniform Grid Construction on GPGPUs Using Atomic Operations
295(10)
Davide Barbieri
Valeria Cardellini
Salvatore Filippone
Porting Large HPC Applications to GPU Clusters: The Codes GENE and VERTEX
305(10)
Tilman Dannert
Andreas Marek
Markus Rampp
Numerical Simulation of the Low Compressible Viscous Gas Flows on GPU-Based Hybrid Supercomputers
315(9)
Alexander A. Davydov
Evgeny V. Shilnikov
Simulation of Multiphase Flows in the Subsurface on GPU-Based Supercomputers
324(10)
Marina Trapeznikova
Natalia Churbanova
Anastasiya Lyupa
Dmitry Morozov
Atomic Computing -- A Different Perspective on Massively Parallel Problems
334(13)
Andrew Brown
Rob Mills
Jeff Reeve
Kier Dugan
Steve Furber
Parallelisation and Optimisation of Large-Scale Applications
Accelerating SeisSol by Generating Vectorized Code for Sparse Matrix Operators
347(10)
Alexander Breuer
Alexander Heinecke
Michael Bader
Christian Pelties
Experience with the MPI/StarSs Programming Model on a Large Production Code
357(10)
Dirk Brommel
Paul Gibbon
Marta Garcia
Victor Lopez
Vladimir Marjanovic
Jesus Labarta
Exploiting Data- and Task-Parallelism in the Solution of Riccati Equations on Multicore Servers and GPUs
367(8)
P. Benner
P. Ezzatti
E.S. Quintana-Orti
A. Remon
Testing and Implementing Some New Algorithms Using the FFTW Library on Massively Parallel Supercomputers
375(12)
Massimiliano Guarrasi
Ning Li
Sandro Frigio
Andrew Emerson
Giovanni Erbacci
Performance Measurements of MHD Simulation for Planetary Magnetosphere on Peta-Scale Computer FX10
387(8)
Keiichiro Fukazawa
Takeshi Nanri
Takayuki Umeda
Parallel Simulations of Self-Propelled Microorganisms
395(10)
Kristina Pickl
Matthias Hofmann
Tobias Preclik
Harald Kostler
Ana-Suncana Smith
Ulrich Rude
Improving Communication Performance of Sparse Linear Algebra for an Atomistic Simulation Application
405(10)
Christiane Pousa
Jurg Hutter
Joost Vandevondele
Nemorb's Fourier Filter and Distributed Matrix Transposition on Petaflop Systems
415(12)
Tiago Ribeiro
Matthieu Haefele
Parallel Computing Design for Exact Diagonalization Scheme on Multi-Band Hubbard Cluster Models
427(12)
Susumu Yamada
Toshiyuki Imamura
Masahiko Machida
ParCo PhD Symposium
ParCo 2013 PhD Symposium
439(2)
Josef Weidendorfer
Michael Bader
Numerical Experiments with New Algorithms for Parallel Decomposition of Large Computational Meshes
441(10)
Evdokia Golovchenko
Elizaveta Dorofeeva
Irina Gasilova
Alexey Boldarev
A Distributed Algorithm for the Permutation Flow Shop Problem -- An Empirical Analysis
451(10)
Samia Kouki
Mohamed Jemni
Talel Ladhari
GPI2 for GPUs: A PGAS Framework for Efficient Communication in Hybrid Clusters
461(10)
Lena Oden
A Fault Tolerant Implementation of Multi-Level Monte Carlo Methods
471(10)
Stefan Pauli
Manuel Kohler
Peter Arbenz
High Performance CPU/GPU Multiresolution Poisson Solver
481(12)
Wim M. Van Rees
Diego Rossinelli
Panagiotis Hadjidoukas
Petros Koumoutsakos
Mini-Symposium "Parallel Computing with FPGAs (ParaFPGA2013)"
ParaFPGA 2013: Harnessing Programs, Power and Performance in Parallel FPGA Applications
493(4)
Erik H. D'Hollander
Dirk Stroobandt
Abdellah Touhafi
High-Level Synthesis Revised: Generation of FPGA Accelerators from a Domain-Specific Language Using the Polyhedron Model
497(10)
Moritz Schmid
Frank Hannig
Alexandru Tanase
Jurgen Teich
Compiling a Dataflow-Based Language Abstraction onto an FPGA
507(8)
Eva Burrows
Timing Driven C-Slow Retiming on RTL for MultiCores on FPGAs
515(8)
Tobias Strauch
Performance and Resource Modeling for FPGAs Using High-Level Synthesis Tools
523(9)
Bruno Da Silva
An Braeken
Erik H. D'Hollander
Abdellah Touhafi
Interactive Graph Cuts Using FPGA
532(8)
Daichi Kobori
Tsutomu Maruyama
An Image Filter System Based on Dynamic Partial Reconfiguration on FPGA
540(8)
Hisaaki Kurita
Tsutomu Maruyama
Investigating Energy Consumption of an SRAM-Based FPGA for Duty-Cycle Applications
548(15)
Khurram Shahzad
Bengt Oelmann
Mini-Symposium "High-Dimensional Meets Parallel -- Algorithms and Applications"
High-Dimensional Meets Parallel: Algorithms and Applications
563(1)
Hans-Joachim Bungartz
Dirk Pfluger
Markus Hegland
Global Communication Schemes for the Sparse Grid Combination Technique
564(10)
Philipp Hupp
Riko Jacob
Mario Heene
Dirk Pfluger
Markus Hegland
Load Balancing for Massively Parallel Computations with the Sparse Grid Combination Technique
574(10)
Mario Heene
Christoph Kowitz
Dirk Pfluger
A Parallel Fault Tolerant Combination Technique
584(9)
Brendan Harding
Markus Hegland
Managing Complexity in the Parallel Sparse Grid Combination Technique
593(10)
J. W. Larson
P.E. Strazdins
M. Hegland
B. Harding
S. Roberts
L. Stals
A.P. Rendell
Md.M. Ali
J. Southern
Scalability and Fault Tolerance of the Alternating Direction Method of Multipliers for Sparse Grids
603(12)
Valeriy Khakhutskyy
Dirk Pfluger
Markus Hegland
Mini-Symposium "Application Autotuning for HPC (Architectures)"
Mini-Symposium on Application Autotuning for HPC
615(1)
Siegfried Benkner
Matthias Brehm
Michael Gerndt
Wolfram Hesse
Anna Sikora
Investigating Performance Benefits from OpenACC Kernel Directives
616(10)
Benjamin Eagan
Gilles Civario
Renato Miceli
Application-Independent Autotuning for GPUs
626(10)
Martin Tillmann
Thomas Karcher
Carsten Dachsbacher
Walter F. Tichy
Autotuning of Pattern Runtimes for Accelerated Parallel Systems
636(10)
Enes Bajrovic
Siegfried Benkner
Jiri Dokulil
Martin Sandrieser
Empirical Performance Modeling of GPU Kernels Using Active Learning
646(10)
Prasanna Balaprakash
Karl Rupp
Azamat Mametjanov
Robert B. Gramacy
Paul D. Hovland
Stefan M. Wild
Crowdtuning: Systematizing Auto-Tuning Using Predictive Modeling and Crowdsourcing
656(12)
Abdul Memon
Grigori Fursin
Autotuning the Energy Consumption
668(10)
Carmen B. Navarrete
Carla Guillen
Wolfram Hesse
Matthias Brehm
Potentials and Limitations for Energy Efficiency Auto-Tuning
678(13)
Robert Schone
Andreas Knupfer
Daniel Molka
Mini-Symposium "Extreme Scaling on SuperMUC"
Extreme Scaling Workshop at the LRZ
691(7)
Momme Allalen
Gurvan Bazin
Christoph Bernau
Arndt Bode
David Brayford
Matthias Brehm
Jurg Diemand
Klaus Dolag
Jan Engels
Nicolay Hammer
Herbert Huber
Ferdinand Jamitzky
Anupam Kamakar
Carsten Kutzner
Andreas Marek
Carmen Navarrete
Helmut Satzger
Wolfram Schmidt
Philipp Trisjono
Extreme Scaling of Lattice Quantum Chromodynamics
698(5)
David Brayford
Momme Allalen
Volker Weinberg
End-to-End Parallel Simulations with APES
703(9)
Harald Klimach
Kartik Jain
Sabine Roller
Towards Petaflops Capability of the VERTEX Supernova Code
712(10)
Andreas Marek
Markus Rampp
Florian Hanke
Hans-Thomas Janka
Scaling of the GROMACS 4.6 Molecular Dynamics Code on SuperMUC
722(9)
Carsten Kutzner
Rossen Apostolov
Berk Hess
Helmut Grubmuller
Mini-Symposium "Parallel Programming for Heterogeneous Architectures"
Parallel Programming for Heterogeneous Architectures
731(2)
Bettina Krammer
Hartmut Mix
Markus Geimer
Execution Schemes for the NPB-MZ Benchmarks on Hybrid Architectures: A Comparative Study
733(10)
Jorg Dummler
Gudula Runger
Scilab on a Hybrid Platform
743(10)
Victor Lomuller
Sylvestre Ledru
Henri-Pierre Charles
Divide and Conquer Parallelization of Finite Element Method Assembly
753(10)
Loic Thebault
Eric Petit
Marc Tchiboukdjian
Quang Dinh
William Jalby
Cudagrind: A Valgrind Extension for CUDA
763(10)
Thomas M. Baumann
Jose Gracia
Profiling Hybrid HMPP Applications with Score-P on Heterogeneous Hardware
773(10)
Marc Schlutter
Peter Philippen
Laurent Morin
Markus Geimer
Bernd Mohr
Binary Instrumentation for Scalable Performance Measurement of OpenMP Applications
783(10)
Julien Jaeger
Peter Philippen
Eric Petit
Andres Charif Rubial
Christian Rossel
William Jalby
Bernd Mohr
A Case Study: Holistic Performance Analysis on Heterogeneous Architectures Using the Vampir Toolchain
793(12)
Robert Dietrich
Frank Winkler
Thomas William
Jonas Stolle
Robert Henschel
Donald K. Berry
Further Mini-Symposium Contributions
PRACE DECI (Distributed European Computing Initiative) Minisymposium
805(8)
Chris Johnson
Anastasia V. Bochenkova
Alexander A. Granovsky
Peter J. Bond
Teresa Paramo
Tristan Glatard
William A. Romero R.
Denis Friboulet
Stefan J. Zasada
Peter V. Coveney
A Generic Prototype to Benchmark Algorithms and Data Structures for Hierarchical Hybrid Grids
813(10)
Sebastian Kuckuk
Bjorn Gmeiner
Harald Kostler
Ulrich Rude
Towards a Performance Engineering Workflow for OpenMP 4.0
823(10)
Dirk Schmidl
Christian Iwainsky
Christian Terboven
Christian H. Bischof
Matthias S. Muller
Theoretical Measures of Cache Efficiency for Tetrahedral Adaptive Meshes. A Case Study with a Quasi Space-Filling Curve Order
833(10)
Oliver Kunst
Jorn Behrens
Author Index 843