Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Automatic Parallelization For A Class Of Regular Computations

G M Megson (Univ Of Reading, Uk), Xian Chen (.)

Formaat: 272 pages
Ilmumisaeg: 04-Jan-1997
Kirjastus: World Scientific Publishing Co Pte Ltd
Keel: eng
ISBN-13: 9789814498418

Teised raamatud teemal:

Parallel processing

Formaat - PDF+DRM
Hind: 32,76 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
Raamatukogudele

Formaat: 272 pages
Ilmumisaeg: 04-Jan-1997
Kirjastus: World Scientific Publishing Co Pte Ltd
Keel: eng
ISBN-13: 9789814498418

Teised raamatud teemal:

Parallel processing

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

The automatic generation of parallel code from high level sequential description is of key importance to the wide spread use of high performance machine architectures. This text considers (in detail) the theory and practical realization of automatic mapping of algorithms generated from systems of uniform recurrence equations (do-lccps) onto fixed size architectures with defined communication primitives. Experimental results of the mapping scheme and its implementation are given.

Terminology

vii

1 Introduction

(12)

1.1 Automatic Parallelization

(2)

1.2 Processor Array Structures

(2)

1.3 Nested Loops and Data Dependencies

(3)

1.4 Space-Time Mapping

(2)

1.5 About the Book

(3)

2 Survey and Analysis of Partitioning and Mapping

(14)

2.1 The Problems

(1)

2.2 Independent Partitioning

(2)

2.3 General Partitioning and Projecting For Space-Time Mapping

(8)

2.3.1 Projecting-Partitioning Methods

(5)

2.3.2 Partitioning-Projecting Methods

(1)

2.3.3 Comparison

(1)

2.4 Lower-Dimensional Mapping

(2)

2.5 Summary

(1)

3 Improvements to Some Existing Partitioning and Mapping Methods

(20)

3.1 On the Maximal Independent Partitioning

(6)

3.1.1 Introduction

(1)

3.1.2 Improved Methods of Independent Partitioning

(1)

3.1.3 Producing HSNF

(2)

3.1.4 Algorithm Generation

(2)

3.2 A Synthesis Method using LSGP Partitioning for Given-Shape Regular Arrays

(10)

3.2.1 Introduction

(1)

3.2.2 A New LSGP Synthesis Method

(3)

3.2.3 The Conditions for Valid Q and t

(4)

3.2.4 An Example

(1)

3.2.5 A Particular Area of Application

(2)

3.3 Bouncing LPGS Method

(2)

3.3.1 Introduction

(1)

3.3.2 Bouncing LPGS

(1)

3.4 Summary

(2)

4 A Methodology of Partitioning and Mapping for Given (N-1)-D Regular Arrays

(28)

4.1 A New Methodology

(2)

4.2 Transforming to Canonical Dependencies

(6)

4.2.1 A Cone Including Dependencies

(3)

4.2.2 Forming Canonical Dependencies in Supernode Space

(3)

4.3 Space-Time Mapping

(5)

4.3.1 S-T Transformation and Interconnection Primitives

(1)

4.3.2 Choosing S and t

(3)

4.3.3 Permuting the Space Projection Matrix

(1)

4.4 Further LSGP Partitioning for SBC

(1)

4.5 Scaling and Optimisation

(7)

4.5.1 SUC Cases

(1)

4.5.2 SBC Case

(1)

4.5.3 Optimising

(2)

4.5.4 Integralization of the Quasi-supernode Transformation Matrix

(1)

4.6 Examples and Discussions

(3)

4.6.1 A 3-D Example

(3)

4.6.2 More Results and Discussions

(1)

4.7 Summary

(4)

5 Optimal Mapping onto Lower-Dimensional Regular Arrays

(24)

5.1 A Methodology for Partitioning and Mapping onto Lower Dimensional Arrays

(2)

5.2 The Transformations into K-D Time Domain and M-D Processor Array

(2)

5.2.1 Selecting a Family of T

(2)

5.2.2 Scaling the Supernode Parallelepiped

(1)

5.3 Maximum Local K-D Time Domain

(3)

5.3.1 Intersecting a Polyhedron with a Hyperplane

(1)

5.3.2 Maximum Local Supernode Domain

(1)

5.3.3 Mapping to K-D Time Domain

(1)

5.4 Valid Minimum Projecting Vector

(3)

5.5 A General Methodology

(4)

5.5.1 The First and Last Nodes of Polyhedrons

(1)

5.5.2 Deriving P

(3)

5.6 Optimisation

(4)

5.6.1 Method

(1)

5.6.2 An Example

(3)

5.7 Special Example: Partitioning and Mapping a Knapsack Problem onto a Linear Array

(4)

5.7.1 Description of Computational Structure

(2)

5.7.2 Supernode Polyhedron

(1)

5.7.3 Transformation onto a Time-Processor Domain

(2)

5.8 Summary

(2)

6 The Structure of Parallel Programs

(30)

6.1 Introduction

(2)

6.2 On Supernodes

101

(8)

6.2.1 Boundary of the Supernode Polyhedron

102

(5)

6.2.2 Vertices of the Enlarged Quasi-Supernode Polyhedron

107

(1)

6.2.3 Boundary of a Single Supernode

108

(1)

6.3 Data Flows

109

(2)

6.3.1 Outgoing Data of a Single Supernode

109

(1)

6.3.2 Outgoing Data Packets of a Processor

110

(1)

6.4 Algorithm Generation for a Lower-Dimensional Processor Array

111

(4)

6.4.1 K-D Parallel Algorithms

111

(1)

6.4.2 K-D Time Domain to 1-D Time Domain

112

(3)

6.5 Algorithm Generation For LSGP Case

115

(7)

6.5.1 Rectangular Boundary in Virtual Processor-Time Domain

116

(1)

6.5.2 Algorithm Generation Involving LSGP

117

(2)

6.5.3 Outgoing Data after LSGP

119

(1)

6.5.4 Example

120

(2)

6.5.5 Algorithm Generation for (N-1)-D SUC Case

122

(1)

6.6 From Inequalities To Boundaries

122

(4)

6.6.1 Finding All Possible Lower and Upper Bounds

123

(1)

6.2.2 Deleting Redundant Bounds

124

(2)

6.7 Summary

126

(3)

7 Parallel Code Generation

129

(22)

7.1 Introduction

129

(1)

7.2 Processor Array

129

(2)

7.3 Supernode Storage

131

(4)

7.3.1 From Supernode Domain to Local Supernode Domains

132

(1)

7.3.2 Supernode Storage and in/out Data Storage

133

(2)

7.4 Data Flow and Relay

135

(8)

7.4.1 LSGP Case

136

(5)

7.4.2 Non LSGP Case and Direct Data Flows

141

(2)

7.5 Outline of the Parallel Code

143

(7)

7.5.1 Parallel Codes for Non-LSGP

144

(4)

7.5.2 Parallel Codes with LSGP

148

(2)

7.6 Summary

150

(1)

8 Experimental Results and Discussions

151

(12)

8.1 Experimental Results

151

(6)

8.1.1 Test of the Correctness

151

(1)

8.1.2 Measuring Performance

152

(5)

8.2 Discussions on Factors Affecting Efficiency

157

(5)

8.2.1 Space and Time Fitness of the Mapping

157

(3)

8.2.2 Data Communication

160

(1)

8.2.3 Single Supernode Loops

161

(1)

8.2.4 The Effect of Granularity

161

(1)

8.3 Summary

162

(1)

9 Conclusion

163

(6)

9.1 Summary of the Whole Design Procedure

163

(1)

9.2 Evaluations and Comparisons

164

(3)

9.2.1 Theory

164

(2)

9.2.2 Practice

166

(1)

9.2.3 Results

167

(1)

9.3 Closing Remarks

167

(2)

Bibliography

169

(80)

A Scaling SBC

177

(6)

A.1 Initial Wu'iS and Wl'iS

177

(2)

A.2 SF as a Function of fk

179

(1)

A.2.1 Determining fa with a Known Set of Wui and Wli

179

(1)

A.2.2 Determining the Truning Points

179

(1)

A.2.3 Selecting Wui and wli from Multi-Candidates

180

(1)

A.3 Delimit fk According to Dependencies

181

(2)

B Collection of Algorithms for (N-1)-D Partitioning and Mapping

183

(4)

B.1 Pre-Compilation Work

183

(1)

B.2 Compiling Work

184

(3)

C Collection of Algorithms for Lower-Dimensional Mapping

187

(4)

D Parallel Algorithm for Pure LSGP Method

191

(2)

E Generating Data Flows and Relays for LSGP

193

(4)

F The Collections of Experimental Results

197

(28)

G Examples of Automatically Generated Parallel Codes

225

(24)

G.1 Parallel Codes for Non-LSGP

225

(1)

G.1.1 h.file

225

(5)

G.1.2 Parallel Code

230

(7)

G.2 Parallel Codes for LSGP

237

(1)

G.2.1 h.file

237

(4)

G.2.2 Parallel Codes

241

(8)

Index

249

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97898144984182e.html

E-raamat: Automatic Parallelization For A Class Of Regular Computations

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv