Muutke küpsiste eelistusi

Pipelined Processor Farms: Structured Design for Embedded Parallel Systems [Kõva köide]

Teised raamatud teemal:
Teised raamatud teemal:
Signal processing control, robotic, image processing, pattern analysis, and computer vision are some of the applications Fleury and Downton (both electronic systems engineering, U. of Essex) describe for parallel processing in real-time systems. They also explain the methodology, focusing on Pipelined Processor Farms, so introduce parallel processing in general and embedded systems in particular. Annotation c. Book News, Inc., Portland, OR (booknews.com)

This book outlines a methodology for the use of parallel processing in real time systems. It provides an introduction to parallel processing in general, and to embedded systems in particular. Among the embedded systems are processors in such applications as automobiles, various machinery, IPGAs (field programmable gate arrays), multimedia embedded systems such as those used in the computer game industry, and more.
* Presents design and simulation tools as well as case studies.
* First presentation of this material in book form.

Arvustused

"Signal processing control, robotic, image processing, pattern analysis, and computer vision are some of the applications..." (SciTech Book News Vol. 25, No. 2 June 2001)

Foreword v
Preface vii
Acknowledgments ix
Acronyms xix
Part I Introduction and Basic Concepts
Introduction
1(16)
Overview
1(1)
Origins
2(2)
Amdahl's Law and Structured Parallel Design
4(1)
Introduction to PPF Systems
4(4)
Conclusions
8(9)
Appendix
10(1)
Simple Design Example: The H.261 Decoder
10(7)
Basic Concepts
17(20)
Pipelined Processing
20(4)
Pipeline Types
24(3)
Asynchronous PPF
25(1)
Synchronous PPF
26(1)
Data Farming and Demand-based Scheduling
27(1)
Data-farm Performance Criteria
28(2)
Conclusion
30(7)
Appendix
31(1)
Short case studies
31(6)
PPF in Practice
37(20)
Application Overview
38(1)
Implementation issues
39(1)
Parallelization of the Postcode Recognizer
39(8)
Partitioning the postcode recognizer
40(1)
Scaling the postcode recognizer
41(2)
Performance achieved
43(4)
Parallelization of the address verifier
47(4)
Partitioning the address verifier
47(2)
Scaling the address verifier
49(1)
Address verification farms
50(1)
Overall performance achieved
50(1)
Meeting the Specification
51(2)
Conclusion
53(4)
Appendix
53(1)
Other Parallel Postcode Recognition Systems
53(4)
Development of PPF Applications
57(10)
Analysis Tools
58(1)
Tool Characteristics
59(1)
Development Cycle
60(2)
Conclusion
62(5)
Part II Analysis and Partitioning of Sequential Applications
Initial Development of an Application
67(14)
Confidence Building
67(2)
Automatic and Semi-automatic Parallelization
69(2)
Language Proliferation
71(1)
Size of Applications
72(1)
Semi-automatic Partitioning
73(2)
Porting Code
75(2)
Checking a Decomposition
77(1)
Optimizing Compilers
77(2)
Conclusion
79(2)
Graphical Simulation and Performance Analysis of PPFs
81(14)
Simulating Asynchronous Pipelines
82(1)
Simulation Implementation
82(2)
Graphical Representation
84(4)
Display Features
88(1)
Cross-architectural Comparison
89(4)
Conclusion
93(2)
Template-based Implementation
95(22)
Template Design Principles
96(3)
Implementation Choices
99(1)
Parallel Logic Implementation
100(1)
Target Machine Implementation
101(3)
Common implementation issues
102(2)
`NOW' Implementation for Logic Debugging
104(5)
Target Machine Implementations for Performance Tuning
109(3)
Patterns and Templates
112(1)
Conclusion
113(4)
Part III Case Studies
Application Examples
117(46)
Case Study 1: H.261 Encoder
118(14)
Purpose of parallelization
119(1)
`Per macroblock' quantization without motion estimation
119(4)
`Per picture' quantization without motion estimation
123(2)
`Per picture' quantization with motion estimation
125(1)
Implementation of the parallel encoders
126(2)
H.261 encoders without motion estimation
128(1)
H.261 encoder with motion estimation
129(2)
Edge data exchange
131(1)
Case Study 2: H263 Encoder/Decoder
132(7)
Static analysis of H.263 algorithm
134(1)
Results from parallelizing H.263
135(4)
Case Study 3: `Eigenfaces' - Face Detection
139(6)
Background
139(1)
Eigenfaces algorithm
140(1)
Parallelization steps
141(2)
Introduction of second and third farms
143(2)
Case Study 4: Optical Flow
145(16)
Optical flow
145(2)
Existing sequential implementation
147(1)
Gradient-based routine
147(3)
Multi-resolution routine
150(4)
Phase-based routine
154(2)
LK results
156(2)
Other methods
158(2)
Evaluation
160(1)
Conclusion
161(2)
Design Studies
163(26)
Case Study 1: Karhunen-Loeve Transform (KLT)
164(7)
Applications of the KLT
164(1)
Features of the KLT
165(1)
Parallelization of the KLT
165(3)
PPF parallelization
168(3)
Implementation
171(1)
Case Study 2: 2D-Wavelet Transform
171(8)
Wavelet Transform
172(1)
Computational algorithms
173(1)
Parallel implementation of Discrete Wavelet Transform (DWT)
173(3)
Parallel implementation of oversampled WT
176(3)
Case Study 3: Vector Quantization
179(7)
Parallelization of VQ
180(1)
PPF schemes for VQ
181(2)
VQ implementation
183(3)
Conclusion
186(3)
Counter Examples
189(22)
Case Study 1: Large Vocabulary Continuous-Speech Recognition
190(6)
Background
190(1)
Static analysis of the LVCR system
191(2)
Parallel design
193(2)
Implementation on an SMP
195(1)
Case Study 2: Model-based Coding
196(6)
Parallelization of the model-based coder
196(2)
Analysis of results
198(4)
Case Study 3: Microphone Beam Array
202(4)
Griffiths-Jim beam-former
202(1)
Sequential implementation
203(1)
Parallel implementation of the G-J Algorithm
204(2)
Conclusion
206(5)
Part IV Underlying Theory and Analysis
Performance of PPFs
211(36)
Naming Conventions
212(1)
Performance Metrics
212(8)
Order statistics
213(3)
Asymptotic distribution
216(1)
Characteristic maximum
217(2)
Sample estimate
219(1)
Gathering Performance Data
220(1)
Performance Prediction Equations
221(2)
Results
223(2)
Prediction results
224(1)
Simulation Results
225(2)
Asynchronous Pipeline Estimate
227(3)
Ordering Constraints
230(5)
Task Scheduling
235(3)
Uniform task size
236(1)
Decreasing task size
236(1)
Heuristic scheduling schemes
237(1)
Validity of Factoring
238(1)
Scheduling Results
238(3)
Timings
238(2)
Simulation results
240(1)
Conclusion
241(6)
Appendix
242(1)
Outline derivation of Kruskal-Weiss prediction equation
242(1)
Factoring regime derivation
243(4)
Instrumentation of Templates
247(16)
Global Time
248(1)
Processor Model
249(1)
Local Clock Requirements
249(1)
Steady-state Behavior
250(3)
Establishing a Refresh Interval
253(3)
Local Clock Adjustment
256(1)
Implementation on the Paramid
257(2)
Conclusion
259(4)
Part V Future Trends
Future Trends
263(6)
Designing for Differing Embedded Hardware
265(1)
Adapting to Mobile Networked Computation
265(2)
Conclusion
267(2)
References 269(30)
Index 299