|
|
xiii | |
|
|
xxi | |
Foreword |
|
xxiii | |
Preface |
|
xxv | |
|
Multi-Core Architectures for Embedded Systems |
|
|
1 | (30) |
|
|
|
2 | (7) |
|
What Makes Multiprocessor Solutions Attractive? |
|
|
3 | (6) |
|
Architectural Considerations |
|
|
9 | (2) |
|
|
11 | (2) |
|
|
13 | (1) |
|
|
14 | (11) |
|
HiBRID-SoC for Multimedia Signal Processing |
|
|
14 | (2) |
|
|
16 | (1) |
|
Defect-Tolerant and Reconfigurable MPSoC |
|
|
17 | (1) |
|
Homogeneous Multiprocessor for Embedded Printer Application |
|
|
18 | (2) |
|
General Purpose Multiprocessor DSP |
|
|
20 | (1) |
|
Multiprocessor DSP for Mobile Applications |
|
|
21 | (2) |
|
|
23 | (2) |
|
|
25 | (1) |
|
|
25 | (2) |
|
|
27 | (4) |
|
Application-Specific Customizable Embedded Systems |
|
|
31 | (40) |
|
|
|
32 | (2) |
|
Challenges and Opportunities |
|
|
34 | (3) |
|
|
35 | (2) |
|
|
37 | (4) |
|
Customized Application-Specific Processor Techniques |
|
|
37 | (3) |
|
Customized Application-Specific On-Chip Interconnect Techniques |
|
|
40 | (1) |
|
Configurable Processors and Instruction Set Synthesis |
|
|
41 | (11) |
|
Design Methodology for Processor Customization |
|
|
43 | (1) |
|
Instruction Set Extension Techniques |
|
|
44 | (4) |
|
Application-Specific Memory-Aware Customization |
|
|
48 | (1) |
|
Customizing On-Chip Communication Interconnect |
|
|
48 | (1) |
|
|
49 | (3) |
|
Reconfigurable Instruction Set Processors |
|
|
52 | (2) |
|
|
53 | (1) |
|
Hardware/Software Codesign |
|
|
54 | (1) |
|
Hardware Architecture Description Languages |
|
|
55 | (3) |
|
|
57 | (1) |
|
|
58 | (2) |
|
Case Study: Realizing Customizable Multi-Core Designs |
|
|
60 | (2) |
|
The Future: System Design with Customizable Architectures, Software, and Tools |
|
|
62 | (1) |
|
|
63 | (1) |
|
|
63 | (8) |
|
Power Optimization in Multi-Core System-on-Chip |
|
|
71 | (40) |
|
|
|
|
|
|
72 | (2) |
|
|
74 | (8) |
|
|
75 | (5) |
|
|
80 | (2) |
|
|
82 | (5) |
|
|
82 | (1) |
|
|
83 | (1) |
|
|
84 | (1) |
|
|
85 | (1) |
|
|
86 | (1) |
|
On-Chip Communication Architectures |
|
|
87 | (3) |
|
|
90 | (5) |
|
|
91 | (4) |
|
DPM and DVS in Multi-Core Systems |
|
|
95 | (5) |
|
|
100 | (1) |
|
|
101 | (1) |
|
|
102 | (9) |
|
Routing Algorithms for Irregular Mesh-Based Network-on-Chip |
|
|
111 | (44) |
|
|
|
|
112 | (1) |
|
An Overview of Irregular Mesh Topology |
|
|
113 | (2) |
|
|
113 | (1) |
|
|
113 | (2) |
|
Fault-Tolerant Routing Algorithms for 2D Meshes |
|
|
115 | (11) |
|
Fault-Tolerant Routing Using Virtual Channels |
|
|
116 | (1) |
|
Fault-Tolerant Routing with Turn Model |
|
|
117 | (9) |
|
Routing Algorithms for Irregular Mesh Topology |
|
|
126 | (10) |
|
Traffic-Balanced OAPR Routing Algorithm |
|
|
127 | (5) |
|
Application-Specific Routing Algorithm |
|
|
132 | (4) |
|
Placement for Irregular Mesh Topology |
|
|
136 | (7) |
|
OIP Placements Based on Chen and Chiu's Algorithm |
|
|
137 | (3) |
|
OIP Placements Based on OAPR |
|
|
140 | (3) |
|
Hardware Efficient Routing Algorithms |
|
|
143 | (8) |
|
|
146 | (1) |
|
XY-Deviation Table Routing (XYDT) |
|
|
147 | (1) |
|
Source Routing for Deviation Points (SRDP) |
|
|
147 | (1) |
|
Degree Priority Routing Algorithm |
|
|
148 | (3) |
|
|
151 | (1) |
|
|
151 | (1) |
|
|
151 | (4) |
|
Debugging Multi-Core Systems-on-Chip |
|
|
155 | (46) |
|
|
|
|
156 | (2) |
|
Why Debugging is Difficult |
|
|
158 | (5) |
|
Limited Internal Observability |
|
|
158 | (1) |
|
Asynchronicity and Consistent Global States |
|
|
159 | (2) |
|
Non-Determinism and Multiple Traces |
|
|
161 | (2) |
|
|
163 | (6) |
|
|
164 | (1) |
|
|
165 | (1) |
|
|
166 | (3) |
|
|
169 | (5) |
|
|
169 | (2) |
|
Comparing Existing Debug Methods |
|
|
171 | (3) |
|
|
174 | (4) |
|
Communication-Centric Debug |
|
|
175 | (1) |
|
|
175 | (1) |
|
|
176 | (1) |
|
|
176 | (2) |
|
On-Chip Debug Infrastructure |
|
|
178 | (6) |
|
|
178 | (1) |
|
|
178 | (2) |
|
Computation-Specific Instrument |
|
|
180 | (1) |
|
Protocol-Specific Instrument |
|
|
181 | (1) |
|
Event Distribution Interconnect |
|
|
182 | (1) |
|
Debug Control Interconnect |
|
|
183 | (1) |
|
|
183 | (1) |
|
Off-Chip Debug Infrastructure |
|
|
184 | (6) |
|
|
184 | (1) |
|
Abstractions Used by Debugger Software |
|
|
184 | (6) |
|
|
190 | (3) |
|
|
193 | (1) |
|
|
194 | (1) |
|
|
194 | (7) |
|
System-Level Tools for NoC-Based Multi-Core Design |
|
|
201 | (42) |
|
|
|
|
|
202 | (4) |
|
|
204 | (2) |
|
|
206 | (1) |
|
Graph Theoretical Analysis |
|
|
207 | (3) |
|
Generating Synthetic Graphs Using TGFF |
|
|
209 | (1) |
|
Task Mapping for SoC Applications |
|
|
210 | (6) |
|
Application Task Embedding and Quality Metrics |
|
|
210 | (4) |
|
|
214 | (2) |
|
OMNeT++ Simulation Framework |
|
|
216 | (1) |
|
|
217 | (14) |
|
|
217 | (1) |
|
Prospective NoC Topology Models |
|
|
218 | (1) |
|
Spidergon Network on Chip |
|
|
219 | (2) |
|
Task Graph Embedding and Analysis |
|
|
221 | (2) |
|
Simulation Models for Proposed NoC Topologies |
|
|
223 | (4) |
|
Mpeg4: A Realistic Scenario |
|
|
227 | (4) |
|
Conclusions and Extensions |
|
|
231 | (3) |
|
|
234 | (1) |
|
|
235 | (8) |
|
Compiler Techniques for Application Level Memory Optimization for MPSoC |
|
|
243 | (26) |
|
|
|
|
|
|
|
|
244 | (1) |
|
Loop Transformation for Single and Multiprocessors |
|
|
245 | (1) |
|
Program Transformation Concepts |
|
|
246 | (2) |
|
Memory Optimization Techniques |
|
|
248 | (2) |
|
|
249 | (1) |
|
|
249 | (1) |
|
|
249 | (1) |
|
MPSoC Memory Optimization Techniques |
|
|
250 | (5) |
|
|
251 | (1) |
|
Comparison of Lexicographically Positive and Positive Dependency |
|
|
252 | (1) |
|
|
253 | (1) |
|
|
254 | (1) |
|
|
255 | (1) |
|
|
255 | (1) |
|
|
256 | (1) |
|
Improvement in Optimization Techniques |
|
|
256 | (5) |
|
Parallel Processing Area and Partitioning |
|
|
256 | (3) |
|
Modulo Operator Elimination |
|
|
259 | (1) |
|
Unimodular Transformation |
|
|
260 | (1) |
|
|
261 | (2) |
|
Cache Ratio and Memory Space |
|
|
262 | (1) |
|
Processing Time and Code Size |
|
|
263 | (1) |
|
|
263 | (1) |
|
|
264 | (1) |
|
|
265 | (1) |
|
|
266 | (3) |
|
Programming Models for Multi-Core Embedded Software |
|
|
269 | (40) |
|
|
|
|
|
|
270 | (2) |
|
Thread Libraries for Multi-Threaded Programming |
|
|
272 | (4) |
|
Protections for Data Integrity in a Multi-Threaded Environment |
|
|
276 | (3) |
|
Mutual Exclusion Primitives for Deterministic Output |
|
|
276 | (2) |
|
|
278 | (1) |
|
Programming Models for Shared Memory and Distributed Memory |
|
|
279 | (3) |
|
|
279 | (1) |
|
|
280 | (1) |
|
Message Passing Interface |
|
|
281 | (1) |
|
Parallel Programming on Multiprocessors |
|
|
282 | (1) |
|
Parallel Programming Using Graphic Processors |
|
|
283 | (1) |
|
Model-Driven Code Generation for Multi-Core Systems |
|
|
284 | (2) |
|
|
285 | (1) |
|
Synchronous Programming Languages |
|
|
286 | (2) |
|
Imperative Synchronous Language: Esterel |
|
|
288 | (2) |
|
|
288 | (1) |
|
Multi-Core Implementations and Their Compilation Schemes |
|
|
289 | (1) |
|
Declarative Synchronous Language: LUSTRE |
|
|
290 | (2) |
|
|
291 | (1) |
|
Multi-Core Implementations from LUSTRE Specifications |
|
|
291 | (1) |
|
Multi-Rate Synchronous Language: SIGNAL |
|
|
292 | (7) |
|
|
292 | (1) |
|
Characterization and Compilation of SIGNAL |
|
|
293 | (1) |
|
SIGNAL Implementations on Distributed Systems |
|
|
294 | (2) |
|
Multi-Threaded Programming Models for SIGNAL |
|
|
296 | (3) |
|
Programming Models for Real-Time Software |
|
|
299 | (2) |
|
Real-Time Extensions to Synchronous Languages |
|
|
300 | (1) |
|
Future Directions for Multi-Core Programming |
|
|
301 | (1) |
|
|
302 | (3) |
|
|
305 | (4) |
|
Operating System Support for Multi-Core Systems-on-Chips |
|
|
309 | (28) |
|
|
|
|
310 | (1) |
|
Ideal Software Organization |
|
|
311 | (2) |
|
|
313 | (1) |
|
|
314 | (8) |
|
|
314 | (3) |
|
General Purpose Operating System |
|
|
317 | (5) |
|
Real-Time and Component-Based Operating System Models |
|
|
322 | (7) |
|
Automated Application Code Generation and RTOS Modeling |
|
|
322 | (4) |
|
Component-Based Operating System |
|
|
326 | (3) |
|
|
329 | (1) |
|
|
330 | (2) |
|
|
332 | (1) |
|
|
333 | (4) |
|
Autonomous Power Management in Embedded Multi-Cores |
|
|
337 | (32) |
|
|
|
|
|
|
|
338 | (4) |
|
Why is Autonomous Power Management Necessary? |
|
|
339 | (3) |
|
Survey of Autonomous Power Management Techniques |
|
|
342 | (5) |
|
|
342 | (1) |
|
|
343 | (1) |
|
Dynamic Voltage and Frequency Scaling |
|
|
343 | (1) |
|
|
344 | (1) |
|
|
345 | (1) |
|
Commercial Power Management Tools |
|
|
346 | (1) |
|
Power Management and RTOS |
|
|
347 | (2) |
|
Power-Smart RTOS and Processor Simulators |
|
|
349 | (2) |
|
Chip Multi-Threading (CMT) Architecture Simulator |
|
|
350 | (1) |
|
Autonomous Power Saving in Multi-Core Processors |
|
|
351 | (7) |
|
Opportunities to Save Power |
|
|
353 | (1) |
|
|
354 | (2) |
|
Case Study: Power Saving in Intel Centrino |
|
|
356 | (2) |
|
|
358 | (2) |
|
|
358 | (1) |
|
|
358 | (2) |
|
|
360 | (2) |
|
|
362 | (1) |
|
|
363 | (6) |
|
Multi-Core System-on-Chip in Real World Products |
|
|
369 | (30) |
|
|
|
|
|
|
370 | (1) |
|
Overview of picoArray Architecture |
|
|
371 | (4) |
|
Basic Processor Architecture |
|
|
371 | (2) |
|
Communications Interconnet |
|
|
373 | (1) |
|
Peripherals and Hardware Functional Accelerators |
|
|
373 | (2) |
|
|
375 | (6) |
|
Pico Vhdl Parser (Analyzer, Elaborator, Assembler) |
|
|
376 | (1) |
|
|
376 | (2) |
|
|
378 | (3) |
|
Design Partitioning for Multiple Devices |
|
|
381 | (1) |
|
|
381 | (1) |
|
|
381 | (1) |
|
Picoarray Debug and Analysis |
|
|
381 | (7) |
|
|
382 | (1) |
|
|
383 | (1) |
|
|
383 | (2) |
|
|
385 | (2) |
|
|
387 | (1) |
|
|
387 | (1) |
|
Hardening Process in Practice |
|
|
388 | (4) |
|
Viterbi Decoder Hardening |
|
|
389 | (3) |
|
|
392 | (4) |
|
|
396 | (1) |
|
|
396 | (1) |
|
|
397 | (2) |
|
Embedded Multi-Core Processing for Networking |
|
|
399 | (66) |
|
|
|
|
400 | (3) |
|
Overview of Proposed NPU Architectures |
|
|
403 | (9) |
|
Multi-Core Embedded Systems for Multi-Service Broadband Access and Multimedia Home Networks |
|
|
403 | (2) |
|
SoC Integration of Network Components and Examples of Commercial Access NPUs |
|
|
405 | (2) |
|
NPU Architectures for Core Network Nodes and High-Speed Networking and Switching |
|
|
407 | (5) |
|
Programmable Packet Processing Engines |
|
|
412 | (10) |
|
|
413 | (5) |
|
|
418 | (3) |
|
Specialized Instruction Set Architectures |
|
|
421 | (1) |
|
Address Lookup and Packet Classification Engines |
|
|
422 | (9) |
|
Classification Techniques |
|
|
424 | (2) |
|
|
426 | (5) |
|
Packet Buffering and Queue Management Engines |
|
|
431 | (11) |
|
|
433 | (2) |
|
Design of Specialized Core for Implementation of Queue Management in Hardware |
|
|
435 | (7) |
|
|
442 | (11) |
|
Data Structures in Scheduling Architectures |
|
|
443 | (1) |
|
|
444 | (6) |
|
|
450 | (3) |
|
|
453 | (2) |
|
|
455 | (4) |
|
|
459 | (6) |
Index |
|
465 | |