Preface |
|
xv | |
Acknowledgment |
|
xxi | |
Part I Overview |
|
|
|
3 | (26) |
|
1.1 Embedded Systems Applications |
|
|
6 | (3) |
|
1.1.1 Cyber-Physical Systems |
|
|
6 | (1) |
|
|
6 | (1) |
|
|
7 | (1) |
|
|
8 | (1) |
|
1.2 Characteristics of Embedded Systems Applications |
|
|
9 | (2) |
|
1.2.1 Throughput-Intensive |
|
|
9 | (1) |
|
1.2.2 Thermal-Constrained |
|
|
9 | (1) |
|
1.2.3 Reliability-Constrained |
|
|
10 | (1) |
|
|
10 | (1) |
|
1.2.5 Parallel and Distributed |
|
|
10 | (1) |
|
1.3 Embedded Systems—Hardware and Software |
|
|
11 | (4) |
|
1.3.1 Embedded Systems Hardware |
|
|
11 | (3) |
|
1.3.2 Embedded Systems Software |
|
|
14 | (1) |
|
1.4 Modeling—An Integral Part of the Embedded Systems Design Flow |
|
|
15 | (6) |
|
1.4.1 Modeling Objectives |
|
|
16 | (2) |
|
|
18 | (2) |
|
1.4.3 Strategies for Integration of Modeling Paradigms |
|
|
20 | (1) |
|
1.5 Optimization in Embedded Systems |
|
|
21 | (6) |
|
1.5.1 Optimization of Embedded Systems Design Metrics |
|
|
23 | (3) |
|
1.5.2 Multiobjective Optimization |
|
|
26 | (1) |
|
|
27 | (2) |
|
2 Multicore-Based EWSNs—An Example of Parallel and Distributed Embedded Systems |
|
|
29 | (22) |
|
2.1 Multicore Embedded Wireless Sensor Network Architecture |
|
|
31 | (2) |
|
2.2 Multicore Embedded Sensor Node Architecture |
|
|
33 | (3) |
|
|
34 | (1) |
|
|
34 | (1) |
|
|
34 | (1) |
|
|
35 | (1) |
|
|
35 | (1) |
|
|
35 | (1) |
|
2.2.7 Location Finding Unit |
|
|
36 | (1) |
|
2.3 Compute-Intensive Tasks Motivating the Emergence of MCEWSNs |
|
|
36 | (2) |
|
|
36 | (2) |
|
|
38 | (1) |
|
|
38 | (1) |
|
2.3.4 Software-Defined Radio (SDR) |
|
|
38 | (1) |
|
2.4 MCEWSN Application Domains |
|
|
38 | (5) |
|
2.4.1 Wireless Video Sensor Networks (WVSNs) |
|
|
39 | (1) |
|
2.4.2 Wireless Multimedia Sensor Networks (WMSNs) |
|
|
39 | (1) |
|
2.4.3 Satellite-Based Wireless Sensor Networks (SBWSN) |
|
|
40 | (1) |
|
2.4.4 Space Shuttle Sensor Networks (3SN) |
|
|
41 | (1) |
|
2.4.5 Aerial—Terrestrial Hybrid Sensor Networks (ATHSNs) |
|
|
42 | (1) |
|
2.4.6 Fault-Tolerant (FT) Sensor Networks |
|
|
43 | (1) |
|
2.5 Multicore Embedded Sensor Nodes |
|
|
43 | (2) |
|
|
43 | (1) |
|
2.5.2 Mars Rover Prototype Mote |
|
|
43 | (1) |
|
2.5.3 Satellite-Based Sensor Node (SBSN) |
|
|
44 | (1) |
|
2.5.4 Multi-CPU-Based Sensor Node Prototype |
|
|
44 | (1) |
|
|
44 | (1) |
|
2.6 Research Challenges and Future Research Directions |
|
|
45 | (2) |
|
|
47 | (4) |
Part II Modeling |
|
|
3 An Application Metrics Estimation Model for Embedded Wireless Sensor Networks |
|
|
51 | (12) |
|
3.1 Application Metrics Estimation Model |
|
|
52 | (6) |
|
3.1.1 Lifetime Estimation |
|
|
53 | (3) |
|
3.1.2 Throughput Estimation |
|
|
56 | (1) |
|
3.1.3 Reliability Estimation |
|
|
57 | (1) |
|
|
57 | (1) |
|
|
58 | (3) |
|
|
58 | (1) |
|
|
59 | (2) |
|
|
61 | (2) |
|
4 Modeling and Analysis of Fault Detection and Fault Tolerance in Embedded Wireless Sensor Networks |
|
|
63 | (44) |
|
|
67 | (3) |
|
|
67 | (1) |
|
|
68 | (1) |
|
4.1.3 WSN Reliability Modeling |
|
|
69 | (1) |
|
4.2 Fault Diagnosis in WSNs |
|
|
70 | (4) |
|
|
70 | (2) |
|
4.2.2 Taxonomy for Fault Diagnosis Techniques |
|
|
72 | (2) |
|
4.3 Distributed Fault Detection Algorithms |
|
|
74 | (3) |
|
4.3.1 Fault Detection Algorithm 1: The Chen Algorithm |
|
|
74 | (2) |
|
4.3.2 Fault Detection Algorithm 2: The Ding Algorithm |
|
|
76 | (1) |
|
4.4 Fault-Tolerant Markov Models |
|
|
77 | (8) |
|
4.4.1 Fault-Tolerance Parameters |
|
|
77 | (2) |
|
4.4.2 Fault-Tolerant Sensor Node Model |
|
|
79 | (2) |
|
4.4.3 Fault-Tolerant WSN Cluster Model |
|
|
81 | (2) |
|
4.4.4 Fault-Tolerant WSN Model |
|
|
83 | (2) |
|
4.5 Simulation of Distributed Fault Detection Algorithms |
|
|
85 | (6) |
|
4.5.1 Using ns-2 to Simulate Faulty Sensors |
|
|
85 | (1) |
|
4.5.2 Experimental Setup for Simulated Data |
|
|
86 | (1) |
|
4.5.3 Experiments Using Real-World Data |
|
|
86 | (5) |
|
|
91 | (10) |
|
|
91 | (1) |
|
4.6.2 Reliability and MTTF for an NFT and an FT Sensor Node |
|
|
91 | (4) |
|
4.6.3 Reliability and MTTF for an NFT and an FT WSN Cluster |
|
|
95 | (3) |
|
4.6.4 Reliability and MTTF for an NFT and an FT WSN |
|
|
98 | (3) |
|
4.7 Research Challenges and Future Research Directions |
|
|
101 | (4) |
|
4.7.1 Accurate Fault Detection |
|
|
101 | (1) |
|
4.7.2 Benchmarks for Comparing Fault Detection Algorithms |
|
|
101 | (1) |
|
4.7.3 Energy-Efficient Fault Detection and Tolerance |
|
|
101 | (1) |
|
4.7.4 Machine-Learning-Inspired Fault Detection |
|
|
102 | (1) |
|
4.7.5 FT in Multimedia Sensor Networks |
|
|
102 | (1) |
|
|
102 | (2) |
|
4.7.7 WSN Design and Tuning for Reliability |
|
|
104 | (1) |
|
4.7.8 Novel WSN Architectures |
|
|
104 | (1) |
|
|
105 | (2) |
|
5 A Queueing Theoretic Approach for Performance Evaluation of Low-Power Multicore-Based Parallel Embedded Systems |
|
|
107 | (36) |
|
|
110 | (2) |
|
5.2 Queueing Network Modeling of Multicore Embedded Architectures |
|
|
112 | (8) |
|
5.2.1 Queueing Network Terminology |
|
|
112 | (1) |
|
|
113 | (6) |
|
|
119 | (1) |
|
5.3 Queueing Network Model Validation |
|
|
120 | (5) |
|
5.3.1 Theoretical Validation |
|
|
120 | (1) |
|
5.3.2 Validation with a Multicore Simulator |
|
|
120 | (4) |
|
|
124 | (1) |
|
5.4 Queueing Theoretic Model Insights |
|
|
125 | (14) |
|
|
125 | (4) |
|
5.4.2 The Effects of Cache Miss Rates on Performance |
|
|
129 | (3) |
|
5.4.3 The Effects of Workloads on Performance |
|
|
132 | (3) |
|
5.4.4 Performance per Watt and Performance per Unit Area Computations |
|
|
135 | (4) |
|
|
139 | (4) |
Part III Optimization |
|
|
6 Optimization Approaches in Distributed Embedded Wireless Sensor Networks |
|
|
143 | (16) |
|
6.1 Architecture-Level Optimizations |
|
|
144 | (2) |
|
6.2 Sensor Node Component-Level Optimizations |
|
|
146 | (3) |
|
|
146 | (2) |
|
|
148 | (1) |
|
|
148 | (1) |
|
|
148 | (1) |
|
|
148 | (1) |
|
6.2.6 Location Finding Unit |
|
|
149 | (1) |
|
|
149 | (1) |
|
6.3 Data Link-Level Medium Access Control Optimizations |
|
|
149 | (3) |
|
6.3.1 Load Balancing and Throughput Optimizations |
|
|
149 | (1) |
|
6.3.2 Power/Energy Optimizations |
|
|
150 | (2) |
|
6.4 Network-Level Data Dissemination and Routing Protocol Optimizations |
|
|
152 | (3) |
|
6.4.1 Query Dissemination Optimizations |
|
|
152 | (2) |
|
6.4.2 Real-Time Constrained Optimizations |
|
|
154 | (1) |
|
6.4.3 Network Topology Optimizations |
|
|
154 | (1) |
|
6.4.4 Resource-Adaptive Optimizations |
|
|
154 | (1) |
|
6.5 Operating System-Level Optimizations |
|
|
155 | (1) |
|
6.5.1 Event-Driven Optimizations |
|
|
155 | (1) |
|
6.5.2 Dynamic Power Management |
|
|
155 | (1) |
|
|
155 | (1) |
|
6.6 Dynamic Optimizations |
|
|
156 | (1) |
|
6.6.1 Dynamic Voltage and Frequency Scaling |
|
|
156 | (1) |
|
6.6.2 Software-Based Dynamic Optimizations |
|
|
156 | (1) |
|
6.6.3 Dynamic Network Reprogramming |
|
|
157 | (1) |
|
|
157 | (2) |
|
7 High-Performance Energy-Efficient Multicore-Based Parallel Embedded Computing |
|
|
159 | (32) |
|
7.1 Characteristics of Embedded Systems Applications |
|
|
163 | (3) |
|
7.1.1 Throughput-Intensive |
|
|
163 | (2) |
|
7.1.2 Thermal-Constrained |
|
|
165 | (1) |
|
7.1.3 Reliability-Constrained |
|
|
165 | (1) |
|
|
165 | (1) |
|
7.1.5 Parallel and Distributed |
|
|
165 | (1) |
|
7.2 Architectural Approaches |
|
|
166 | (7) |
|
|
166 | (2) |
|
|
168 | (2) |
|
7.2.3 Interconnection Network |
|
|
170 | (2) |
|
7.2.4 Reduction Techniques |
|
|
172 | (1) |
|
7.3 Hardware-Assisted Middleware Approaches |
|
|
173 | (7) |
|
7.3.1 Dynamic Voltage and Frequency Scaling |
|
|
174 | (1) |
|
7.3.2 Advanced Configuration and Power Interface |
|
|
174 | (1) |
|
|
175 | (1) |
|
7.3.4 Threading Techniques |
|
|
176 | (1) |
|
7.3.5 Energy Monitoring and Management |
|
|
177 | (1) |
|
7.3.6 Dynamic Thermal Management |
|
|
178 | (1) |
|
7.3.7 Dependable Techniques |
|
|
179 | (1) |
|
|
180 | (2) |
|
|
180 | (1) |
|
|
180 | (2) |
|
7.5 High-Performance Energy-Efficient Multicore Processors |
|
|
182 | (4) |
|
|
183 | (1) |
|
7.5.2 ARM Cortex A-9 MPCore |
|
|
184 | (1) |
|
7.5.3 MPC8572E PowerQUICC III |
|
|
184 | (1) |
|
7.5.4 Tilera TILEPro64 and TILE-Gx |
|
|
184 | (1) |
|
7.5.5 AMD Opteron Processor |
|
|
185 | (1) |
|
7.5.6 Intel Xeon Processor |
|
|
185 | (1) |
|
7.5.7 Intel Sandy Bridge Processor |
|
|
185 | (1) |
|
7.5.8 Graphics Processing Units |
|
|
186 | (1) |
|
7.6 Challenges and Future Research Directions |
|
|
186 | (3) |
|
|
189 | (2) |
|
8 An MDP-Based Dynamic Optimization Methodology for Embedded Wireless Sensor Networks |
|
|
191 | (34) |
|
|
193 | (2) |
|
8.2 MDP-Based Tuning Overview |
|
|
195 | (5) |
|
8.2.1 MDP-Based Tuning Methodology for Embedded Wireless Sensor Networks |
|
|
195 | (2) |
|
8.2.2 MDP Overview with Respect to Embedded Wireless Sensor Networks |
|
|
197 | (3) |
|
8.3 Application-Specific Embedded Sensor Node Tuning Formulation as an MDP |
|
|
200 | (5) |
|
|
200 | (1) |
|
8.3.2 Decision Epochs and Actions |
|
|
200 | (1) |
|
|
201 | (1) |
|
8.3.4 Policy and Performance Criterion |
|
|
201 | (1) |
|
|
202 | (2) |
|
8.3.6 Optimality Equation |
|
|
204 | (1) |
|
8.3.7 Policy Iteration Algorithm |
|
|
205 | (1) |
|
8.4 Implementation Guidelines and Complexity |
|
|
205 | (2) |
|
8.4.1 Implementation Guidelines |
|
|
205 | (1) |
|
8.4.2 Computational Complexity |
|
|
206 | (1) |
|
8.4.3 Data Memory Analysis |
|
|
207 | (1) |
|
|
207 | (3) |
|
|
210 | (13) |
|
8.6.1 Fixed Heuristic Policies for Performance Comparisons |
|
|
210 | (1) |
|
|
210 | (3) |
|
8.6.3 Results for a Security/Defense System Application |
|
|
213 | (3) |
|
8.6.4 Results for a Healthcare Application |
|
|
216 | (4) |
|
8.6.5 Results for an Ambient Conditions Monitoring Application |
|
|
220 | (2) |
|
8.6.6 Sensitivity Analysis |
|
|
222 | (1) |
|
8.6.7 Number of Iterations for Convergence |
|
|
223 | (1) |
|
|
223 | (2) |
|
9 Online Algorithms for Dynamic Optimization of Embedded Wireless Sensor Networks |
|
|
225 | (16) |
|
|
227 | (1) |
|
9.2 Dynamic Optimization Methodology |
|
|
228 | (5) |
|
9.2.1 Methodology Overview |
|
|
228 | (1) |
|
|
229 | (1) |
|
|
229 | (1) |
|
9.2.4 Online Optimization Algorithms |
|
|
230 | (3) |
|
|
233 | (6) |
|
|
233 | (2) |
|
|
235 | (4) |
|
|
239 | (2) |
|
10 A Lightweight Dynamic Optimization Methodology for Embedded Wireless Sensor Networks |
|
|
241 | (28) |
|
|
243 | (1) |
|
10.2 Dynamic Optimization Methodology |
|
|
244 | (4) |
|
|
244 | (2) |
|
|
246 | (1) |
|
10.2.3 Optimization Objection Function |
|
|
246 | (2) |
|
10.3 Algorithms for Dynamic Optimization Methodology |
|
|
248 | (4) |
|
10.3.1 Initial Tunable Parameter Value Settings and Exploration Order |
|
|
248 | (1) |
|
10.3.2 Parameter Arrangement |
|
|
249 | (2) |
|
10.3.3 Online Optimization Algorithm |
|
|
251 | (1) |
|
10.3.4 Computational Complexity |
|
|
252 | (1) |
|
10.4 Experimental Results |
|
|
252 | (14) |
|
10.4.1 Experimental Setup |
|
|
253 | (2) |
|
|
255 | (11) |
|
|
266 | (3) |
|
11 Parallelized Benchmark-Driven Performance Evaluation of Symmetric Multiprocessors and Tiled Multicore Architectures for Parallel Embedded Systems |
|
|
269 | (18) |
|
|
271 | (1) |
|
11.2 Multicore Architectures and Benchmarks |
|
|
272 | (3) |
|
11.2.1 Multicore Architectures |
|
|
272 | (1) |
|
11.2.2 Benchmark Applications and Kernels |
|
|
273 | (2) |
|
11.3 Parallel Computing Device Metrics |
|
|
275 | (2) |
|
|
277 | (8) |
|
11.4.1 Quantitative Comparison of SMPs and TMAs |
|
|
277 | (1) |
|
11.4.2 Benchmark-Driven Results for SMPs |
|
|
278 | (2) |
|
11.4.3 Benchmark-Driven Results for TMAs |
|
|
280 | (2) |
|
11.4.4 Comparison of SMPs and TMAs |
|
|
282 | (3) |
|
|
285 | (2) |
|
12 High-Performance Optimizations on Tiled Manycore Embedded Systems: A Matrix Multiplication Case Study |
|
|
287 | (56) |
|
|
290 | (3) |
|
12.1.1 Performance Analysis and Optimization |
|
|
290 | (1) |
|
12.1.2 Parallelized MM Algorithms |
|
|
290 | (1) |
|
|
291 | (1) |
|
12.1.4 Tiled Manycore Architectures |
|
|
292 | (1) |
|
12.2 Tiled Manycore Architecture (TMA) Overview |
|
|
293 | (8) |
|
12.2.1 Intel's TeraFLOPS Research Chip |
|
|
294 | (2) |
|
12.2.2 IBM's Cyclops-64 (C64) |
|
|
296 | (1) |
|
12.2.3 Tilera's TILEPro64 |
|
|
297 | (3) |
|
|
300 | (1) |
|
12.3 Parallel Computing Metrics and Matrix Multiplication (MM) Case Study |
|
|
301 | (2) |
|
12.3.1 Parallel Computing Metrics for TMAs |
|
|
301 | (1) |
|
12.3.2 Matrix Multiplication (MM) Case Study |
|
|
302 | (1) |
|
12.4 Matrix Multiplication Algorithms' Code Snippets for Tilera's TILEPro64 |
|
|
303 | (11) |
|
12.4.1 Serial Non-blocked Matrix Multiplication Algorithm |
|
|
303 | (1) |
|
12.4.2 Serial Blocked Matrix Multiplication Algorithm |
|
|
304 | (3) |
|
12.4.3 Parallel Blocked Matrix Multiplication Algorithm |
|
|
307 | (2) |
|
12.4.4 Parallel Blocked Cannon's Algorithm for Matrix Multiplication |
|
|
309 | (5) |
|
12.5 Performance Optimization on a Manycore Architecture |
|
|
314 | (9) |
|
12.5.1 Performance Optimization on a Single Tile |
|
|
314 | (1) |
|
12.5.2 Parallel Performance Optimizations |
|
|
315 | (4) |
|
12.5.3 Compiler-Based Optimizations |
|
|
319 | (4) |
|
|
323 | (16) |
|
12.6.1 Data Allocation, Data Decomposition, Data Layout, and Communication |
|
|
324 | (3) |
|
12.6.2 Performance Optimizations on a Single Tile |
|
|
327 | (5) |
|
12.6.3 Parallel Performance Optimizations |
|
|
332 | (7) |
|
|
339 | (4) |
|
|
343 | (6) |
References |
|
349 | (20) |
Index |
|
369 | |