1 Basic Concepts, Motivation and Structure |
|
1 | (6) |
|
|
1 | (3) |
|
1.2 Scope and Contribution |
|
|
4 | (1) |
|
|
5 | (2) |
2 Background Concepts and Resilience |
|
7 | (32) |
|
2.1 System Failure Life Cycle |
|
|
7 | (2) |
|
2.2 Attributes and Measures of Resilience |
|
|
9 | (1) |
|
|
10 | (14) |
|
2.3.1 Performance and Reliability |
|
|
10 | (3) |
|
2.3.2 Reliability and Unreliability Functions |
|
|
13 | (1) |
|
2.3.3 Probability Density Function |
|
|
14 | (1) |
|
2.3.4 Failure Rate Function |
|
|
15 | (1) |
|
2.3.5 Cumulative Hazard Function |
|
|
16 | (1) |
|
2.3.6 Bathtub Curve of Failure Rates |
|
|
16 | (2) |
|
2.3.7 Mean Time Between Failures (MTBF) |
|
|
18 | (1) |
|
2.3.8 Mean Time to Failure (MTTF) |
|
|
19 | (1) |
|
2.3.9 Reliability Prediction |
|
|
20 | (4) |
|
|
24 | (1) |
|
|
25 | (8) |
|
|
25 | (1) |
|
|
26 | (3) |
|
|
29 | (4) |
|
|
33 | (1) |
|
|
34 | (2) |
|
|
35 | (1) |
|
2.7.2 Effectiveness of Resilience |
|
|
35 | (1) |
|
|
36 | (3) |
3 Dealing with Faults: Redundancy |
|
39 | (40) |
|
3.1 Handling Faults: Design Strategies |
|
|
39 | (1) |
|
|
40 | (2) |
|
3.3 Fault Tolerance: Using Redundancy |
|
|
42 | (2) |
|
3.3.1 Redundancy Notation |
|
|
43 | (1) |
|
3.4 Structural Redundancy HW(S) |
|
|
44 | (15) |
|
|
46 | (5) |
|
|
51 | (4) |
|
|
55 | (4) |
|
3.5 Information Redundancy |
|
|
59 | (10) |
|
3.5.1 Error Detection Codes: EDC |
|
|
61 | (1) |
|
3.5.2 Error Correction Codes: ECC |
|
|
62 | (7) |
|
|
69 | (7) |
|
3.6.1 Concurrent Error Detection: Basics of Time Redundancy |
|
|
69 | (3) |
|
|
72 | (1) |
|
3.6.3 Recomputing with Shifted Operands (RESO) |
|
|
73 | (2) |
|
3.6.4 Recomputing with Rotated Operands (RERO) |
|
|
75 | (1) |
|
3.6.5 Recomputing with Swapped Operands (RESWO) |
|
|
76 | (1) |
|
3.6.6 Recomputing with Comparison (REDWC) |
|
|
76 | (1) |
|
3.7 Comparison of Main Redundancy Schemes |
|
|
76 | (1) |
|
|
77 | (2) |
4 Impact of Radiation on Electronics |
|
79 | (34) |
|
|
79 | (1) |
|
4.2 Radiation and Its Effect on Electronics |
|
|
80 | (1) |
|
|
81 | (1) |
|
4.4 Radiation Macro-effects |
|
|
82 | (5) |
|
4.5 Single Event Effects (SEE) |
|
|
87 | (24) |
|
4.5.1 Physical Mechanisms Responsible for SEES |
|
|
87 | (8) |
|
4.5.2 System Level Response |
|
|
95 | (16) |
|
|
111 | (2) |
5 FT Models |
|
113 | (32) |
|
|
113 | (4) |
|
|
117 | (1) |
|
5.3 Classification of Faults by Origin |
|
|
117 | (6) |
|
|
117 | (4) |
|
|
121 | (1) |
|
5.3.3 Phase of Creation and Occurrence of Faults |
|
|
122 | (1) |
|
|
122 | (1) |
|
|
123 | (1) |
|
5.3.6 Phenomenological Cause |
|
|
123 | (1) |
|
5.3.7 Capability/Objective/Intent |
|
|
123 | (1) |
|
5.4 Classification of Faults by Manifestation |
|
|
123 | (11) |
|
5.4.1 Response-Timeliness |
|
|
125 | (1) |
|
|
126 | (2) |
|
5.4.3 Maintainability: Detectability, Diagnosability and Recoverability |
|
|
128 | (6) |
|
5.5 FT and System Modelling |
|
|
134 | (8) |
|
|
135 | (1) |
|
5.5.2 GAFT: Generalised Algorithm of Fault Tolerance |
|
|
136 | (4) |
|
5.5.3 GAFT: System Estates and Actions to Implement Fault Tolerance |
|
|
140 | (2) |
|
|
142 | (3) |
6 Hardware Support of Resilience |
|
145 | (28) |
|
6.1 ERA Concept, System Design and Hardware Elements |
|
|
145 | (2) |
|
6.2 ERA Hardware Configuration: ERRIC |
|
|
147 | (5) |
|
|
147 | (3) |
|
|
150 | (1) |
|
|
151 | (1) |
|
6.3 ERA Reconfigurability |
|
|
152 | (4) |
|
6.3.1 T-Logic for Memory Management |
|
|
152 | (3) |
|
6.3.2 T-Logic for Configuration in ERA |
|
|
155 | (1) |
|
|
156 | (9) |
|
|
156 | (5) |
|
6.4.2 Location Access and Way of Operation of the Syndrome |
|
|
161 | (2) |
|
6.4.3 Syndrome: Passive Zone Configurations |
|
|
163 | (2) |
|
|
165 | (3) |
|
6.5.1 Graceful Degradation: Markov Analysis |
|
|
166 | (2) |
|
6.6 Implementation Constraints |
|
|
168 | (4) |
|
6.6.1 Graceful Degradation: Markov Analysis |
|
|
169 | (1) |
|
6.6.2 Interfacing Zone: the Syndrome as Memory Controller |
|
|
170 | (2) |
|
6.6.3 Access to the Syndrome |
|
|
172 | (1) |
|
|
172 | (1) |
7 System Software Support |
|
173 | (10) |
|
7.1 System Software Support of Hardware Checking |
|
|
173 | (3) |
|
7.2 System Software Support for Hardware Reconfiguration |
|
|
176 | (2) |
|
7.3 System Software Monitor of Hardware Condition |
|
|
178 | (2) |
|
|
180 | (3) |
8 Implementation: Hardware Prototype, Comparisons, Simulation and Testing |
|
183 | (24) |
|
8.1 Instruction Execution |
|
|
183 | (1) |
|
|
184 | (4) |
|
8.3 ERA Hardware Prototype |
|
|
188 | (1) |
|
8.4 Architectural Comparison |
|
|
189 | (5) |
|
8.5 ERA Testing and Debugging |
|
|
194 | (1) |
|
|
194 | (4) |
|
8.7 ERA's Simulator: Dissimera |
|
|
198 | (7) |
|
8.7.1 Architecture and Description |
|
|
199 | (6) |
|
8.7.2 Dissimera Log Sample |
|
|
205 | (1) |
|
|
205 | (2) |
9 Conclusions |
|
207 | (4) |
|
|
207 | (3) |
|
|
210 | (1) |
10 Vision on Evolving System Future |
|
211 | (28) |
|
|
211 | (2) |
|
10.2 Known Solutions (What We Have...) |
|
|
213 | (1) |
|
|
214 | (4) |
|
10.4 Proposed Approach (What We Need and Why We Need This) |
|
|
218 | (2) |
|
|
220 | (5) |
|
10.5.1 Control-Data-Predicate (CDP) Model |
|
|
220 | (3) |
|
10.5.2 Graph Logic Model (GLM) |
|
|
223 | (2) |
|
10.6 System Software for Evolving Systems |
|
|
225 | (6) |
|
10.6.1 Active Language (AL) |
|
|
225 | (3) |
|
10.6.2 Active Reconfigurable Run-Time System |
|
|
228 | (3) |
|
10.7 Evolving System: Hardware |
|
|
231 | (2) |
|
|
231 | (2) |
|
10.8 Evolving System: Multi-element Configuration |
|
|
233 | (2) |
|
10.9 Evolving System Approach vs. Berkley View |
|
|
235 | (2) |
|
10.10 Evolving System: Conclusion |
|
|
237 | (2) |
References |
|
239 | (16) |
Index |
|
255 | |