Klienditugi: 7440010 (E-R 10-18)

E-raamat: Fault-Tolerant Systems

4.22/5 (9 hinnangut Goodreads-ist)

Israel Koren (Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA), C. Mani Krishna (Department of Electrical and Computer Engineering, University of Massachusetts, Amherst)

Formaat: PDF+DRM
Ilmumisaeg: 01-Sep-2020
Kirjastus: Morgan Kaufmann Publishers In
Keel: eng
ISBN-13: 9780128181065

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 99,58 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: PDF+DRM
Ilmumisaeg: 01-Sep-2020
Kirjastus: Morgan Kaufmann Publishers In
Keel: eng
ISBN-13: 9780128181065

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Fault-Tolerant Systems, Second Edition, is the first book on fault tolerance design utilizing a systems approach to both hardware and software. No other text takes this approach or offers the comprehensive and up-to-date treatment that Koren and Krishna provide. The book comprehensively covers the design of fault-tolerant hardware and software, use of fault-tolerance techniques to improve manufacturing yields, and design and analysis of networks. Incorporating case studies that highlight six different computer systems with fault-tolerance techniques implemented in their design, the book includes critical material on methods to protect against threats to encryption subsystems used for security purposes.

The text’s updated content will help students and practitioners in electrical and computer engineering and computer science learn how to design reliable computing systems, and how to analyze fault-tolerant computing systems.

Delivers the first book on fault tolerance design with a systems approach
Offers comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy
Features fully updated content plus new chapters on failure mechanisms and fault-tolerance in cyber-physical systems
Provides a complete ancillary package, including an on-line solutions manual for instructors and PowerPoint slides

Preface to the Second Edition

Acknowledgments

xiii

Chapter 1 Preliminaries

(10)

1.1 Fault Classification

(2)

1.2 Types of Redundancy

(1)

1.3 Basic Measures of Fault Tolerance

(2)

1.3.1 Traditional Measures

(2)

1.3.2 Network Measures

(1)

1.4 Outline of This Book

(2)

1.5 Further Reading

(1)

References

(2)

Chapter 2 Hardware Fault Tolerance

(48)

2.1 The Rate of Hardware Failure

(2)

2.2 Failure Rate, Reliability, and Mean Time to Failure

(2)

2.3 Hardware Failure Mechanisms

(4)

2.3.1 Electromigration

(1)

2.3.2 Stress Migration

(1)

2.3.3 Negative Bias Temperature Instability

(1)

2.3.4 Hot Carrier Injection

(1)

2.3.5 Time-Dependent Dielectric Breakdown

(1)

2.3.6 Putting It All Together

(1)

2.4 Common-Mode Failures

(1)

2.5 Canonical and Resilient Structures

(13)

2.5.1 Series and Parallel Systems

(1)

2.5.2 Nonseries/Parallel Systems

(3)

2.5.3 M-of-N Systems

(2)

2.5.4 Voters

(1)

2.5.5 Variations on N-Modular Redundancy

(3)

2.5.6 Duplex Systems

(3)

2.6 Other Reliability Evaluation Techniques

(6)

2.6.1 Poisson Processes

(2)

2.6.2 Markov Models

(4)

2.7 Fault-Tolerance Processor-Level Techniques

(4)

2.7.1 Watchdog Processor

(2)

2.7.2 Simultaneous Multithreading for Fault Tolerance

(2)

2.8 Timing Fault Tolerance

(2)

2.9 Tolerance of Byzantine Failures

(5)

2.9.1 Byzantine Agreement With Message Authentication

(1)

2.10 Further Reading

(1)

2.11 Exercises

(4)

References

(4)

Chapter 3 Information Redundancy

(56)

3.1 Coding

(27)

3.1.1 Parity Codes

(5)

3.1.2 Checksum

(2)

3.1.3 M-of-N Codes

(1)

3.1.4 Berger Code

(1)

3.1.5 Cyclic Codes

(6)

3.1.6 Arithmetic Codes

(4)

3.1.7 Local Hard and Soft Decisions

(7)

3.2 Resilient Disk Systems

(11)

3.2.1 RAID Level 1

(1)

3.2.2 RAID Level 2

(1)

3.2.3 RAID Level 3

(1)

3.2.4 RAID Level 4

(1)

3.2.5 RAID Level 5

(1)

3.2.6 Hierarchical RAID

(1)

3.2.7 Modeling Correlated Failures

(4)

3.2.8 RAID With Solid-State Disks

(1)

3.3 Data Replication

(9)

3.3.1 Voting: Nonhierarchical Organization

(5)

3.3.2 Voting: Hierarchical Organization

103

(1)

3.3.3 Primary-Backup Approach

104

(2)

3.4 Algorithm-Based Fault Tolerance

106

(2)

3.5 Further Reading

108

(1)

3.6 Exercises

109

(3)

References

112

(3)

Chapter 4 Fault-Tolerant Networks

115

(46)

4.1 Measures of Resilience

116

(1)

4.1.1 Graph Theoretical Measures

116

(1)

4.1.2 Computer Networks Measures

116

(1)

4.2 Common Network Topologies and Their Resilience

117

(22)

4.2.1 Multistage and Extra-Stage Networks

118

(5)

4.2.2 Crossbar Networks

123

(2)

4.2.3 Rectangular Mesh and Interstitial Mesh

125

(2)

4.2.4 Hypercube Network

127

(4)

4.2.5 Cube-Connected Cycles Networks

131

(1)

4.2.6 Loop Networks

132

(2)

4.2.7 Tree Networks

134

(2)

4.2.8 Ad Hoc Point-to-Point Networks

136

(3)

4.3 Fault-Tolerant Routing

139

(5)

4.3.1 Hypercube Fault-Tolerant Routing

139

(2)

4.3.2 Origin-Based Routing in the Mesh

141

(3)

4.4 Networks on a Chip

144

(5)

4.4.1 Router Fault Tolerance

145

(2)

4.4.2 Links

147

(1)

4.4.3 Routing in the Presence of Failure

148

(1)

4.5 Wireless Sensor Networks

149

(4)

4.5.1 Basics

149

(1)

4.5.2 Sensor Network Failures

150

(1)

4.5.3 Sensor Network Fault Tolerance

150

(3)

4.6 Further Reading

153

(1)

4.7 Exercises

154

(3)

References

157

(4)

Chapter 5 Software Fault Tolerance

161

(42)

5.1 Acceptance Tests

161

(2)

5.2 Single-Version Fault Tolerance

163

(10)

5.2.1 Wrappers

163

(2)

5.2.2 Software Rejuvenation

165

(4)

5.2.3 Data Diversity

169

(2)

5.2.4 Software-Implemented Hardware Fault Tolerance (SIHFT)

171

(2)

5.3 N-Version Programming

173

(8)

5.3.1 Consistent Comparison Problem

174

(1)

5.3.2 Version Independence

175

(4)

5.3.3 Other Issues in N-Version Programming

179

(2)

5.4 Recovery Block Approach

181

(4)

5.4.1 Basic Principles

181

(1)

5.4.2 Success Probability Calculation

182

(1)

5.4.3 Distributed Recovery Blocks

183

(2)

5.5 Preconditions, Postconditions, and Assertions

185

(1)

5.6 Exception Handling

185

(4)

5.6.1 Requirements From Exception Handlers

186

(1)

5.6.2 Basics of Exceptions and Exception Handling

186

(3)

5.6.3 Language Support

189

(1)

5.7 Software Reliability Models

189

(5)

5.7.1 Jelinski-Moranda Model

189

(1)

5.7.2 Littlewood-Verrall Model

190

(1)

5.7.3 Musa-Okumoto Model

191

(1)

5.7.4 Ostrand-Weyuker-Bell (OWB) Fault Model

192

(1)

5.7.5 Model Selection and Parameter Estimation

193

(1)

5.8 Fault-Tolerant Remote Procedure Calls

194

(2)

5.8.1 Primary-Backup Approach

194

(1)

5.8.2 The Circus Approach

194

(2)

5.9 Further Reading

196

(1)

5.10 Exercises

197

(2)

References

199

(4)

Chapter 6 Checkpointing

203

(34)

6.1 What Is Checkpointing?

205

(2)

6.1.1 Why Is Checkpointing Nontrivial?

206

(1)

6.2 Checkpoint Level

207

(1)

6.3 Optimal Checkpointing: an Analytical Model

207

(6)

6.3.1 Time Between Checkpoints-a First-Order Approximation

208

(1)

6.3.2 Optimal Checkpoint Placement

209

(1)

6.3.3 Time Between Checkpoints: a More Accurate Model

210

(1)

6.3.4 Reducing Overhead

211

(1)

6.3.5 Reducing Latency

212

(1)

6.4 Cache-Aided Rollback Error Recovery (CARER)

213

(1)

6.5 Checkpointing in Distributed Systems

214

(9)

6.5.1 The Domino Effect and Livelock

215

(2)

6.5.2 A Coordinated Checkpointing Algorithm

217

(1)

6.5.3 Time-Based Synchronization

218

(1)

6.5.4 Diskless Checkpointing

219

(1)

6.5.5 Message Logging

219

(4)

6.6 Checkpointing in Shared-Memory Systems

223

(2)

6.6.1 Bus-Based Coherence Protocol

223

(1)

6.6.2 Directory-Based Protocol

224

(1)

6.7 Checkpointing in Real-Time Systems

225

(3)

6.8 Checkpointing While Using Cloud Computing Utilities

228

(1)

6.9 Emerging Challenges: Petascale and Exascale Computing

228

(1)

6.10 Other Uses of Checkpointing

229

(1)

6.11 Further Reading

230

(1)

6.12 Exercises

231

(2)

References

233

(4)

Chapter 7 Cyber-Physical Systems

237

(26)

7.1 Structure of a Cyber-Physical System

238

(2)

7.2 The Controlled Plant State Space

240

(2)

7.3 Sensors

242

(10)

7.3.1 Calibration

244

(1)

7.3.2 Detecting Faulty Sensors

245

(5)

7.3.3 Confidence Measures for Intervals

250

(2)

7.4 The Cyber Platform

252

(4)

7.4.1 Isolation

253

(2)

7.4.2 Load Shedding

255

(1)

7.4.3 Overrun Absorption

256

(1)

7.5 Actuators

256

(3)

7.6 Further Reading

259

(1)

7.7 Exercises

260

(1)

References

261

(2)

Chapter 8 Case Studies

263

(28)

8.1 Aerospace Systems

263

(3)

8.1.1 Protecting Against Radiation

263

(1)

8.1.2 Flight Control System: Boeing 777

264

(2)

8.2 NonStop Systems

266

(6)

8.2.1 Architecture

267

(2)

8.2.2 Maintenance and Repair Aids

269

(1)

8.2.3 Software

269

(1)

8.2.4 Modifications to the NonStop Architecture

270

(2)

8.3 Stratus Systems

272

(2)

8.4 Cassini Command and Data Subsystem

274

(2)

8.5 IBM POWER8

276

(1)

8.6 IBM G5

277

(1)

8.7 IBM Sysplex

278

(2)

8.8 Intel Servers

280

(4)

8.8.1 Itanium

280

(2)

8.8.2 Xeon

282

(2)

8.9 Oracle SPARC M8 Server

284

(1)

8.10 Cloud Computing

285

(2)

8.10.1 Checkpointing in Response to Spot Pricing

285

(1)

8.10.2 Proactive Virtual Machine Migration

286

(1)

8.10.3 Fault Tolerance as a Service

286

(1)

8.11 Further Reading

287

(1)

References

288

(3)

Chapter 9 Simulation Techniques

291

(50)

9.1 Writing a Simulation Program

291

(3)

9.2 Parameter Estimation

294

(11)

9.2.1 Point Versus Interval Estimation

294

(1)

9.2.2 Method of Moments

295

(2)

9.2.3 Method of Maximum Likelihood

297

(3)

9.2.4 The Bayesian Approach to Parameter Estimation

300

(1)

9.2.5 Confidence Intervals

301

(4)

9.3 Variance Reduction Methods

305

(11)

9.3.1 Antithetic Variables

305

(2)

9.3.2 Using Control Variables

307

(1)

9.3.3 Stratified Sampling

307

(2)

9.3.4 Importance Sampling

309

(7)

9.4 Splitting

316

(5)

9.5 Random Number Generation

321

(11)

9.5.1 Uniformly Distributed Random Number Generators

321

(3)

9.5.2 Testing Uniform Random Number Generators

324

(3)

9.5.3 Generating Other Distributions

327

(5)

9.6 Fault Injection

332

(3)

9.6.1 Types of Fault Injection Techniques

332

(2)

9.6.2 Fault Injection Application and Tools

334

(1)

9.7 Further Reading

335

(1)

9.8 Exercises

336

(2)

References

338

(3)

Chapter 10 Defect Tolerance in VLSI Circuits

341

(32)

10.1 Manufacturing Defects and Circuit Faults

341

(2)

10.2 Probability of Failure and Critical Area

343

(2)

10.3 Basic Yield Models

345

(4)

10.3.1 The Poisson and Compound Poisson Yield Models

345

(2)

10.3.2 Variations on the Simple Yield Models

347

(2)

10.4 Yield Enhancement Through Redundancy

349

(15)

10.4.1 Yield Projection for Chips With Redundancy

349

(4)

10.4.2 Memory Arrays With Redundancy

353

(6)

10.4.3 Logic Integrated Circuits With Redundancy

359

(2)

10.4.4 Modifying the Floorplan

361

(3)

10.5 Further Reading

364

(2)

10.6 Exercises

366

(3)

References

369

(4)

Chapter 11 Fault Detection in Cryptographic Systems

373

(22)

11.1 Overview of Ciphers

373

(10)

11.1.1 Symmetric Key Ciphers

374

(7)

11.1.2 Public Key Ciphers

381

(2)

11.2 Security Attacks Through Fault Injection

383

(2)

11.2.1 Fault Attacks on Symmetric Key Ciphers

384

(1)

11.2.2 Fault Attacks on Public (Asymmetric) Key Ciphers

385

(1)

11.3 Countermeasures

385

(7)

11.3.1 Spatial and Temporal Duplication

386

(1)

11.3.2 Error-Detecting Codes

386

(3)

11.3.3 Are These Countermeasure Sufficient?

389

(2)

11.3.4 Final Comment

391

(1)

11.4 Further Reading

392

(1)

11.5 Exercises

392

(1)

References

393

(2)

Index

395

Israel Koren is Professor Emeritus of Electrical and Computer Engineering at the University of Massachusetts, Amherst. Previously, he held positions with the Technion---Israel Institute of Technology, Haifa, the University of California at Berkeley, the University of Southern California, Los Angeles and the University of California, Santa Barbara. He has been a consultant to several companies, including Analog Devices, AMD, Digital Equipment Corp., IBM, Intel, and National Semiconductors. His research interests include fault-tolerant computing, cyber-physical systems, computer architecture, computer arithmetic, and secure cryptographic systems. He has over 300 publications in refereed journals and conferences and served as general chair, program committee chair and program committee member for numerous conferences C. Mani Krishna is Professor of Electrical and Computer Engineering at the University of Massachusetts, Amherst. He received his PhD in Electrical Engineering from the University of Michigan in 1984. He previously received a BTech in Electrical Engineering from the Indian Institute of Technology, Delhi, in 1979, and an MS from the Rensselaer Polytechnic Institute in Troy, NY, in 1980. Dr. Krishna's research interests are in the areas of cyber-physical systems, real-time and fault-tolerant computing, and distributed and networked systems. He has also been an editor on volumes of readings in performance evaluation and real-time systems, and for special issues on real-time systems of IEEE Computer and the Proceedings of the IEEE.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97801281810652e.html

Märksõnad:

E-raamat: Fault-Tolerant Systems

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv