Muutke küpsiste eelistusi

E-raamat: Fault-Tolerant Systems

(Department of Electrical and Computer Engineering, University of Massachusetts, Amherst), (Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA)
  • Formaat: EPUB+DRM
  • Ilmumisaeg: 19-Jul-2010
  • Kirjastus: Morgan Kaufmann Publishers In
  • Keel: eng
  • ISBN-13: 9780080492681
Teised raamatud teemal:
  • Formaat - EPUB+DRM
  • Hind: 60,50 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: EPUB+DRM
  • Ilmumisaeg: 19-Jul-2010
  • Kirjastus: Morgan Kaufmann Publishers In
  • Keel: eng
  • ISBN-13: 9780080492681
Teised raamatud teemal:

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

There are many applications in which the reliability of the overall system must be far higher than the reliability of its individual components. In such cases, designers devise mechanisms and architectures that allow the system to either completely mask the effects of a component failure or recover from it so quickly that the application is not seriously affected. This is the work of fault-tolerant designers and their work is increasingly important and complex not only because of the increasing number of “mission critical? applications, but also because the diminishing reliability of hardware means that even systems for non-critical applications will need to be designed with fault-tolerance in mind.

Reflecting the real-world challenges faced by designers of these systems, this book addresses fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment Koren and Krishna provide. Students, designers and architects of high performance processors will value this comprehensive overview of the field.

* The first book on fault tolerance design with a systems approach

* Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy

* Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design

* Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides

Muu info

Now combining hardware and software fault tolerance in a single book
Foreword xi
Preface xiii
Acknowledgements xvii
About the Authors xix
Preliminaries
1(10)
Fault Classification
2(1)
Types of Redundancy
3(1)
Basic Measures of Fault Tolerance
4(3)
Traditional Measures
5(1)
Network Measures
6(1)
Outline of This Book
7(2)
Further Reading
9(2)
References
10(1)
Hardware Fault Tolerance
11(44)
The Rate of Hardware Failures
11(2)
Failure Rate, Reliability, and Mean Time to Failure
13(2)
Canonical and Resilient Structures
15(15)
Series and Parallel Systems
16(1)
Non-Series/Parallel Systems
17(3)
M-of-N Systems
20(3)
Voters
23(1)
Variations on N-Modular Redundancy
23(4)
Duplex Systems
27(3)
Other Reliability Evaluation Techniques
30(6)
Poisson Processes
30(3)
Markov Models
33(3)
Fault-Tolerance Processor-Level Techniques
36(5)
Watchdog Processor
37(2)
Simultaneous Multithreading for Fault Tolerance
39(2)
Byzantine Failures
41(7)
Byzantine Agreement with Message Authentication
46(2)
Further Reading
48(1)
Exercises
48(7)
References
53(2)
Information Redundancy
55(54)
Coding
56(23)
Parity Codes
57(7)
Checksum
64(1)
M-of-N Codes
65(1)
Berger Code
66(1)
Cyclic Codes
67(7)
Arithmetic Codes
74(5)
Resilient Disk Systems
79(9)
RAID Level 1
79(2)
RAID Level 2
81(1)
RAID Level 3
82(1)
RAID Level 4
83(1)
RAID Level 5
84(1)
Modeling Correlated Failures
84(4)
Data Replication
88(11)
Voting: Non-Hierarchical Organization
89(6)
Voting: Hierarchical Organization
95(1)
Primary-Backup Approach
96(3)
Algorithm-Based Fault Tolerance
99(2)
Further Reading
101(1)
Exercises
102(7)
References
106(3)
Fault-Tolerant Networks
109(38)
Measures of Resilience
110(2)
Graph-Theoretical Measures
110(1)
Computer Networks Measures
111(1)
Common Network Topologies and Their Resilience
112(23)
Multistage and Extra-Stage Networks
112(7)
Crossbar Networks
119(2)
Rectangular Mesh and Interstitial Mesh
121(3)
Hypercube Network
124(4)
Cube-Connected Cycles Networks
128(2)
Loop Networks
130(2)
Ad hoc Point-to-Point Networks
132(3)
Fault-Tolerant Routing
135(6)
Hypercube Fault-Tolerant Routing
136(2)
Origin-Based Routing in the Mesh
138(3)
Further Reading
141(1)
Exercises
142(5)
References
145(2)
Software Fault Tolerance
147(46)
Acceptance Tests
148(1)
Single-Version Fault Tolerance
149(11)
Wrappers
149(3)
Software Rejuvenation
152(3)
Data Diversity
155(2)
Software Implemented Hardware Fault Tolerance (SIHFT)
157(3)
N-Version Programming
160(9)
Consistent Comparison Problem
161(1)
Version Independence
162(7)
Recovery Block Approach
169(4)
Basic Principles
169(1)
Success Probability Calculation
169(2)
Distributed Recovery Blocks
171(2)
Preconditions, Postconditions, and Assertions
173(1)
Exception-Handling
173(5)
Requirements from Exception-Handlers
174(1)
Basics of Exceptions and Exception-Handling
175(2)
Language Support
177(1)
Software Reliability Models
178(4)
Jelinski--Moranda Model
178(1)
Littlewood--Verrall Model
179(1)
Musa--Okumoto Model
180(2)
Model Selection and Parameter Estimation
182(1)
Fault-Tolerant Remote Procedure Calls
182(2)
Primary-Backup Approach
182(1)
The Circus Approach
183(1)
Further Reading
184(2)
Exercises
186(7)
References
188(5)
Checkpointing
193(36)
What is Checkpointing?
195(2)
Why is Checkpointing Nontrivial?
197(1)
Checkpoint Level
197(1)
Optimal Checkpointing---An Analytical Model
198(8)
Time Between Checkpoints---A First-Order Approximation
200(1)
Optimal Checkpoint Placement
201(1)
Time Between Checkpoints---A More Accurate Model
202(2)
Reducing Overhead
204(1)
Reducing Latency
205(1)
Cache-Aided Rollback Error Recovery (CARER)
206(1)
Checkpointing in Distributed Systems
207(10)
The Domino Effect and Livelock
209(1)
A Coordinated Checkpointing Algorithm
210(1)
Time-Based Synchronization
211(1)
Diskless Checkpointing
212(1)
Message Logging
213(4)
Checkpointing in Shared-Memory Systems
217(3)
Bus-Based Coherence Protocol
218(1)
Directory-Based Protocol
219(1)
Checkpointing in Real-Time Systems
220(3)
Other Uses of Checkpointing
223(1)
Further Reading
223(1)
Exercises
224(5)
References
226(3)
Case Studies
229(20)
NonStop Systems
229(7)
Architecture
229(4)
Maintenance and Repair Aids
233(1)
Software
233(2)
Modifications to the NonStop Architecture
235(1)
Stratus Systems
236(2)
Cassini Command and Data Subsystem
238(3)
IBM G5
241(1)
IBM Sysplex
242(2)
Itanium
244(2)
Further Reading
246(3)
References
247(2)
Defect Tolerance in VLSI Circuits
249(36)
Manufacturing Defects and Circuit Faults
249(2)
Probability of Failure and Critical Area
251(2)
Basic Yield Models
253(5)
The Poisson and Compound Poisson Yield Models
254(2)
Variations on the Simple Yield Models
256(2)
Yield Enhancement Through Redundancy
258(18)
Yield Projection for Chips with Redundancy
259(4)
Memory Arrays with Redundancy
263(7)
Logic Integrated Circuits with Redundancy
270(2)
Modifying the Floorplan
272(4)
Further Reading
276(1)
Exercises
277(8)
References
281(4)
Fault Detection in Cryptographic Systems
285(26)
Overview of Ciphers
286(10)
Symmetric Key Ciphers
286(9)
Public Key Ciphers
295(1)
Security Attacks Through Fault Injection
296(3)
Fault Attacks on Symmetric Key Ciphers
297(1)
Fault Attacks on Public (Asymmetric) Key Ciphers
298(1)
Countermeasures
299(8)
Spatial and Temporal Duplication
300(1)
Error-Detecting Codes
300(4)
Are These Countermeasures Sufficient?
304(3)
Final Comment
307(1)
Further Reading
307(1)
Exercises
307(4)
References
308(3)
Simulation Techniques
311(54)
Writing a Simulation Program
311(4)
Parameter Estimation
315(13)
Point Versus Interval Estimation
315(1)
Method of Moments
316(2)
Method of Maximum Likelihood
318(4)
The Bayesian Approach to Parameter Estimation
322(2)
Confidence Intervals
324(4)
Variance Reduction Methods
328(13)
Antithetic Variables
328(2)
Using Control Variables
330(1)
Stratified Sampling
331(2)
Importance Sampling
333(8)
Random Number Generation
341(14)
Uniformly Distributed Random Number Generators
342(3)
Testing Uniform Random Number Generators
345(4)
Generating Other Distributions
349(6)
Fault Injection
355(3)
Types of Fault Injection Techniques
356(2)
Fault Injection Application and Tools
358(1)
Further Reading
358(1)
Exercises
359(6)
References
363(2)
Subject Index 365


Israel Koren is Professor Emeritus of Electrical and Computer Engineering at the University of Massachusetts, Amherst. Previously, he held positions with the Technion---Israel Institute of Technology, Haifa, the University of California at Berkeley, the University of Southern California, Los Angeles and the University of California, Santa Barbara. He has been a consultant to several companies, including Analog Devices, AMD, Digital Equipment Corp., IBM, Intel, and National Semiconductors. His research interests include fault-tolerant computing, cyber-physical systems, computer architecture, computer arithmetic, and secure cryptographic systems. He has over 300 publications in refereed journals and conferences and served as general chair, program committee chair and program committee member for numerous conferences C. Mani Krishna is Professor of Electrical and Computer Engineering at the University of Massachusetts, Amherst. He received his PhD in Electrical Engineering from the University of Michigan in 1984. He previously received a BTech in Electrical Engineering from the Indian Institute of Technology, Delhi, in 1979, and an MS from the Rensselaer Polytechnic Institute in Troy, NY, in 1980. Dr. Krishna's research interests are in the areas of cyber-physical systems, real-time and fault-tolerant computing, and distributed and networked systems. He has also been an editor on volumes of readings in performance evaluation and real-time systems, and for special issues on real-time systems of IEEE Computer and the Proceedings of the IEEE.