About the Authors |
|
xix | |
List of Figures |
|
xxi | |
List of Tables |
|
xxv | |
Series Editor's Foreword |
|
xxvii | |
Series Foreword Second Edition |
|
xxix | |
Series Foreword First Edition |
|
xxxi | |
Foreword First Edition |
|
xxxiii | |
Preface Second Edition |
|
xxxv | |
Preface First Edition |
|
xxxvii | |
Acknowledgments |
|
xli | |
Glossary |
|
xliii | |
Part I Reliability and Software Quality - It's a Matter of Survival |
|
1 | (36) |
|
1 The Need for a New Paradigm for Hardware Reliability and Software Quality |
|
|
3 | (12) |
|
1.1 Rapidly Shifting Challenges for Hardware Reliability and Software Quality |
|
|
3 | (2) |
|
1.2 Gaining Competitive Advantage |
|
|
5 | (1) |
|
1.3 Competing in the Next Decade - Winners Will Compete on Reliability |
|
|
5 | (1) |
|
1.4 Concurrent Engineering |
|
|
6 | (2) |
|
1.5 Reducing the Number of Engineering Change Orders at Product Release |
|
|
8 | (1) |
|
1.6 Time-to-Market Advantage |
|
|
9 | (1) |
|
1.7 Accelerating Product Development |
|
|
10 | (1) |
|
1.8 Identifying and Managing Risks |
|
|
11 | (1) |
|
1.9 ICM, a Process to Mitigate Risk |
|
|
11 | (1) |
|
1.10 Software Quality Overview |
|
|
12 | (1) |
|
|
13 | (1) |
|
|
13 | (2) |
|
2 Barriers to Implementing Hardware Reliability and Software Quality |
|
|
15 | (10) |
|
2.1 Lack of Understanding |
|
|
15 | (1) |
|
|
16 | (1) |
|
2.3 Implementing Change and Change Agents |
|
|
17 | (2) |
|
|
19 | (1) |
|
2.5 Perceived External Barriers |
|
|
20 | (1) |
|
2.6 Time to Gain Acceptance |
|
|
21 | (1) |
|
|
22 | (1) |
|
2.8 Barriers to Software Process Improvement |
|
|
23 | (2) |
|
3 Understanding Why Products Fail |
|
|
25 | (8) |
|
|
25 | (3) |
|
3.2 Parts Have Improved, Everyone Can Build Quality Products |
|
|
28 | (1) |
|
3.3 Hardware Reliability and Software Quality - The New Paradigm |
|
|
28 | (1) |
|
3.4 Reliability vs. Quality Escapes |
|
|
29 | (1) |
|
3.5 Why Software Quality Improvement Programs Are Unsuccessful |
|
|
30 | (1) |
|
|
31 | (2) |
|
4 Alternative Approaches to Implementing Reliability |
|
|
33 | (4) |
|
4.1 Hiring Consultants for HALT Testing |
|
|
33 | (1) |
|
4.2 Outsourcing Reliability Testing |
|
|
33 | (1) |
|
4.3 Using Consultants to Develop and Implement a Reliability Program |
|
|
34 | (1) |
|
4.4 Hiring Reliability Engineers |
|
|
34 | (3) |
Part II Unraveling the Mystery |
|
37 | (168) |
|
|
39 | (10) |
|
5.1 Six Phases of the Product Life Cycle |
|
|
39 | (2) |
|
|
41 | (4) |
|
5.2.1 Investigate the Risk |
|
|
41 | (1) |
|
5.2.2 Communicate the Risk |
|
|
41 | (3) |
|
|
44 | (1) |
|
5.3 The ICM Process for a Small Company |
|
|
45 | (1) |
|
|
46 | (1) |
|
|
46 | (1) |
|
|
47 | (1) |
|
|
47 | (1) |
|
|
48 | (1) |
|
|
49 | (24) |
|
|
50 | (1) |
|
6.2 Mean Time between Failure |
|
|
51 | (2) |
|
6.2.1 Mean Time between Repair |
|
|
52 | (1) |
|
6.2.2 Mean Time between Maintenance (MTBM) |
|
|
52 | (1) |
|
6.2.3 Mean Time between Incidents (MTBI) |
|
|
52 | (1) |
|
6.2.4 Mean Time to Failure (MTTF) |
|
|
52 | (1) |
|
6.2.5 Mean Time to Repair (MTTR) |
|
|
52 | (1) |
|
6.2.6 Mean Time to Restore System (MTTRS) |
|
|
52 | (1) |
|
|
53 | (2) |
|
|
55 | (2) |
|
6.4.1 On-site Manufacturer Service Personnel |
|
|
56 | (1) |
|
6.4.2 Trained Customer Service Personnel |
|
|
56 | (1) |
|
6.4.3 Manufacturer Training for Customer Service Personnel |
|
|
56 | (1) |
|
6.4.4 Easy-to-Use Service Manuals |
|
|
56 | (1) |
|
6.4.5 Rapid Diagnosis Capability |
|
|
56 | (1) |
|
6.4.6 Repair and Spare Parts Availability |
|
|
57 | (1) |
|
6.4.7 Rapid Response to Customer Requests for Service |
|
|
57 | (1) |
|
6.4.8 Failure Data Tracking |
|
|
57 | (1) |
|
|
57 | (2) |
|
6.6 Reliability Demonstration Testing |
|
|
59 | (3) |
|
6.7 Maintenance and Availability |
|
|
62 | (7) |
|
6.7.1 Preventative Maintenance |
|
|
63 | (1) |
|
6.7.2 Predictive Maintenance |
|
|
64 | (1) |
|
6.7.3 Prognostics and Health Management (PHM) |
|
|
64 | (5) |
|
|
69 | (1) |
|
|
70 | (1) |
|
|
71 | (1) |
|
|
72 | (1) |
|
|
72 | (1) |
|
Reliability Demonstration |
|
|
72 | (1) |
|
Prognostics and Health Management |
|
|
72 | (1) |
|
|
73 | (28) |
|
|
73 | (1) |
|
|
74 | (12) |
|
7.2.1 The Functional Block Diagram (FBD) |
|
|
74 | (4) |
|
7.2.1.1 Generating the Functional Block Diagram |
|
|
75 | (1) |
|
7.2.1.2 Filling in the Functional Block Diagram |
|
|
76 | (2) |
|
7.2.2 The Fault Tree Analysis |
|
|
78 | (2) |
|
7.2.2.1 Building the Fault Tree |
|
|
78 | (1) |
|
|
79 | (1) |
|
7.2.3 Failure Modes and Effects Analysis Spreadsheet |
|
|
80 | (6) |
|
7.3 Preparing for the FMEA |
|
|
86 | (3) |
|
7.4 Barriers to the FMEA Process |
|
|
89 | (2) |
|
|
91 | (1) |
|
7.6 Using Macros to Improve FMEA Efficiency and Effectiveness |
|
|
92 | (2) |
|
|
94 | (3) |
|
7.8 Software Fault Tree Analysis (SFTA) |
|
|
97 | (1) |
|
|
97 | (2) |
|
|
99 | (2) |
|
8 The Reliability Toolbox |
|
|
101 | (38) |
|
|
101 | (20) |
|
8.1.1 Types of Stresses Applied in HALT |
|
|
104 | (1) |
|
8.1.2 The Theory behind the HALT Process |
|
|
105 | (4) |
|
8.1.3 HALT Testing Liquid Cooled Products |
|
|
109 | (1) |
|
8.1.4 Planning for HALT Testing |
|
|
110 | (11) |
|
8.2 Highly Accelerated Stress Screening (HASS) |
|
|
121 | (6) |
|
8.2.1 Proof of Screen (POS) |
|
|
122 | (1) |
|
|
123 | (1) |
|
8.2.3 Environmental Stress Screening (ESS) |
|
|
124 | (1) |
|
8.2.4 Economic Impact of HASS |
|
|
125 | (1) |
|
|
126 | (1) |
|
8.3 HALT and HASS Test Chambers |
|
|
127 | (1) |
|
8.4 Accelerated Reliability Growth (ARG) |
|
|
128 | (3) |
|
8.5 Accelerated Early Life Test (ELT) |
|
|
131 | (1) |
|
|
132 | (1) |
|
|
132 | (2) |
|
|
134 | (1) |
|
|
134 | (1) |
|
|
134 | (1) |
|
|
135 | (1) |
|
|
136 | (1) |
|
|
136 | (1) |
|
|
136 | (1) |
|
|
137 | (1) |
|
|
137 | (2) |
|
9 Software Quality Goals and Metrics |
|
|
139 | (12) |
|
9.1 Setting Software Quality Goals |
|
|
139 | (1) |
|
|
140 | (2) |
|
|
142 | (1) |
|
|
142 | (2) |
|
|
144 | (1) |
|
|
145 | (2) |
|
|
147 | (1) |
|
|
148 | (1) |
|
|
149 | (1) |
|
|
150 | (1) |
|
10 Software Quality Analysis Techniques |
|
|
151 | (8) |
|
|
151 | (1) |
|
|
151 | (1) |
|
10.3 Cause and Effect Diagrams |
|
|
152 | (1) |
|
|
153 | (1) |
|
10.5 Defect Prevention, Defect Detection, and Defensive Programming |
|
|
154 | (3) |
|
|
157 | (1) |
|
|
158 | (1) |
|
|
158 | (1) |
|
|
159 | (8) |
|
|
159 | (2) |
|
|
161 | (1) |
|
|
162 | (3) |
|
11.4 How to Choose a Software Life Cycle |
|
|
165 | (1) |
|
|
166 | (1) |
|
|
166 | (1) |
|
12 Software Procedures and Techniques |
|
|
167 | (16) |
|
12.1 Gathering Requirements |
|
|
167 | (2) |
|
12.2 Documenting Requirements |
|
|
169 | (3) |
|
|
172 | (1) |
|
|
173 | (1) |
|
12.5 Reviews and Inspections |
|
|
174 | (5) |
|
|
179 | (1) |
|
|
179 | (1) |
|
12.8 Software and Hardware Integration |
|
|
180 | (2) |
|
|
182 | (1) |
|
|
182 | (1) |
|
13 Why Hardware Reliability and Software Quality Improvement Efforts Fail |
|
|
183 | (16) |
|
13.1 Lack of Commitment to the Reliability Process |
|
|
183 | (2) |
|
13.2 Inability to Embrace and Mitigate Technologies Risk Issues |
|
|
185 | (1) |
|
13.3 Choosing the Wrong People for the Job |
|
|
186 | (1) |
|
|
186 | (5) |
|
13.5 Inadequate Resources |
|
|
191 | (1) |
|
13.6 MIL-HDBK 217 - Why It Is Obsolete |
|
|
192 | (3) |
|
13.7 Finding But Not Fixing Problems |
|
|
195 | (1) |
|
|
196 | (1) |
|
13.9 Vibration Testing Too Difficult to Implement |
|
|
196 | (1) |
|
13.10 The Impact of Late Hardware or Late Software Delivery |
|
|
196 | (1) |
|
13.11 Supplier Reliability |
|
|
196 | (1) |
|
|
197 | (1) |
|
|
197 | (2) |
|
|
199 | (6) |
|
14.1 Purchasing Interface |
|
|
199 | (1) |
|
14.2 Identifying Your Critical Suppliers |
|
|
200 | (1) |
|
14.3 Develop a Thorough Supplier Audit Process |
|
|
200 | (1) |
|
14.4 Develop Rapid Nonconformance Feedback |
|
|
201 | (1) |
|
14.5 Develop a Materials Review Board (MRB) |
|
|
202 | (1) |
|
14.6 Counterfeit Parts and Materials |
|
|
202 | (3) |
Part III Steps to Successful Implementation |
|
205 | (40) |
|
15 Establishing a Reliability Lab |
|
|
207 | (14) |
|
15.1 Staffing for Reliability |
|
|
207 | (1) |
|
|
208 | (2) |
|
15.3 Facility Requirements |
|
|
210 | (1) |
|
15.4 Liquid Nitrogen Requirements |
|
|
210 | (1) |
|
15.5 Air Compressor Requirements |
|
|
211 | (1) |
|
15.6 Selecting a Reliability Lab Location |
|
|
212 | (1) |
|
15.7 Selecting a Halt Test Chamber |
|
|
213 | (7) |
|
|
214 | (1) |
|
15.7.2 Machine Overall Height |
|
|
214 | (2) |
|
15.7.3 Power Required and Consumption |
|
|
216 | (1) |
|
15.7.4 Acceptable Operational Noise Levels |
|
|
216 | (1) |
|
|
216 | (1) |
|
|
217 | (1) |
|
15.7.7 Profile Creation, Editing, and Storage |
|
|
217 | (1) |
|
15.7.8 Temperature Rates of Change |
|
|
217 | (1) |
|
15.7.9 Built-In Test Instrumentation |
|
|
217 | (1) |
|
|
217 | (1) |
|
15.7.11 Time from Order to Delivery |
|
|
217 | (1) |
|
|
218 | (1) |
|
15.7.13 Technical/Service Support |
|
|
218 | (1) |
|
15.7.14 Compressed Air Requirements |
|
|
218 | (1) |
|
|
218 | (1) |
|
|
218 | (2) |
|
|
220 | (1) |
|
16 Hiring and Staffing the Right People |
|
|
221 | (8) |
|
16.1 Staffing for Reliability |
|
|
221 | (4) |
|
16.1.1 A Reliability Engineering Background |
|
|
221 | (1) |
|
|
221 | (2) |
|
16.1.3 Shock and Vibration Testing |
|
|
223 | (1) |
|
16.1.4 Statistical Analysis |
|
|
223 | (1) |
|
16.1.5 Failure Budgeting/Estimating |
|
|
223 | (1) |
|
|
224 | (1) |
|
16.1.7 Conducting Reliability Training |
|
|
224 | (1) |
|
16.1.8 Persuasive in Implementing New Concepts |
|
|
224 | (1) |
|
16.1.9 A Degree in Engineering and/or Physics |
|
|
225 | (1) |
|
16.2 Staffing for Software Engineers |
|
|
225 | (1) |
|
16.3 Choosing the Wrong People for the Job |
|
|
226 | (3) |
|
17 Implementing the Reliability Process |
|
|
229 | (16) |
|
17.1 Reliability Is Everyone's Job |
|
|
229 | (1) |
|
17.2 Formalizing the Reliability Process |
|
|
230 | (1) |
|
17.3 Implementing the Reliability Process |
|
|
231 | (1) |
|
17.4 Rolling Out the Reliability Process |
|
|
231 | (4) |
|
17.5 Developing a Reliability Culture |
|
|
235 | (1) |
|
17.6 Setting Reliability Goals |
|
|
236 | (1) |
|
|
237 | (1) |
|
17.8 Product Life Cycle Defined |
|
|
238 | (3) |
|
|
239 | (1) |
|
|
240 | (1) |
|
|
241 | (1) |
|
17.8.4 End-of-Life and Obsolescence Phase |
|
|
241 | (1) |
|
17.9 Proactive and Reactive Reliability Activities |
|
|
241 | (3) |
|
|
244 | (1) |
|
|
244 | (1) |
Part IV Reliability and Quality Process for Product Development |
|
245 | (124) |
|
|
247 | (10) |
|
18.1 Reliability Activities in the Product Concept Phase |
|
|
247 | (1) |
|
18.2 Establish the Reliability Organization |
|
|
248 | (1) |
|
18.3 Define the Reliability Process |
|
|
249 | (1) |
|
18.4 Define the Product Reliability Requirements |
|
|
249 | (1) |
|
18.5 Capture and Apply Lessons Learned |
|
|
249 | (3) |
|
|
252 | (3) |
|
18.6.1 Filling Out the Risk Mitigation Form |
|
|
253 | (6) |
|
18.6.1.1 Identify and Analyze Risk |
|
|
253 | (1) |
|
|
254 | (1) |
|
18.6.1.3 Date Risk Is Identified |
|
|
254 | (1) |
|
|
254 | (1) |
|
18.6.1.5 High-Level Mitigation Plan |
|
|
254 | (1) |
|
18.6.1.6 Resources Required |
|
|
254 | (1) |
|
|
255 | (1) |
|
|
255 | (1) |
|
18.6.1.9 Investigate Alternative Solutions |
|
|
255 | (1) |
|
18.6.2 Risk Mitigation Meeting |
|
|
255 | (2) |
|
|
257 | (16) |
|
19.1 Reliability Activities in the Design Concept Phase |
|
|
257 | (2) |
|
19.2 Set Reliability Requirements and Budgets |
|
|
259 | (4) |
|
19.2.1 Requirements for Product Use Environment |
|
|
259 | (1) |
|
19.2.2 Product Useful Life Requirements |
|
|
260 | (1) |
|
19.2.3 Subsystem and Printed Circuit Board Assembly (PCBA) Reliability Budgets |
|
|
261 | (2) |
|
19.2.4 Service and Repair Requirements |
|
|
263 | (1) |
|
19.3 Define Reliability Design Guidelines |
|
|
263 | (1) |
|
19.4 Revise Risk Mitigation |
|
|
264 | (4) |
|
19.4.1 Identifying Risk Issues |
|
|
264 | (1) |
|
19.4.2 Reflecting Back (Capturing Internal Lessons Learned) |
|
|
265 | (1) |
|
19.4.3 Looking Forward (Capturing New Risk Issues) |
|
|
265 | (3) |
|
19.5 Schedule Reliability Activities and Capital Budgets |
|
|
268 | (1) |
|
19.6 Decide Risk Mitigation Sign-off Day |
|
|
269 | (2) |
|
19.7 Reflect on What Worked Well |
|
|
271 | (2) |
|
|
273 | (26) |
|
20.1 Product Design Phase |
|
|
273 | (1) |
|
20.2 Reliability Estimates |
|
|
274 | (2) |
|
20.3 Implementing Risk Mitigation Plans |
|
|
276 | (9) |
|
20.3.1 Mitigating Risk Issues Captured Reflecting Back |
|
|
276 | (2) |
|
20.3.1.1 Design Out (or Use an Alternate Part/Supplier) |
|
|
276 | (1) |
|
20.3.1.2 Change Use Conditions |
|
|
277 | (1) |
|
|
278 | (1) |
|
|
278 | (1) |
|
20.3.2 Mitigating Risk Issues Captured Looking Forward |
|
|
278 | (7) |
|
20.3.2.1 Accelerated Life Testing |
|
|
280 | (4) |
|
20.3.2.2 Risk Mitigation Progress |
|
|
284 | (1) |
|
20.4 Design for Reliability Guidelines (DFR) |
|
|
285 | (4) |
|
20.4.1 Derating Guidelines |
|
|
288 | (1) |
|
|
289 | (1) |
|
20.6 Installing a Failure Reporting Analysis and Corrective Action System |
|
|
290 | (1) |
|
|
291 | (1) |
|
20.8 HALT Test Development |
|
|
292 | (3) |
|
20.9 Risk Mitigation Meeting |
|
|
295 | (1) |
|
|
296 | (1) |
|
|
296 | (1) |
|
|
296 | (3) |
|
21 Design Validation Phase |
|
|
299 | (22) |
|
|
299 | (2) |
|
21.2 Using HALT to Precipitate Failures |
|
|
301 | (12) |
|
21.2.1 Starting the HALT Test |
|
|
304 | (2) |
|
|
306 | (1) |
|
21.2.3 Tickle Vibration Test |
|
|
306 | (1) |
|
21.2.4 Temperature Step Stress Test and Power Cycling |
|
|
306 | (2) |
|
21.2.5 Vibration Step Stress Test |
|
|
308 | (1) |
|
21.2.6 Combinational Temperature and Vibration Test |
|
|
308 | (1) |
|
21.2.7 Rapid Thermal Cycling Stress Test |
|
|
309 | (1) |
|
21.2.8 Slow Temperature Ramp |
|
|
310 | (1) |
|
21.2.9 Combinational Search Pattern Test |
|
|
311 | (1) |
|
21.2.10 Additional Nonenvironmental Stress Tests |
|
|
312 | (1) |
|
21.2.11 HALT Validation Test |
|
|
312 | (1) |
|
21.3 Proof of Screen (POS) |
|
|
313 | (2) |
|
21.4 Highly Accelerated Stress Screen (HASS) |
|
|
315 | (1) |
|
|
315 | (2) |
|
|
317 | (1) |
|
21.7 Closure of Risk Issues |
|
|
317 | (1) |
|
|
318 | (1) |
|
|
318 | (1) |
|
|
318 | (1) |
|
|
318 | (1) |
|
|
319 | (2) |
|
22 Software Testing and Debugging |
|
|
321 | (14) |
|
|
321 | (2) |
|
|
323 | (1) |
|
|
324 | (1) |
|
|
324 | (2) |
|
|
326 | (1) |
|
22.6 Guidelines for Creating Test Cases |
|
|
327 | (1) |
|
|
328 | (1) |
|
22.8 Defect Isolation Techniques |
|
|
329 | (2) |
|
|
329 | (2) |
|
22.9 Instrumentation and Logging |
|
|
331 | (3) |
|
|
334 | (1) |
|
23 Applying Software Quality Procedures |
|
|
335 | (6) |
|
23.1 Using Defect Model to Create Defect Run Chart |
|
|
336 | (1) |
|
23.2 Using Defect Run Chart to Know When You Have Achieved the Quality Target |
|
|
336 | (2) |
|
23.3 Using Root Cause Analysis on Defects to Improve Organizational Quality Delivery |
|
|
338 | (1) |
|
23.4 Continuous Integration and Test |
|
|
338 | (1) |
|
|
339 | (2) |
|
|
341 | (18) |
|
24.1 Accelerating Design Maturity |
|
|
341 | (5) |
|
24.1.1 Product Improvement Tools |
|
|
343 | (3) |
|
|
344 | (1) |
|
24.1.1.2 Design Issue Tracking |
|
|
345 | (1) |
|
|
346 | (5) |
|
24.2.1 Accelerated Reliability Growth (ARG) |
|
|
349 | (1) |
|
24.2.2 Accelerated Early Life Testing (ELT) |
|
|
350 | (1) |
|
24.3 Design and Process FMEA |
|
|
351 | (4) |
|
24.3.1 Quality Control Tools |
|
|
351 | (4) |
|
|
352 | (2) |
|
|
354 | (1) |
|
|
355 | (1) |
|
|
355 | (1) |
|
|
355 | (1) |
|
|
356 | (1) |
|
|
356 | (1) |
|
|
357 | (1) |
|
|
357 | (2) |
|
|
359 | (4) |
|
25.1 Managing Obsolescence |
|
|
359 | (1) |
|
|
360 | (1) |
|
|
360 | (1) |
|
|
361 | (2) |
|
|
363 | (6) |
|
26.1 Design for Ease of Access |
|
|
363 | (1) |
|
26.2 Identify High Replacement Assemblies (FRUs) |
|
|
363 | (2) |
|
|
365 | (1) |
|
26.4 Preemptive Servicing |
|
|
365 | (1) |
|
|
365 | (1) |
|
|
366 | (1) |
|
26.7 Availability or Repair Time Turnaround |
|
|
367 | (1) |
|
26.8 Avoid System Failure Through Redundancy |
|
|
367 | (1) |
|
26.9 Random versus Wearout Failures |
|
|
367 | (1) |
|
|
368 | (1) |
Appendix A |
|
369 | (18) |
|
A.1 Reliability Consultants |
|
|
369 | (3) |
|
A.2 Graduate Reliability Engineering Programs and Reliability Certification Programs |
|
|
372 | (4) |
|
A.3 Reliability Professional Organizations and Societies |
|
|
376 | (1) |
|
A.4 Reliability Training Classes |
|
|
377 | (2) |
|
A.5 Environmental Testing Services |
|
|
379 | (2) |
|
|
381 | (1) |
|
|
382 | (1) |
|
|
383 | (1) |
|
A.9 Reliability Seminars and Conferences |
|
|
384 | (2) |
|
A.10 Reliability Journals |
|
|
386 | (1) |
Appendix B |
|
387 | (12) |
|
B.1 MTBF, FIT, and PPM Conversions |
|
|
387 | (1) |
|
B.2 Mean Time Between Failure (MTBF) |
|
|
387 | (9) |
|
B.3 Estimating Field Failures |
|
|
396 | (3) |
|
B.3.1 Comparing Repairable to Nonrepairable Systems |
|
|
397 | (2) |
Index |
|
399 | |