Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: High Availability IT Services

3.00/5 (2 hinnangut Goodreads-ist)

Terry Critchley (IT Consultant, Manchester, UK)

Formaat: 537 pages
Ilmumisaeg: 17-Dec-2014
Kirjastus: Apple Academic Press Inc.
Keel: eng
ISBN-13: 9781040177761

Teised raamatud teemal:

Formaat - EPUB+DRM
Hind: 74,09 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: 537 pages
Ilmumisaeg: 17-Dec-2014
Kirjastus: Apple Academic Press Inc.
Keel: eng
ISBN-13: 9781040177761

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

This book starts with the basic premise that a service is comprised of the 3Psproducts, processes, and people. Moreover, these entities and their sub-entities interlink to support the services that end users require to run and support a business. This widens the scope of any availability design far beyond hardware and software. It also increases the potential for service failure for reasons beyond just hardware and software; the concept of logical outages.

High Availability IT Services details the considerations for designing and running highly available "services" and not just the systems infrastructure that supports those services. Providing an overview of virtualization and cloud computing, it supplies a detailed look at availability, redundancy, fault tolerance, and security. It also stresses the importance of human factors.

The book starts off by providing an availability primer and detailing the reasons why you need to be concerned with high availability. Next, it outlines the theory of reliability and availability and the elements of actual practices in this high availability (HA) area, including Service Level Agreements (SLAs) and Change Management.

Examining what the major hardware and software vendors have to offer in the HA world, the book considers the ubiquitous world of clouds and virtualization as well as the availability considerations they present.

The book examines high availability concepts and architectures such as reliability, availability, and serviceability (RAS); clusters; grids; and redundant arrays of independent disks (RAID) storage. It also covers the role of security in providing high availability, cluster offerings, emergent Linux clusters, online transaction processing (OLTP), and relational databases.

Foreword

xxv

Preface

xxvii

Acknowledgments

xxxiii

Author

xxxv

SECTION I AN AVAILABILITY PRIMER

1 Preamble: A View from 30,000 Feet

(10)

Do You Know...?

(1)

Availability in Perspective

(6)

Murphy's Law of Availability

(1)

Availability Drivers in Flux: What Percentage of Business Is Critical?

(2)

Historical View of Availability: The First 7 × 24 Requirements?

(2)

Historical Availability Scenarios

(1)

Planar Technology

(1)

Power-On Self-Test

(1)

Other Diagnostics

(1)

Component Repair

(1)

In-Flight Diagnostics

(1)

Summary

(3)

2 Reliability and Availability

(26)

Introduction to Reliability, Availability, and Serviceability

(5)

RAS Moves Beyond Hardware

(1)

Availability: An Overview

(1)

Some Definitions

(1)

Quantitative Availability

(1)

Availability: 7 R's (SNIA)

(2)

Availability and Change

(4)

Change All around Us

(1)

Software: Effect of Change

(1)

Operations: Effect of Change

(1)

Monitoring and Change

(2)

Automation: The Solution?

(2)

Data Center Automation

(1)

Network Change/Configuration Automation

(1)

Automation Vendors

(1)

Types of Availability

(4)

Binary Availability

(1)

Duke of York Availability

(1)

Hierarchy of Failures

(1)

Hierarchy Example

(1)

State Parameters

(1)

Types of Nonavailability (Outages)

(3)

Logical Outage Examples

(2)

Summary

(1)

Planning for Availability and Recovery

(2)

Why Bother?

(1)

What Is a Business Continuity Plan?

(1)

What Is a BIA?

(1)

What Is DR?

(1)

Relationships: BC, BIA, and DR

(1)

Recovery Logistics

(1)

Business Continuity

(1)

Downtime: Who or What Is to Blame?

(1)

Elements of Failure: Interaction of the Wares

(2)

Summary

(1)

DR/BC Source Documents

(2)

3 Reliability: Background and Basics

(14)

Introduction

(1)

IT Structure---Schematic

(1)

IT Structure---Hardware Overview

(2)

Service Level Agreements

(1)

Service Level Agreements: The Dawn of Realism

(1)

What Is an SLA?

(1)

Why Is an SLA Important?

(1)

Service Life Cycle

(2)

Concept of User Service

(1)

Elements of Service Management

(4)

Introduction

(1)

Scope of Service Management

(1)

User Support

(1)

Operations Support

(1)

Systems Management

(1)

Service Management Hierarchy

(1)

The Effective Service

(1)

Services versus Systems

(1)

Availability Concepts

(3)

First Dip in the Water

(1)

Availability Parameters

(2)

Summary

(1)

4 What Is High Availability?

(52)

IDC and Availability

(1)

Availability Classification

(11)

Availability: Outage Analogy

(1)

A Recovery Analogy

(1)

Availability: Redundancy

(1)

Availability: Fault Tolerance

(1)

Sample List of Availability Requirements

(1)

System Architecture

(1)

Availability: Single Node

(1)

Dynamic Reconfiguration/Hot Repair of System Components

(1)

Disaster Backup and Recovery

(1)

System Administration Facilities

(1)

HA Costs Money, So Why Bother?

(1)

Cost Impact Analysis

(1)

HA: Cost versus Benefit

(1)

Penalty for Nonavailability

(1)

Organizations: Attitude toward HA

(1)

Aberdeen Group Study: February 2012

(1)

Outage Loss Factors (Percentage of Loss)

(1)

Software Failure Costs

(2)

Assessing the Cost of HA

(1)

Performance and Availability

(1)

HA Design: Top 10 Mistakes

(1)

The Development of HA

(4)

Servers

(2)

Systems and Subsystems Development

(1)

Production Clusters

(2)

Availability Architectures

(3)

RAS Features

(1)

Hot-Plug Hardware

(1)

Processors

(1)

Memory

(1)

Input/Output

(1)

Storage

(1)

Power/Cooling

(1)

Fault Tolerance

(1)

Outline of Server Domain Architecture

(2)

Introduction

(1)

Domain/LPAR Structure

(1)

Outline of Cluster Architecture

(1)

Cluster Configurations: Commercial Cluster

(1)

Cluster Components

(3)

Hardware

(1)

Software

(1)

Commercial LB

(1)

Commercial Performance

(1)

Commercial HA

(1)

HPC Clusters

(5)

Generic HPC Cluster

(1)

HPC Cluster: Oscar Configuration

(1)

HPC Cluster: Availability

(1)

HPC Cluster: Applications

(1)

HA in Scientific Computing

(1)

Topics in HPC Reliability: Summary

(1)

Errors in Cluster HA Design

(1)

Outline of Grid Computing

(1)

Grid Availability

(1)

Commercial Grid Computing

(1)

Outline of RAID Architecture

(11)

Origins of RAID

(1)

RAID Architecture and Levels

(1)

Hardware

(1)

Software

(1)

Hardware versus Software RAID

(1)

RAID Striping: Fundamental to RAID

(1)

RAID Configurations

(1)

RAID Components

(1)

ECC

(1)

Parity

(1)

RAID Level 0

(1)

RAID Level 1

(1)

RAID Level 3

(1)

RAID Level 5

(1)

RAID Level 6

(1)

RAID Level 10

(1)

RAID 0 + 1 Schematic

(1)

RAID 10 Schematic

(1)

RAID Level 30

(1)

RAID Level 50

(1)

RAID Level 51

(1)

RAID Level 60

(1)

RAID Level 100

(1)

Less Relevant RAIDs

(1)

RAID Level 2

(1)

RAID Level 4

(1)

RAID Level 7

(1)

Standard RAID Storage Efficiency

(1)

SSDs and RAID

(1)

SSD Longevity

(1)

Hybrid RAID: SSD and HDD

(1)

SSD References

(1)

Post-RAID Environment

(3)

Big Data: The Issue

(1)

Data Loss Overview

(1)

Big Data: Solutions?

(1)

Non-RAID RAID

(1)

Erasure Codes

(4)

RAID Successor Qualifications

(1)

EC Overview

(1)

EC Recovery Scope

(1)

Self-Healing Storage

100

(1)

Summary

101

(4)

SECTION II AVAILABILITY THEORY AND PRACTICE

5 High Availability: Theory

105

(36)

Some Math

105

(4)

Guide to Reliability Graphs

105

(1)

Probability Density Function

105

(2)

Cumulative Distribution Function

107

(1)

Availability Probabilities

107

(1)

Lusser's Law

108

(1)

Availability Concepts

109

(2)

Hardware Reliability: The Bathtub Curve

109

(1)

Software Reliability: The Bathtub Curve

110

(1)

Simple Math of Availability

111

(13)

Availability

111

(1)

Nonavailability

112

(1)

Mean Time between Failures

112

(1)

Mean Time to Repair

112

(1)

Online Availability Tool

113

(1)

Availability Equation I: Time Factors in an Outage

114

(2)

Availability Equation II

116

(1)

Effect of Redundant Blocks on Availability

117

(1)

Parallel (Redundant) Components

118

(1)

Two Parallel Blocks: Example

118

(1)

Combinations of Series and Parallel Blocks

119

(1)

Complex Systems

120

(1)

System Failure Combinations

120

(1)

Complex Systems Solution Methods

121

(1)

Real-Life Example: Cisco Network Configuration

121

(1)

Configuration A

121

(1)

Configuration B

122

(1)

Summary of Block Considerations

123

(1)

Sample Availability Calculations versus Costs

124

(1)

Calculation 1 Server Is 99% Available

124

(1)

Calculation 2 Server Is 99.99% Available

124

(1)

Availability: MTBFs and Failure Rate

124

(15)

Availability Factors

125

(1)

Planned versus Unplanned Outages

125

(1)

Planned Downtime: Planned Downtime Breakdown

126

(2)

Unplanned Downtime

128

(1)

Security: The New Downtime

128

(1)

Disasters: Breakdown of Causes

128

(1)

Power: Downtime Causes

129

(1)

Power Issues Addenda

129

(1)

So What?

130

(1)

External Electromagnetic Radiation Addendum

131

(1)

Power: Recovery Timescales for Uninterruptible Power Supply

131

(1)

Causes of Data Loss

132

(1)

Pandemics? Disaster Waiting to Happen?

133

(1)

Disasters: Learning the Hard Way

133

(1)

Other Downtime Gotchas

133

(2)

Downtime Gotchas: Survey Paper

135

(1)

Downtime Reduction Initiatives

135

(1)

Low Impact Outages

135

(1)

Availability: A Lesson in Design

136

(1)

Availability: Humor in an Outage---Part I

137

(1)

Availability: Humor in an Outage---Part II

137

(1)

So What?

137

(1)

Application Nonavailability

137

(1)

Traditional Outage Reasons

138

(1)

Modern Outage Reasons

138

(1)

Summary

139

(2)

6 High Availability: Practice

141

(56)

Central Site

141

(1)

Service Domain Concept

141

(3)

Sample Domain Architecture

143

(1)

Planning for Availability---Starting Point

144

(1)

The HA Design Spectrum

145

(18)

Availability by Systems Design/Modification

145

(1)

Availability by Engineering Design

145

(1)

Self-Healing Hardware and Software

145

(1)

Self-Healing and Other Items

146

(1)

Availability by Application Design: Poor Application Design

147

(1)

Conventional Programs

147

(1)

Web Applications

147

(2)

Availability by Configuration

149

(1)

Hardware

149

(1)

Data

150

(1)

Networks

150

(1)

Operating System

150

(1)

Environment

150

(1)

Availability by Outside Consultancy

151

(1)

Availability by Vendor Support

151

(1)

Availability by Proactive Monitoring

151

(1)

Availability by Technical Support Excellence

152

(1)

Availability by Operations Excellence

152

(1)

First Class Runbook

153

(1)

Software Level Issues

153

(1)

System Time

154

(1)

Performance and Capacity

154

(1)

Data Center Efficiency

154

(1)

Availability by Retrospective Analysis

154

(1)

Availability by Application Monitoring

155

(1)

Availability by Automation

155

(1)

Availability by Reactive Recovery

156

(1)

Availability by Partnerships

157

(1)

Availability by Change Management

158

(1)

Availability by Performance/Capacity Management

158

(1)

Availability by Monitoring

159

(1)

Availability by Cleanliness

159

(1)

Availability by Anticipation

159

(1)

Predictive Maintenance

159

(1)

Availability by Teamwork

160

(1)

Availability by Organization

160

(1)

Availability by Eternal Vigilance

161

(1)

Availability by Location

162

(1)

A Word on Documentation

162

(1)

Network Reliability/Availability

163

(6)

Protocols and Redundancy

163

(1)

Network Types

164

(1)

Network Outages

164

(1)

Network Design for Availability

165

(1)

Network Security

166

(1)

File Transfer Reliability

167

(2)

Network DR

169

(1)

Software Reliability

169

(6)

Software Quality

169

(1)

Software: Output Verification

170

(1)

Example 1

171

(1)

Example 2

171

(1)

Example 3

171

(1)

Software Reliability: Problem Flow

171

(1)

Software Testing Steps

172

(1)

Software Documentation

173

(1)

Software Testing Model

173

(2)

Software Reliability---Models

175

(10)

The Software Scenario

175

(1)

SRE Models

175

(1)

Model Entities

176

(1)

SRE Models: Shape Characterization

177

(1)

SRE Models: Time-Based versus Defect-Based

178

(1)

Software Reliability Growth Model

178

(2)

Software Reliability Model: Defect Count

180

(1)

Software Reliability: IEEE Standard 1633-2008

181

(1)

Software Reliability: Hardening

182

(1)

Software Reliability: Installation

182

(1)

Software Reliability: Version Control

183

(1)

Software: Penetration Testing

183

(1)

Software: Fault Tolerance

184

(1)

Software Error Classification

185

(1)

Heisenbug

185

(1)

Bohrbug

185

(1)

Reliability Properties of Software

186

(1)

ACID Properties

186

(1)

Two-Phase Commit

186

(1)

Software Reliability: Current Status

187

(1)

Software Reliability: Assessment Questions

188

(1)

Software Universe and Summary

188

(1)

Subsystem Reliability

189

(5)

Hardware Outside the Server

189

(1)

Disk Subsystem Reliability

190

(1)

Disk Subsystem RAS

190

(1)

Tape Reliability/RAS

191

(1)

Availability: Other Peripherals

192

(1)

Attention to Detail

193

(1)

Liveware Reliability

193

(1)

Summary

194

(3)

Be Prepared for Big Brother!

195

(2)

7 High Availability: SLAs, Management, and Methods

197

(52)

Introduction

197

(1)

Preliminary Activities

198

(2)

Pre-Production Activities

198

(1)

BC Plan

199

(1)

BC: Best Practice

199

(1)

Management Disciplines

200

(1)

Service Level Agreements

201

(8)

SLA Introduction

201

(1)

SLA: Availability and QoS

201

(1)

Elements of SLAs

201

(2)

Types of SLAs

203

(1)

Potential Business Benefits of SLAs

203

(1)

Potential IT Benefits of SLAs

204

(1)

IT Service Delivery

204

(1)

SLA: Structure and Samples

205

(1)

SLA: How Do We Quantify Availability?

206

(1)

SLA: Reporting of Availability

206

(1)

Reneging on SLAs

207

(2)

HA Management: The Project

209

(14)

Start-Up and Design Phase

209

(1)

The Management Flow

210

(1)

The Management Framework

210

(1)

Project Definition Workshop

210

(2)

Outline of the PDW

212

(1)

PDW Method Overview

212

(1)

Project Initiation Document

213

(1)

PID Structure and Purpose

213

(2)

Multistage PDW

215

(1)

Delphi Techniques and Intensive Planning

215

(1)

Delphi Technique

215

(1)

Delphi: The Steps

216

(1)

Intensive Planning

217

(1)

FMEA Process

217

(1)

FMEA: An Analogy

218

(1)

FMEA: The Steps

218

(1)

FMECA = FMEA + Criticality

219

(1)

Risk Evaluation and Priority: Risk Evaluation Methods

219

(1)

Component Failure Impact Analysis

220

(1)

CFIA Development---A Walkthrough and Risk Analysis

220

(1)

CFIA Table: Schematic

221

(1)

Quantitative CFIA

222

(1)

CFIA: Other Factors

222

(1)

Management of Operations Phase

223

(2)

Failure Reporting and Corrective Action System

223

(1)

Introduction

223

(1)

FRACAS: Steps for Handling Failures

223

(2)

HA Operations: Supporting Disciplines

225

(14)

War Room

225

(1)

War Room Location

225

(1)

Documentation

225

(1)

Change/Configuration Management

226

(1)

Change Management and Control: Best Practice

226

(1)

Change Operations

227

(1)

Patch Management

228

(1)

Performance Management

229

(1)

Introduction

229

(1)

Overview

229

(1)

Security Management

230

(1)

Security: Threats or Posturing?

230

(1)

Security: Best Practice

231

(1)

Problem Determination

231

(1)

Problems: Short Term

232

(1)

Problems: After the Event

232

(1)

Event Management

233

(1)

Fault Management

233

(1)

Faults and What to Do about Them

233

(1)

System Failure: The Response Stages

234

(1)

HA Plan B: What's That?

235

(1)

Plan B: Example I

235

(1)

Plan B: Example II

235

(1)

What? IT Problem Recovery without IT?

235

(1)

Faults and What Not to Do

236

(1)

Outages: Areas for Inaction

236

(1)

Problem Management

237

(1)

Managing Problems

237

(1)

Problems: Best Practice

237

(1)

Help Desk Architecture and Implementation

238

(1)

Escalation Management

238

(1)

Resource Management

238

(1)

Service Monitors

239

(6)

Availability Measurement

239

(1)

Monitor Layers

240

(1)

System Resource Monitors

241

(1)

Synthetic Workload: Generic Requirements

241

(1)

Availability Monitors

242

(1)

General EUE Tools

243

(1)

Availability Benchmarks

243

(1)

Availability: Related Monitors

244

(1)

Disaster Recovery

244

(1)

The Viewpoint Approach to Documentation

245

(1)

Summary

245

(4)

SECTION III VENDORS AND HIGH AVAILABILITY

8 High Availability: Vendor Products

249

(18)

IBM Availability and Reliability

250

(4)

IBM Hardware

250

(1)

Virtualization

251

(1)

IBM PowerVM

251

(1)

IBM Series x

251

(1)

IBM Clusters

251

(1)

Z Series Parallel Sysplex

251

(1)

Sysplex Structure and Purpose

252

(1)

Parallel Sysplex Schematic

252

(1)

IBM: High Availability Services

253

(1)

IBM Future Series/System

253

(1)

Oracle Sun HA

254

(2)

Sun HA

254

(1)

Hardware Range

254

(1)

Super Cluster

255

(1)

Oracle Sun M5-32

255

(1)

Oracle HA Clusters

255

(1)

Oracle RAC 12c

255

(1)

Hewlett-Packard HA

256

(4)

HP Hardware and Software

256

(1)

Servers

256

(1)

Software

256

(1)

Services

256

(1)

Servers: Integrity Servers

257

(1)

HP NonStop Integrity Servers

258

(1)

NonStop Architecture and Stack

258

(1)

NonStop Stack Functions

259

(1)

Stratus Fault Tolerance

260

(1)

Automated Uptime Layer

260

(1)

ActiveService Architecture

261

(1)

Other Clusters

261

(3)

Veritas Clusters (Symantec)

261

(1)

Supported Platforms

261

(1)

Databases, Applications, and Replicators

262

(1)

Linux Clusters

262

(1)

Overview

262

(1)

Oracle Clusterware

263

(1)

SUSE Linux Clustering

263

(1)

Red Hat Linux Clustering

263

(1)

Linux in the Clouds

263

(1)

Linux HPC HA

263

(1)

Linux-HA

263

(1)

Carrier Grade Linux

263

(1)

VMware Clusters

264

(1)

The Web and HA

264

(1)

Service Availability Software

264

(1)

Continuity Software

265

(1)

Continuity Software: Services

265

(1)

Summary

265

(2)

9 High Availability: Transaction Processing and Databases

267

(26)

Transaction Processing Systems

267

(1)

Some TP Systems: OLTP Availability Requirements

268

(1)

TP Systems with Databases

268

(3)

The X/Open Distributed Transaction Processing Model: XA and XA+ Concepts

269

(1)

CICS and RDBMS

270

(1)

Relational Database Systems

271

(1)

Some Database History

271

(1)

Early RDBMS

271

(1)

SQL Server and HA

272

(3)

Microsoft SQL Server 2014 Community Technology Preview 1

273

(1)

SQL Server HA Basics

273

(1)

SQL Server AlwaysOn Solutions

273

(1)

Failover Cluster Instances

273

(1)

Availability Groups

274

(1)

Database Mirroring

274

(1)

Log Shipping

274

(1)

References

274

(1)

Oracle Database and HA

275

(2)

Introduction

275

(1)

Oracle Databases

275

(1)

Oracle 11g (R2.1) HA

275

(1)

Oracle 12c

276

(1)

Oracle MAA

276

(1)

Oracle High Availability Playing Field

276

(1)

MySQL

277

(1)

MySQL: HA Features

278

(1)

MySQL: HA Services and Support

278

(1)

IBM DB2 Database and HA

278

(2)

DB2 for Windows, UNIX, and Linux

279

(1)

DB2 HA Feature

279

(1)

High Availability DR

279

(1)

DB2 Replication: SQL and Q Replication

280

(1)

DB2 for i

280

(1)

DB2 10 for z/OS

280

(1)

DB2 pureScale

280

(1)

InfoSphere Replication Server for z/OS

281

(1)

DB2 Cross Platform Development

281

(1)

IBM Informix Database and HA

281

(3)

Introduction (Informix 11.70)

281

(1)

Availability Features

282

(1)

Fault Tolerance

282

(1)

Informix MACH 11 Clusters

282

(1)

Connection Manager

283

(1)

Informix 12.1

283

(1)

Ingres Database and HA

284

(1)

Ingres RDBMS

284

(1)

Ingres High Availability Option

284

(1)

Sybase Database and HA

285

(3)

Sybase High Availability Option

285

(1)

Terminology

285

(1)

Use of SAP ASE

286

(1)

Vendor Availability

286

(1)

ASE Cluster Requirements

286

(1)

Business Continuity with SAP Sybase

287

(1)

NoSQL

287

(1)

NonStopSQL Database

288

(1)

Summary

289

(4)

SECTION IV CLOUDS AND VIRTUALIZATION

10 High Availability: The Cloud and Virtualization

293

(14)

Introduction

293

(5)

What Is Cloud Computing?

294

(1)

Cloud Characteristics

294

(1)

Functions of the Cloud

294

(1)

Cloud Service Models

295

(1)

Cloud Deployment Models

296

(1)

Resource Management in the Cloud

297

(1)

SLAs and the Cloud

297

(1)

Cloud Availability and Security

298

(2)

Cloud Availability

298

(1)

Cloud Outages: A Review

298

(1)

Aberdeen: Cloud Storage Outages

299

(1)

Cloud Security

299

(1)

Virtualization

300

(3)

What Is Virtualization?

300

(1)

Full Virtualization

301

(1)

Paravirtualization

302

(1)

Security Risks in Virtual Environments

303

(1)

Vendors and Virtualization

303

(3)

IBM PowerVM

303

(1)

IBM z/VM

304

(1)

VMware VSphere, ESX, and ESXi

304

(1)

Microsoft Hyper-V

304

(1)

HP Integrity Virtual Machines

304

(1)

Linux KVM

304

(1)

Solaris Zones

304

(1)

Xen

305

(1)

Virtualization and HA

305

(1)

Virtualization Information Sources

306

(1)

Summary

306

(1)

11 Disaster Recovery Overview

307

(28)

DR Background

307

(4)

A DR Lesson from Space

307

(1)

Disasters Are Rare Aren't They?

308

(1)

Key Message: Be Prepared

308

(1)

DR Invocation Reasons: Forrester Survey

309

(1)

DR Testing: Kaseya Survey

310

(1)

DR: A Point to B Point

310

(1)

Backup/Restore

311

(7)

Overview

311

(1)

Backup Modes

311

(1)

Cold (Offline)

311

(1)

Warm (Online)

311

(1)

Hot (Online)

311

(1)

Backup Types

312

(1)

Full Backup

312

(1)

Incremental Backup

312

(1)

Multilevel Incremental Backup

312

(1)

Differential Backup

312

(1)

Synthetic Backup

312

(1)

Progressive Backup

312

(1)

Data Deduplication

313

(1)

Data Replication

314

(1)

Replication Agents

315

(1)

Asynchronous Replication

315

(1)

Synchronous Replication

316

(1)

Heterogeneous Replication

316

(1)

Other Types of Backup

316

(1)

DR Recovery Time Objective: WAN Optimization

317

(1)

Backup Product Assessments

318

(3)

Virtualization Review

318

(1)

Gartner Quadrant Analysis

318

(1)

Backup/Archive: Tape or Disk?

319

(1)

Bit Rot

319

(1)

Tape Costs

320

(1)

DR Concepts and Considerations

321

(3)

The DR Scenario

321

(1)

Who Is Involved?

321

(1)

DR Objectives

322

(1)

Recovery Factors

322

(1)

Tiers of DR Availability

323

(1)

DR and Data Tiering

323

(1)

A Key Factor

324

(1)

The DR Planning Process

324

(6)

DR: The Steps Involved

324

(1)

In-House DR

324

(3)

DR Requirements in Operations

327

(1)

Hardware

327

(1)

Software

327

(1)

Applications

327

(1)

Data

327

(1)

DR Cost Considerations

328

(1)

The Backup Site

328

(1)

Third-Party DR (Outsourcing)

329

(1)

DR and the Cloud

329

(1)

HA/DR Options Described

329

(1)

Disaster Recovery Templates

330

(1)

Summary

330

(5)

SECTION V APPENDICES AND HARD SUMS

Appendix 1

335

(38)

Reliability and Availability: Terminology

335

(36)

Summary

371

(2)

Appendix 2

373

(14)

Availability: MTBF/MTTF/MTTR Discussion

373

(8)

Interpretation of MTTR

373

(2)

Interpretation of MTTF

375

(1)

Interpretation of MTBF

375

(1)

MTTF and MTBF---The Difference

375

(2)

MTTR: Ramp-Up Time

377

(1)

Serial Blocks and Availability---NB

378

(1)

Typical MTBF Figures

379

(1)

Gathering MTTF/MTBF Figures

380

(1)

Outage Records and MTTx Figures

380

(1)

MTTF and MTTR Interpretation

381

(6)

MTTF versus Lifetime

381

(1)

Some MTxx Theory

381

(1)

MTBF/MTTF Analogy

382

(1)

Final Word on MTxx

382

(1)

Forrester/Zenoss MTxx Definitions

383

(1)

Summary

384

(3)

Appendix 3

387

(18)

Your HA/DR Route Map and Kitbag

387

(16)

Road to HA/DR

387

(1)

The Stages

387

(4)

A Short DR Case Study

391

(1)

HA and DR: Total Cost of Ownership

392

(1)

TCO Factors

392

(1)

Cloud TCO

393

(1)

TCO Summary

394

(1)

Risk Assessment and Management

394

(1)

Who Are the Risk Stakeholders?

395

(1)

Where Are the Risks?

395

(1)

How Is Risk Managed?

395

(1)

Availability: Project Risk Management

396

(4)

Availability: Deliverables Risk Management

400

(2)

Deliverables Risk Management Plan: Specific Risk Areas

402

(1)

The IT Role in All This

403

(1)

Summary

403

(2)

Appendix 4

405

(56)

Availability: Math and Other Topics

405

(53)

Lesson 1 Multiplication, Summation, and Integration Symbols

405

(1)

Mathematical Distributions

405

(1)

Lesson 2 General Theory of Reliability and Availability

406

(1)

Reliability Distributions

406

(4)

Lesson 3 Parallel Components (Blocks)

410

(1)

Availability: m-from-n Components

410

(1)

m-from-n Examples

410

(1)

m-from-n Theory

410

(1)

m-from-n Redundant Blocks

411

(1)

Active and Standby Redundancy

412

(1)

Introduction

412

(1)

Summary of Redundancy Systems

412

(1)

Types of Redundancy

413

(1)

Real m-from-n Example

414

(1)

Math of m-from-n Configurations

415

(1)

Standby Redundancy

415

(1)

An Example of These Equations

415

(1)

Online Tool for Parallel Components: Typical Calculation

416

(1)

NB: Realistic IT Redundancy

417

(1)

Overall Availability Graphs

418

(1)

Try This Availability Test

419

(1)

Lesson 4 Cluster Speedup Formulae

419

(1)

Amdahl's Law

420

(1)

Gunther's Law

421

(2)

Gustafson's Law

423

(1)

Amdahl versus Gunther

424

(1)

Speedup: Sun-Ni Law

425

(1)

Lesson 5 Some RAID and EC Math

426

(1)

RAID Configurations

426

(3)

Erasure Codes

429

(3)

Lesson 6 Math of Monitoring

432

(1)

Ping: Useful Aside

432

(3)

Ping Sequence Sample

435

(1)

Lesson 7 Software Reliability/Availability

435

(1)

Overview

435

(1)

Software Reliability Theory

436

(1)

The Failure/Defect Density Models

437

(7)

Lesson 8 Additional RAS Features

444

(1)

Upmarket RAS Features

444

(1)

Processor

444

(1)

I/O Subsystem

445

(1)

Memory Availability

445

(1)

Fault Detection and Isolation

445

(1)

Clocks and Service Processor

446

(1)

Serviceability

446

(1)

Predictive Failure Analysis

447

(1)

Lesson 9 Triple Modular Redundancy

447

(1)

Lesson 10 Cyber Crime, Security, and Availability

448

(1)

The Issue

448

(1)

The Solution

449

(1)

Security Analytics

449

(1)

Zero Trust Security Model

449

(1)

Security Information Event Management

450

(1)

Security Management Flow

450

(1)

SIEM Best Practices

451

(1)

Security: Denial of Service

452

(1)

Security: Insider Threats

452

(1)

Security: Mobile Devices (BYOD)

453

(1)

BYOD Security Steps

454

(1)

Security: WiFi in the Enterprise

455

(1)

Security: The Database

455

(1)

Distributed DoS

456

(1)

Security: DNS Servers

456

(1)

Cost of Cyber Crime

457

(1)

Cost of Cyber Crime Prevention versus Risk

457

(1)

Security Literature

458

(1)

Summary

458

(3)

Appendix 5

461

(18)

Availability: Organizations and References

461

(18)

Reliability/Availability Organizations

461

(1)

Reliability Information Analysis Center

462

(1)

Uptime Institute

462

(1)

IEEE Reliability Society

462

(1)

Storage Networking Industry Association

463

(1)

Availability Digest

463

(1)

Service Availability Forum

463

(1)

Carnegie Mellon Software Engineering Institute

464

(1)

ROC Project---Software Resilience

465

(1)

Business Continuity Today

465

(1)

Disaster Recovery Institute

465

(1)

Business Continuity Institute

466

(1)

Information Availability Institute

466

(1)

International Working Group on Cloud Computing Resiliency

466

(1)

TMMi Foundation

466

(1)

Center for Software Reliability

467

(1)

CloudTweaks

467

(1)

Security Organizations

467

(1)

Security? I Can't Be Bothered

467

(1)

Cloud Security Alliance

468

(1)

CSO Online

468

(1)

Dark READING

469

(1)

Cyber Security and Information Systems IAC

469

(1)

Center for International Security and Cooperation

469

(1)

Other Reliability/Security Resources

469

(1)

Books, Articles, and Websites

469

(1)

Major Reliability/Availability Information Sources

469

(1)

Other Information Sources

470

(9)

Appendix 6

479

(10)

Service Management: Where Next?

479

(10)

Information Technology Infrastructure Library

479

(1)

ITIL Availability Management

480

(1)

Service Architectures

480

(3)

Architectures

483

(1)

Availability Architectures: HA Documentation

483

(1)

Clouds and Architectures

484

(5)

Appendix 7

489

(2)

Index

491

Dr. Terry Critchley is a retired IT consultant living near Manchester in the United Kingdom. He studied physics at the Manchester University (using some of Rutherford's original equipment!), gained an Honours degree in physics, and 5 years later with a PhD in nuclear physics. He then joined IBM as a Systems Engineer and spent 24 years there in a variety of accounts and specializations, later served in Oracle for 3 years. Terry joined his last company, Sun Microsystems in 1996 and left there in 2001, after planning and running the Sun European Y2000 education, and then spent a year at a major UK bank.

In 1993 he initiated and coauthored a book on Open Systems for the British Computer Society (Open Systems: The Reality) and has recently written this book IT Services High Availability. He is also mining swathes of his old material for his next book, Service Performance and Management.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97810401777616e.html

Märksõnad:

E-raamat: High Availability IT Services

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv