1 / 11

Crisis Management and DR

Crisis Management and DR. Larry K. Peck Disaster Recovery Consultant Office of Information Resources State of Tennessee. Software System Failure Hardware System Failure Network or Telecommunications Carrier Failure Human Error Cause Uncertain or Unknown

cade-moon
Download Presentation

Crisis Management and DR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Crisis Management and DR Larry K. Peck Disaster Recovery Consultant Office of Information Resources State of Tennessee

  2. Software System Failure Hardware System Failure Network or Telecommunications Carrier Failure Human Error Cause Uncertain or Unknown Environmental Factors (Such As Power Outages) Security Breach or Virus 26% 21% 15% 13% 12% 10% 3% Downtime and Availability - Factors Contributing to Downtime • The following seven categories were identified as the factors contributing to downtime (Gartner-April 2004 Survey of 145 IT Organizations):

  3. Planning and TestingStrategiesTechnologies

  4. Emergency Management Team • EMT is First Response to Crisis Event • Identified 1st Responders from various functional and business units • Disaster Assessment Teams (DAT) – inspect equipment and facilities, report to EMT • Interfaces together • Executive Management • Financial Management • Technical Management • Functional Response Teams • Press Relations Team • Conduct TWO “Exercises” per year, 1st planned, 2nd Surprise

  5. Planning-Preparation • Business Impact Analysis (BIA) • Conducted high level BIA as part of recent study – Annual detailed BIA with every agency now in progress • Established annual BIA review process

  6. Business impact analysis (BIA) and risk assessment approach: The analysis and report are structured around the following systems and critical, dependent business processes Technology View EnterpriseView Billing LAN WAN MANInternal/External ApplicationA Financials Agency/OperationsView MediaView CallCenters Applications Government/AgencyCommunications 3rdParty Technologies CustomerService Data Center/NOCs HighSpeed Telephony Services Telephony

  7. Planning-Preparation • New approach to system criticality identification • Level 1 - < 5 minute RTO/RPO (0 downtime) • Level 2 – 8 hour or less RTO/RPO • Level 3 – 48 hour RTO/RPO • Level 4 – 72 hour RTO/RPO • Level 5 – NR – No specific disaster recovery requirements

  8. Planning-Preparation • Implemented new WEB based Disaster Recovery Application and Inventory Planning Application

  9. Strategies • Outside Analysis and Review • Confirmed what we thought we knew – our strengths and weaknesses • DR for Mainframe is mature, stable, and very supportable utilizing 3rd party services • DR for Distributed Systems is very complex and poorly suited for 3rd party services • Some existing technologies are still viable • New approaches are necessary for others • Migration to self-supporting recovery model is necessary, especially for Distributed Systems

  10. Technologies • Construction of Second Data Center • Full Tier III facility* • Self-Recovery Model (just one example) • Each data center runs 50% of production • Each data center runs 50% of total dev/test/training • DR event – utilize dev/test/training hardware to recovery most critical systems • Various data replication schemes and technologies • Server Virtualization/Clustering over WAN/ HA technologies

  11. Thoughts • Plan, Plan, Plan • Review, Review, Review • Test, Test, Test • Revise, Revise, Revise

More Related