Disaster Recovery Planning Spencer Foisy Debbie Madsen Amy Quarterman
Protect People Ensure Business Continuity Reduce Insurance Premiums Protect Corporate Assets Minimize Decision Making Reduce Reliance on Key Individuals Eliminate Confusion and Errors Minimize Liability Goals
Obtain Management Commitment Establish Planning Committee Perform Risk Assessment Establish Priorities for Processing Determine Recovery Strategies Perform Data Collection Document a Written Plan Develop Testing Criteria Test the Plan Approve the Plan Designing a Plan
The Written Plan • Standard format facilitates consistency across multiple authors • Background information • Purpose of the procedure • Reference materials that should be consulted • Documentation of required paperwork • Authorizations required
The Written Plan Cont... • Instructions • Developed on preprinted forms • Written for the lay person • Short, direct sentences • Present one idea at a time • Avoid Jargon • Use position titles, rather than people’s names • Scope • Design plan for worst case scenario
The Internet and Disaster Recovery • Originally designed as a disaster recovery system for the US government’s telecommunications infrastructure. • One of the most redundant networks in the world. • Examples of use: • During earthquakes in California and Japan. • Only link victims had with outside. Also used to direct aid and rescue crews.
Opportunities with Internet • Coordinate response, communicate to employees, share information with public • Email to dedicated to crisis management web pages with news and real time video/audio conference. • Email files of building maps.
How to Recover a Website • Easiest changing URL to new IP address - but not timely • Need to be able to make switch over transparent to site visitors • Use same IP address or make change through DNS hierarchy. • CISCO systems Distributed Director will transparently redirect http connections from a non-responding server. • In the future, redundant T3’s coming into a remote site connected to different POPs.
Lan/Wan and Client Server • Extra complexity due to distributed computing environment. • Need to consider role of each individual system component • To easy disaster recovery, need standardization and documentation • Many different skill sets needed to recover system • Benefits of decentralized risk, some redundancy
To Lower Risk of Client Server • Remote site mirroring • Vendor contingency agreements • End-user documentation and back-up of files.
VendorsOutsourcing Recovery • Comdisco -1969, 100+ locals worldwide, $2.8 billion in revenue • IBM -10x capabilities of others, 113 sites in 62 countries, IBMfacilities • SunGard -1978, $862 m in revenue, Mega & Mobile Data Centers • HP- 600 Support Offices, 35 Response Ctrs. in 110 countries, NT focus • DRC, Inc. - 1987, only up to 250 workstations, On-Site Emerg. Fac. • DRS, Inc. - 1991, mobile services, one NC hot-site • Weyerhaeuser - 1948, forest company, 4 hot-sites nationally
MetLifeBusiness Continuity Planning • Refer to Red Book for Game Plan on Physical Recovery • Where to Meet • Team Recovery - contact call list, percent per week, etc. • Recovery Site • Critical and Production Systems Recovery - PCs, phones, faxes, copiers, etc. • Special Software, Equipment, Contacts
MetLifeDisaster Recovery Procedure • Comdisco Plan of Attack - Recovering Critical Systems and Applications • Mainframe • Separate LPARs created in NJ site • Back-up tapes copied and sent to Iron Mountain (in-ground vaulting) • Business Impact Analysis to determine which applications recover first • Option of electronic vaulting with spinning DASD, but too expensive
MetLifeDisaster Recovery Procedure • Comdisco Plan of Attack - Recovering Critical Systems and Applications • Client-Server • Separate Lan/Wan servers created in NJ site • Back-up tapes from servers also sent to Iron Mountain • Procedure locker at Comdisco to hold detailed procedures, so off-the-street user can execute • Certain critical client-server applications automatically recovered (Lotus Notes, NT, etc.)