DR and BC Mythconceptions there is a better way - PowerPoint PPT Presentation

Gabriel
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
DR and BC Mythconceptions there is a better way PowerPoint Presentation
Download Presentation
DR and BC Mythconceptions there is a better way

play fullscreen
1 / 27
Download Presentation
DR and BC Mythconceptions there is a better way
153 Views
Download Presentation

DR and BC Mythconceptions there is a better way

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

    1. DR and BC Mythconceptions” …there is a better way

    2. Copyright 2007 The William Travis Group, Inc. 2

    3. Copyright 2007 The William Travis Group, Inc. 3 WTG Credentials 25 years of industry-leading experience Built and managed largest commercial hotsites Developed today’s virtualized hotsite standards Developed much of the industry’s methodology Completed the industry’s largest projects Designed first integrated DR/BCP planning tool Pioneered business continuity planning Authors of the only NextGenTM methodology

    4. Copyright 2007 The William Travis Group, Inc. 4 Today’s Objectives To stimulate your thinking To challenge the standard industry drone To provide insights to alternative approaches To validate what others have done and what you can do To remind you of what you really already know

    5. Copyright 2007 The William Travis Group, Inc. 5 Obvious similarities, or subtle differences?

    6. Copyright 2007 The William Travis Group, Inc. 6 Shattering Some Popular and Some Not So Popular…

    7. Copyright 2007 The William Travis Group, Inc. 7 Mythconception #1 “Successful DR/BC requires senior management’s commitment” actually... commitment of senior management comes from successful DR/BC planning NextGen Alternative revisit your costs—if you haven’t zero-based your architecture in the last two years, you are probably paying 30 - 50% too much shorten contract terms to realize better pricing re-architect your solution to incorporate the improved price and performance of new technology evaluate new vendors, products/services to leverage market pressures reduce the cost and improve the performance of DR/BC and watch senior management commitment blossom

    8. Copyright 2007 The William Travis Group, Inc. 8 Mythconception #2 “You can’t prevent a disaster” with today’s tools you often can! NextGen Alternative maximize use of existing current locations and assets decentralize business operations to insulate from regional and targeted risks, infrastructure failures and wide-scale unavailability of staff – certain geographies are simply unacceptable for housing core processes decentralize large, monolithic IT shops to reduce disaster impact leverage production initiatives to produce disaster resilience—convert HA to CA, CA to active/active build disaster resilience into the production organization

    9. Copyright 2007 The William Travis Group, Inc. 9 Mythconception #3 “Disaster recovery is a business problem” it’s not a business problem… it’s a enterprise problem and the solution always requires a technical foundation NextGen Alternative facilitate coordination ATOD—departmental ownership creates an unrealistically complex recovery model centralize DR/BC ownership to eliminate “islands of recoverability” but…do not over-commit centralized resources centralize control of all remote Open Systems and data backups to simplify recoverability and cross-application synchronization

    10. Copyright 2007 The William Travis Group, Inc. 10 Mythconception #4 “DR/BC is a program, not a project” 10 to 20 year programs and still... no end to end test—no comprehensive application recovery—no unrehearsed data synchronization—no surprise tests—no substantial cross-platform recovery—no achieving RTOs—no appropriate backup policies—no critical workforce recovery—no business unit commitment—etc., etc., etc. NextGen Alternative ensure core requirements are recoverable within 12 months make pragmatic assumptions--accept reality and do not sugar coat a self-fulfilling prophecy use today’s tools to a ensure pragmatic recovery architecture plan vertically, not horizontally - all of something is much better than some of everything

    11. Copyright 2007 The William Travis Group, Inc. 11 Mythconception #5 “There has never been an unsuccessful recovery” do you count dramatically missing your RTOs—most RTCs are 3 to 5 times the planned RTOs how about massive data loss—most companies cannot achieve even a 24 hour RPO or no “business” recovery—less than 25% of companies have adequate work area recovery NextGen Alternative realize that your business is more resilient than you think accept that the “fail first then recover” model will never meet the RTOs or RPOs of most large shops do not assume that the “business” can make-up for the shortfalls of the DR/BC plan—utilize “advanced” technologies to create an inherently disaster resistant environment

    12. Copyright 2007 The William Travis Group, Inc. 12 Mythconception #6 “All disaster recovery planning should start with a BIA“ only if you have a lot of time, money and patience the fundamental BIA process is critically flawed—impacts are not additive—risks are not manageable NextGen Alternative recast or eliminate the traditional BIA process - change focus from potential loss to certain dependencies focus on applications as enablers of business functionality seriously question the practicality of manual alternatives do not neglect your upstream/downstream and industry responsibilities

    13. Copyright 2007 The William Travis Group, Inc. 13 Mythconception #7 “There is no such thing as a failed test” actually 90%+ of all tests should be considered a failure traditional testing is an artificial feel-good exercise excluding the first few tests, the extent of preparation efforts are a geometrically inverse indicator of actual recoverability! NextGen Alternative if you cannot prepare for and complete a successful end-to-end test in less than 96 hours your recovery capability is probably inadequate establish an active-active model when ever possible focus now on production data backup and restoration expand unit testing to maximize integrated testing match vertical DR/BC planning with vertical testing focus on test results not testing activity

    14. Copyright 2007 The William Travis Group, Inc. 14 Mythconception #8 “Business proponents must be the ones to determine recovery requirements” business proponents are typically not qualified to determine application requirements the complex interaction of business process, applications and platforms largely invalidates departmental input NextGen alternative define minimum standardized recovery levels mandate DR/BC participation and hold accountable—minimal compliance is not a departmental decision determine criteria for optimal compliance based on up/downstream dependencies, not simple departmental impact

    15. Copyright 2007 The William Travis Group, Inc. 15 Mythconception #9 “Planning scope should address complete facility loss” 9/11, the NE Power Outage, and Katrina and other massive disasters have forever changed the rules of the game total site loss is no longer adequate personal priorities trump business requirements NextGen Alternative deeper planning is required—longer disasters, tertiary site broader planning is required—eliminate dependency on national infrastructure… regional utilities, communications and transportation are unreliable wider planning is required—upstream and downstream dependencies must be addressed

    16. Copyright 2007 The William Travis Group, Inc. 16 Mythconception #10 “BC, not DR, is the real objective” business continuity is the goal and business processes are the drivers but technical recovery is the solution NextGen alternative achieve simpler Business Continuity by focusing more on Disaster Recovery eliminate manual bridging, lost data re-entry and business catch-up—the more technology recovered the less the business user’s must accommodate ATOD replace unrealistic, high-overhead manual processes with the applications that you use every day—incremental capabilities are “cheap”

    17. Copyright 2007 The William Travis Group, Inc. 17 Mythconception #11 “Our nightly tape backups will ensure a 24 hour RPO” few large shops will achieve the recovery they anticipate by relying on their “normal“ tape backups actual backups are never what they are expected to be technology cannot eliminate the need for understanding interdependencies and synchronization requirements NextGen alternative conduct a detailed application data analysis immediately assign ownership of disaster data recovery to applications conduct a zero-based analysis of your production data backup policies and modify them to facilitate disaster data restores use advanced disk technology to achieve your real backup requirements

    18. Copyright 2007 The William Travis Group, Inc. 18 Mythconception #12 “The best recovery plans consist of team to do lists” most plans are far too simplistic and are not useable at time of disaster—remember, most employees have never done this before most plans are largely unreadable due to their disorganized mix of everything DR Simpler is better, but simplicity comes from clarity which comes from detail and organization NextGen alternative 100% action oriented plans – understand the difference between filler, methodology and recovery tasks explicitly define the recovery timeline from first step to the return home including escalation and de-escalation one size does not fit all – DR Master Site Plan, BC Master Site Plan, Regional Office Plan, Small Office Plan, Home Office Plan, S.O.A.P.s

    19. Copyright 2007 The William Travis Group, Inc. 19 Mythconception #13 “Shorter RTO’s require “advanced” recovery techniques” reduce RTO through planning not spending focus on meeting your RTOs before you focus on shortening them NextGen alternative improve notification and communication processes shorten disaster assessment to achieve faster recovery mobilize and deploy preemptively to speed recovery better define recovery tasks and interdependencies often, days can be shaved off the recovery timeline just by tightening the process

    20. Copyright 2007 The William Travis Group, Inc. 20 Mythconception #14 “Data synchronization is the user’s responsibility” synchronization problems can completely invalidate a recovery capability realistically consider the impact of lost data few if any business departments can still recover lost data manually NextGen alternative use technology to solve the RPO problem—advanced data availability technologies solve the unsolvable eliminate decades of futility with a one time capital expense enjoy new price-performance with new “second tier” technologies shave years off the development effort and man-years off the maintenance effort with data mirroring technologies

    21. Copyright 2007 The William Travis Group, Inc. 21 Mythconception #15 “The RTO and RPO are completely different issues” only in the most literal interpretation the largest part of RTO is dependant on the RPO most RPO solutions are needed in order to shorten RTOs NextGen alternative include core infrastructure in your RTO calculations and pre-stage it whenever possible remember to consider dependency groups eliminate tape recovery for all except stand-alone applications when using tapes, optimize for restoration not backup

    22. Copyright 2007 The William Travis Group, Inc. 22 Mythconception #16 “Communicating ATOD is the most important aspect of a successful recovery” communications not communicating is the most critical factor, and it doesn’t happen naturally! NextGen alternative differentiate operational communicating from strategic communications pre-define messages, audiences, vehicles for in-bound and out-bound frame all communications and don’t forget the “little timeline” pre-develop all communications messages error towards over-communication use an automated system and all possible vehicles – redundancy saves the day

    23. Copyright 2007 The William Travis Group, Inc. 23 Mythconception #17 “Working from home is the least expensive and most effective work area replacement” casual telecommuting misleads us to believe that working from home ATOD is a fully viable solution—it usually isn’t unless specifically pre-planned, telecommuting is usually a non-starter NextGen alternative in typical client-server environments, discount home computers as viable workstations existing RAS capabilities often don’t meet required recovery capacities do not underestimate the physical proximity requirements of many business processes maximize shift work and account for the increasing/decreasing needs as the recovery unfolds

    24. Copyright 2007 The William Travis Group, Inc. 24 Mythconception #18 “Disaster Recovery is becoming too complicated… it’s really a fairly simple process” anyone who really believes that DR/BC is a simple process, obviously doesn’t understand the problem NextGen alternative the only way to simplify the issue is to proactively design and document the complexity out of the process step-by-step recovery plans with explicit instructions choreographed mobilization, deployment, workarea usage pre-drafted communications with “multi flavor” messages detailed alternative procedures per scenario, particularly on the business side

    25. Copyright 2007 The William Travis Group, Inc. 25 Mythconception #19 “An organization needs both a Disaster Recovery Plan and a Business Continuity Plan to meet Best Practice requirements” there is a new sheriff in town, All-Risk Incident Management trumps Business Continuity today’s planning requirements are much broader than simple Business Continuity NextGen alternative implement a holistic sub-plan approach to deal with any and all risks – pandemic operations, succession planning, supply chain planning, crisis communications, product liability, etc. implement a single common communications structure between sub-plans employ milestone management at the senior management top level

    26. Copyright 2007 The William Travis Group, Inc. 26 Mythconception #20 “7 out of 10 businesses that experience a disaster without a DR plan are out of business within five years” or… “Of the companies experiencing disasters, 43% never reopen, and 29% close within two years” or…“Some 40% of companies that experience a devastating loss to their data systems never reopen their doors” or…“Of 350 businesses in the World Trade Center before the bombing, 150 were out of business a year later” NextGen alternative implement a pragmatic recovery capability NOW! realistically address data backup, synchronization and restoration develop a scenario-based, multi-threaded plan that will really work change untenable recovery architectures and get production performance from your DR investment

    27. Copyright 2007 The William Travis Group, Inc. 27 Why NextGen? Plan for new risks - targeted attacks, wide scale unavailability of staff Apply new technology and replace tape recovery for critical functions or large environments Build independence from over-allocated shared commercial facilities Implement active site model to insure continuity through actual use versus testing New breadth of recoverability… city or even metro-wide New depth of recoverability… length of disaster (tertiary site) Address worst case volumes not average case Re-evaluate DR/BC ownership model Leverage DR/BC industry confusion and weakness Re-purpose existing assets and resources Simplify and reduce traditional documentation Define pragmatic limits

    28. Copyright 2007 The William Travis Group, Inc. 28

    29. Copyright 2007 The William Travis Group, Inc. 29 Ask the Hard Questions Is your recovery solution as “right” as it was just 1 or 2 years ago? Do you understand what won’t be recovered? How many employees will be out-of-work after a disaster? Are you prepared to permanently lose the amount of data your current backup model risks? How much of you business is dependant on paper records? Can your critical functions wait while systems return to normal? How many skilled technicians will it take to recovery 100s of servers? Do you have enough technical staff to cover 3 or more sites ATOD? Can you really synchronize thousands of files to a single point in time? Are you certain that you are not overpaying for your recovery capability?

    30. Copyright 2007 The William Travis Group, Inc. 30 A New Checklist for Effective DR/BC …NextGen Axioms Implement a more pragmatic recovery architecture Define and mandate minimum standardized recovery levels Establish a CCO position Recast the BIA process Centralize control of Open Systems Decentralize business operations Eliminate the “fail first then recover” model Centralize ownership of DR/BC Eliminate tape-based recovery Implement a new, more effective plan model Achieve Business Continuity by focusing on Disaster Recovery Reduce RTO through planning not spending Plan vertically, not horizontally Use technology to solve the RPO problem Pursue an Active-Active model and restructure testing Leverage industry weakness while it lasts Implement a holistic approach to DR-BC-CM Leverage production inactivates for recovery purposes Eliminate dependency on national infrastructure

    31. Copyright 2007 The William Travis Group, Inc. 31 Mythconception #21 “A shared hotsite is the most cost-effective recovery solution“ mainframe-centric—inverse benefits of shared cost model the traditional fail-first-then-recover model is too complicated for large, multi-platform shops commercial site “virtualization” and changing risk management limits further complicate recoverability NextGen alternatives proactively evaluate the new price/performance of in-house recovery and leverage production initiatives for disaster resilience leverage the inherent redundancy of your open system environment leverage server consolidation and repurpose existing assets for simplified and cost-effective recovery maximize hybrid solutions and multi-tiered architectures explore the possibilities of limited risk consortiums—again!

    32. Copyright 2007 The William Travis Group, Inc. 32 Mythconception #22 “An automated planning package is the best way to develop and maintain your DR/BC plan” less is more when it comes to documentation tools few organizations need the overhead of an automated tool remember the old saw “do things in recovery just as you would in production” NextGen alternative simplify the recovery plan by improving the recovery architecture automate processes and eliminate documents simplify with built-to-purpose documents—differentiate recovery procedures from operating procedures centralize maintenance to reduce effort and improve results use your common toolset – web sites, version control, collaboration

    33. Copyright 2007 The William Travis Group, Inc. 33 Mythconception #23 “Quickship is the preferred solution to recover non-critical systems” many Quickship “offerings” are now backed-up by manufacturer and/or distributor agreements vs. physical inventories consider the recovery site and connectivity speed NextGen alternative hotsite hardware is now “mobile” installed hotsite hardware can sometimes be priced as Quickship manufacturer maintenance services can provide inexpensive replacements along with a tech, often in the same timeframe you usually can’t recover 100s of systems in 2-3 days anyway

    34. Copyright 2007 The William Travis Group, Inc. 34 Mythconception #24 “As the relative cost of hardware decreases, disaster recovery planning becomes much more cost effective” what ever happened to Moore’s law? most expensive aspect of DR is not hardware upgrade costs vary inversely to hardware costs HA breaks the traditional model NextGen alternative leverage industry weakness while it lasts shorter term contracts - aggressive T&Cs upgrade concessions - understand pricing units understand physical vs. subscription configs understand the shell game - where is the gear?

    35. Copyright 2007 The William Travis Group, Inc. 35 Mythconception #25 “Network (LAN) recovery is the easiest part of DR” the complexity of the LAN is usually underestimated and it’s recovery is under orchestrated NextGen alternative don’t waste time on rediscovery—backup and store device configurations realize the impact DNS changes have on most organizations subnets, VLANs, AD, Single Sign-ons, Firewalls, Proxy Servers, etc. develop a life of their own over time which is nearly impossible to re-develop ATOLD ad hoc production bandwidth is seldom replicated in recovery mode—understand the impact to operations

    36. Copyright 2007 The William Travis Group, Inc. 36 Mythconception #26 “A well-documented shop already has much of what it needs for disaster recovery” the difference between production procedures and recovery procedures are subtle but significant—they are not interchangeable inventories are easily re-purposed, procedures are not “To-Do List” plans are not worth the effort to develop NextGen alternative develop “timeline based” plans focus on single-purpose procedures maximize use of self-documenting resources – config files, third parties, PBX directories, etc. reference documentation from original sources – HR, vendors, etc.

    37. Copyright 2007 The William Travis Group, Inc. 37 Mythconception #27 “The planning process is essentially the same for open systems as for mainframe systems” less location constrained, less costly environmentals and support, more existing redundancy, shorter RTO and RPO but simpler data availability more unique configurations, greater quantities, more dependant tiers NextGen Alternative differentiate production, test, development, pre-release, etc. environments and maximize the leverage of repurposing separate existing assets and leverage inherent modularity understand the breakage associated with recovering dozens or hundreds of servers and develop intelligent and pragmatic solutions

    38. Copyright 2007 The William Travis Group, Inc. 38 Mythconception #28 “Functional alignment of DR/BC should fall under security (or operations, or risk management, or…)” none of these departments can effectively cross all necessary business units to meet the need each prevents a holistic approach to DR, BC, CM and Security absent a cross discipline authorization, board review can only offer lip service NextGen alternative reframe DR/BC governance and establish CCO position raise DR/BC to a legitimate board level concern with legitimate ownership