1 / 39

BUSINESS CONTINUITY AND DISASTER RECOVERY STRATEGIES

BUSINESS CONTINUITY AND DISASTER RECOVERY STRATEGIES. Giulio Brenna Systems Engineers manager Specialists Team. Workshop Objectives. Explain Why Customers need a BC/DR Strategy Explain Capabilities, Complexity, and Choice Understand BC and DR from a technological standpoint

kellsie
Download Presentation

BUSINESS CONTINUITY AND DISASTER RECOVERY STRATEGIES

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BUSINESS CONTINUITY AND DISASTER RECOVERY STRATEGIES Giulio Brenna Systems Engineers manager Specialists Team

  2. Workshop Objectives • Explain Why Customers need a BC/DR Strategy • Explain Capabilities, Complexity, and Choice • Understand BC and DR from a technological standpoint • Describe the main EMC Solutions for BC/DR

  3. BC & DR Drivers

  4. Why You Should Care A study from research firm Frost & Sullivan estimates that North American Business Continuity and Disaster Recovery spending will reach $23.3 billion by 2012. That is up more than 50 percent from $15.1 billion in 2006. "We are seeing increased concern from small and mid-sized enterprises about how they protect their data,” October 2009

  5. Business Continuity and Disaster Recovery Decision Drivers Technical Considerations Business Considerations Cost Consistency and Recovery Functionality, Availability Capacity Bandwidth Recovery-Time Objectives Performance Recovery-Point Objectives PRIMARY DECISION DRIVERS

  6. The Cost of Downtime Per Hour By Industry Investments Manufacturing Telecom Banking Transportation Retail Insurance $0 $100,000 $200,000 $300,000 $400,000 Source: AMR Research

  7. The Impact of Business Continuity Productivity Impact Revenue Impact • Direct + Indirect losses • Compensatory payments • Lost future revenue • Employees affected • Email ! • Systems Financial Impact Brand Impact • Customers • Suppliers • Financial markets • Banks • Business partners • The Media • Revenue recognition • Cash flow

  8. Business Continuity Considerations • What are your company’s most critical processes and data needs? • How much data can you afford to lose? • How quickly do you need to restore your critical processes? • How vulnerable are your operations to disasters?

  9. Events that Impact Information Availability Events that require a data center move: <1% of occurrences • Examples? Unscheduled events/failures: 15% • Examples? Scheduled events/competing workloads: 85% • Examples?

  10. Events that Impact Information Availability Events that require a data center move: <1% of occurrences • Disaster events • Fire, flood, storms, etc. • Data center move or relocation • Workload relocation Unscheduled events/failures: 15% • Server failure • Application failure • Network / storage failure • Processing or operator error Scheduled events/competing workloads: 85% • Maintenance and migrations • Backup and restore • Batch processing • Reporting • Data warehouse extract, build, and load

  11. Terminology

  12. A Key Differentiation Understand the difference betweenDisaster Recovery (DR) and Business Continuity (BC) • Disaster Recovery: Restoring IT operations following a site failure • Business Continuity: Reducing or eliminating application downtime • Disaster Avoidance: availability to predict and take actions to avoid Disaster Impact

  13. The Evolution of Availability Continuous Availability Application continues with no disruption (Zero Downtime) High Availability In-data-center Application Restart Convergence Advanced Recovery Replication to Second Site Traditional Disaster Recovery Tape Backup and Offsite Rotation

  14. Protecting Information is a Business Decision Recovery-point objective (RPO): How much data can you afford to lose, can you determine a sync point Recovery-time objective (RTO): How long can you afford to idle your business and survive? Fast recovery times enable continuous business operations Slow recovery times—or data loss—translates into Business Recover Plan and procedures Technology Failover Data Lost Notification Plan Activation System Restart Service Availability Recovery Point Objective Recovery Time Objective

  15. Balancing Business Requirements and Cost Auto-mation Backup Replication Replication Backup Cost of Data Availability Cost of System Availability Cost of Data Loss Cost of System Downtime CriticalApplication Business Application $ $ Time= 0 RPO RTO The maximum acceptable length of time that can elapse following an interruption to the operations of a business function before its absence severely impacts the organization The point in time to which critical data must be restored to following an interruption before its loss severely impacts the organization

  16. Basic Replica Tipologies Active – Active access NEW Costs Replica syncronous replication with zero data loss Acces Anywhere Synchronous Mirroring Mirroring replication with different levels of possible data loss Semi-Synchronous Mirroring Remote Replication Transactions Remote replication of data based on time intervals or events Classic Backup with physical tape transportation Stand-By Database Remote Journaling Electronic Vaulting Traditional Backup Recovery Point Objective Zero data loss 12 h ss 48 h 24 h 8 h 4 h 2 h m

  17. Replica protocols

  18. Latency • Latency in dark fiber is ~ 5ns/m or 5us/km (One 10km link can have 50us latency) • Worst ….. A round‐trip time (RTT) can be 100us • Latency over SONET/SDH is higher • Latency over IP networks is generally much higher • Latency directly impacts application performance: • Increased idle‐time while application is waiting for read data • Increased idle‐time while application is waiting for write acknowledgement • Reduces I/Os per second (IOPS)

  19. Protection% Active – Active access NEW Costs Replica syncronous replication with zero data loss Acces Anywhere Synchronous Mirroring Mirroring replication with different levels of possible data loss Semi-Synchronous Mirroring Remote Replication Transactions Remote replication of data based on time intervals or events Classic Backup with physical tape transportation Stand-By Database Remote Journaling Electronic Vaulting Traditional Backup Recovery Point Objective Zero data loss 12 h ss 48 h 24 h 8 h 4 h 2 h m

  20. Protection% Active – Active access NEW Costs Replica syncronous replication with zero data loss Acces Anywhere Synchronous Mirroring Mirroring replication with different levels of possible data loss Semi-Synchronous Mirroring Remote Replication Transactions Remote replication of data based on time intervals or events Classic Backup with physical tape transportation Stand-By Database Remote Journaling Electronic Vaulting Traditional Backup Recovery Point Objective Zero data loss 12 h ss 48 h 24 h 8 h 4 h 2 h m

  21. Tape Backup Software Tape Backup Software VTL VTL/Tape Backup Software Deduplication Storage Backup Evolution Over Time From Tape to Disk to Deduplication Backup Applications Onsite Backup Storage DR Storage Traditional Tape Centric Backup Failures Recovery Time Storage Cost Complexity Decrease Transformational Disk Centric Deduplication Backup Software and System

  22. Deduplication is Accelerating the Transition More Efficient Reduced Storage Less Bandwidth

  23. EMC Backup and Recovery Solutions Data Domain NetWorker Data Protection Advisor Avamar Disk Library for Mainframe

  24. Protection% Active – Active access NEW Costs Replica syncronous replication with zero data loss Acces Anywhere Synchronous Mirroring Mirroring replication with different levels of possible data loss Semi-Synchronous Mirroring Remote Replication Transactions Remote replication of data based on time intervals or events Classic Backup with physical tape transportation Stand-By Database Remote Journaling Electronic Vaulting Traditional Backup Recovery Point Objective Zero data loss 12 h ss 48 h 24 h 8 h 4 h 2 h m

  25. Symmetrix Remote Data Facility (SRDF) Family Industry-leading remote replication SRDF Family SRDF/Star Multi-site replication option • Protects against local and regional disruptions • Increases application availability by reducing downtime • Minimizes/eliminates performance impact on applications and hosts • Independent of hosts and operating systems, applications, and databases • Improves recovery point objectives (RPOs) and recovery time objectives (RTOs) with automated restart solutions • Mission-critical proven with numerous testimonials and references • Tens of thousands of licenses shipped SRDF/S Synchronous for zero data exposure SRDF/CE Cluster Enabler option SRDF/A Asynchronous for extended distances SRDF/AR Automated Replicationoption SRDF/CG Consistency Groups SRDF/DM Efficient Symmetrix-to-Symmetrix data mobility Cascaded SRDF and SRDF/EDP Extended Distance Protection ConcurrentSRDF Concurrent EMC offers choice and flexibility to meet any service level requirement

  26. SRDF Synch Virtual Virtual machines Production Virtual R1 Data R1 Data SRDF links TimeFinder R2 Boot R2 Boot Test and development TimeFinder Primary Secondary Provides disaster restart of remotely replicated devices and can be used for offsite backup operations using EMC TimeFinder

  27. R2 WAN SRDF/A Delta Set Push Operation Source Target 1 3 3 N Capture N–1 Transmit N–1 Receive N–2 Apply WAN 2 4 4 • SRDF/A write I/O cycle number assigned as part of capture cycle (N) • SRDF/A write I/O acknowledged back to host as local write operation • SRDF/A write I/O cycle number is part of transmit/receive cycle (N–1) • SRDF/A write I/O acknowledged from target and removed from transmit cycle (N–1) on source Capture to transmit cycle switch initiated based on cycle switch time interval setting with N–1 and N–2 cycles completed

  28. R11 SRDF Advanced Three-Site SRDF/Star Solution Site A Workload Site Reconfigure dynamic SRDF devices at Site A to Concurrent SRDF mode and start SRDF/A session from Site A to C SRDF/A SRDF/S SRDF/A R2 R2 Extended intersite link outage occurs between Sites B and C Site B Local or Regional Site Site C Out-of-Region Site

  29. EMC RECOVERPOINT FAMILY One way to protect everything better

  30. What Is EMC RecoverPoint? One way to protect everything better RecoverPoint family • Protects any physical or virtual host, application, or storage • Provides affordable data protection • Uses a DVR-like point-in-time recovery • Supports policy-based synchronous and asynchronous replication • Supports Block and NAS CDP: local protection CRR: remote protection CLR: concurrent local and remote protection

  31. Any Point-in-Time Recovery RecoverPoint for continuous protection • Recover to any point in time • Annotate selected recovery points with bookmarks • Continue replication during recovery • Use recovered image for a variety of purposes Daily Backup: Recovery point is once every 24 hours Snapshots: Recovery point is once every 8 hours Disk Mirroring: Recovery point is latest image replicated Continuous Protection: Recovery to any point in time UNLIMITED RECOVERY POINTS, APPLICATION BOOKMARKS Time Check-point Pre-Patch Patch Post-Patch Cache Flush Hot Backup Check-point

  32. Protection% Active – Active access NEW Costs Replica syncronous replication with zero data loss Acces Anywhere Synchronous Mirroring Mirroring replication with different levels of possible data loss Semi-Synchronous Mirroring Remote Replication Transactions Remote replication of data based on time intervals or events Classic Backup with physical tape transportation Stand-By Database Remote Journaling Electronic Vaulting Traditional Backup Recovery Point Objective Zero data loss 12 h ss 48 h 24 h 8 h 4 h 2 h m

  33. Mobility. Availability. Collaboration.

  34. The Unique Value of VPLEX Access Anywhere • Access the SAME information… • in SEPARATE locations … • all at the SAME time…

  35. Active-Passive Data Access Before VPLEX Site A Site B Active-Passive Site Data on disaster recovery site is used on failure Outage to moveapplications SYNCHRONOUS/ASYNCHRONOUS REPLICATION

  36. Federated Data Access With VPLEX Site A Site B Active-Active Site VPLEX Metro or VPLEX Geo DISTRIBUTED VIRTUAL VOLUME TRANSFER PROTOCOL VPLEX enables active use of resources at two sites

  37. The VPLEX Family of Products NEW Geo Metro Local Within a data center AccessAnywhere at synchronous distances AccessAnywhere at asynchronous distances

  38. Application Consistency Which type of consistency will be required for the applications that you’re going to protect ? Crash Consistency This is the equivalent of pulling the power from a server while the applications are running, and then powering up the server again. Replication solutions that have limited knowledge of the applications are easier to put together. During recovery you are reliant on the application’s capability to start up on its own merits, or possibly with some intervention. Following a fail-over, the data will not have transactional consistence, if transactions were in-flight at the time of the failure. In most cases what occurs is that once the application or database is restarted, the incomplete transactions are identified and the updates relating to these transactions are “backed-out” or some extra procedures or tools may be required. Application Consistency There are ways of ensuring that if a copy is taken, or if a system is shut down, all necessary transactions within a database are complete and caches are flushed inorder to maintain consistency. Scripts can be written, following best practice for each application to ensure processes take place in a certain order, or there are applications which can automate these procedures for each application. Some technologies use agents which are application specific. The choice is again down to importance of data, RPOs, RTO’s and the available budgets within the organisation.

More Related