1 / 33

EMC Next Generation Backup & Data De-Duplication High Level Overview and Strategy

EMC Next Generation Backup & Data De-Duplication High Level Overview and Strategy. Joe Staiber EMC Corporation Data De-Duplication Product Manager Backup, Recovery and Archive Division. Long Backup Windows Affecting Production Tape Cost License Cost Cost to use Disk Technology

melva
Download Presentation

EMC Next Generation Backup & Data De-Duplication High Level Overview and Strategy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EMC Next Generation Backup & Data De-Duplication High Level Overview and Strategy Joe Staiber EMC Corporation Data De-Duplication Product Manager Backup, Recovery and Archive Division

  2. Long Backup Windows Affecting Production Tape Cost License Cost Cost to use Disk Technology Client Licensing VMWare Resources VCB Infrastructure Tape Drive Failure Tape Read/Write Errors Tape Drive Maintenance Intraday Restore needs Retention Backup Servers Server Cost / Licensing VM Guest Proliferation Off-site storage Iron Mountain / Transport Tape Rotation and Changes Restore times Restore complexity Multiple Solutions Remote office backup DR / Business Continuity GROWTH / TIME Typical Issus with Traditional Backup

  3. What is Data De-Duplication? – An Analogy • How many times does the word “THE” appear in a sentence, a chapter, an entire book, a library? • Data is not unlike words in print, only instead of words, data uses strings of 1’s and 0’s. • A book may contain 4 million words in it, but only 200,000 different words, 3.8 million words are repeats. Some of them, hundreds or thousands of times. • The Amount of de-duplication possible in your data center is in line with these numbers…. Its staggering “Would you rather copy 200,000 or 4 million words every day?”

  4. Remote Site 2 Data Center • Duplicate Instance • Modified Instance A B C E D A B C D E De-duplication Server (stored backup data) Remote Site 1 Data already backed up, so only unique IDs stored (20 byte pointers) How it Works Simple Example of Global, Source Data De-duplication • First Instance Only unique data segments are backed up New data segment identified and backed up

  5. Where can De-Duplication Occur? • IT’S NOT JUST IN BACKUP!!!!!! • De-Dup is theoretically possible ANYWHERE • But it comes with a price…. Processing, latency, bandwidth, and most importantly TIME Who does the actual processing? • Storage Array? • Software? • Backup Server? • Tape Device / VTL? De-duplication Device Backup Server

  6. De-Duplication Concepts: Prominent Use Cases Backup – address significant inefficiency & cost due to redundant data Integrated end-to-end backup software stack B2D H/W Target component for incumbent backup environments Archive Applications and Platforms – efficient retention over time Low cost, “acceptable performance” secondary storage for mid-term retention, where regulatory compliance is not required As an efficiency feature in compliant archive (e.g. Centera) Primary Storage - “Capacity Optimized” ILM tier Block and file for tier 2 applications Different performance and cost characteristics Replication – Save bandwidth & time by moving less data Inherent in most storage use case solutions Also found in WAAS/WAFS solutions Where is De-Duplication being applied today?

  7. De-Duplication in PRIMARY Storage will Change the GAME !!!! Technologies like Flash drives and NAS subfile de-dup are HERE. Celerra CLARiiON Invista NSX Connectrix EMC Centera NS80 NS40 NS20 Symmetrix CX3 UltraScale Series DL4x00 EMCDisk Library NS40G AX150 NS80G Fiber Channeland iSCSI DMX-4 950 EMC Centera Gen 4 LPNode New DL6000 DL210 DMX-4 and DMX-3 FlashDrives

  8. Different Vendors De-Dup in Different Places EMC, HP NetApp, IBM Etc etc Lets look at the Vendors who play in this equation • What happens when the backup application does the de-dup? (such as Commvault) • Do we need DD or Exagrid to do it again? No we don't • What happens when the primary SAN does it? (NetApp & EMC Celerra) • Do we need Commvault or DD to do it again? No we don’t And if they did, they would have to “un-dedup (rehydrate) the data to even be able to read it!!! Primary Storage SAN/NAS Backup Application Data De-dupe Symantec Commvault Etc etc Data De-dupe Data Domain Exagrid, Quantum Etc etc Target Device SOFTWARE BASED DE-DUP Commvault / PureDisk TARGET BASED DE-DUP Data Domain / Exagrid

  9. BUYER BEWARE!!!!! • EMC IS THE ONLY COMPANY THAT MANUFACTURES PRODUCTS IN EVERY SECTION OF THE DE-DUPLICATION MARKET • EMC is ready and capable in leveraging de-duplication across the spectrum • What happens to vendors like Data Domain and Commvault, when the data is already de-duplicated??? • Other vendors see De-Dup as a product, not a technology… Primary Storage SAN/NAS Data De-dupe Backup Application Data De-dupe Data De-dupe Target Device

  10. What is Most Impactful to You TODAY? Backup is still the best and most efficient application for De-Dup today • It is proven and available • It is out of the production window • There are several ways to De-Duplicate data in a backup environment • But first, lets define the backup challenge we all are facing… TARGET DE-DUPLICATION SOURCE DE-DUPLICATION B B B B B B De-duplication Device De-Dup Device Backup Server

  11. Backup De-duplication – Media Impact Traditional Backup v. EMC Avamar Cumulative Media Required Traditional Backup Traditional Backup w/Compression (2:1) EMC Avamar 8 weeks 4 weeks Avamar Avamar makes backup to disk more economical

  12. The Avalanche Would you rather stop the avalanche here? The Goal is to De-Duplicate as close to the SOURCE as possible Or here?

  13. The Power of Avamar and De-Duplication 70 Hour backup down to 4 Hours 400 servers backed up in 5 hours over T1 or less bandwidth Eliminated Tape Eliminated 40 backup servers Improved Restore times Centralized all Backup Operations 300GB of backup stored in 10GB 99.8% de-dup rate in Windows 99% de-dup rate in SQL What has Avamar resulted in for Customers: • 10x Faster Backups • 500:1 reduction in network bandwidth • 50:1 reduction in backup infrastructure • Elimination of off-site tape storage 13

  14. 95+% Less BC & Disaster Recovery: • Primary Data Storage: 50TB • Daily Cumulative: 8TB • Weekly Cumulatives: 48TB • Weekly Full Backups: 50TB 98TB • Primary Data Storage: 50TB • Axion Daily Snapups: .5TB • Axion Weekly Snapups: 3.5TB • Weekly Full Backups:N/A 3.5TB • 70 hour staged full backup window reduced to 4 hours • Cost-effective replication to two sites “Avamar has a game changing solution. Through their innovative technology, we have been able to rethink our backup, recovery and replication infrastructure, providing Morgan Stanley with better local and remote recovery at a greatly reduced TCO.” —Guy Chiarello, CTO/CIO, Morgan Stanley

  15. Expected Results for Manufacturing Co. Backup Exec • Current Full Backups 5 TB • Backup Window 28 hrs • Media Used (1 year) = 106 TB Current Full Backups 5 TB Backup Window 1.2 Hours Media Used (1 year) = 6 TB • 28 hour staged full backup window reduced to 1.2 Hours • Cost savings estimated at $23,851 for 3 Years • New Functionality, Centralized, Faster Backups, Streamlined Avamar starts at 17k and goes up from there based on Capacity and Retention periods

  16. Verizon Wireless Home Depot The Limited Ann Taylor GE Cardinal Health Nationwide Pepsi AT&T VMWare Nexon Travelers Corporate Express Wellesley College Sterilite CRI Technologies Danvers Bank CISCO Arizona Dept of Education PPG Medco Kelley Drye & Warren Brooks Automation Chrysler Morriston Forester Lexis Nexis Kroger Baker & McKenzie Rob Roy Reckitt Benckiser Steamship Authority La Quinta 21st Century Oklahoma Turnpike Avamar Customers (Notable) • Churchill Downs • New Albany • Bank of New York • Dell • City of Kirkland • Univ of CA • Kiewit • Komatsu • Iowa Dept of Transportation • Duoline • Citizens Bank • Nodaway Bank • Plymouth Rock • Chadwick Martin • Auto Owners • Mile High Banks • Farallon

  17. Montgomery County Public Schools Howard County Public Schools Country Meadow Associates Arraya Solutions IPR Evolve IP HydroGeneLogic American Healthways NetTelCos Expedient ADLCM Kirklands Retail Restaurant Services Inc SEA Medical Center DCH Informed Medical Orange Lake Resorts Welbro Construction FCCJ Seminole Community College Avocent Debartolo Properties First Bank GPX Leesburg Regional Hospital Northside Hospital Lithonia Lighting Manatee County Palm Beach County Parker Hudson Rainer & Dobbs LLC Miles & Stockbridge Reynolds Smith and Hills Sarasota County Clerk Satilla Regional Medical Southern Bone & Joint Success For All CGI Mecklenburg County Barlowworld ABNB Federal Credit Union Wunderlich Microstrategy Eastern Regional Avamar Installed Customers(Commercial)

  18. Where is Avamar MOST common • Avamar is used in nearly every industry • Every type of infrastructure • Across most platforms Its biggest Value comes in areas where backup time / bandwidth are limited: • Remote Offices / Branch Offices • Data Centers / Enterprise Backup Management • VMWare & File Sytems • NAS

  19. WAN WAN Remote Office Backup Via WAN Without Avamar Clients With Avamar Clients Data De-dupe Central Data Center De-dupe Server Data De-dupe Challenges • WAN blockage • Poor reliability • Decentralized • Untrained IT staff Data De-dupe Advantages • Automated • Encrypted • Centralized • Outstanding ROI • Target approach requires hardware at every site

  20. Real Example from AvamarMD Public School System (WAN)

  21. Virtualization Creates New Backup Challenges OLD PARADIGM Low overall utilization and plenty of bandwidth for backup NEW PARADIGM High overall server utilization, but low bandwidth for backup

  22. Backup Built for VMware Infrastructure Avamar Efficiently Protects Virtual Machines Traditional moves ~200% weekly • Up to 95% reduction in data moved • Up to 90% reduction in backup times • Up to 50% reduction in disk impact • Up to 95% reduction in NIC usage • Up to 80% reduction in CPU usage • Up to 50% reduction in memory usage • All backups stored as “virtual full backups,” ready for immediate restore • Maintain effective consolidation ratios without over-taxing CPU utilization Avamar moves ~2% weekly

  23. EMC Avamar Solutions for VMware Infrastructure Flexible, Fast, Efficient and Reliable Backup and Recovery AVAMAR CLIENT BACKUP SOLUTIONS AVAMAR SERVER BACKUP SOLUTIONS Guest VCB Avamar Software Avamar Virtual Edition Service Console Avamar Data Store

  24. Lightweight Agents / Reduced CPU Utilization Total CPU Utilization by Event (Time Elapsed) Full Avamar: Efficient Full Backups Incrementals Traditional Incremental + Full Backups Avamar: Efficient Full Backups • Avamar reduces backup times by up to 90% weekly • CPU utilization slightly higher during backup operation (~15%) • Reduced time = weekly CPU utilization reduced by up to 85% • Avamar backups set in “nice mode” or low priority: minimizes CPU contention

  25. EMC Avamar Data Store Gen 2 SUSTAINABLE GRID (RAIN) TECHNOLOGY • Avamar Data Store • Multi-node configuration starts at 4 TB and scales to support up to 32 TB licensable de-duplicated capacity • Equivalent of up to 1.1 PB of cumulative traditional disk or tape backup storage* • Backup media requirement reduced 20–40 times • High availability and reliability with RAIN architecture, RAID, daily integrity checks, and redundant power • Avamar Data Store, Single Node • Supports 1 TB and2 TB licensable de-duplicated storage capacity configurations • Equivalent of up to 70 TB of cumulative traditional disk or tape backup storage* • Designed for easy deployment at remote offices • Offers fast, local recovery without dependence on a WAN connection *Note: Equivalent traditional backup capacity assumptions: 100 percent MS Office file data, weekly full and daily incremental backups, no compression, 10 percent daily change rate, 90-day retention

  26. Avamar’s Major Competitive DifferentiationsWho is REALLY less expensive?

  27. Why an Integrated Solution? • HARDWARE ONLY SOLUTIONS: (Data Domain / Exagrid) • As software is now performing the De-Duplication, the hardware de-duplication is NO LONGER required (This is now the case with Symantec, Commvault v8, and Avamar) • SOFTWARE ONLY SOLUTIONS: (Symantec, Commvault, etc) • As primary storage arrays begin to utilize Data De-Duplication technologies, the Backup Software is not aware and its value diminishes if the data is already in a De-Duplicated state. Re-Hydration would be required. (This is already the case with NetApp and Celerra NAS based De-Dup and there are more to come) • EMC is the ONLY vendor in the De-Duplication space that manufactures Primary Storage, Backup Software and Backup Hardware. • Regardless of where the de-duplication occurs, EMC is ready and capable to leverage and optimize it. • EMC|Avamar is the only vendor to utilize variable length segments when de-duplicating data. It will ALWAYS store less, send less and backup faster! “What vendor do you want to make a strategic investment in?” Ask these vendors what their strategy is, as data is already de-duplicated before it gets to their product…

  28. Avamar Example $90,000 Investment 3 years all inclusive (HW, SW, Maint) No recurring tape spend All client software/agents incl. Offisite replication included All data retained on disk and all media included for the 3 years of retention 20% growth rate of data was factored into the system Backup window reduced by 300% Restore times improved Time to first byte of restore within minutes Traditional Backup Solution $35,000 Investment 1 year included HW and Maint No software, use existing. 9k per year in maint $3500 per year for new clients $9000 for VCB SW (vRanger) + a server $2700 per year for additional media 11,500 per year in offsite costs No data growth factored in New media / upgrade required year 2 at 18k. New server too? HW maint of 12k years 2 and 3 No significant backup improvements The Economics of Backup & Recovery TOTAL COST = $90,000 TOTAL COST = $166,100

  29. Long Backup Windows Affecting Production Tape Cost License Cost Cost to use Disk Technology Client Licensing VMWare Resources VCB Infrastructure Tape Drive Failure Tape Read/Write Errors Tape Drive Maintenance Intraday Restore needs Backup Servers Server Cost / Licensing VM Guest Proliferation Off-site storage Iron Mountain / Transport Tape Rotation and Changes Restore times Restore complexity Multiple Solutions Remote office backup DR / Business Continuity GROWTH / TIME Avamar SOLVES issues

  30. Intuitive, Policy-Based Management Console

  31. In Summary • File System and VMWare benefits of Source De-Dup alone, justify the investment • You can start SMALL with Avamar (single use) and grow it easily into a full integrated enterprise solution • Source Based De-Duplication makes the Difference, beware of the values of a Target Based De-Dup • Competitive Solutions around De-Dup have value, but understand the differences. They are Band-Aid’s not long term solutions • EMC has the ONLY broad based De-Dup strategy that will grow and continue to add value as De-Dup stretches into new areas

  32. Next Steps • Live Demo’s provided every FRIDAY at 11am EST • Performed by an Avamar Engineer • Live via Web • Ask questions, see the product in action • Solution Sizing • How much data is transferred in a full backup today • % of data is FS/Exchange/DB/Images/VMWare • Retention periods on disk • Replication? • Avamar Virtual Demo • Configuration / Pricing / Cost Justifications • Commonality Analysis • Proof of Concept / Evaluations

  33. where YOUR information should live where PRIMARY information lives where TIERED information lives where VIRTUAL information lives where BACKUP information lives where REPLICATED information lives where DE-DUPLICATED data lives where ARCHIVED data lives

More Related