1 / 65

EMC Symmetrix VMax – VP, FAST VP for Oracle DBs

EMC Symmetrix VMax – VP, FAST VP for Oracle DBs. Technical Presentation. Topics to Discuss Foundational Topics I/T & Application DBA Challenges (a few of them…) Quick Overview -- VMax Architecture and I/O Level Set Flash Drives Virtual Provisioning Fully Automated Storage Tiering (FAST)

melosa
Download Presentation

EMC Symmetrix VMax – VP, FAST VP for Oracle DBs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EMC SymmetrixVMax – VP, FAST VP for Oracle DBs Technical Presentation

  2. Topics to Discuss • Foundational Topics • I/T & Application DBA Challenges (a few of them…) • Quick Overview -- VMax Architecture and I/O Level Set • Flash Drives • Virtual Provisioning • Fully Automated Storage Tiering (FAST) • ASM, VP, FAST VP Testing

  3. I/T & Application DBA Challenges (a few of them…) • DBA & System Administrators need to do more with less • Databases continue to grow in size • Old data not being deleted from DBs & sitting on Tier 1 storage • Database availability requirements increasing • Increased CPU, memory, and storage utilizations • Same or fewer resources to do the work • Proactively improving performance a second priority to fighting fires? • Active data changes over time • Difficult to control user queries Source: GartnerUser Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011. October 2010

  4. Database Growth Management Challenges…Dormant Data Tablespace IO Stats Tablespace ----------------------------------- AvAvAvAv Buffer Av Buf Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms) -------------- ------- ------ ------- ------------ -------- ---------- --------------------------------- DW_TBL_DEBIT_P04 2,772 3 1.3 1.0 0 0 0 0.0 DW_TBL_DEBIT_P03 2,646 3 1.2 1.0 0 0 0 0.0 DW_TBL_CLEARDM_P16 2,637 3 8.6 1.0 2 0 0 0.0 DW_IDX_CLEARDM_P22 2,489 3 5.8 1.0 0 0 0 0.0 DW_IDX_CLEARDM_P21 2,448 3 4.2 1.0 0 0 0 0.0 DW_TBL_CLEARDM_P17 1,829 2 9.0 1.0 375 0 0 0.0 DW_TBL_CLEARDM_P15 2,121 2 9.0 1.0 0 0 0 0.0 DW_IDX_CLEARDM_P19 1,881 2 3.4 1.0 0 0 0 0.0 This I/O activity occurred for 2+ pages of the AWR & over many continuous snap periods

  5. …Database Growth Management Challenges • Application driven approaches to address growth are not easy • Archiving • Requires a dedicated project and Application owners cooperation • Reclaiming space is not easy • Software can be expensive relative to disk space savings • Purging • Steep resistance from the Application teams • Government, Legal mandate to keep aged or dormant data • Infrastructure approaches are easier • Data Reduction strategies (Compression) • More work for the DBA’s • CPU / system overhead • Storage Tiering - It does not reduce the amount of data but saves $’s • Database approaches using Table Partitioning and Tiering Requires a good deal of effort from DBA’s • Databases do not recognize dormant data  But Storage Can

  6. DBA I/O Performance Challenge • Physical disks are getting more data dense • Users demand lower disk cost per GB • Disk vendors respond with larger capacity drives • Capacity based purchases result in • Fewer physical drives • Reduced I/O throughput • When I/O throughput is the goal • Unused purchased capacity is the result • Very common to see 50% storage capacity utilization with 146GB drives – What happens with 300 or 600GB drives? • Short stroking drives • Energy Company - Customer quote: “It's clear from our research….that storage industry is moving in this direction (Tiering, FAST etc)….and its clear EMC is the leader” • Energy Company - Customer quote: “The larger the disk drives continue to get….the only way we will be able to manage the growth….is through automation & tools like EMC’s.”

  7. Quick Overview : VMax Architecture and I/O Level Set

  8. Symmetrix VMax Architecture

  9. Symmetrix Cache in Relation to Oracle SGA DISK VMax CACHE SGA Fastest Faster Fast

  10. Cache Usage - Fast Writes (Write Hit) Write-back Cache: 1) Write request is sent to Symmetrix 2) Write is instantly acknowledged 3) Data is de-staged later to disk (M1 and M2) *Typical I/O service time: ~1Ms DB Server Oracle Instance Symmetrix 1) 2) Cache 3) M 1 M 2

  11. Cache Usage - Fast Reads (Read Hit) Data found in Cache: 1) Read request is sent to Symmetrix 2) Read is instantly acknowledged *No disk access required: I/O = < 1Ms DB Server Oracle Instance 2) Symmetrix 1) Cache M 1 M 2

  12. No Cache Usage – (Read Miss) Data not found in Cache: 1) Read request sent to Symmetrix 2) Data not found in cache 3) Read request sent to drive 4) Data piped in to cache 5) Data in cache sent to host *Disk access required: I/O = ~6-10Ms DB Server Oracle Instance Symmetrix 5) 1) 2) Cache 3) 4) M 1 M 2

  13. Enterprise Flash Drives • Virtual Provisioning • Fully Automated Storage Tiering (FAST V1) • FAST VP • FAST VP & ASM Testing Performance Services • Holistic Performance Analysis • SQL Tuning Workshop • DB Layout Other Oracle Services • RAC Planning & Implementation • Oracle EBS Implementations & Upgrades • RMAN/Data Domain Implementations • OBIEE Planning & Implementation • Application Server Implementations • Remote DBA Services

  14. Enterprise Flash Drives

  15. No Cache Usage – (Read Miss) Data not found in Cache: 1) Read request sent to Symmetrix 2) Data not found in cache 3) Read request sent to drive 4) Data piped in to cache 5) Data in cache sent to host *Disk access required: I/O = ~6-10Ms DB Server Oracle Instance Symmetrix 5) 1) 2) Cache 3) 4) M 1 M 2

  16. How long it will take to read the whole drive? 8KB random reads ~22 Days ~15 Hours

  17. Enterprise Flash Drives(EFD) Available for DMX-4 w/Enginuity 5773 & VMax Arrays • 73GB, 146GB, 200GB & 400GB usable capacities • All versions of Symmetrix RAID Protection supported • Enginuity code optimized & integrated w/Flash Drives • i.e. Dynamic Cache Partitioning • Flash Drives can be protected with TimeFinder & SRDF • Read Miss I/O Comparison • 7200 rpm HDD: ~12ms response time • 10K rpm HDD: ~ 9ms response time • 15K rpm HDD: ~ 6ms response time • Flash Drive: ~ 1ms response time • RAID-5 Rebuilds • Flash drives configured as RAID 5, rebuilds can be over 40% faster than traditional HDD’s & has minimal impact on response time

  18. Virtual Provisioning (VP) & Fully Automated Storage Tiering (FAST) It’s an evolution…

  19. Skewed workload on physical drives Disk contention – spread load across more disks

  20. Set it & Forget It

  21. 1 TB Server Pool Storage Pool Compress Deduplicate ~ 500 GB Virtual Self Optimized Balance I/O Workloads Virtual Server Self Optimization Virtual Storage Self Optimization Workloads are dynamically balanced across virtualized storage to provide high, predictable performance and a lower cost!

  22. Virtual Provisioning… Multiple Thin Devices presented to host Storage Pool “TDATs” “TDEVs“ DB Server Allocation units: -768KB = Symmetrix “chunks”

  23. Pool based approach to storage Each pool contains many disks  100 disks (for example) Each pool has 1 disk drive type / tier  EFD or FC or SATA RAID protection defined for entire pool  R5 or R1 or R6 Multiple Storage Pools created for enterprise Recommendation is to avoid creating excessive number of thin pools Application associated to Pools  Segregate Apps that might/will not behave well together New Devices  Thin LUNs(TDEVs) and TDATs TDEVs presented to host  No space associated with. Are pointers to Thin Pool. TDEV = ASM Disk (35 x 60GB LUNs) Still need to allocate sufficient # of LUNs  They represent I/O channel to storage TDAT DB data is stored in TDATs (data devices, 240GB R5 3+1 RAID group) …Virtual Provisioning

  24. Wide striping across Storage Pool DB data broken in to 768KB chunks on Symmetrix Chunks distributed round-robin across the “100” disks  intends to stripe chunks & I/O across the entire pool Storage Admin time no longer needed to stripe data across back-end devices Add data devices to the pool to meet capacity or IOPS needs  Invoke pool rebalance to evenly re-distribute chunks No impact to the DB or Host  No changes to the LUNs presented to DB Re-distribution done in background as low priority activity  Front-end IO operations get priority Remove data devices from pool  Drain & copy allocated extents to others in pool Impact & re-distribution same as above Critical to avoid out of space situation  Applications writing to pool will freeze …Virtual Provisioning

  25. Flash Fibre Channel SATA FAST Makes it Easy and Automatic Get the right data to the right place FAST wizardallows administrators to set up and apply storage tiering in minutes …at the right time

  26. VMware VMware VMware VMware All Fibre Channel disk drives Symmetrix VMax FAST in ActionSymmetrix VMax with an active VMware ESX cluster Drive resources are ~80% busy

  27. VMware VMware VMware VMware SymmetrixVMax FAST in ActionAdd Flash drives and apply FAST Policy Tiered storage: 4% Flash drives 96% Fibre Channel drives 68% less drive I/O contention 2.5-times faster drive response time

  28. Online move of LUN granularity inside the storage array, between Disk types RAID protection Storage tiers Migration can be controlled thru SymmetrixQoS commands Front-end I/O activity takes priority over back-end I/O activity Advantages: Introduce new storage tiers to DB without host side changes Change control always challenge in production environments (maintenance time, risks) Change application performance without affecting Filesystem mount points Backup/cloning/DR scripts (no host LUN changes) Maintains TimeFinder and/or SRDF protection and incremental relationships LUN movement using storage resources No SAN, host, DB or application cycles consumed FAST V1 – LUN level Migration

  29. FAST VP Main Elements Storage Tiers – Storage Groups – Policies What are these Elements? Storage Tiers: Combination of drive technology and RAID protection available in the VMAX array For example: R5/EFD, R1/FC, R6/SATA Storage Groups: Collection of Symmetrix host-addressable devices For example: All the devices provisioned to an Oracle DB Policies: Ties Storage Groups to Storage Tiers & defines the configured capacities, as a percentage, that a storage group is allowed to consume on that tier Up-tiers parts of the DB when the “parts” meet the policy Down-tiers parts of the DB when the “parts” meet the policy Will not move parts of the DB if the “parts” meet the policy Data movement done in 7.6MB size  10 sequential 768KB chunks However, we can Pin thin devices (let’s say Redo Logs) to the FC tier FAST VP – Sub-LUN level Migration…

  30. …FAST VP – Sub-LUN level Migration

  31. Let’s put the pieces together for FAST VP…

  32. FAST VP Setup At a Glance…

  33. …FAST VP Setup At a Glance

  34. …FAST VP Setup At a Glance

  35. FAST VP – Storage Granularity Hierarchy StorageGroupThe many volumes for an Oracle DB Volume/LUN Size: User defined Example = 50GB ExtentSize: 360MB(139 Extents in 50GB LUN) Sub-ExtentSize: 7.6MB(48 in Extent,6672 in 50GB LUN)

  36. Extent Granularity • Extent: The primary focus of FAST VP • Host usage counters are accumulated in 10 minute intervals for each Extent, for 3 I/O types • Read Misses (RM) • Pre-fetches (PF) • Disk Writes (W) • These 3 variables create an: interval score = (RM*3 ) + PF + W • Note: I/Os for creating or maintaining TF Snaps or Clones (as well as other back-end Symmetrix I/O activity) does not count towards Score • Interval scores are added toShort-termand Long-term scores

  37. Sub-Extent Granularity • Sub-Extent: Very small chunk of capacity & typical data movement size for FAST VP • Flags are devoted to capturing data accesses on each sub-extent • If there is any Read Miss, Write or Pre-Fetch in a 12 hour period, the Sub-Extent’s flags are set appropriately • Each 12 hours of inactivity causes the value to Flags to be modified • After 36 idle hours, they are reset • Never-written or unallocated space is not moved

  38. FAST VP Process • Every 10 minute interval, each Extent’s scores under each policy are sorted • Short-term scores sorted in descending order • FAST engine determines the cutoff score for the top tier then goes down and determines the cutoff score for the 2nd tier • Long-term scores are sorted into ascending order: • Cutoff scores are determined in each tier for demotion • These cutoff scores are passed to the array µ-code for movement • The array will begin to queue, for promotion, all sub-extents with non-zero flags that exceed the cutoff values Demotion applies to all sub-extents within tier

  39. FAST VP Actions • Short-term score for Promotion provides quick responsiveness by FAST to changing needs • Long-term score for Demotion remembers signs of life in past weeks, days, hours • All FAST moves are performed by the Array, not the FAST engine • If all queued moves are not completed by the end of the 10 minute interval • New cutoff scores for Promotion and Demotion in each tier and policy are calculated and given to the array • The array will finish all moves that were in flight, but after that, the Extents will be evaluated under their newer scores with the newer cutoff values

  40. ASM, VP, FAST VP Testing

  41. ASM DB I/O Analysis – LUN level -Uniform I/O across the ASM +DATA disks(LUNs), no LUN skewing-ASM +FRA disk I/O diverges from the +DATA disks, as expected

  42. ASM DB I/O Analysis – Sub-LUN level -Non-Uniform I/O within an ASM +DATA disk -Hot spots for focused areas of each +DATA disk

  43. Testing Sub-LUN level Movement with ASM DB… • Objectives • Achieve single Oracle ASM DB workload performance optimization by FAST VP • Optimize DB storage allocation across three tiers: SATA, HDD, and EFD • Entire ASM disk group starts on the FC tier and FAST VP migrates idle portions to SATA, and highly active to EFD • Approach • Run transaction rate baseline & collect the AWR reports • Run the workload with FAST VP enabled, allowing allocations on all three tiers • Review the storage tiering efficiency and performance differences • Note: • Since an industry-standard benchmark tool was used (Swingbench), the I/O distribution across the database was completely even & random. This reduced sub-LUN skewing (since the whole database was highly active), and therefore the second idle database helped in representing a normal environment where some objects won’t be highly accessed.

  44. …Testing Sub-LUN level Movement with ASM DB

  45. …Testing Sub-LUN level Movement with ASM DB Initial Run FAST VP Run

  46. …Testing Sub-LUN level Movement with ASM DB Database transaction rate with FAST VP Storage tier allocation changes during the FAST VP enabled run

  47. Recent Customer Testing: 11.2.0.2 ASM / RAC / VP / FAST VP Database

  48. …Testing Sub-LUN level Movement with ASM DB Customer Baseline config Customer FAST VP config

  49. …Testing Sub-LUN level Movement with ASM DB

  50. …Testing Sub-LUN level Movement with ASM DB VMax Thermal Graph during testing

More Related