1 / 32

Microsoft SQL Server Data Compression: Experience and Changes

Required Slide. SESSION CODE: DAT309. Microsoft SQL Server Data Compression: Experience and Changes . Sunil Agarwal Senior Program Manager Email: sunila@microsoft.com Microsoft Corporation. Agenda. Customer Experience and Feedback Unicode Compression in SQL2008R2 Future Directions.

gusty
Download Presentation

Microsoft SQL Server Data Compression: Experience and Changes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Required Slide SESSION CODE: DAT309 Microsoft SQL Server Data Compression: Experience and Changes Sunil Agarwal Senior Program Manager Email: sunila@microsoft.com Microsoft Corporation

  2. Agenda • Customer Experience and Feedback • Unicode Compression in SQL2008R2 • Future Directions

  3. Data Compression Overview • Two Types of compression • ROW • Fixed length columns stored as variable length • Recommendation: DML heavy workload • PAGE • Column Prefix and Page Dictionary compression • Recommendation: Read-mostly workload • Can be enabled on a table, index, and partition • Estimate data compression savings by sp_estimate_data_compression_savings • Can be enabled/disabled ONLINE • No application changes

  4. Data Compression and Space Savings Your mileage will vary.

  5. Data Compression and Workload Performance

  6. Top Customer Question - 1 • Question: Data compression increases the size of my database? File Size = 28 GB File Size = 20 GB Comp TB-2 2GB Compressed TB-1 (8GB) Empty Space 4GB Empty Space (16 GB) TB-2 ( 4 GB) Empty Space (14 GB) TB-1 (16 GB) • Suggestions: • Do nothing if the object needs to grow • Start by compressing the smaller object first • Use shrink. But it fragments the data • Bulk export/import into empty compressed table. Data availability? • Moving object to a new filegroup File Size = 22 GB File Size = 20 GB File Size = 26 GB Comp TB-1 4 GB Comp TB-1 4 GB Comp TB-2 (2 GB) Free Space ( 4 GB) TB-2 ( 4 GB) Empty Space (16 GB) TB-1 (16 GB)

  7. Top Customer Question - 2 • Question: I am not getting any or minimal compression? • ROW Compression: • No fixed length column • Fixed length columns but all bytes are used • Compressed row > 4K • PAGE Compression • No column prefix savings • No common values for page dictionary • Large row size implying 1 to few rows per page • Mostly LOB data

  8. Top Customer Question - 3 • How do I get PAGE compression on a HEAP? PAGE • Problem: Adhoc inserts on a new page will not be PAGE compressed in a HEAP • Suggestions • Rebuild HEAP periodically (ONLINE available) • Use TABLOCK when bulk importing into a HEAP Header Header ROW CI structure R1 R2 R3 R4 R5 BTREE PAGE

  9. Top Customer Question - 4 Index Related • Question: It is taking longer to rebuild index or heap • ROW compression takes approx. 1.5 times the CPU time used for rebuilding an index • PAGE compression takes approx. 2 to 5 times the CPU time used for rebuilding an index • Your mileage may vary • Question: Do I need to take object offline to enable compression? • ONLINE operations supported. Few unique values for the leading column of the index may reduce parallelism. This is similar to regular index • Compressing a heap with ONLINE = ON uses a single CPU for compression (or rebuild)

  10. Top Customer Question - 5 • Question: What is the impact of compression on Bulk Import?

  11. Top Customer Question - 6 • Question: What object(s) should I compress? • Evaluate Compression savings • General: DML heavy (ROW) vs Query heavy (PAGE) both for table/partition • Don’t compress all objects in the database without evaluation • If table is relatively small don’t bother compressing • Consider compression if table/partition accessed rarely • Look at index usage • Used Rarely? • Singleton lookup • Range Access

  12. Example: Enabling Compression • Unpartitioned table Index Index Index Table Table Uncompressed PAGE Compressed

  13. Example: Enabling Compression • Latest partition uncompressed Uncompressed PAGE Compressed ROW Compressed Jan-Mar Apr-June July-Sept Oct-Dec

  14. Customer Example: An SAP Deployment • Inputs: sp_estimate_data_compression_savings, dm_db_physical_index_usage_stats, SAP knowledge Computed: S=% scans; U=% updates ROW ~= PAGE => ROW High Update, Low Scan => ROW High Scan => PAGE Append Only => PAGE Read-only => PAGE

  15. Top Customer Question - 7 • How do I compress Unicode data? • SQL uses UCS-2 encoding scheme. • NCHAR and NVARCHAR data always takes 2 bytes of storage. • Waste of 1 byte/char for commonly deployed locales (e.g. ASCII) • Existing ROW compression ineffective • PAGE compression only helps for exact match. • Sample representation • ‘a’ = 0x61 (ASCII) and 0x0061 (UCS-2)

  16. Agenda • Customer Experience and Feedback • Unicode Compression in SQL2008R2 • Future Directions

  17. Most ISVs are switching their customers to the UNICODE version of applications. Competition Oracle supports UTF-8 encoding for Unicode. Results in 1 byte storage for ASCII and most European DB2 provides UTF8 and Unicode compression as well SQL2008R2: Unicode and Competitive challenge

  18. SQL2008R2: Unicode Solution • Use standard SCSU compression technique http://www.unicode.org/reports/tr6/tr6-4.html • No application change needed • Compression Achieved

  19. SQL2008R2: Enabling Compression • Enterprise Edition only • Types of data compressions • ROW • Stores fixed length values as variable length • Superset of vardecimal storage format • Row metadata optimized • BLOB/LOB is not ROW compressed • Unicode data is compressed. For most locales 50% saving Supported types NVARCHAR and NCHAR but not NTEXT • PAGE (includes ROW) • Column Prefix • Dictionary • Only in-row BLOB/LOB can potentially benefit from PAGE compression

  20. Example: Unicode Compression Create Table SQLCOMP (Name NVARCHAR(20)) Go Insert into SQLCOMP values (‘SERVER’) Insert into SQLCOMP values (‘ENGINE’) Insert into SQLCOMP values (‘LOADERS’) HEADER 0x53514C “SQL” Col-prefix 0x03534552564552 “?SERVER” 0x53514C534552564552 “SQLSERVER” 0x00530051004C005300450052005600450052 “SQLSERVER” 0x53514C454E47494E45 “SQLENGINE” 0x 03454E47494E45 “?ENGINE” 0x00530051004C0045004E00470049004E0045 “SQLENGINE” 0x53514C4C4F414445525310 “SQLLOADERS” 0x034C4F414445525310 “?LOADERS” 0x00530051004C004C004F00410044004500520053 “SQLLOADERS” PAGE COMPRESSION ROW COMPRESSION

  21. Changes to Estimate Compression Stored Procedure • SQL2008 RTM • Estimated compression savings = 0 if compression mode did not change • SQL2008R2 • Estimated compression savings non-zero if space can be further saved. Useful in • De-fragmentation space savings • Unicode Compression space savings

  22. Unicode Compression Sunil AgarwalSenior Program ManagerMicrosoft DEMO

  23. SQL2008R2: Unicode Compression Results

  24. Upgrade to SQL2008R2 • Scenarios • ROW compression enabled in SQL2008 • No database changes when upgraded • Unicode value compressed only if it saves space. It happens when • An existing value is updated • A new row is inserted • Index is rebuilt with ROW or PAGE compression • PAGE compression enabled in SQL2008 • Same as with ROW compression • No changes needed to existing scripts and DDL

  25. Future Directions and ASKs • We are looking into • Unicode Compression for in-row portion for NVARCHAR(MAX) • LOB Compression • XML compression • Make sp_estimate* available on all SKUs

  26. Related Contents • http://sqlcat.com/whitepapers/archive/2009/05/29/data-compression-strategy-capacity-planning-and-best-practices.aspx • www.sqlcat.com • http://blogs.msdn.com/sqlserverstorageengine • http://blogs.msdn.com/sqlcat/ • http://blogs.msdn.com/mssqlisv/ • http://www.unisys.com/eprise/main/admin/corporate/doc/41371394.pdf • http://search.hp.com/redirect.html?type=REG&qt=sql+server+data+compression&url=http%3A//h71028.www7.hp.com/ERC/downloads/4AA1-8766ENW.pdf%3Fjumpid%3Dreg_R1002_USEN&pos=1 • http://www.netapp.com/us/library/technical-reports/tr-3719.html

  27. Required Slide Track PMs will supply the content for this slide, which will be inserted during the final scrub. DAT Track Scratch 2 Win • Find the DAT Track Surface Table in the Yellow Section of the TLC • Try your luck to win a Zune HD • Simply scratch the game pieces on the DAT Track Surface Table and Match 3 Zune HDs to win

  28. Required Slide Resources Learning • Sessions On-Demand & Community • Microsoft Certification & Training Resources www.microsoft.com/teched www.microsoft.com/learning • Resources for IT Professionals • Resources for Developers http://microsoft.com/technet http://microsoft.com/msdn

  29. Required Slide Complete an evaluation on CommNet and enter to win!

  30. Sign up for Tech·Ed 2011 and save $500 starting June 8 – June 31st http://northamerica.msteched.com/registration You can also register at the North America 2011 kiosk located at registrationJoin us in Atlanta next year

  31. © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

  32. Required Slide

More Related