1 / 33

SQL Server 2012 Parallel Data Warehouse

SQL Server 2012 Parallel Data Warehouse. Meinrad Weiss Principal Consultant Trivadis AG. Trivadis solution portfolio and competences. CUSTOMER. IT SOLUTIONS, SERVICES & PRODUCTS. Business departments. Business Integration Services. Business Intelligence. Infrastructure Engineering.

Download Presentation

SQL Server 2012 Parallel Data Warehouse

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SQL Server 2012 Parallel Data Warehouse Meinrad Weiss Principal Consultant Trivadis AG

  2. Trivadissolutionportfolioandcompetences CUSTOMER IT SOLUTIONS, SERVICES & PRODUCTS Business departments Business Integration Services Business Intelligence InfrastructureEngineering Application Development Managed Services Training IT departments Integration, Application Performance Management, Security TECHNOLOGIESOracle, Microsoft, IBM, Open Source SQL Server 2012 Parallel Data Warehouse

  3. Trivadisfacts & figures • 11 Trivadis locationswithmorethan 600 employees • Financiallyindependentandsustainably profitable • Key figures 2011 • Revenue CHF 104 / EUR 84 Mio. • Services formorethan 800 clients in over 1,900 projects • 200 Service Level Agreements • More than 4,000 trainingparticipants • Research anddevelopmentbudget: CHF 5.0 / EUR 4 Mio. Hamburg ~200 employees Dusseldorf Frankfurt Stuttgart Vienna Munich Freiburg Basel ~30 employees Zurich Bern ~380 employees Lausanne 3 Trivadis– thecompany SQL Server 2012 Parallel Data Warehouse date

  4. Visit us, win a price and be prepared for your next adventure SQL Server 2012 Parallel Data Warehouse

  5. Agenda • Positioning Parallel Data Warehouse (PDW) • Architecture • Working with PDW • PDW and Big Data SQL Server 2012 Parallel Data Warehouse

  6. SQL Server 2012 Parallel Data Warehouse

  7. Data Warehouse – Products Positioning PDW with Distributed Data Architecture Scale Complexity HA by default SW-HW integration Balanced solution for mostly scan-centric workloads. 1 3 Max HW tune-up for most DW scenarios. 2 2 SQL Server 2012PDW Most flexible architecture for handling all DW scenarios. 3 1 SQL Server 2012 Fast Track SQL Server 2012 SQL Server 2012 Parallel Data Warehouse

  8. SQL Server 2012 Parallel Data Warehouse Built For Big Data Next-generation Performance At Scale Scale-Out Architecture on Industry Standard Hardware SQL Server 2012 Parallel Data Warehouse

  9. Agenda • Positioning Parallel Data Warehouse (PDW) • Architecture • Working with PDW • PDW and Big Data SQL Server 2012 Parallel Data Warehouse

  10. Parallel Data Warehouse at a glance • Shared-nothing parallel database system • Massively parallel processing (MPP) • A “Control” server that accepts user queries, generates a plan, and distributes operations in parallel to compute nodes • Multiple “Compute” servers running SQL Server • Delivered as an appliance • Balanced and pre-configured hardware and software • Scales from 2 to 56 Nodes • Fastest Time to Market SQL Server 2012 Parallel Data Warehouse

  11. Base Unit (1/4 Rack) • All hosts and VM’s run Windows Server 2012 Standard • All Fabric and workload activity happens in Hyper-V virtual machines, with Fabric VM’s sharing 1 server • Failover is handled by Hyper-V • PDW Agent runs on all hosts and all VMs, collects appliance health data on fabric and workload • Windows Storage Spaces handles mirroring and spares FAB AD Control VMM Host 0 Host 1 Compute 1 Host 2 Storage Spaces Compute 2 Host 3 IB & Ethernet Direct attached SAS SQL Server 2012 Parallel Data Warehouse

  12. Current Limitations: Performance and Scale Up today • Scale Up (and pay) ? ? $$$ $$$ $$$ Old serverwill be obsolete if bigger system is required SQL Server 2012 Parallel Data Warehouse

  13. SCALING FROM 2 TO 56 NODES Host 0 Appliance can grow with increasing workload • Provisioning consists of 3 phases: • Bare metal provisioning of new nodes • Provisioning of workload VMs and ‘hooking up’ to other workload VMs • Redistribution of data FAB AD Control VMM Host 1 Host 2 Storage Spaces Compute 1 Compute 1 Host 3 Compute 2 Compute 2 Host 2 Storage Spaces Host 3 IB & Ethernet Direct attached SAS SQL Server 2012 Parallel Data Warehouse

  14. Seamless Scalability 30 Nodes System Throughput 30 Nodes 20 Nodes Regular SQL Server( 1 Node) 10 Nodes 5 Nodes 4 8 12 16 20 24 28 32 36 40 SQL Server 2012 Parallel Data Warehouse SQL Server Compute Nodes

  15. High Availability • Storage Spaces manages the physical disks in the disk enclosures • Failover: • One cluster across the whole appliance • VMs are automatically migrated on host failure • Affinity and anti-affinity maps enforce rules FAB AD Control VMM Host 0 Host 1 Compute 1 Host 2 Storage Spaces Compute 2 Host 3 IB & Ethernet Direct attached SAS SQL Server 2012 Parallel Data Warehouse

  16. Agenda • Positioning Parallel Data Warehouse (PDW) • Architecture • Working with PDW • Introduction to Big Data • Microsoft and Big Data SQL Server 2012 Parallel Data Warehouse

  17. Time Dim Product Dim Date Dim ID Calendar Year Calendar Qtr Calendar Mo Calendar Day Prod Dim ID Prod Category Prod Sub Cat Prod Desc Store Dim Store Dim ID Store Name Store Mgr Store Size Distribution and Replication of Data: Replicate Smaller (<5GB ) Dimension Tables are Replicated on Every Compute Node SQL SQL SQL SQL TD PD MD SD TD PD MD SD Sales Facts Date Dim ID Store Dim ID Prod Dim ID Mktg Camp Id Qty Sold Dollars Sold TD PD MD SD TD PD MD SD Mktg Campaign Dim Result: Fact -Dimension Joins can be performed locally Mktg Camp ID Camp Name Camp Mgr Camp Start Camp End SQL Server 2012 Parallel Data Warehouse

  18. Create Replicated Table CREATE TABLE myTable   (     id int NOT NULL, lastNamevarchar(20), zipCodevarchar(6)   ) WITH (CLUSTERED COLUMNSTORE INDEX); • Creates tables on each of the individual compute nodes and assigns them to the REPLICATED file group. • Data Compression is automatically turned on SQL Server 2012 Parallel Data Warehouse

  19. Time Dim Product Dim Date Dim ID Calendar Year Calendar Qtr Calendar Mo Calendar Day Prod Dim ID Prod Category Prod Sub Cat Prod Desc Store Dim Store Dim ID Store Name Store Mgr Store Size Distribution and Replication of Data: Distribute Larger (> 10 GB) Fact Table is Hash Distributed Across All Compute Nodes SQL SQL SQL SQL TD PD SF-1 SF-1 SF-1 SF-1 MD SD SF-1 SF-1 TD PD SF-1 SF-2 Sales Facts MD SD Date Dim ID Store Dim ID Prod Dim ID Mktg Camp Id Qty Sold Dollars Sold SF-1 SF-1 TD PD SF-1 SF-3 MD SD SF-1 TD PD SF-1 SF-1 SF-4 MD SD Mktg Campaign Dim Mktg Camp ID Camp Name Camp Mgr Camp Start Camp End SQL Server 2012 Parallel Data Warehouse

  20. Distribution on a PDW CREATE TABLE myTable (column Defs)WITH(DISTRIBUTION = HASH (id)); PDW Node 1 Create Table <myTable GUID>_a Create Table <myTable GUID>_b … Create Table <myTable GUID>_h 8 Tables per Node PDW Node 2 Create Table <myTable GUID>_a Create Table <myTable GUID>_b … Create Table <myTable GUID>_h PDW Node … PDW Node 8 Create Table <myTable GUID>_a Create Table <myTable GUID>_b … Create Table <myTable GUID>_h Final Result: 64individual tables across a 8node (1 data rack HP) appliance SQL Server 2012 Parallel Data Warehouse

  21. Lightning Fast Data Query Processing xVelocity gives next-gen performance Products • Columnstore Provides Dramatic Performance • Updateable and clustered xVelocitycolumnstore • Stores data in columnar format • Memory-optimized for next-generation performance • Updateable to support bulk and/or trickle loading Country Sales Supplier Customer Save Timeand Costs Up to50X Faster Up to 15x compression Real-TimeDW SQL Server 2012 Parallel Data Warehouse

  22. Web-Based Management Dashboard SQL Server 2012 Parallel Data Warehouse

  23. Web-Based Management Dashboard (2) SQL Server 2012 Parallel Data Warehouse

  24. Web-Based Management Dashboard (3) SQL Server 2012 Parallel Data Warehouse

  25. PDWQuerying 1Petabyte of data in 1 second 294‘000‘000‘000 rows http://www.sqlpass.org/summit/2012/DayOneKeynote.aspx SQL Server 2012 Parallel Data Warehouse

  26. Reference Case: Today’s process flow / Building blocks Baseline : Once data extracted from SAP: Time taken to create end-end Reports and Cubes insights 13+ hours (In production typical 20+ hours with multiple companies) SAP Suspicious words Reports 3hr21min MasterFinance table population 6 hours 21min DB_MasterTables DB_ ReportTables DB_GSAPOP DW_FinanceTransactions FinanceCube 1 hour 6 hours SQL Server 2012 Parallel Data Warehouse

  27. Reference Case: Audit Process with PDW Once data is extracted from SAP: Creating 5 CM Reports & FSCP Finance Cube; Time taken: 30 Minutes SAP MasterFinance table population 8m50sec All 5 Reports within 6min DB_MasterTables DB_ ReportTables DB_GSAPOP (80) (80) (80) (80) FinanceCube DW_FinanceTransactions load from FlatFile 23min 11 min 10m10sec SQL Server 2012 Parallel Data Warehouse

  28. Agenda • Positioning Parallel Data Warehouse (PDW) • Architecture • Working with PDW • PDW and Big Data SQL Server 2012 Parallel Data Warehouse

  29. Seamless integration of two worlds Introducing PolyBase • Single Query; Structured and Unstructured • Query and join Hadoop tables with Relational Tables • Use Standard SQL language • Select, From Where SQL SQL Server 2012 PDW Powered by PolyBase Existing SQLSkillset Save Timeand Costs Analyze AllData Types HDFS (Hadoop) Database SQL Server 2012 Parallel Data Warehouse

  30. Additional Resources SQL Server Parallel Data Warehouse (PDW) Landing Page: www.microsoft.com/PDW Introduction to Polybase: http://www.microsoft.com/en-us/sqlserver/solutions-technologies/data-warehousing/polybase.aspx Price/TB comparison: http://www.valueprism.com/resources/resources/Resources/PDW%20Compete%20Pricing%20FINAL.pdf HP QuickSpecs http://h18000.www1.hp.com/products/quickspecs/13830_div/13830_div.html http://h18000.www1.hp.com/products/quickspecs/13830_div/13830_div.pdf SQL Server 2012 Parallel Data Warehouse

  31. Conclusion • SQL Server 2012 Parallel Data Warehouse is an MPP based appliance optimized for data warehouse workload • Very Similar to a regular SQL Server, but T-SQL not 100 % identical • Support for development via SQL Server Data Tools • Scalable, balanced platform • Not just storage (data + CPU power) • Central part of Microsoft BI stack • Well integrated in Microsoft management tools SQL Server 2012 Parallel Data Warehouse

  32. Visit us, win a price and be prepared for your next adventure SQL Server 2012 Parallel Data Warehouse

  33. Wettbewerb Let‘sgo.

More Related