1 / 25

VLDB 2005 31st International Conference on Very Large Databases

VLDB 2005 31st International Conference on Very Large Databases. Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Systems. Raghunath K. Othayoth Hewlett-Packard Company. Meikel Poess Oracle Corporation. Agenda. • Grid Computing. • Hardware Support.

mnesbit
Download Presentation

VLDB 2005 31st International Conference on Very Large Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VLDB 2005 31st International Conference on Very Large Databases Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Systems Raghunath K. Othayoth Hewlett-Packard Company Meikel Poess Oracle Corporation

  2. Agenda • Grid Computing • Hardware Support • Software Support • TPC-H Result Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 2

  3. Grid Computing 1) application and user perspective: −just like the power grid: Have computing power delivered as requested 2) implementation perspective: −Data virtualization −Resource provisioning −High availability Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 3

  4. From Research to Industry • Research projects using grid technology: −Seti@Home −World Community Grid • Traditionally companies used islands of systems to implement corporate data warehouses −Unable to share resources −Too rigid to answer rapidly changing business needs −Cannot be scaled indefinitely  HP and Oracle are applying the grid concept to industry data warehouses (DW) Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 4

  5. Commercial Grid Market • IDC calls grid computing the fifth generation of computing • Commercial grid computing revenue was −2003: 1 Billion USD −2008: 12 Billion USD [estimate] • Forrester Research: −37% of enterprises are piloting, rolling out or have implemented some form of grid computing. −30% of firms are considering grid technology. (IDC,2004.Www.oracle.com/technology/tech/grid/collateral/idc_oracle10g.pdf) (Forrester, 2004. www.forrester.com/go?docid=34449) Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 5

  6. N-tier v/s Grid Computing Application Servers (middle tier) (middle tier) Application Servers Database Servers Servers Database DSS DSS Servers Servers Resource Virtualization and Provisioning Storage Area network (SAN) Network Attached Storage (NAS) OLTP Database Servers and Direct Attach Storage DSS Servers Direct Attach Storage Application Servers (middle tier) Shared Pool of commodity Servers Internet Traditional multi tier datacenter infrastructure – Web servers, application servers and database servers are preconfigured and pre allocated. Grid Computing - Infrastructure is dynamically provisioned to applications that have been virtualized. Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 6

  7. Commercial Grid Components • Commodity hardware (x-86 based servers) • Linux OS - cost effective • SAN – highly scalable • High speed interconnect (Gigabit Ethernet, InfiniBand) • Management software (manage as individual servers or manage as one large virtual servers) • Database layer (ties the resources together, Dynamic resource allocation, parallel processing) Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 7

  8. Commercial Grid benefits • High scalability • High flexibility • Low total cost of ownership • High availability • Easy manageability Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 8

  9. Oracle Features for a Data Warehouse Grid • Dynamic parallel processing • Data virtualization and dynamic resource provisioning in DW • Smart inter node parallelism Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 9

  10. Dynamic Parallel Processing • Queries are automatically parallelized to maximize resource utilization • Degree of Parallelism (DOP) is adjusted according to resource availability and computing demands at parse time • DOP is automatically adjusted when: −Number of concurrent users change −Nodes are taken down for maintenance −Nodes are added due to increased computing demand (scale-out) −Nodes are assigned to different application Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 10

  11. Data Virtualization and Dynamic Resource Provisioning in DW • Oracle’s shard everything architecture provides data virtualization and provisioning in Data Warehouses Interconnect Nodes 1 2 3 4 5 6 7 8 Disk Subsystem Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 11

  12. Data Virtualization and Dynamic Resource Provisioning in DW • Oracle’s shard everything architecture provides data virtualization and provisioning in Data Warehouses OLAP Reports ETL Workload Type Interconnect Nodes 1 2 3 4 5 6 7 8 Disk Subsystem Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 12

  13. Data Virtualization and Dynamic Resource Provisioning in DW • Oracle’s shard everything architecture provides data virtualization and provisioning in Data Warehouses During peak working hours OLAP Reports ETL Workload Type Interconnect Nodes 1 2 3 4 5 6 7 8 Disk Subsystem Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 13

  14. Data Virtualization and Dynamic Resource Provisioning in DW • Oracle’s shard everything architecture provides data virtualization and provisioning in Data Warehouses During the night OLAP Reports ETL Workload Type Interconnect Nodes 1 2 3 4 5 6 7 8 Disk Subsystem Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 14

  15. Data Virtualization and Dynamic Resource Provisioning in DW • Oracle’s shard everything architecture provides data virtualization and provisioning in Data Warehouses During short intervals when the DW is synchronized with the OLTP system OLAP Reports ETL Workload Type Interconnect Nodes 1 2 3 4 5 6 7 8 Disk Subsystem Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 15

  16. Data Virtualization and Dynamic Resource Provisioning in DW • Oracle’s shard everything architecture provides data virtualization and provisioning in Data Warehouses Without response time requirements all types of workload can run on all nodes OLAP Reports ETL Workload Type Interconnect Nodes 1 2 3 4 5 6 7 8 Disk Subsystem Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 16

  17. Data Virtualization and Dynamic Resource Provisioning in DW • This concept can be extended to different applications OLTP DW DM Workload Type Interconnect Nodes 1 2 3 4 5 6 7 8 Disk Subsystem Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 17

  18. Data Virtualization and Dynamic Resource Provisioning in DW • This concept can be extended to different applications OLTP DW DM Workload Type Interconnect Nodes 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 Disk Subsystem Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 18

  19. Smart Inter Node Parallelism • Optimizer avoids inter node parallelism when possible  reduced interconnect traffic  faster execution time 1) node locality − If possible operations are executed on one node − When the DOP of an operation can be satisfied with resources of one server it executes locally 2) full partition wise join − If two tables are equipartitioned on their join key, the join can be divided into smaller joins between partitions 3) partial partition wise join − If only one table is partitioned on the join key, the other table is dynamically repartitioned on the join key to break the large join into smaller joins. Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 19

  20. TPC-H Benchmark • The industry standard benchmark for data warehouse applications • Stresses grid based data warehouses: −Complex queries • Sequential scans of large amounts of data • Aggregations of large amounts of data • Multi-table joins • Extensive sorting of very large sets of data −Single-user test −Multi-user test −Parallel insert operations −Parallel delete operations Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 20

  21. Benchmarked Configuration hp ProLiant DL585 Cluster 48P : : InfiniCon Systems InfiIO3016 : 12 x hp SAN Switch 2/16 2 x hp ProCurve Switch 4148gl 48 x hp MSA1000 12 x hp ProLiant DL585- Storage Area Network 4x AMD 848 Opteron™ 2.2GHz/1MB 8GB 2 x On-board NICs 6 x hp fca 2214 DC 1 x InfiniCon Systems InfiniServ 7000 HCA Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 21

  22. Current results 1,000 GB Results System Availability Operating System Date Rank Company System QphH Price/QphH Database Submitted Cluster Oracle HP Integrity Superdome Enterprise Server Database 10g R2 Enterprise Edt w/Partitioning HP UX 11.i V2 64 bit 1 68,100 59.00 US $ 01/18/06 08/08/05 N SUSE LINUX Enterprise Server 9 IBM eServer xSeries 346 IBM DB2 UDB 8.2 2 53,451 32.80 US $ 02/14/05 02/14/05 Y Oracle 10g RAC with Partitioning Oracle Database 10g Enterprise Edition Oracle Database 10g Enterprise Edition IBM DB2 UDB 8.2 Red Hat Enterprise Linux AS 3 HP ProLiant DL585 Cluster 48P 3 35,141 59.93 US $ 10/21/04 10/22/04 Y Sun PRIMEPOWER 2500 34,492 155.99 Euros 03/08/04 4 09/08/03 N Solaris 9 Sun *** PRIMEPOWER 2500 34,492 140.96 US $ 03/08/04 11/13/03 N Solaris 9 IBM eServer p5 570 with DB2 UDB IBM AIX 5L V5.3 5 26,156 53.43 US $ 12/15/04 09/15/04 Y Microsoft Windows Server 2003 Datacenter Edition 64- bit Microsoft Windows Server 2003 Datacenter Edition 64- bit Microsoft Windows Server 2003 Datacenter Edition 64- bit IBM AIX 5L V5.2 Microsoft SQL Server 2005 Enterprise Edition 64bit NEC Express5800/1320Xe (32SMP) 6 22,967 68.51 US $ 12/07/05 07/19/05 N Microsoft SQL Server 2005 Enterprise Edition 64bit Unisys ES7000 Orion 440 Enterprise Server 7 21,505 41.92 US $ 12/07/05 06/27/05 N Microsoft SQL Server 2005 Enterprise Edition 64bit NEC Express5800/1320Xe (32SMP) 8 20,231 76.06 US $ 12/07/05 06/07/05 N IBM eServer p655 with DB2 UDB IBM DB2 UDB 8.1 9 20,221 69.41 US $ 06/08/04 12/08/03 Y Microsoft Windows Server 2003 Datacenter Edition Oracle Database 10g release2 Enterprise Edt NovaScale 5160 10 15,069 44.32 US $ 12/20/05 06/20/05 N Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 22

  23. Result Analysis • Leadership performance −Query performance of 35,141 QphH @ 1000GB −Price-to-performance ratio of $60/QphH @ 1000GB Database grid of ProLiant systems with multiple Opteron–- x86 processors deliver performance comparable to large SMP systems The Linux operating system delivers the throughput and processing demands necessary to achieve the benchmark result Oracle’s 10g + RAC database delivers consistent, high performance query execution in large grid environments Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 23

  24. Future Hardware for Grid – HP BladeSystems Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 24

  25. Conclusion • Grid is ready for prime time • In grid computing resources are provisioned on demand and virtualized for applications to meet today’s challenging business needs • Commodity x-86 based servers and blade servers offer reduced total cost of ownership • Overcomes the natural limitations of SMP systems such as number of processors, memory and disk arrays Large Scale Data Warehouses on Grid: Oracle Database 10g and HP ProLiant Servers VLDB 2005 - 31st International Conference -Trondheim, Norway 4 January 2020 25

More Related