1 / 11

Parallel OLAP

Parallel OLAP. Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University. Joint Work with F. Dehne T. Eavis S. Hambrusch. Decision Support Systems. A time-oriented analysis of scientific or organizational data. Data Minning. Online Analytical Processing (OLAP).

suchin
Download Presentation

Parallel OLAP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel OLAP Andrew Rau-ChaplinFaculty of Computer ScienceDalhousie University Joint Work withF. DehneT. EavisS. Hambrusch

  2. Decision Support Systems • A time-oriented analysis of scientific or organizational data Data Minning Online Analytical Processing (OLAP) Information Processing

  3. Data Warehousing for Decision Support • Operational data collected into DW • DW used to support multi-dimensional views • Views form the basis of OLAP processing • Our focus: the OLAP server

  4. A B C Data Cube Generation ABC • Proposed by Gray et al in 1995 • Can be generated from a relational DB but… AC BC AB 34 12 21 18 21 B A C 83 38 50 The cuboid ABC (or CAB) ALL

  5. Core OLAP Operations • Five fundamental OLAP operations: roll-up, drill-down, slice, dice, and pivot • Range Queries

  6. The Challenge • Design and build a parallel ROLAP system • Full cube generation • Partial cube generation • Indexing and query resolution • For • High dimensionality: 10 – 30 D • Large input data sizes: Gigabytes • Large output data sizes: Terabytes • Implications • Parallel + external memory • Shared disk + Shared nothing

  7. Communication Fabric p1 p2 p3 p4 pn Communication Fabric p1 p2 p3 p4 pn The Architectural Model • Shared Disk • A set of P processors connected via an interconnection fabric • standard-sized local memory • concurrent access to a shared disk array • Shared Nothing • A set of p processors connected via and interconnection fabric • Standard size local memory • Independent local disk(s) • Algorithm Design • CGM (Coarse Grained Multicomputer)

  8. Coarse Grained Multicomputer • A set of P processors • Arbitrary communication topology or shared memory • m memory per processor, m >>p • Communication round consists of an h-relation in which all proc. send and receive O(m) data Communication Fabric

  9. Model Year Colour Sales Chevy 1990 Red 5 Chevy 1990 Blue 87 Ford 1990 Green 64 Ford 1990 Blue 99 Ford 1991 Red 8 Ford 1991 Blue 7 Model Year Colour Sales Chevy 1990 Blue 87 Chevy 1990 Red 5 Chevy 1990 ALL 92 Chevy ALL Blue 87 Chevy ALL Red 5 Chevy ALL ALL 92 Ford 1990 Blue 99 Ford 1990 Green 64 Ford 1990 ALL 163 Ford 1991 Blue 7 Ford 1991 Red 8 Ford 1991 ALL 15 Ford ALL Blue 106 Ford ALL Green 64 Ford ALL Red 8 ALL 1990 Blue 186 ALL 1990 Green 64 ALL 1991 Blue 7 ALL 1991 Red 8 Ford ALL ALL 178 ALL 1990 ALL 255 ALL 1991 ALL 15 ALL ALL Blue 193 ALL ALL Green 64 ALL ALL Red 13 ALL ALL ALL 270 MOLAP vs. ROLAP

  10. Existing Parallel Results • Goil & Choudhary • MOLAP • Approach • Parallelize the generation of each cuboid • Challenge • > 2d comm. rounds

  11. Parallelizing the Data Cube • Generating Data Cubes (Shared Disk) • Generating Data Cubes (Shared Nothing) • Generating Partial Data Cubes • Parallel Multi-dimensional Indexing • Conclusions and Future Work

More Related