1 / 19

M.Tech.CS By Research 1st Sem Seminar

Overview. Traditional database systems are tuned to many, small, simple queries.Some new applications use fewer, more time-consuming, analytic queries.New architectures have been developed to handle analytic queries efficiently.. Background. DSS (Decision Support System)Gain competitiveness for

evadne
Download Presentation

M.Tech.CS By Research 1st Sem Seminar

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. M.Tech.(CS) By Research 1st Sem Seminar Title of the Project : Computing & Querying Data Cube – A Parallel Approach. Supervisors : Dr.A.K. Ramani (Prof. & Head, SCS, DAV Indore ) Dr. B.S.Panda (Associate Professor, Maths Department ,IIT Delhi)

    2. Overview Traditional database systems are tuned to many, small, simple queries. Some new applications use fewer, more time-consuming, analytic queries. New architectures have been developed to handle analytic queries efficiently.

    3. Background DSS (Decision Support System) Gain competitiveness for business Data warehouse Maintain historical information Use “Data Cube” to summarize results Identify trends Performance issue (time and space) Need to reuse result (materialization of view)

    4. The Data Warehouse The most common form of Data integration. Copy sources into a single DB (warehouse) and try to keep it up-to-date. Usual method: periodic reconstruction of the warehouse, perhaps overnight. Frequently essential for analytic queries.

    5. OLTP Most Database operations involve Online Transaction Processing (OLTP). Short, simple, frequent queries and/or modifications, each involving a small number of tuples. Examples: Answering queries from web interface, sales at cash registers, selling airline tickets.

    6. OLAP On-Line Application Processing (OLAP, or “analytic”) queries are, typically: Few, but complex queries --- may run for hours. Queries do not depend on having an absolutely up-to-date database.

    7. OLAP Examples Amazon analyses purchases by its customers to come up with an individual screen with products of likely interest to the customer. Analysts at Wal-Mart look for items with increasing sales in some region.

    8. Common Architecture Databases at store branches handle OLTP. Local store databases copied to a central warehouse overnight. Analysts use the warehouse for OLTP.

    9. ROLLUP & CUBE ROOT ROLL UP operator delivers aggregates and superaggregates within a GROUP BY. Used by report writers to extract statistics summary information from result sets. The cummulative aggregates can be used in reports, charts and graphs. Without ROLLUP subtotals can be produce by UNION ALL. For n columns n+1 SELECT statement.

    10. Introduction of Datacube Datacube Dimensionality (number of GROUP-BYs) Aggregated data: Values in each cell Dimension of datacube: Detail of summary Higher Dimension: Higher detail Common Operations Drill down: Look in more detail Roll Up: Look in less detail

    11. Definition of Datacube Users of DSS often see data in the form of Data Cubes. A Cube can be 2 dimensional, 3 dimensional,or higher dimensional A Data cube is defined to be data abstraction that allows one to view aggregated data from a number of perspectives or “views”.

    12. Our Problem Physically materialize the whole data cube Best query response Heavy pre-computing, large storage space i,e Time efficient but space inefficient Materialize Nothing Worse query response Dynamic query evaluation, less storage space i,e Space efficient but time inefficient

    13. Problem on materialized views Materialize only part of the data cube Balance the storage space and response What Is the best subject to materialize

    14. Data? View? We use data cube to modify aggregate data So what we use to model view Lattice

    15. Lattice Multidimensional data can be viewed as lattice of cubiods.

    16. Data Cube Technology Sequential Computation Pipe Sort Algorithm: Undertaken by Sunita Sarawagi and others of the IBM Research Center Other Algorithm : Pipe Hash and Array-based

    17. Parallel Computation As the data sizes of organisations have increased at an exponential rate, the efforts of the researchers too have been increased to a great extent towards providing parallel solutions for the problem The algorithm uses a cluster based approach, consisting of multiprocessors grouped together to perform computations simultaneoulsy and separate memories and processing capabilities for faster results.

    18. References S.Chaudhari and Kyuseok Shim. Including Group-By Query Optimization In proceedings of the twentieth International Conference on Very Large Databases (VLDB) 2) S. Chaudhari et al. An overview of Data Warehousing and OLAP Technology. ACM-SIGMOD Record

    19. References 3) J.Gray, A. Bosworth, A. Layman, and H.Piramish. Data Cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total. In the proceeding of the 12th Intl. Conference on Data Engineering 4) V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In the proceedings of the ACM SIGMOD Conference on Management of Data

More Related