Why multi dimensional databases
This presentation is the property of its rightful owner.
Sponsored Links
1 / 53

Why Multi-Dimensional Databases? PowerPoint PPT Presentation


  • 124 Views
  • Uploaded on
  • Presentation posted in: General

Why Multi-Dimensional Databases?. No single "best" data structure for all applications within an enterprise Select the most appropriate data structure on a case-by-case basis from a palette of standard database structures Multidimensional Databases for OLAP?. Why Multi-Dimensional Databases?.

Download Presentation

Why Multi-Dimensional Databases?

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Why multi dimensional databases

Why Multi-Dimensional Databases?

  • No single "best" data structure for all applications within an enterprise

  • Select the most appropriate data structure on a case-by-case basis from a palette of standard database structures

  • Multidimensional Databases for OLAP?


Why multi dimensional databases1

Why Multi-Dimensional Databases?

  • From econometric research conducted at MIT in the 1960s, the multidimensional database has matured into the database engine of choice for data analysis applications

  • Inherent ability to integrate and analyze large volumes of enterprise data

  • Offers a good conceptual fit with the way end-users visualize business data

    • Most business people already think about their businesses in multidimensional terms

    • Managers tend to ask questions about product sales in different markets over specific time periods


Multidimensional database

Multidimensional Database

  • Spreadsheets – A 2D database?

  • Functionalities

  • What about a stack of similar spreadsheets for different times?

  • Limitations?

We can not relate data in different

sheets easily


Multidimensional database1

Multidimensional Database

An MDDB is a computer software system designed to allow for the efficient and convenient storage and retrieval system of large volumes of data that is

  • Intimately related &

  • Stored, viewed and analyzed form different perspectives

    These perspectives are called Dimensions


A motivating example

A Motivating Example

An automobile manufacturer wants to increase sale volumes by examining sales data collected throughout the organization. The evaluation would require viewing historical sales volume figures from multiple dimensions such as

  • Sales volume by model

  • Sales volume by color

  • Sales volume by dealer

  • Sales volume over time


Why multi dimensional databases

SALES VOLUMES FOR GLEASON DEALERSHIP

  • MODELCOLORSALES VOLUME

  • -----------------------------------------------------------------------------------------------------------------------------------

  • MINI VANBLUE6

  • MINI VANRED5

  • MINI VANWHITE4

  • SPORTS COUPEBLUE3

  • SPORTS COUPERED5

  • SPORTS COUPEWHITE5

  • SEDANBLUE4

  • SEDANRED3

  • SEDANWHITE2


Why multi dimensional databases

Multidimensional Array Structure

6

5

4

M

O

D

3

5

5

E

L

4

3

2

COLOR

Measurement

Sales Volumes

Dimension

MiniVan

Coupe

Sedan

Blue

Red

White

Dimension

Positions


Rdbms vs mdd

RDBMS vs. MDD

  • Multidimensional array structure represents a higher level of organization than the relational table

  • Perspectives are embedded directly into the structure in the multidimensional model

    • All possible combinations of perspectives containing a specific attribute (the color BLUE, for example) line up along the dimension position for that attribute.

  • Perspectives are placed in fields in the relational model - tells us nothing about field contents.


  • Why multi dimensional databases

    RDBMS vs. MDD

    • MDD makes data browsing and manipulation intuitive to the end-user

    • Any data manipulation action possible with a MDD is also possible using relational technology

    • Substantial cognitive advantages in query formulation

    • Substantial computational performance advantages in query processing when using MDD


    Why multi dimensional databases

    SALES VOLUMES FOR ALL DEALERSHIPS

    MODELCOLORDEALERSHIPVOLUME

    ----------------------------------------------------------------------------------------------------------------------------------------------------

    • MINI VANBLUECLYDE6

    • MINI VANBLUEGLEASON6

    • MINI VANBLUECARR2

    • MINI VANREDCLYDE3

    • MINI VANREDGLEASON5

    • MINI VANREDCARR5

    • MINI VANWHITECLYDE2

    • MINI VANWHITEGLEASON4

    • MINI VANWHITECARR3

    • SPORTS COUPEBLUECLYDE2

    • SPORTS COUPEBLUEGLEASON3

    • SPORTS COUPEBLUECARR2

    • SPORTS COUPEREDCLYDE7

    • SPORTS COUPEREDGLEASON5

    • SPORTS COUPEREDCARR2

    • SPORTS COUPEWHITECLYDE4

    • SPORTS COUPEWHITEGLEASON5

    • SPORTS COUPEWHITECARR1

    • SEDANBLUECLYDE6

    • SEDANBLUEGLEASON4

    • SEDANBLUECARR2

    • SEDANREDCLYDE1

    • SEDANREDGLEASON3

    • SEDANREDCARR4

    • SEDANWHITECLYDE2

    • SEDANWHITEGLEASON2

    • SEDANWHITECARR3


    Mutlidimensional representation

    M

    O

    D

    E

    L

    DEALERSHIP

    COLOR

    Mutlidimensional Representation

    Sales Volumes

    Mini Van

    Coupe

    Carr

    Sedan

    Gleason

    Clyde

    Blue

    Red

    White


    Viewing data an example

    M

    O

    D

    E

    L

    DEALERSHIP

    COLOR

    Viewing Data - An Example

    Sales Volumes

    Assume that each dimension has 10 positions, as shown in the cube above


    Viewing data an example1

    Viewing Data - An Example

    • How many records would be there in a relational table?

    • Implications for viewing data from an end-user standpoint?

    SALES VOLUMES FOR ALL DEALERSHIPS

    MODEL COLORDEALERSHIPVOLUME

    MINI VANBLUECLYDE 2

    MINI VAN BLUEGLEASON 2

    MINI VANBLUECARR 2

    MINI VANREDCLYDE 1

    MINI VANWHITEGLEASON 3

    RECORD NUMBER.... 998

    RECORD NUMBER.... 999

    RECORD NUMBER....1000


    Why multi dimensional databases

    Performance Advantages

    • Volume figure when car type = SEDAN, color=BLUE, & dealer=GLEASON?

    • RDBMS – all 1000 records might need to be searched to find the right record

    • MDB has more ‘knowledge’ about where the data lies

    • Max. of 30 position searches!!

    • Average case

      15 vs. 500


    Why multi dimensional databases

    Performance Advantages

    • Total Sales across all colors and dealers when model = SEDAN?

    • RDBMS – all 1000 records must be searched to get the answer

    • MDB – Sum the contents of one 10x10 ‘slice’


    Why multi dimensional databases

    Performance Advantages

    • Data manipulation that requires a minute in RDBMS may require only a few seconds in MDB

    • MDBs are an order of magnitude faster than RDBMSs

    • Performance benefits are more for queries that generate cross-tab views of data

    • The performance advantages offered by multidimensional technology facilitates the development of interactive decision support applications like OLAP that can be impractical in a relational environment.


    Why multi dimensional databases

    RDBMS vs. MDB

    • Any data manipulation action possible with a multidimensional database is also possible using relational technology

    • MDBs however offer several advantages like:

      • Ease of data presentation and navigation

      • Ease of maintenance

      • Performance


    Why multi dimensional databases

    Ease of Data Presentation & Navigation

    • Intuitive spreadsheet like data views are natural output of MDBs

    • Obtaining the same views in a relational environment, requires either a complex SQL or a SQL generator against a RDB to convert the table outputs into a more intuitive format

    • Top N queries are not possible with SQL at all


    Why multi dimensional databases

    Ease of Maintenance

    • Ease of maintenance because data is stored as it is viewed

    • No additional overhead is required to translate user queries into requests for data

    • To provide same intuitiveness, RDBs use indexes and sophisticated joins which require significant maintenance and storage


    Why multi dimensional databases

    Performance

    • Performance of MDBs can be matched by RDBs through database tuning

    • Not possible to tune the database for all possible adhoc queries

    • Tuning requires resources of an expensive DB specialist

    • Aggregate navigators are helping RDBs to catch up with MDBs as far as aggregation queries are concerned


    Adding dimension an example

    M

    O

    D

    E

    L

    DEALERSHIP

    COLOR

    COLOR

    COLOR

    JANUARY

    FEBRUARY

    MARCH

    Adding Dimension - An Example

    Sales Volumes

    Mini Van

    Mini Van

    Mini Van

    Coupe

    Coupe

    Coupe

    Carr

    Carr

    Carr

    Sedan

    Sedan

    Sedan

    Gleason

    Gleason

    Gleason

    Clyde

    Clyde

    Clyde

    Blue

    Red

    White

    Blue

    Red

    White

    Blue

    Red

    White


    When is mdd in appropriate

    When is MDD (In)appropriate?

    First, consider situation 1

    PERSONNEL

    EMPLOYEE#

    LAST NAME

    EMPLOYEE AGE

    SMITH

    01

    21

    REGAN

    12

    19

    FOX

    31

    63

    WELD

    14

    31

    KELLY

    54

    27

    LINK

    03

    56

    KRANZ

    41

    45

    LUCUS

    33

    41

    WEISS

    23

    19


    Why multi dimensional databases

    When is MDD (In)appropriate?

    Now consider situation 2

    1. Set up a MDD structure for situation 1, with LAST NAME

    and Employee# as dimensions, and AGE as the measurement.

    2. Set up a MDD structure for situation 2, with MODEL and

    COLOR as dimensions, and SALES VOLUME as the measurement.


    Why multi dimensional databases

    When is MDD (In)appropriate?

    21

    19

    63

    L

    A

    6

    5

    4

    31

    M

    S

    O

    T

    D

    3

    5

    5

    27

    E

    N

    L

    A

    4

    3

    2

    M

    56

    E

    45

    COLOR

    41

    19

    EMPLOYEE #

    MDD Structures for the Situations

    EmployeeAge

    Smith

    Regan

    Sales Volumes

    Fox

    Mini Van

    Weld

    Coupe

    Kelly

    Link

    Sedan

    Red

    White

    Blue

    Kranz

    Lucas

    Weiss

    31

    41

    23

    01

    14

    54

    03

    12

    33

    Note the sparse between the two MDD representations


    Why multi dimensional databases

    When is MDD (In)appropriate?

    • Our sales volume dataset has a great number of meaningful interrelationships

    • Interrelationships more meaningful than individual data elements themselves.

    • The greater the number of inherent interrelationships between the elements of a dataset, the more likely it is that a study of those interrelationships will yield business information of value to the company.

    • Highly interrelated dataset types be placed in a multidimensional data structure for greatest ease of access and analysis


    Why multi dimensional databases

    When is MDD (In)appropriate?

    • No last name is matching with more than one emp # and no emp # is matching with more than one last name

    • In contrast, there is a sales figure associated with every combination of model and color resulting in a completed filled up 3x3 matrix

    • Performance suffers (RDB 9 vs. MDB 18)


    Why multi dimensional databases

    When is MDD (In)appropriate?

    • The relative performance advantages of storing multidimensional data in a multidimensional array increase as the size of the dataset increases

    • The relative performance disadvantages of storing non-multidimensional data in a multidimensional array increase as the size of the dataset increases.

    • NO inherent value of storing Non-multidimensional data (employee data) in multidimensional arrays


    Why multi dimensional databases

    When is MDD Appropriate?

    • The greater the number of inherent interrelationships between the elements of a dataset, the more likely it is that a study of those interrelationships will yield business information of value to the company.

    • Most companies have limited time and resources to devote to analyzing data

    • It therefore becomes critical that these highly interrelated dataset types be placed in a multidimensional data structure for greatest ease of access and analysis.


    Why multi dimensional databases

    When is MDD Appropriate?

    • Examples of applications that are suited for

    • multidimensional technology:

    • Financial Analysis and Reporting

    • Budgeting

    • Promotion Tracking

    • Quality Assurance and Quality Control

    • Product Profitability


    Why multi dimensional databases

    MDD Features - Rotation

    6

    5

    4

    6

    3

    4

    M

    C

    O

    O

    D

    L

    3

    5

    5

    5

    5

    3

    E

    O

    L

    R

    4

    3

    2

    4

    5

    2

    o

    ( ROTATE 90

    )

    COLOR

    MODEL

    Sales Volumes

    Mini Van

    Blue

    Coupe

    Red

    Sedan

    White

    Mini Van

    Coupe

    Sedan

    Blue

    Red

    White

    View #1: ModelxColor

    View #2: ColorxModel

    • Also referred to as “data slicing.”

    • Each rotation yields a different slice or two dimensional table

    • of data.


    Mdd features rotation

    Sales Volumes

    M

    C

    C

    Mini Van

    Blue

    Blue

    O

    O

    O

    D

    L

    L

    Coupe

    Red

    Red

    E

    O

    O

    L

    R

    R

    Mini Van

    Carr

    Carr

    Sedan

    White

    White

    Gleason

    Gleason

    Coupe

    Clyde

    Clyde

    Sedan

    Sedan

    Coupe

    Mini Van

    Blue

    Red

    White

    Carr

    Gleason

    Clyde

    DEALERSHIP

    COLOR

    MODEL

    o

    o

    o

    ( ROTATE 90

    )

    ( ROTATE 90

    )

    ( ROTATE 90

    )

    DEALERSHIP

    DEALERSHIP

    MODEL

    View #3

    View #1

    View #2

    D

    D

    E

    E

    A

    A

    L

    L

    Mini Van

    Carr

    Carr

    E

    E

    M

    R

    R

    O

    Coupe

    Gleason

    Gleason

    S

    S

    D

    H

    H

    E

    Blue

    Blue

    Mini Van

    Sedan

    I

    I

    L

    Red

    Clyde

    Clyde

    Red

    Coupe

    White

    P

    P

    White

    Sedan

    Coupe

    Sedan

    Mini Van

    White

    Clyde

    Gleason

    Carr

    Red

    Blue

    DEALERSHIP

    COLOR

    MODEL

    o

    o

    ( ROTATE 90

    )

    ( ROTATE 90

    )

    MODEL

    COLOR

    COLOR

    View #6

    View #4

    View #5

    MDD Features - Rotation


    Mdd features rotation1

    MDD Features - Rotation

    • All the six views can be obtained by simple rotation

    • In MDBs rotations are simple as no rearrangement of data is required

    • Rotation is also referred to as “data slicing”

    • No. of views

      • 2D – 2

      • 3D – 6

      • 4D - ?

    24


    Mdd features ranging

    MDD Features - Ranging

    • How sales volume of models painted with new metallic blue compared with the sales of normal blue color models?

    • The user knows that only Sports Coupe and Mini Van models have received the new paint treatment

    • Also the user knows that only 2 dealers viz, Carr and Clyde have unconstrained supply of these models


    Mdd features ranging1

    Sales Volumes

    Mini Van

    M

    Mini Van

    O

    D

    Coupe

    Carr

    E

    Coupe

    Clyde

    L

    Normal

    Metal

    Blue

    Blue

    Carr

    Clyde

    DEALERSHIP

    Normal

    Metal

    Blue

    Blue

    COLOR

    MDD Features - Ranging

    • The end user selects the desired positions along each dimension.

    • Also referred to as "data dicing."

    • The data is scoped down to a subset grouping


    Mdd features ranging2

    MDD Features - Ranging

    • The reduced array can now be rotated and used in computations in the same was as the parent array

    • Referred to as “Data Dicing” as data is scoped down to a subset grouping

    • Complex SQL query is required in RDB

    • Performance is better in MDB as less resource consuming searches are required


    Mdd features roll up drill down

    MDD Features – Roll-Up & Drill-Down

    • Users want different views of the same data

    • For eg., Sales Volume by model vs, sales volume by dealership

    • Many times views are similar

    • Sales volume by dealership vs. volume by district

    • Natural relationship between Sales Volumes at the DEALERSHIP level and Sales Volumes at the DISTRICT level

    • Sales Volumes for all the dealerships in a district sum to the Sales Volumes for that district


    Mdd features roll up drill down1

    MDD Features – Roll-Up & Drill-Down

    • Multidimensional database technology is specially designed to facilitate the handling of these natural relationships

    • Define two related aggregates on the same dimension

    • One aggregation is dealership and the other district

    • District is at a higher level of aggregation than dealership


    Mdd features roll ups drill downs

    MDD Features - Roll-Ups & Drill Downs

    • The figure presents a definition of a hierarchy within

    • the organization dimension.

    • Aggregations perceived as being part of the same dimension.

    • Moving up and moving down levels in a hierarchy is referred to

    • as “roll-up” and “drill-down.”


    Why multi dimensional databases

    MDD Features - Roll-Ups & Drill Downs


    Mdd features drill down through a dimension

    MDD Features:Drill-Down Through a Dimension


    Queries

    Queries

    • High degree of structure in MDB makes the query language very simple and efficient

    • Query language is intuitive

    • Output is immediately useful to end user


    Queries example

    Queries: Example

    • Display sales volume by model for each dealership

      PRINT TOTAL.(SALES_VOLUME KEEP MODEL DEALERSHIP)

      Trends emerge and comparisons are easily made

    DEALERSHIP

    MODELCLYDE GLEASON CARR

    MINI VAN7 56

    SPORTS COUPE4 68

    SEDAN 3 812


    Queries example1

    Queries: Example

    • Corresponding SQL

      SELECT MODEL, DEALERSHIP, SUM(SALES_VOLUME)

      FROM SALES_VOLUME

      GROUP BY MODEL, DEALERSHIP

      ORDER BY MODEL, DEALERSHIP

    MODEL|DEALERSHIP|SUM(SALES_VOLUME)

    MINI VAN|CLYDE|7

    MINI VAN|GLEASON|5

    MINI VAN|CARR|6

    SPORTS COUPE|CLYDE|4

    SPORTS COUPE|GLEASON|6

    SPORTS COUPE|CARR|8

    SEDAN|CLYDE|3

    SEDAN|GLEASON|8

    SEDAN|CARR |12


    Queries example2

    Queries: Example

    Use report writer in addition to SQL and we get

    MINI VAN

    CLYDE7

    GLEASON5

    CARR6

    SPORTS COUPE

    CLYDE4

    GLEASON5

    CARR8

    SEDAN

    CLYDE3

    GLEASON8

    CARR12


    Mdd features multidimensional computations

    MDD Features:Multidimensional Computations

    • Well equipped to handle demanding mathematical functions.

    • Can treat arrays like cells in spreadsheets. For example, in a budget analysis situation, one can divide the ACTUAL array by the BUDGET array to compute the VARIANCE array.

    • Applications based on multidimensional database technology typically have one dimension defined as a "business measurements" dimension.

    • Integrates computational tools very tightly with the database structure.


    Mdd features multidimensional computations1

    Sales Volumes

    10

    0.1

    11

    Mini Van

    Coupe

    12

    0.33

    16

    -

    10

    8

    0.2

    Sedan

    16

    0.0

    16

    Variance

    Budget

    Actual

    BUSINESS MEASUREMENTS

    MDD Features:Multidimensional Computations


    The time dimension

    The Time Dimension

    • TIME as a predefined hierarchy for rolling-up and drilling-down across days, weeks, months, years and special periods, such as fiscal years.

      • Eliminates the effort required to build sophisticated hierarchies every time a database is set up.

      • Extra performance advantages


    Contrasting relational model and md model

    Contrasting Relational Model and MD Model


    Rdbms vs mddb

    RDBMS vs. MDDB

    • DO I still use RDBMS for my DW?

    • MDDBs store data in hypercube, i.e., multidimensional array

    • RDBMS store data as tables with row and columns that do not map directly to multidimensional view that user have of data

    • EDW – RDBMS

    • Data Marts - MDDB


    Rdbms vs mddb trade offs

    RDBMS vs. MDDB: Trade-Offs

    • SIZE

      • MDDBs limited by size

        • Mid – 1990s 10GB caused problems

        • Today – 100GB is OK

      • Large DWs are still better served by relational front-ends running against high performance and scalable RDBMS

    • VOLATILITY

      • Highly volatile data are better handled by RDBMS

      • MDDBs take long to load and update


    Rdbms vs mddb trade offs1

    RDBMS vs. MDDB: Trade-Offs

    • AGGREGATE STRATEGY

      • MDDBs support aggregates better

      • RDBMSs are catching up with the help of Aggregate Navigators

    • INVESTMENT PROTECTION

      • Most organizations already have made significant investments in relational technology and skill sets

      • Continued use for another purpose (DW) provides additional ROI and lowers technical risk of failure

      • MDDBs – need to acquire new software and train staff to use it


    Rdbms vs mddb trade offs2

    RDBMS vs. MDDB: Trade-Offs

    TYPE OF USERS

    • Power users prefer the range of functionalities available in MOLAP tools

    • Users that require broad views of enterprise data require access to DW and therefore better served by a ROLAP tool


    Integrated architecture

    INTEGRATED ARCHITECTURE

    • DB vendors have integrated their multidimensional and relational database products

    • Multidimensional Front-end tools

    • If queries require data that are not available in MDDB, the tools retrieve the data from the larger RDB

    • Known as “DRILL-THROUGH”


  • Login