why multi dimensional databases
Download
Skip this Video
Download Presentation
Why Multi-Dimensional Databases?

Loading in 2 Seconds...

play fullscreen
1 / 53

Why Multi-Dimensional Databases? - PowerPoint PPT Presentation


  • 186 Views
  • Uploaded on

Why Multi-Dimensional Databases?. No single "best" data structure for all applications within an enterprise Select the most appropriate data structure on a case-by-case basis from a palette of standard database structures Multidimensional Databases for OLAP?. Why Multi-Dimensional Databases?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Why Multi-Dimensional Databases?' - ghada


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
why multi dimensional databases
Why Multi-Dimensional Databases?
  • No single "best" data structure for all applications within an enterprise
  • Select the most appropriate data structure on a case-by-case basis from a palette of standard database structures
  • Multidimensional Databases for OLAP?
why multi dimensional databases1
Why Multi-Dimensional Databases?
  • From econometric research conducted at MIT in the 1960s, the multidimensional database has matured into the database engine of choice for data analysis applications
  • Inherent ability to integrate and analyze large volumes of enterprise data
  • Offers a good conceptual fit with the way end-users visualize business data
    • Most business people already think about their businesses in multidimensional terms
    • Managers tend to ask questions about product sales in different markets over specific time periods
multidimensional database
Multidimensional Database
  • Spreadsheets – A 2D database?
  • Functionalities
  • What about a stack of similar spreadsheets for different times?
  • Limitations?

We can not relate data in different

sheets easily

multidimensional database1
Multidimensional Database

An MDDB is a computer software system designed to allow for the efficient and convenient storage and retrieval system of large volumes of data that is

  • Intimately related &
  • Stored, viewed and analyzed form different perspectives

These perspectives are called Dimensions

a motivating example
A Motivating Example

An automobile manufacturer wants to increase sale volumes by examining sales data collected throughout the organization. The evaluation would require viewing historical sales volume figures from multiple dimensions such as

  • Sales volume by model
  • Sales volume by color
  • Sales volume by dealer
  • Sales volume over time
slide6
SALES VOLUMES FOR GLEASON DEALERSHIP
  • MODELCOLOR SALES VOLUME
  • -----------------------------------------------------------------------------------------------------------------------------------
  • MINI VAN BLUE 6
  • MINI VAN RED 5
  • MINI VAN WHITE 4
  • SPORTS COUPE BLUE 3
  • SPORTS COUPE RED 5
  • SPORTS COUPE WHITE 5
  • SEDAN BLUE 4
  • SEDAN RED 3
  • SEDAN WHITE 2
slide7

Multidimensional Array Structure

6

5

4

M

O

D

3

5

5

E

L

4

3

2

COLOR

Measurement

Sales Volumes

Dimension

MiniVan

Coupe

Sedan

Blue

Red

White

Dimension

Positions

rdbms vs mdd
RDBMS vs. MDD
  • Multidimensional array structure represents a higher level of organization than the relational table
  • Perspectives are embedded directly into the structure in the multidimensional model
      • All possible combinations of perspectives containing a specific attribute (the color BLUE, for example) line up along the dimension position for that attribute.
  • Perspectives are placed in fields in the relational model - tells us nothing about field contents.
slide9

RDBMS vs. MDD

  • MDD makes data browsing and manipulation intuitive to the end-user
  • Any data manipulation action possible with a MDD is also possible using relational technology
  • Substantial cognitive advantages in query formulation
  • Substantial computational performance advantages in query processing when using MDD
slide10

SALES VOLUMES FOR ALL DEALERSHIPS

MODELCOLORDEALERSHIPVOLUME

----------------------------------------------------------------------------------------------------------------------------------------------------

  • MINI VAN BLUE CLYDE 6
  • MINI VAN BLUE GLEASON 6
  • MINI VAN BLUE CARR 2
  • MINI VAN RED CLYDE 3
  • MINI VAN RED GLEASON 5
  • MINI VAN RED CARR 5
  • MINI VAN WHITE CLYDE 2
  • MINI VAN WHITE GLEASON 4
  • MINI VAN WHITE CARR 3
  • SPORTS COUPE BLUE CLYDE 2
  • SPORTS COUPE BLUE GLEASON 3
  • SPORTS COUPE BLUE CARR 2
  • SPORTS COUPE RED CLYDE 7
  • SPORTS COUPE RED GLEASON 5
  • SPORTS COUPE RED CARR 2
  • SPORTS COUPE WHITE CLYDE 4
  • SPORTS COUPE WHITE GLEASON 5
  • SPORTS COUPE WHITE CARR 1
  • SEDAN BLUE CLYDE 6
  • SEDAN BLUE GLEASON 4
  • SEDAN BLUE CARR 2
  • SEDAN RED CLYDE 1
  • SEDAN RED GLEASON 3
  • SEDAN RED CARR 4
  • SEDAN WHITE CLYDE 2
  • SEDAN WHITE GLEASON 2
  • SEDAN WHITE CARR 3
mutlidimensional representation

M

O

D

E

L

DEALERSHIP

COLOR

Mutlidimensional Representation

Sales Volumes

Mini Van

Coupe

Carr

Sedan

Gleason

Clyde

Blue

Red

White

viewing data an example

M

O

D

E

L

DEALERSHIP

COLOR

Viewing Data - An Example

Sales Volumes

Assume that each dimension has 10 positions, as shown in the cube above

viewing data an example1
Viewing Data - An Example
  • How many records would be there in a relational table?
  • Implications for viewing data from an end-user standpoint?

SALES VOLUMES FOR ALL DEALERSHIPS

MODEL COLOR DEALERSHIP VOLUME

MINI VAN BLUE CLYDE 2

MINI VAN BLUE GLEASON 2

MINI VAN BLUE CARR 2

MINI VAN RED CLYDE 1

MINI VAN WHITE GLEASON 3

RECORD NUMBER.... 998

RECORD NUMBER.... 999

RECORD NUMBER.... 1000

slide14

Performance Advantages

  • Volume figure when car type = SEDAN, color=BLUE, & dealer=GLEASON?
  • RDBMS – all 1000 records might need to be searched to find the right record
  • MDB has more ‘knowledge’ about where the data lies
  • Max. of 30 position searches!!
  • Average case

15 vs. 500

slide15

Performance Advantages

  • Total Sales across all colors and dealers when model = SEDAN?
  • RDBMS – all 1000 records must be searched to get the answer
  • MDB – Sum the contents of one 10x10 ‘slice’
slide16

Performance Advantages

  • Data manipulation that requires a minute in RDBMS may require only a few seconds in MDB
  • MDBs are an order of magnitude faster than RDBMSs
  • Performance benefits are more for queries that generate cross-tab views of data
  • The performance advantages offered by multidimensional technology facilitates the development of interactive decision support applications like OLAP that can be impractical in a relational environment.
slide17

RDBMS vs. MDB

  • Any data manipulation action possible with a multidimensional database is also possible using relational technology
  • MDBs however offer several advantages like:
    • Ease of data presentation and navigation
    • Ease of maintenance
    • Performance
slide18

Ease of Data Presentation & Navigation

  • Intuitive spreadsheet like data views are natural output of MDBs
  • Obtaining the same views in a relational environment, requires either a complex SQL or a SQL generator against a RDB to convert the table outputs into a more intuitive format
  • Top N queries are not possible with SQL at all
slide19

Ease of Maintenance

  • Ease of maintenance because data is stored as it is viewed
  • No additional overhead is required to translate user queries into requests for data
  • To provide same intuitiveness, RDBs use indexes and sophisticated joins which require significant maintenance and storage
slide20

Performance

  • Performance of MDBs can be matched by RDBs through database tuning
  • Not possible to tune the database for all possible adhoc queries
  • Tuning requires resources of an expensive DB specialist
  • Aggregate navigators are helping RDBs to catch up with MDBs as far as aggregation queries are concerned
adding dimension an example

M

O

D

E

L

DEALERSHIP

COLOR

COLOR

COLOR

JANUARY

FEBRUARY

MARCH

Adding Dimension - An Example

Sales Volumes

Mini Van

Mini Van

Mini Van

Coupe

Coupe

Coupe

Carr

Carr

Carr

Sedan

Sedan

Sedan

Gleason

Gleason

Gleason

Clyde

Clyde

Clyde

Blue

Red

White

Blue

Red

White

Blue

Red

White

when is mdd in appropriate
When is MDD (In)appropriate?

First, consider situation 1

PERSONNEL

EMPLOYEE#

LAST NAME

EMPLOYEE AGE

SMITH

01

21

REGAN

12

19

FOX

31

63

WELD

14

31

KELLY

54

27

LINK

03

56

KRANZ

41

45

LUCUS

33

41

WEISS

23

19

slide23

When is MDD (In)appropriate?

Now consider situation 2

1. Set up a MDD structure for situation 1, with LAST NAME

and Employee# as dimensions, and AGE as the measurement.

2. Set up a MDD structure for situation 2, with MODEL and

COLOR as dimensions, and SALES VOLUME as the measurement.

slide24

When is MDD (In)appropriate?

21

19

63

L

A

6

5

4

31

M

S

O

T

D

3

5

5

27

E

N

L

A

4

3

2

M

56

E

45

COLOR

41

19

EMPLOYEE #

MDD Structures for the Situations

EmployeeAge

Smith

Regan

Sales Volumes

Fox

Mini Van

Weld

Coupe

Kelly

Link

Sedan

Red

White

Blue

Kranz

Lucas

Weiss

31

41

23

01

14

54

03

12

33

Note the sparse between the two MDD representations

slide25

When is MDD (In)appropriate?

  • Our sales volume dataset has a great number of meaningful interrelationships
  • Interrelationships more meaningful than individual data elements themselves.
  • The greater the number of inherent interrelationships between the elements of a dataset, the more likely it is that a study of those interrelationships will yield business information of value to the company.
  • Highly interrelated dataset types be placed in a multidimensional data structure for greatest ease of access and analysis
slide26

When is MDD (In)appropriate?

  • No last name is matching with more than one emp # and no emp # is matching with more than one last name
  • In contrast, there is a sales figure associated with every combination of model and color resulting in a completed filled up 3x3 matrix
  • Performance suffers (RDB 9 vs. MDB 18)
slide27

When is MDD (In)appropriate?

  • The relative performance advantages of storing multidimensional data in a multidimensional array increase as the size of the dataset increases
  • The relative performance disadvantages of storing non-multidimensional data in a multidimensional array increase as the size of the dataset increases.
  • NO inherent value of storing Non-multidimensional data (employee data) in multidimensional arrays
slide28

When is MDD Appropriate?

  • The greater the number of inherent interrelationships between the elements of a dataset, the more likely it is that a study of those interrelationships will yield business information of value to the company.
  • Most companies have limited time and resources to devote to analyzing data
  • It therefore becomes critical that these highly interrelated dataset types be placed in a multidimensional data structure for greatest ease of access and analysis.
slide29

When is MDD Appropriate?

  • Examples of applications that are suited for
  • multidimensional technology:
  • Financial Analysis and Reporting
  • Budgeting
  • Promotion Tracking
  • Quality Assurance and Quality Control
  • Product Profitability
slide30

MDD Features - Rotation

6

5

4

6

3

4

M

C

O

O

D

L

3

5

5

5

5

3

E

O

L

R

4

3

2

4

5

2

o

( ROTATE 90

)

COLOR

MODEL

Sales Volumes

Mini Van

Blue

Coupe

Red

Sedan

White

Mini Van

Coupe

Sedan

Blue

Red

White

View #1: ModelxColor

View #2: ColorxModel

  • Also referred to as “data slicing.”
  • Each rotation yields a different slice or two dimensional table
  • of data.
mdd features rotation

Sales Volumes

M

C

C

Mini Van

Blue

Blue

O

O

O

D

L

L

Coupe

Red

Red

E

O

O

L

R

R

Mini Van

Carr

Carr

Sedan

White

White

Gleason

Gleason

Coupe

Clyde

Clyde

Sedan

Sedan

Coupe

Mini Van

Blue

Red

White

Carr

Gleason

Clyde

DEALERSHIP

COLOR

MODEL

o

o

o

( ROTATE 90

)

( ROTATE 90

)

( ROTATE 90

)

DEALERSHIP

DEALERSHIP

MODEL

View #3

View #1

View #2

D

D

E

E

A

A

L

L

Mini Van

Carr

Carr

E

E

M

R

R

O

Coupe

Gleason

Gleason

S

S

D

H

H

E

Blue

Blue

Mini Van

Sedan

I

I

L

Red

Clyde

Clyde

Red

Coupe

White

P

P

White

Sedan

Coupe

Sedan

Mini Van

White

Clyde

Gleason

Carr

Red

Blue

DEALERSHIP

COLOR

MODEL

o

o

( ROTATE 90

)

( ROTATE 90

)

MODEL

COLOR

COLOR

View #6

View #4

View #5

MDD Features - Rotation
mdd features rotation1
MDD Features - Rotation
  • All the six views can be obtained by simple rotation
  • In MDBs rotations are simple as no rearrangement of data is required
  • Rotation is also referred to as “data slicing”
  • No. of views
    • 2D – 2
    • 3D – 6
    • 4D - ?

24

mdd features ranging
MDD Features - Ranging
  • How sales volume of models painted with new metallic blue compared with the sales of normal blue color models?
  • The user knows that only Sports Coupe and Mini Van models have received the new paint treatment
  • Also the user knows that only 2 dealers viz, Carr and Clyde have unconstrained supply of these models
mdd features ranging1

Sales Volumes

Mini Van

M

Mini Van

O

D

Coupe

Carr

E

Coupe

Clyde

L

Normal

Metal

Blue

Blue

Carr

Clyde

DEALERSHIP

Normal

Metal

Blue

Blue

COLOR

MDD Features - Ranging
  • The end user selects the desired positions along each dimension.
  • Also referred to as "data dicing."
  • The data is scoped down to a subset grouping
mdd features ranging2
MDD Features - Ranging
  • The reduced array can now be rotated and used in computations in the same was as the parent array
  • Referred to as “Data Dicing” as data is scoped down to a subset grouping
  • Complex SQL query is required in RDB
  • Performance is better in MDB as less resource consuming searches are required
mdd features roll up drill down
MDD Features – Roll-Up & Drill-Down
  • Users want different views of the same data
  • For eg., Sales Volume by model vs, sales volume by dealership
  • Many times views are similar
  • Sales volume by dealership vs. volume by district
  • Natural relationship between Sales Volumes at the DEALERSHIP level and Sales Volumes at the DISTRICT level
  • Sales Volumes for all the dealerships in a district sum to the Sales Volumes for that district
mdd features roll up drill down1
MDD Features – Roll-Up & Drill-Down
  • Multidimensional database technology is specially designed to facilitate the handling of these natural relationships
  • Define two related aggregates on the same dimension
  • One aggregation is dealership and the other district
  • District is at a higher level of aggregation than dealership
mdd features roll ups drill downs
MDD Features - Roll-Ups & Drill Downs
  • The figure presents a definition of a hierarchy within
  • the organization dimension.
  • Aggregations perceived as being part of the same dimension.
  • Moving up and moving down levels in a hierarchy is referred to
  • as “roll-up” and “drill-down.”
queries
Queries
  • High degree of structure in MDB makes the query language very simple and efficient
  • Query language is intuitive
  • Output is immediately useful to end user
queries example
Queries: Example
  • Display sales volume by model for each dealership

PRINT TOTAL.(SALES_VOLUME KEEP MODEL DEALERSHIP)

Trends emerge and comparisons are easily made

DEALERSHIP

MODELCLYDE GLEASON CARR

MINI VAN 7 5 6

SPORTS COUPE 4 6 8

SEDAN 3 8 12

queries example1
Queries: Example
  • Corresponding SQL

SELECT MODEL, DEALERSHIP, SUM(SALES_VOLUME)

FROM SALES_VOLUME

GROUP BY MODEL, DEALERSHIP

ORDER BY MODEL, DEALERSHIP

MODEL | DEALERSHIP | SUM(SALES_VOLUME)

MINI VAN | CLYDE | 7

MINI VAN | GLEASON | 5

MINI VAN | CARR | 6

SPORTS COUPE | CLYDE | 4

SPORTS COUPE | GLEASON | 6

SPORTS COUPE| CARR | 8

SEDAN | CLYDE | 3

SEDAN | GLEASON | 8

SEDAN| CARR | 12

queries example2
Queries: Example

Use report writer in addition to SQL and we get

MINI VAN

CLYDE 7

GLEASON 5

CARR 6

SPORTS COUPE

CLYDE 4

GLEASON 5

CARR 8

SEDAN

CLYDE 3

GLEASON 8

CARR 12

mdd features multidimensional computations
MDD Features:Multidimensional Computations
  • Well equipped to handle demanding mathematical functions.
  • Can treat arrays like cells in spreadsheets. For example, in a budget analysis situation, one can divide the ACTUAL array by the BUDGET array to compute the VARIANCE array.
  • Applications based on multidimensional database technology typically have one dimension defined as a "business measurements" dimension.
  • Integrates computational tools very tightly with the database structure.
mdd features multidimensional computations1

Sales Volumes

10

0.1

11

Mini Van

Coupe

12

0.33

16

-

10

8

0.2

Sedan

16

0.0

16

Variance

Budget

Actual

BUSINESS MEASUREMENTS

MDD Features:Multidimensional Computations
the time dimension
The Time Dimension
  • TIME as a predefined hierarchy for rolling-up and drilling-down across days, weeks, months, years and special periods, such as fiscal years.
    • Eliminates the effort required to build sophisticated hierarchies every time a database is set up.
    • Extra performance advantages
rdbms vs mddb
RDBMS vs. MDDB
  • DO I still use RDBMS for my DW?
  • MDDBs store data in hypercube, i.e., multidimensional array
  • RDBMS store data as tables with row and columns that do not map directly to multidimensional view that user have of data
  • EDW – RDBMS
  • Data Marts - MDDB
rdbms vs mddb trade offs
RDBMS vs. MDDB: Trade-Offs
  • SIZE
    • MDDBs limited by size
      • Mid – 1990s 10GB caused problems
      • Today – 100GB is OK
    • Large DWs are still better served by relational front-ends running against high performance and scalable RDBMS
  • VOLATILITY
    • Highly volatile data are better handled by RDBMS
    • MDDBs take long to load and update
rdbms vs mddb trade offs1
RDBMS vs. MDDB: Trade-Offs
  • AGGREGATE STRATEGY
    • MDDBs support aggregates better
    • RDBMSs are catching up with the help of Aggregate Navigators
  • INVESTMENT PROTECTION
    • Most organizations already have made significant investments in relational technology and skill sets
    • Continued use for another purpose (DW) provides additional ROI and lowers technical risk of failure
    • MDDBs – need to acquire new software and train staff to use it
rdbms vs mddb trade offs2
RDBMS vs. MDDB: Trade-Offs

TYPE OF USERS

  • Power users prefer the range of functionalities available in MOLAP tools
  • Users that require broad views of enterprise data require access to DW and therefore better served by a ROLAP tool
integrated architecture
INTEGRATED ARCHITECTURE
  • DB vendors have integrated their multidimensional and relational database products
  • Multidimensional Front-end tools
  • If queries require data that are not available in MDDB, the tools retrieve the data from the larger RDB
  • Known as “DRILL-THROUGH”
ad