Reducing the response time for data warehouse queries using rough set theory
Download
1 / 16

Reducing the Response Time for Data Warehouse Queries Using Rough Set Theory - PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on

Reducing the Response Time for Data Warehouse Queries Using Rough Set Theory. By Mahmoud Mohamed Al-Bouraie Yasser Fouad Mahmoud Hassan Wesam Fathy Jasser. Outline. Aim Pre-grouping Transformation Hierarchical pre-grouping Attribute Selection A Heuristic Algorithm for Attribute Selection

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Reducing the Response Time for Data Warehouse Queries Using Rough Set Theory' - xena


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Reducing the response time for data warehouse queries using rough set theory

Reducing the Response Time for Data Warehouse Queries Using Rough Set Theory

By

Mahmoud Mohamed Al-Bouraie

Yasser Fouad Mahmoud Hassan

Wesam Fathy Jasser


Outline
Outline Rough Set Theory

  • Aim

  • Pre-grouping Transformation

  • Hierarchical pre-grouping

  • Attribute Selection

  • A Heuristic Algorithm for Attribute Selection

  • Worked Example

  • Applications

  • Conclusion


Aim Rough Set Theory

  • To reach to the optimization case.

  • This case is happened when the response time for processing a query become small as possible using

    • Pre-grouping transformation and

    • The size of any database become small as possible

      (using rough set theory).


Pre grouping transformation 1
Pre-grouping Transformation (1) Rough Set Theory

  • It dependents on some concepts like star schema structure as shown in the figure it contents of facttable surrounded by dimension tables. The relationships between them is 1: N.


Pre grouping transformation 2
Pre-grouping Transformation (2) Rough Set Theory

  • Another concept that that this transformation dependents on is hierarchical clustering of data.

  • It is based on the idea that the hierarchy of one dimension are encoded into hierarchical surrogates used in the fact table.

  • There is a compact representation of the hierarchy path of a dimension member making it possible to use hierarchy on the fact table without requiring residual joins. The figure shows an example


Hierarchical pre grouping
Hierarchical pre-grouping Rough Set Theory

  • We assume that the DBMS has information about hierarchical relationships of the dimension attributes.

  • We group on the highest hierarchy level to reduces the number of resulting groups.

  • The groups of the pre-grouping operation are joined with the dimension tables, in order to get the values for the grouping attributes.


Attribute selection
Attribute Selection Rough Set Theory

  • Depending on rough set theory, a database always contains a lot of attributes that are redundant.

  • To eliminate these redundant attributes we use attribute selection that used to find an optimal subset of attributes in a database according to some criterion, so that a classifier with the highest possible accuracy can be induced by learning algorithm using information about data available only from the subset of attributes.


A heuristic algorithm for attribute selection
A Heuristic Algorithm Rough Set Theoryfor Attribute Selection

  • Let R be a set of the selected attributes, P be the set of unselected condition attributes, U be the set of all objects, X be the set of contradictory objects, Va denotes the attribute a values and EXPECT be the threshold of accuracy.

  • In the initial state, R = CORE(C),

    k = 0.


Attribute selection using rsh 1
Attribute Selection using RSH (1) Rough Set Theory

  • Step 1.If k >= EXPECT, finish, otherwise calculate the dependency degree, k,

  • Step 2. For each p in P, calculate

  • where max_size denotes the cardinality of the maximal subset.


Attribute selection using rsh 2
Attribute Selection using RSH (2) Rough Set Theory

  • Step 3. Choose the best attribute p with the

    largest and let

  • Step 4. Remove all consistent instances u in from X.

  • Step 5. Go back to Step 1.


Worked example of attribute selection
Worked Example of Attribute Selection Rough Set Theory

Condition Attributes:

a: Va = {1, 2}

b: Vb = {0, 1, 2}

c: Vc = {0, 1, 2}

d: Vd = {0, 1}

Decision Attribute:

e: Ve = {0, 1, 2}


R={b} Rough Set Theory

  • After deleting all consistent objects we have:

T

T’

The instances containing b0 will not be considered.


U/{a,b} Rough Set Theory

1. Selecting {a}

R = {a,b}

u5

u3

u6

u4

u7

U/{e}

u3,u5,u6

u4

u7


Also, we select {c} and {d}. Then finally we found: Rough Set Theory

Result: Subset of attributes= {b, d}


Application and result
Application and Result Rough Set Theory

  • We built a simple database depending on Heart diseases dataset using Excel file. The attributes information and their types will be as following:

  • Attribute Information:

  • 1) age 2) sex 3) chest pain type (4 values)

  • 4) resting blood pressure 5) serum cholesterol in mg/dl

  • 6) fasting blood sugar > 120 mg/dl

  • 7) resting electrocardiograph results (values 0, 1, and 2)

  • 8) maximum heart rate achieved 9) exercise induced angina

  • 10) old peak = ST depression induced by exercise relative to rest

  • 11) the slope of the peak exercise ST segment

  • 12) number of major vessels (0-3) colored by fluoroscopy

  • 13) thal: 3 = normal; 6 = fixed defect; 7 = reversable defect

  • Attributes types

  • Real: 1,4,5,8,10,12 Ordered: 11 Binary: 2,6,9 Nominal: 7,3,13

  • Using Rosetta software which is used for analyze the data; Then the database reduced. After that we generate the rules on the best reduct. Finally we filtered the rules using Quality filtering loop. The result for our experiment like this in Table .


Conclusion
Conclusion Rough Set Theory

  • The star query: the most common type of query in data warehouse,

  • One of the most promising techniques for efficiently evaluating such queries is the use of fact table organizations that store data clustered according to the dimension hierarchies.

    • A special hierarchical encoding is imposed on star joins are transformed to multidimensional range queries on the underlying multidimensional structures. The conventional star query evaluation plan changes radically and new processing steps are required.


ad