1 / 17

Fragmentation

Fragmentation. Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität Wien Tel. 4277 3 94 25 Sprechstunde: Di, 1 3.0 0-1 4 . 0 0 LV-Portal: www.par.univie.ac.at/~brezany/teach/gckfk/300658.html. We already presented the various fragmentation strategies.

aatkinson
Download Presentation

Fragmentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fragmentation Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität Wien Tel. 4277 39425 Sprechstunde: Di, 13.00-14.00 LV-Portal: www.par.univie.ac.at/~brezany/teach/gckfk/300658.html

  2. We already presented the various fragmentation strategies. Fragmentation strategies: horizontal vertical nesting fragments in a hybrid fashion. Introduction

  3. Primary horizontal fragmentation of a relation is performed using predicates that are defined on that relation. Derived horizontal fragmentation is the partitioning of a relation that results from predicates being defined on another relation. Information requirements of horizontal fragmentation Database information Application information Horizontal Fragmentation

  4. Database Example

  5. It concerns the global conceptual schema. In this context it is important to note how the database relations are connected to one another, especially with joins. Example: Database Information Given link L1 of the above figure, the owner and member functions have the the following values: owner(L1) = PAY; member(L1) = EMP The quantitative information required about the database is the cardinality of each relation R, denoted card(R).

  6. It is required: qualitative information, which guides the fragmentation activity quantitative information is incorporated primarily into the allocation models. The fundamental qualitative information consists of the predicates used in user queries.It is not possible to analyze all of the user applications to determine these predicates  one should at least investigate the most “important“ ones. a rule of thumb: the most active 20% of user queries account for 80% of the total data access. Simple predicates: Given a relation R(A1, A2, ..., An), where Ai is an attribute defined over domain Di, a simple predicate pj defined on R has the form pj : Ai Value where   {, , , , , } and Value  Di. We use Pri to denote the set of all simple predicates defined on a relation Ri. The members of Pri are denoted by pij. Example: for the relation instance PROJ: PNAME = “Maintenance“ BUDGET  200000 Application Information

  7. Application Information (cont.) User queries often include more complicated predicates, which are Boolean combinations of simple predicates. One important combination: minterm predicate – conjunction of simple predicates. Given a set of simple predicates for relation Ri, the set of minterm predicates is defined as where or So each simple predicate can occur in a minterm predicate either in its natural form or ist negated form.

  8. Application Information (cont.) Example:

  9. In terms of quantitative information about the user applications, we need to have 2 sets of data: Minterm selectivity: number of tuples of the relation that would be accessed by a user query specified according to a given minterm predicate. E.g., in previous example, sel(m1)=0 since there are no tuples in PAY that satisfy the minterm predicate. sel(m2)=1 Access frequency: frequency with which user applications access data. If Q = {q1, q2, ..., qq} is a set of user queries, acc(qi) indicates the access frequency of query qi in a given period. The minterm access frequencies can be determined from the query frequencies  acc(mi) – the access frequency of a minterm mi. Application Information (cont.)

  10. It is defined by a selection operation on the owner relations of a database schema. Given a relation R, its horizontal fragments are given by Ri = Fi(R), 1  i  w where Fi is the selection formula used to obtain fragment Ri. Example : PROJ  PROJ1 and PROJ2 PROJ1 = BUDGET  200000(PROJ) PROJ2 = BUDGET  200000(PROJ) Primary Horizontal Fragmentation

  11. Primary Horizontal Fragmentation (cont.) Example:

  12. A more formal definition of a horizontal fragment: A horizontal fragment of relation Ri consists of all the tuples of R that satisfy a minterm predicate mj. Hence, given a set of minterm predicates M, there are as many horizontal fragments of R as there are minterm predicates.  minterm fragments. An important aspect of simple predicates is their completeness; another is their minimality. A set od simple predicates Pr is said to be complete if and only if there is an equal probability of access by every application to any tuple belonging to any minterm fragment that is defined according to Pr. Primary Horizontal Fragmentation (cont.)

  13. Example: Consider the fragmentation of PROJ in the last example. If the only application that accesses PROJ wants to access the tuples according to the location, the set is complete since each tuple of each fragment PROJi, has the same probability of being accessed. If there is a second application which accesses only those project tuples where the budegt is less than $200.000, then Pr is not complete. Some of the tuples within each PROJi have a higher probability of being accessed due to this second application. To make the set of predicates complete, we need to add (BUDGET  200000, BUDGET > 20000) to Pr: Pr = {LOC=“Montreal“, LOC=“New York“, LOC=“Paris“, BUDGET200000, BUDGET > 20000} Primary Horizontal Fragmentation (cont.)

  14. The second desirable property of the set of predicates, according to which minterm predicates and turn, fragments are to be defined, is minimality. If a predicate influences how fragmentation is performed (i.e., causes a fragment f to be further fragmented into, say, fi and fj), there should be at least one application that accesses fi and fj differently. In other words, the simple predicate should be relevant in determining a fragmentation. If all the predicates of a set Pr are relevant, Pr is minimal. Primary Horizontal Fragmentation (cont.)

  15. Example: The set Pr defined in the previous example is complete and minimal. If, however, we were to add the predicate PNAME = „Instrumentation“ to Pr, the resulting set would not be minimal since the new predicate is not relevant with respect to Pr. There is no application that would access the resulting fragments any differently. Primary Horizontal Fragmentation (cont.)

  16. Derived Horizontal Fragmentation A derived horizontal fragmentation is defined on a member relation of a link according to a selection operation specified on its owner. Given a link L where owner(L) = S and member(L) = R, the derived horizontal fragments of R are defined as Ri = R  Si, 1  i  w where w is the maximum number of fragments that will be defined on R, and Si = Fi(S), where Fi is the formula according to which the primary horizontal fragment Si is defined.

  17. Derived Horizontal Fragmentation (cont.) • Example: Consider link L1, where • owner(L1) = PAY and • member(L1) = EMP. • Then we can group engineers into • 2 groups according to their salary: • $30.000 and > $30.000. The 2 fragments EMP1 and EMP2 are defined: EMP1 = EMP  PAY1 EMP2 = EMP  PAY2 where PAY1 = SAL30000 (PAY) PAY2 = SAL>30000 (PAY) Derived horizontal fragmentation of EMP

More Related