1 / 9

Cube Tree

Cube Tree. Dimension: number of group-by values Relation tuples map to a point in the space Aggregates: projection of all data points on all the subspaces. Intersection between a subspace and the orthogonal hyper-plane stores the aggregates. Origin represents aggregate with no grouping

Anita
Download Presentation

Cube Tree

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cube Tree • Dimension: number of group-by values • Relation tuples map to a point in the space • Aggregates: projection of all data points on all the subspaces. • Intersection between a subspace and the orthogonal hyper-plane stores the aggregates. • Origin represents aggregate with no grouping • Query a group-by aggregate on the corresponding hyper-planes

  2. Packed R-Tree • Sort-pack: (for multi-dimension data) • Achieves excellent clustering • Significantly reduces the overlap and dead space • A preferred structure for Datcubes storage • Representation of Datacube only provide good clustering for half of the total group-bys • Degradation due to strong interleaving between points of these group-bys.

  3. Dataless & Reduced Cubetree • Dataless Cubtree: Only contains aggregate values but no data values • Better clustering than a full tree in a R-Tree • Projection points are not interleaved • Reduced Cubetree: Each hyper-plane which containing aggregates will form a R-Tree independently • The dimension of R-Tree reduced by one. • Better clustering and query performance

  4. Allocating of goupbys to R-Trees • A set of group-bys are compatible if there exist a sort order that guarantees no dispersion • Allocate a group-by to one of the N R-Trees • the set of group-bys for this R-Tree is compatible • if a group-by cannot find a compatible set • assign it to a set that contain all of its gorup-by attributes. (false allocation) • Selection of sort order for Packed R-Tree is also an import parameter for favoring some prefered group-bys

  5. Bulk Incremental Update

  6. Iceberg Cube • Selectively compute only those partitions that satisfy an aggregate condition • Aggregate with low support reveal little meaning & make the cube sparse • Conditions like • Minimum support of a partition • Required Range

  7. Bottom-Up Cube Parent to compu the child

  8. Bottom-Up Cube (2) • Starting from a bottom single dimension groupby • If current inputs can be pruned return • Partition the data in this group-by • If a partition is greater than the minsup • recursive call on BUC with the partition as inputs • Loop until all dimensions is done

  9. Bottom-Up Cube (3) • Similar idea of Apriori-gen • Apriori will generate all the candidates at the same level first (breadth first) • BUC is in depth first manner. • To reduce memory requirement • Dimension ordering: provide better pruning • Cardinality, Skew & Correlation

More Related