1 / 28

Aggregate Queries in Peer-to-Peer OLAP

Aggregate Queries in Peer-to-Peer OLAP. Mauricio Minuto Espil Faculty of Engineering Universidad Católica Argentina Alejandro A. Vaisman Computer Science Department Universidad de Buenos Aires. 7 th International Workshop on Data Warehousing & OLAP.

rene
Download Presentation

Aggregate Queries in Peer-to-Peer OLAP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Aggregate Queries in Peer-to-Peer OLAP • Mauricio Minuto Espil • Faculty of Engineering • Universidad Católica Argentina • Alejandro A. Vaisman • Computer Science Department • Universidad de Buenos Aires 7th International Workshop on Data Warehousing & OLAP

  2. Aggregate Queries in Peer-to-Peer OLAP OUTLINE: • CHARACTERIZATION • PROBLEM AND PROPOSAL • FACT INTEGRATION • DIMENSION INTEGRATION • AGGREGATE QUERIES • CONCLUSIONS

  3. Peer-to-Peer Systems MAIN CHARACTERISTICS: • Involves a network of interconnected peer systems; • The network topology is not relevant; • Each peer maintains full autonomy over its own data resources; • Each peer may assume the role of local. The rest become acquaintances of the local peer; • The roles of local and acquaintance among peers are not static; they are functional and are determined with respect to an operation.

  4. Peer-to-Peer Data Management MAIN CHARACTERISTICS: • No global schema is assumed to exist for data; • Each peer must manage its data according its own perspective; • A query may be posed on any peer, the responsive peer becomes local with respect to the query; • Answers to queries must conform the best attempt to gather data from all peers; • Answers to queries posed by local peer users must conform the view those users have of their data; • Peers must cooperate in maintaining the local views of data;

  5. Aggregate Queries in Peer-to-Peer OLAP OUTLINE: • CHARACTERIZATION • PROBLEM AND PROPOSAL • FACT INTEGRATION • DIMENSION INTEGRATION • AGGREGATE QUERIES • CONCLUSIONS

  6. OLAP Data in a Peer-to-Peer System THE PROBLEM: • OLAP data is essentially multidimensional; • Multidimensional data consists in a collection of views of base and derived aggregated data, describing fact indicators by dimensions of analysis; • Concepts for aggregation within dimensions are obtained from finer grain concepts through hierarchies; • Different peers may have affine fact indicators described by different dimension hierarchies; • Integration is needed: Any summary concept that appears in a hierarchy of a peer acquaintance must be transformed into a summary concept meaningful to the local peer. •••• >

  7. OLAP Data in a Peer-to-Peer System •••• > THE PROBLEM • The expected integration is not always possible; • Users may pose OLAP queries in a local peer expecting results involving all relevant data stored in all peers. • Local queries must be propagated among the acquaintances; • A rewriting of the propagated queries is needed to conform the view of the local user. • The rewriting technique must accomplish the data integration on the fly; • Incomplete and uncertain results must be admitted;

  8. Peer-to-Peer OLAP MODEL (DEFINES): • FACT PEERS • DIMENSION PEERS • AGGREGATE P2P OLAP QUERIES • COMPLETE AND CERTAIN QUERY ANSWERS ARCHITECTURE (INVOLVES): • AUTONOMOUS PEER DATA MANAGEMENT • THREE PHASE PEER TO PEER COORDINATION • COOPERATIVE QUERY ANSWERING

  9. Aggregate Queries in Peer-to-Peer OLAP OUTLINE: • CHARACTERIZATION • PROBLEM AND PROPOSAL • FACT INTEGRATION • DIMENSION INTEGRATION • AGGREGATE QUERIES • CONCLUSIONS

  10. Fact Integration TYPES OF FACT: • GENERIC FACT • FACT PEERS IS-A RELATIONSHIP FACT CONCILIATION PHASE: PUBLISHES GENERIC FACT DEFINITION AND DIMENSIONAL STRUCTURE SOURCE PEER LISTENING PEER GENERIC FACT AGREEMENT AND DIMENSION PEERS DEFINITION

  11. Aggregate Queries in Peer-to-Peer OLAP OUTLINE: • CHARACTERIZATION • PROBLEM AND PROPOSAL • FACT INTEGRATION • DIMENSION INTEGRATION • AGGREGATE QUERIES • CONCLUSIONS

  12. Dimension Integration INVOLVES: • A PAIR OF DIMENSION PEERS CONSISTS IN: • LEVEL HIERARCHY INTEGRATION • MEMBER HIERARCHY INTEGRATION. COMPRISES: • CORRESPONDENCE DEFINITION AMONG DIMENSION LEVELS • REVISION/MAPPING DEFINITION AMONG DIMENSION INSTANCES

  13. Level Hierarchy Integration LEVEL CORRESPONDENCE • APPLIES ON SCHEMAS • ESTABLISHES HOW A PAIR OF LEVELS ON DIFFERENT PEER DIMENSIONS ARE RELATED • IS PRODUCED/UPDATED DURING A SCHEMA CONCILIATION PHASE • IS MATERIALIZED AS METADATA IN CORRESPONDENCE TABLES

  14. Level Hierarchy Integration ORDER PRESERVING LEVEL CORRESPONDENCE All All Tax Discharge Category Benefit Type Benefit Type Charity Modality Funding Class Loan Type

  15. Level Hierarchy Integration A LEVEL CORRESPONDENCE THAT DO NOT PRESERVE ORDER IS NOT ADMISSIBLE All All Tax Discharge Category Benefit Type Benefit Type WRONG Charity Modality Funding Class Loan Types

  16. Member Hierarchy Integration INTEGRATION BY MAPPING • APPLIES ON INSTANCES • ESTABLISHES HOW A PAIR OF MEMBERS OF CORRESPONDING LEVELS ARE RELATED • IS PRODUCED/UPDATED DURING A MAPPING ACQUISITION PHASE • MUST BE PRECEDED BY AT LEAST ONE SCHEMA CONCILIATION PHASE • IS MATERIALIZED AS METADATA IN MAPPING TABLES

  17. Member Hierarchy Integration MAPPINGS: HOMOMORPHISM PROPERTY map l':m’ roll-up roll-up map l:m For each member m of a level l, such that map (l:m) is defined, if there exists some member m’ of level l’, satisfying roll-up (l:m) = l’:m’ and level l’ is in dom(Correspondence) then roll-up (map (l:m) ) = map (l’:m’).

  18. Member Hierarchy Integration HOMOMORPHISM MAY NOT BE ALWAYS GRANTED l':m’ roll-up roll-up roll-up roll-up l:m2 map map l:m1 Member m’ in level l’ is conflicting, it cannot be mapped. An approach based on mapping exclusively is not always effective.

  19. Member Hierarchy Integration MAPPINGS DO NOT SUFFICE: REVISIONS MAY BE NECESSARY Conflicting Member l':m’ LOCAL l:m2 l:m1 ACQUAINTANCE REVISIONS AFFECT THE VIEW A PEER HAS OF THE HIERARCHY OF ITS ACQUAINTANCE ONLY

  20. Member Hierarchy Integration EXAMPLE OF A REVISION: CONFLICTING MEMBER SPLIT l':m2’ l:m1’ Non-Conflicting Members LOCAL l:m2 l:m1 ACQUAINTANCE A REVISION BY SPLITTING MAY BE USED TO REPAIR CONFLICTS GIVING WAY TO MAPPABLE MEMBERS

  21. Member Hierarchy Integration EXAMPLE OF A REVISION: CONFLICTING MEMBER RECLASSIFICATION l':m” Non-Conflicting Members l:m’ LOCAL l:m3 l:m2 l:m1 ACQUAINTANCE A REVISION BY RECLASSIFYING MAY BE AN ALTERNATIVE TO RESTORE HOMOMORPHISM

  22. Member Hierarchy Integration REVISE AND MAP APPROACH: LOCAL PEER: • PRODUCES ANDBROADCASTS REVISION AND MAPPING DEFINITIONS TO POTENTIAL ACQUAINTANCES ACQUAINTANCE: • REVISES ITS OWN HIERARCHIES PRODUCING A REVISED INSTANCE (REVISED ROLL-UPS) WITH RESPECT TO THE LOCAL PEER • STORE INFORMATION ON MAPPINGS IN METADATA MAPPING TABLES

  23. Member Hierarchy Integration BOTTOM-UP COMPLETION APPROACH l':m2’ Non-Mapped Member l':m1’ roll-up roll-up roll-up Incomplete roll-up l:m2 map map l:m1 Whenever some member m2’ of a level l’ is not mapped, a bottom-up completion approach for query answering is employed: information on non-mapped members and their roll-ups is stored in metadata completion tables.

  24. Aggregate Queries in Peer-to-Peer OLAP OUTLINE: • CHARACTERIZATION • PROBLEM AND PROPOSAL • FACT INTEGRATION • DIMENSION INTEGRATION • AGGREGATE QUERIES • CONCLUSIONS

  25. P2P OLAP Queries Syntactical Structure (Datalog Style): query( Z1, ... , Zn, aggr(M), Set of Peers)  Generic Fact(X1, ... , Xn, M ), rollup dimension d1 from bottom level to desired level l1 ( X1, Z1 ), ... , rollup dimension dn from bottom level to desired level ln ( Xn, Zn );

  26. Query Evaluation Process • GENERATES A QUERY FOR EACH RELEVANT PEER (INCLUDING THE LOCAL PEER); • GENERATED QUERIES ARE PROPAGATED TO RELEVANT PEERS; • QUERIES FOR RELEVANT PEERS STEM FROM THE REWRITING OF THE SUBMITTED P2P OLAP QUERY; • THE REWRITING PROCESS INTRODUCES REFERENCES TO FACT PEERS, REVISED ROLL-UPS, AND MAPPING AND COMPLETION TABLES; • RESULTS OF PROPAGATED QUERIES ARE COLLECTED AND AGGREGATED LOCALLY TO PRODUCE THE FINAL QUERY ANSWER; • QUERY ANSWERS MAY BE UNCERTAIN AND INCOMPLETE DUE TO BOTTOM-UP COMPLETION.

  27. Query Processing Local Peer Relevant Peer Integration Answer Partial Result Evaluation Completion tables Rewriting Fact tables Mapping tables Metadata QUERY Revised Rollups

  28. Aggregate Queries in Peer-to-Peer OLAP CONCLUSIONS: MAIN POINTS DISCUSSED • GENERIC FACTS • FACT CONCILIATION PHASE • HIERARCHY LEVEL CORRESPONDENCE • SCHEMA CONCILIATION PHASE • REVISE AND MAP APPROACH • BOTTOM-UP COMPLETION • MAPPING ACQUISITION PHASE • P2P OLAP QUERIES • QUERY REWRITING AND EVALUATION

More Related