1 / 20

Management of Uncertainty in Publish/Subscribe Systems

Management of Uncertainty in Publish/Subscribe Systems. Haifeng Liu. Department of Computer Sceince University of Toronto. AMGN=58. Publications. Publisher. Publisher. IBM=84. ORCL=12. JNJ=58. HON=24. INTC=19. MSFT=27. Subscriptions: IBM > 85 ORCL < 10 JNJ > 60. Notification.

melba
Download Presentation

Management of Uncertainty in Publish/Subscribe Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Management of Uncertainty in Publish/Subscribe Systems Haifeng Liu Department of Computer Sceince University of Toronto

  2. AMGN=58 Publications Publisher Publisher IBM=84 ORCL=12 JNJ=58 HON=24 INTC=19 MSFT=27 Subscriptions: IBM > 85 ORCL < 10 JNJ > 60 Notification Notification Subscriber Subscriber Publish/Subscribe Model Stock markets TSX NYSE NASDAQ Broker Network Subscriptions

  3. Applications Enabled by Publish/Subscribe • Selective information dissemination • Information Filtering on the Internet • Location-based services • Workflow management • Intra-enterprise process automation • Logistics and supply chain management • Enterprise application integration • Network monitoring and (distributed) system management

  4. Types of Uncertainties • Lack of information • Buy a cheap car • Imprecision • Sensor data: temperature 15~20ºC, • Location: location (x,y)  location t+1(x’,y’) • Semantics • Synonyms: vehicle vs. automobile • Class taxonomy: CD player vs. electronics • Different expression: 5 years experience vs. graduated in 2001 Problem: manage uncertainties, imprecision and semantics in publish/subscribe system

  5. Agenda • Distributed Publish/Subscribe Model and Content-based Routing • Uncertainties in Publish/Subscribe • Research Challenges • Approximate P/S Model • Graph-structured Model • Current Status • Research Plan

  6. Publish/Subscribe Messages • Advertisement (ad) • publication patterns used by publishers to announce the set of publications they are going to publish • E.g. { (stock, any), (price, any) } • Subscription (sub) • User interest specification • E.g. (stock = “yahoo”) & ( price ≤ $35) • Publication (pub) • Information, data, event • E.g. { (stock, “yahoo”), (price, $32.79) }

  7. Content-based Routing Advertising Advertisement Distributed Overlay Broker Network … … *Adopted from SIENA, Gryphon, REBECA and Hermes

  8. Content-based Routing Subscribing Subscription Distributed Overlay Broker Network … … *Adopted from SIENA, Gryphon, REBECA and Hermes

  9. Content-based Routing Publishing Publication Distributed Overlay Broker Network … … *Adopted from SIENA, Gryphon, REBECA and Hermes

  10. Subscription Forwarding I Covering optimization S1: (car=Honda) & (price <= $30K) S2: (car=Honda) & (price <= $25K) S1 covers S2 P: {(car = Honda), (price,$20K)} s1 Distributed Overlay Broker Network … … S2 *Adopted from SIENA, Gryphon, REBECA and Hermes

  11. Subscription Forwarding II Merging optimization S1: (car=Honda) & (price ≤ $30K) S2: (car=Toyota) & (price ≤ $25K) S’ : (car = any) & (price ≤ $30K) P: {(car = Honda), (price,$20K)} S1 Distributed Overlay Broker Network S’ … … S2 *Adopted from SIENA, Gryphon, REBECA and Hermes

  12. Publish/Subscribe Router • Forwarding of advertisements • Via flooding • Forwarding of subscriptions • Forward along reverse ad path • Matching of ad and sub (Intersecting) • Optimizations • Covering/merging of subs • Forwarding of publications • Forward along reverse sub path • Matching of sub and pub

  13. Uncertainties in Distributed Publish/Subscribe System • Messages • uncertain subscription • uncertain publication • Relations • Between sub and pub • Between sub and sub • Result • Return top K matches } representation: modeling Matching Covering Merging } computation: } aggregation: ranking

  14. Research Challenges • Develop a publish/subscribe model to express uncertainties/semantics in publications and subscriptions • Model approximate matching and semantic matching • Model approximate covering/merging and semantic covering/merging • Scalability to large number of subscribers and high publishing rate

  15. Approximate Matching Model • Model • Sub: fuzzy set • Pub: possibility distribution • Matching • Possibility measure • Necessity measure • Ranking • “min” or “product” for conjunction • “max” or “plus” for disjunction

  16. PAPER17 AUTHOR CONFERENCE “Arno Jacobsen” YEAR LOCATION “California” “2001” SIGMOD Academic Publication Publication Jacobsen’sPublications Proceedings Report WWW VLDB PAPER17 Graph-structured Model • Model • Pub: directed graph • Sub: directed graph pattern • Semantic: ontology • Matching • Pattern graph maps to data graph if the topology (structure) of the two graphs matches and all variable constraints (literal and ontology) are satisfied • Ranking

  17. Current Status • Work to date • Develop an approximate p/s model to express uncertainties and an efficient algorithm to do approximate matching • Develop a covering and merging optimizations for approximate content-based routing • Develop a graph-based p/s architecture applied to the dissemination of RDF metadata (including RSS) • Develop two novel algorithms (covering and merging) for creation of a distributed content-based routing network for graph-structured data.

  18. Comments from Previous Meeting • Probability model • Qualitative similarity measure • Validate our results • Real data set • Interactive evaluation

  19. Research Plan I • Membership Function Mining • Get a real data set • “Learn” the membership function • Clustering: K-means, DBscan • Regression: neural network • Semantic Matching and Routing Computation • Matching on ontology • Covering on ontology • Merging on ontology

  20. Research Plan II • Design an experiment to validate the mining results • Design a method to combine possibility measure and necessity measure for ranking • Push thresholds down the matching plan to increase the efficiency of matching algorithm • Use probabilities as an alternative to model uncertainties and imprecision

More Related