1 / 12

CMS-ToPSS: Efficient Dissemination of RSS Documents

CMS-ToPSS: Efficient Dissemination of RSS Documents. Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto. Information Dissemination. Easy to use web publishing tools (blog, wiki) are fueling the increase in the number of web publishers

alec
Download Presentation

CMS-ToPSS: Efficient Dissemination of RSS Documents

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMS-ToPSS: Efficient Dissemination of RSS Documents Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto VLDB2005

  2. Information Dissemination • Easy to use web publishing tools (blog, wiki) are fueling the increase in the number of web publishers • RSS frequently used to disseminate update to interested users • CNN.com, Yahoo! News, Amazon.com, MSN search (beta) Problem:Polling based architecture RSSreaders RSSpublishers RSSaggregator VLDB05

  3. Solution! Current rss dissemination architecture  G-ToPSS rss dissemination architecture  VLDB05

  4. MatchingRSS feeds MatchingRSS feeds Interaction Model: Publish/Subscribe Publisher Publisher RSS feeds Broker Queries over all RSS Subscriber Subscriber VLDB05

  5. Research challenges • Need a subscription (query) language suitable for filtering of rss documents • Need an efficient matching algorithm based on graph representation • Structurally matching • Constraint matching • Scalability to a large number of subscriptions and high publishing rate VLDB05

  6. CMS-ToPSS System Architecture VLDB05

  7. Subscription Scalability VLDB05

  8. Memory Scalability VLDB05

  9. Matching Semantics PAPER17 Publication ?y(?y <= Publication) AUTHOR CONFERENCE AUTHOR CONFERENCE “Arno Jacobsen” SIGMOD SIGMOD “Arno Jacobsen” YEAR “2001” YEAR LOCATION “California” ?z(?z > 2000) Subscription VLDB05

  10. Data Model (RSS Documents) • Publications are represented as directed graphs with node and edge labels • Node labels are typed • Literal value • Class • Edge labels are typed • Class • Classes can be related using multiple inheritance ontology VLDB05

  11. Query Language (GQL) • Queries are represented as directed graph patternswith node and edge labels • Node labels are variables • Variables can be constrained by • Classes • Class instances and literal values • Edge labels are class instances • Mapping (matching) semantics • Pattern graph maps to data graph if the topology (structure) of the two graphs matches and all variable constraints are satisfied VLDB05

  12. Conclusion and Future Work • Proposed a prototype for graph-based metadata filtering • G-ToPSS supports high matching rate for an expressive subscription language • Extend G-ToPSS with full RDF language features • Optimize constraint processing during matching VLDB05

More Related