Subscription partitioning and routing in content based publish subscribe networks
Download
1 / 11

Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks - PowerPoint PPT Presentation


  • 67 Views
  • Uploaded on
  • Presentation posted in: General

Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks. Yi-Min Wang, Lili Qiu, Dimitris Achlioptas, Gautam Das, Paul Larson, and Helen J. Wang Microsoft Research DISC 2002 Toulouse, France. Motivation. Phenomenal growth in Web usage Future trends

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Subscription partitioning and routing in content based publish subscribe networks
Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks

Yi-Min Wang, Lili Qiu, Dimitris Achlioptas, Gautam Das, Paul Larson, and Helen J. Wang

Microsoft Research

DISC 2002

Toulouse, France


Motivation
Motivation Publish/Subscribe Networks

  • Phenomenal growth in Web usage

  • Future trends

    • Switch from polling to notifications

    • Example: stock quotes, sports scores, weather, news, …

    • Yahoo! Alerts, MSN Mobile, AOL anywhere, InfoSpace, …

    • Complements the traditional polling model in Web

  • Event Distribution Network (EDN)

    • Distributed and scalable event distribution

      • Parallel the idea of Content Distribution Network (CDN) for event distribution

      • Built on top of a self-configuring overlay network of servers

    • Content-based publish/subscribe systems through in-network processing of aggregated subscription filters


Dispatcher based model
Dispatcher-based model Publish/Subscribe Networks


Model of content based pub sub
Model of Content-based Pub/Sub Publish/Subscribe Networks

  • Content-based filtering/routing

    • Event schema with d attributes, supporting equality and range predicates

    • Event: a point in the d–dimensional space

    • Subscription: a rectanglein that space

    • Match: a rectangle contains the point


Subscription partitioning
Subscription Partitioning Publish/Subscribe Networks

  • Basic idea: similarity-based clustering for reducing total event traffic

    • Event Space Partitioning(ESP)

    • Filter Set Partitioning (FSP)


Equality predicates
Equality Predicates Publish/Subscribe Networks

  • Hash predicates to get uniform distribution

    • Treat the hashed domain as the event space

  • Use Event Space Partitioning

    • Subscription is a point; does not intersect multiple sub-spaces

  • Use over-partitioning for better load balancing

    • Use offline greedy algorithm to assign buckets to servers for load balancing

    • Use indirection table to dynamically map buckets to servers for load re-balancing

  • Use bloom filters to further reduce traffic

    • Fast detection of true negatives at the expense of (very low) false-positive rate


Simulation results
Simulation Results Publish/Subscribe Networks

  • Actual Notification Money log

    • 1.48M subscriptions with 0.29M unique filters over 21,741 stock symbols

    • Zipf-like distribution


Simulation results cont
Simulation Results (Cont.) Publish/Subscribe Networks

  • Simulate 100M new subscriptions from 43,734 symbols

    • Scaled-up Zipf-like distribution

    • Perturbation and permutation

    • Uniform distribution

  • 50 servers with over-partitioning ratio = 10

  • Without load re-balancing

    • Load imbalance (max/min) ranged from 1.41 to 6.66 (Uniform case)

  • With imbalance threshold of 2.0

    • Re-balancing was triggered only 5 times, each time involving re-assignment of up to 3 buckets and migration of up to 0.7% subscriptions.


Range predicates
Range Predicates Publish/Subscribe Networks

  • Use Filter Set Partitioning

  • K-Mean clustering

    • Use center point to represent a rectangle

  • R-tree-based clustering

    • R-tree: dynamic index structure for multi-dimensional data rectangles

    • Offline R-tree algorithm

      • Exhaustively and recursively search for partitions that minimize sum of bounding rectangle volumes

    • Online R-tree algorithm

      • Insert from root down the path that greedily minimizes the increase in bounding rectangle volume

  • Simulation results

    • Off-line R-tree > On-line R-tree > K-Mean > Random


Related work
Related Work Publish/Subscribe Networks

  • Pub/Sub systems

    • Echo, Elvin, Gryphon, Herald, Hierarchical Proxy Architecture, Information Bus, JEDI, Keryx, Ready, Scribe, Siena, …

  • Clustering in the pub/sub

    • All the previous work focus on reducing # multicast groups [OAA+00, RLW+02, WKM00]


Summary
Summary Publish/Subscribe Networks

  • Proposed two subscription partitioning and routing approaches

    • Event Space Partitioning

    • Filter Set Partitioning

  • Evaluated performance via simulations

    • Subscription partitioning reduces network traffic

    • Over-partitioning helps to achieve good load balancing dynamically

    • Bloom filter further reduces event traffic


ad
  • Login