subscription partitioning and routing in content based publish subscribe networks
Download
Skip this Video
Download Presentation
Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks

Loading in 2 Seconds...

play fullscreen
1 / 11

ioning and Routing in Content-based Publish/Subscribe Networks - PowerPoint PPT Presentation


  • 418 Views
  • Uploaded on

Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks. Yi-Min Wang, Lili Qiu, Dimitris Achlioptas, Gautam Das, Paul Larson, and Helen J. Wang Microsoft Research DISC 2002 Toulouse, France. Motivation. Phenomenal growth in Web usage Future trends

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'ioning and Routing in Content-based Publish/Subscribe Networks' - daniel_millan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
subscription partitioning and routing in content based publish subscribe networks

Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks

Yi-Min Wang, Lili Qiu, Dimitris Achlioptas, Gautam Das, Paul Larson, and Helen J. Wang

Microsoft Research

DISC 2002

Toulouse, France

motivation
Motivation
  • Phenomenal growth in Web usage
  • Future trends
    • Switch from polling to notifications
    • Example: stock quotes, sports scores, weather, news, …
    • Yahoo! Alerts, MSN Mobile, AOL anywhere, InfoSpace, …
    • Complements the traditional polling model in Web
  • Event Distribution Network (EDN)
    • Distributed and scalable event distribution
      • Parallel the idea of Content Distribution Network (CDN) for event distribution
      • Built on top of a self-configuring overlay network of servers
    • Content-based publish/subscribe systems through in-network processing of aggregated subscription filters
model of content based pub sub
Model of Content-based Pub/Sub
  • Content-based filtering/routing
    • Event schema with d attributes, supporting equality and range predicates
    • Event: a point in the d–dimensional space
    • Subscription: a rectanglein that space
    • Match: a rectangle contains the point
subscription partitioning
Subscription Partitioning
  • Basic idea: similarity-based clustering for reducing total event traffic
    • Event Space Partitioning(ESP)
    • Filter Set Partitioning (FSP)
equality predicates
Equality Predicates
  • Hash predicates to get uniform distribution
    • Treat the hashed domain as the event space
  • Use Event Space Partitioning
    • Subscription is a point; does not intersect multiple sub-spaces
  • Use over-partitioning for better load balancing
    • Use offline greedy algorithm to assign buckets to servers for load balancing
    • Use indirection table to dynamically map buckets to servers for load re-balancing
  • Use bloom filters to further reduce traffic
    • Fast detection of true negatives at the expense of (very low) false-positive rate
simulation results
Simulation Results
  • Actual Notification Money log
    • 1.48M subscriptions with 0.29M unique filters over 21,741 stock symbols
    • Zipf-like distribution
simulation results cont
Simulation Results (Cont.)
  • Simulate 100M new subscriptions from 43,734 symbols
    • Scaled-up Zipf-like distribution
    • Perturbation and permutation
    • Uniform distribution
  • 50 servers with over-partitioning ratio = 10
  • Without load re-balancing
    • Load imbalance (max/min) ranged from 1.41 to 6.66 (Uniform case)
  • With imbalance threshold of 2.0
    • Re-balancing was triggered only 5 times, each time involving re-assignment of up to 3 buckets and migration of up to 0.7% subscriptions.
range predicates
Range Predicates
  • Use Filter Set Partitioning
  • K-Mean clustering
    • Use center point to represent a rectangle
  • R-tree-based clustering
    • R-tree: dynamic index structure for multi-dimensional data rectangles
    • Offline R-tree algorithm
      • Exhaustively and recursively search for partitions that minimize sum of bounding rectangle volumes
    • Online R-tree algorithm
      • Insert from root down the path that greedily minimizes the increase in bounding rectangle volume
  • Simulation results
    • Off-line R-tree > On-line R-tree > K-Mean > Random
related work
Related Work
  • Pub/Sub systems
    • Echo, Elvin, Gryphon, Herald, Hierarchical Proxy Architecture, Information Bus, JEDI, Keryx, Ready, Scribe, Siena, …
  • Clustering in the pub/sub
    • All the previous work focus on reducing # multicast groups [OAA+00, RLW+02, WKM00]
summary
Summary
  • Proposed two subscription partitioning and routing approaches
    • Event Space Partitioning
    • Filter Set Partitioning
  • Evaluated performance via simulations
    • Subscription partitioning reduces network traffic
    • Over-partitioning helps to achieve good load balancing dynamically
    • Bloom filter further reduces event traffic
ad