180 likes | 329 Views
Optimal Load Balancing in Publish/Subscribe Broker Networks using Active Workload Management. Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California. Outline. Problem statement
 
                
                E N D
Optimal Load Balancing in Publish/Subscribe Broker Networksusing Active Workload Management Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California
Outline • Problem statement • Load balancing in pub/sub broker networks • Optimal load balancing • half-cascading load distribution on a workload aggregation tree • Shuffle • Architecture • workload balancing schemes • Analysis & Evaluation • Conclusions
Publish/Subscribe Overlay Services Subscription Publisher X Event Subscriber A Broker network Subscriber B Publisher Y
Workload Management In a Pub/Sub Broker Network • A broker network offers 1 function: message filtering. • the process of selecting messages for reception. • 4 types of workloads in a broker network. • message parsing. • message matching. • message delivering. • message forwarding. • Assumed the last to cause performance bottleneck. • 1 unique factor in the difficulty of the workload management • Run-time content matching • Our contribution: an active workload management middleware, offering optimal load balancing on all 4 types of the workloads. • 2 main components: message shuffling and half-cascading aggregation trees.
A Simple Optimal Load Balancing Scheme • A simple push-half-down load balancing scheme can be enabled with the workload aggregation tree as left. • An aggregation tree with the half-cascading load distribution under uniform traffic input distribution.
Message shuffling • Upon receiving a message m (event or subscription) from outside the broker network, the first assignment of a Shuffle node x is to redistribute it in the system. • x will pick a random key for m (e.g., by hashing some subscription ID contained in the message) and send it to the node y responsible for that key in the overlay space. • The above message shuffling achieves two goals: • The randomization makes the distribution of the input traffic for any potential aggregation tree uniform on the node space. • Combing message shuffling, and Chord [stoica2001] with a new node join/leave scheme, Shuffle can construct half-cascading aggregation trees. • The cost of message parsing on subscriptions is distributed evenly throughout the system so that Shuffle eliminates the potential performance bottleneck due to message parsing workload.
Shuffle – software architecture • The Shuffle node architecture
Shuffle – an example message filtering process • An event message e arrives from a publisher and on node x. • Node x forwards e to node y through message shuffling. • Node y parses e, and forwards the parsed message to each of the subscription aggregation trees that e’s attributes corresponds to. • In each aggregation tree, e is forwarded along the path from y to the root node following Chord routing protocol, and the node at each hop either forwards it or does message matching. • When the message matching is done, message delivering will be done in the same node afterwards. • Periodically, a load balancing process will be scheduled to balance the workload due to two independent inputs: streaming events and stored subscriptions.
Event Overload X : # subs. for attribute A Y: # of events with attribute A X, Y/2 X, Y 0 0 X, Y/2 1 2 1 2 3 3
Subscription Overload X : # subs. for attribute A; Y: # of events with attribute A X/2,Y a X,Y a X/2,Y c b c b d d X/4,Y X/4,Y a a X/4,Y X/2,Y X/4,Y X/4,Y c b c b X/4,Y d d
Analysis • Result 1: when the Shuffle network size is a power of 2, every Shuffle node in any aggregation tree has the half-cascading load distribution on its children in terms of aggregated messages. • Result 2: When the Shuffle network size is not a power of 2, any non-leaf node x in an aggregation tree has at least one child which contributes no less than 1/4 of the total load aggregated on x. • Result 3: MIN-NODE-LOAD-FORWARD is NP-hard. • MIN-NODE-LOAD-FORWARD: For a network of size N, given k attribute trees, the number of subscriptions Xi at the root of each attribute tree i and threshold th, what is the minimum number of nodes in the network to which subscriptions must be transferred to such that the number of subscriptions at any node is at most th?
Evaluation • consider three load balancing schemes: • Shuffle. • Random-Half: In this scheme, an overloaded node picks an underloaded node with random probing, and then splits half its load with that node. The overloaded node repeats the operation until its load is reduced below a target level. • Random-Min: Random-Min is the same as Random-Half except when an overloaded node splits its load with an underloaded node, it just delegates a bare minimum load equal to the target value to the chosen node by replicating its subscription set there and forwarding a commensurate fraction of event traffic there.
Single aggregation tree results (1) • Event load balancing – Control Messages
Single Aggregation Trees Results (2) • Event load-balancing- Message Forwarding Load
Multiple Aggregation Tree Results • Subscription load balancing - Nodes affected
Conclusions • In this paper, we present the design of Shuffle, an active workload management middleware to support a scalable broker network. • Shuffle offers an integral solution to manage all types of the workload in a pub/sub broker network. • The load balancing performance is insensitive to the data distribution of input requests. • The load balancing does not introduce extra maintenance cost on the overlay topology.
Thank you! Questions?