©NEC Laboratories America 1 Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California Optimal Load Balancing in Publish/Subscribe Broker Networks using Active Workload Management
©NEC Laboratories America 2 Outline Problem statement Load balancing in pub/sub broker networks Optimal load balancing half-cascading load distribution on a workload aggregation tree Shuffle Architecture workload balancing schemes Analysis & Evaluation Conclusions
©NEC Laboratories America 3 Publish/Subscribe Overlay Services Publisher Y Publisher X Subscriber B Subscriber A Broker network Subscription Event
©NEC Laboratories America 4 Workload Management In a Pub/Sub Broker Network A broker network offers 1 function: message filtering. the process of selecting messages for reception. 4 types of workloads in a broker network. message parsing. message matching. message delivering. message forwarding. Assumed the last to cause performance bottleneck. 1 unique factor in the difficulty of the workload management Run-time content matching Our contribution: an active workload management middleware, offering optimal load balancing on all 4 types of the workloads. 2 main components: message shuffling and half-cascading aggregation trees.
©NEC Laboratories America 5 A Simple Optimal Load Balancing Scheme A simple push-half- down load balancing scheme can be enabled with the workload aggregation tree as left. An aggregation tree with the half-cascading load distribution under uniform traffic input distribution.
©NEC Laboratories America 6 Message shuffling Upon receiving a message m (event or subscription) from outside the broker network, the first assignment of a Shuffle node x is to redistribute it in the system. x will pick a random key for m (e.g., by hashing some subscription ID contained in the message) and send it to the node y responsible for that key in the overlay space. The above message shuffling achieves two goals: The randomization makes the distribution of the input traffic for any potential aggregation tree uniform on the node space. Combing message shuffling, and Chord [stoica2001] with a new node join/leave scheme, Shuffle can construct half-cascading aggregation trees. The cost of message parsing on subscriptions is distributed evenly throughout the system so that Shuffle eliminates the potential performance bottleneck due to message parsing workload.
©NEC Laboratories America 7 Shuffle – software architecture The Shuffle node architecture
©NEC Laboratories America 8 Shuffle – an example message filtering process 1.An event message e arrives from a publisher and on node x. 2.Node x forwards e to node y through message shuffling. 3.Node y parses e, and forwards the parsed message to each of the subscription aggregation trees that e’s attributes corresponds to. 4.In each aggregation tree, e is forwarded along the path from y to the root node following Chord routing protocol, and the node at each hop either forwards it or does message matching. 5.When the message matching is done, message delivering will be done in the same node afterwards. 6.Periodically, a load balancing process will be scheduled to balance the workload due to two independent inputs: streaming events and stored subscriptions.
©NEC Laboratories America 9 Event Overload X, Y X, Y/2 X : # subs. for attribute A Y: # of events with attribute A
©NEC Laboratories America 10 Subscription Overload X : # subs. for attribute A; Y: # of events with attribute A b c d X,Y a b c d X/2,Y a b c d X/4,Y a b c d X/2,Y X/4,Y X/2,Y X/4,Y a
©NEC Laboratories America 11 Analysis Result 1: when the Shuffle network size is a power of 2, every Shuffle node in any aggregation tree has the half-cascading load distribution on its children in terms of aggregated messages. Result 2: When the Shuffle network size is not a power of 2, any non-leaf node x in an aggregation tree has at least one child which contributes no less than 1/4 of the total load aggregated on x. Result 3: MIN-NODE-LOAD-FORWARD is NP-hard. MIN-NODE-LOAD-FORWARD: For a network of size N, given k attribute trees, the number of subscriptions Xi at the root of each attribute tree i and threshold th, what is the minimum number of nodes in the network to which subscriptions must be transferred to such that the number of subscriptions at any node is at most th?
©NEC Laboratories America 12 Evaluation consider three load balancing schemes: Shuffle. Random-Half: In this scheme, an overloaded node picks an underloaded node with random probing, and then splits half its load with that node. The overloaded node repeats the operation until its load is reduced below a target level. Random-Min: Random-Min is the same as Random-Half except when an overloaded node splits its load with an underloaded node, it just delegates a bare minimum load equal to the target value to the chosen node by replicating its subscription set there and forwarding a commensurate fraction of event traffic there.
©NEC Laboratories America 13 Single aggregation tree results (1) Event load balancing – Control Messages
©NEC Laboratories America 14 Single Aggregation Trees Results (2) Event load-balancing- Message Forwarding Load
©NEC Laboratories America 15 Multiple Aggregation Tree Results Subscription load balancing - Nodes affected
©NEC Laboratories America 16 Conclusions In this paper, we present the design of Shuffle, an active workload management middleware to support a scalable broker network. Shuffle offers an integral solution to manage all types of the workload in a pub/sub broker network. The load balancing performance is insensitive to the data distribution of input requests. The load balancing does not introduce extra maintenance cost on the overlay topology.
©NEC Laboratories America Thank you! Questions?
©NEC Laboratories America 18 Backup - 1