MIDDLEWARE SYSTEMS RESEARCH GROUP Adaptive Content-based Routing In General Overlay Topologies Guoli Li, Vinod Muthusamy Hans-Arno Jacobsen Middleware Systems Research Group University of Toronto
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Distributed Publish/Subscribe An acyclic overlay is sensitive to: Congestion Broker failures Benefits of a general overlay: Routing around congestion and failures Handling imbalanced workloads Publisher Subscriber Subscription Publication Advertisement Subscriber Applications Business process execution e.g., BPEL Business activity monitoring Service discovery and integration …
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Challenges With General Overlays Subscriptions are routed in loops Brokers receive duplicate subscriptions Subscription copies exacerbate the problem Same problem for publications Adv 1 Adv 2 S X S S
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Agenda Content-based routing protocol for general overlays Atomic and composite subscriptions Optimal publication routing Evaluation Dynamic publication routing Adaptive composite subscription routing
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium TID-based Approach Each advertisement is assigned to a unique tree identifier (TID) Each subscription has a TID predicate with a variable Adv 1 Adv 2 X S
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Subscription Routing Adv 1 Adv 2 X S: [class=stock][symbol=*] [TID=$Z ] At Broker 1: Adv1: [class=stock][symbol=IBM] [TID=Adv1] Adv2: [class=stock][symbol=HP] [TID=Adv2] S matching Adv1: [class=stock][symbol=*][TID=Adv1] S matching Adv2: [class=stock][symbol=*] [TID=Adv2] S A2 S A1 S
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Publication Routing Each publication is assigned the TID of its matching advertisement e.g., p [class, stock][symbol,HP][TID, adv_msg_id] Publications are routed: Fixed TID routing: a publication is routed to subscribers along its advertisement tree. Dynamic publication routing: a publication may be routed to subscribers across advertisement trees.
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Fixed TID Routing Property No broker receives duplicate publication messages Adv 1 Adv 2 Sub X P P
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Dynamic Publication Routing Publication’s TID is changeable Routing heuristic Util = R output / R sending Property Changing a publication’s TID while in transit will not change the set of notified subscribers Adv 1 Adv 2 Sub X P
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Advantages Retains the publish/subscribe client interface Speeds up subscription and publication matching Avoids duplicate subscriptions and publications Routes publications dynamically across multiple alternatives Enables routing around failures, congestion and load imbalances
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Composite Subscription AND S1S2 A composite subscription consists of atomic subscriptions linked by logical operators (e.g., AND, OR). Composite subscription routing Topology-based routing Adaptive routing e.g., CS= {[class=stock][symbol=YHOO][price>12]} AND {[class=stock][symbol=MSFT][price<20]}
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Topology-based CS Routing Adv 1 Adv CS={{S1 AND S2} ANDS3} CS S2 A2 S3 A3 S1 A1 CS’ Adv 3 CS’ ={S1 AND S2} Broker 4 and 8 are the joint point brokers
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Adaptive CS Routing CS’s joint points are determined according to potential publication traffic, bandwidth, latency, etc Adv 1 Adv 2 CS={S1 AND S2} 2 13 Adv 1 Adv 2 CS={S1 AND S2}
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Cost Model Routing cost of CS RC(CS)) = + + Subscription cardinality |P(S)| : The number of matching publications per unit of time. |P(S)| = |P(CS)| = |P(S l )| + |P(S r )| if op = or subscriptiondest Matching Engine Routing Table + symbol=IBM B1 symbol=HP B2 input queue output queue B1 output queue B2 Broker
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Adaptive CS Routing Adv 1 Adv CS={{S1 AND S2} ANDS3} CS S2 A2 S3 A3 S1 A1 CS’ Adv 3 CS’ ={S1 AND S2} CS’
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Evaluation Setup Overlays of 32 brokers with different connection degrees Cluster (each node:1.86GHz, 4G) and PlanetLab Workloads: Yahoo!Finance stock quote traces Metrics End to end notification delay Network traffic
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Dense vs. Sparser Topologies 20% 4% Note: The benefit is not proportional to the connection degree.
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Higher Publication Rate stabilized
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Publication Burst Burst
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium With Broker Failures 1 st failure 2 nd failure
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium CS Routing Traffic
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Conclusions Enables routing around failures, congestion and load imbalances Allows publications routing across alternative paths Improves the notification delay by 20% Enables flexible CS routing Reduces 80% publication traffic Improves the notification delay by 55% Simplifies solutions for failure recovery and load balancing
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Questions? A D R E S P
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium More Publishers
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Effect of Subscriber Distance DistanceFixed(ms)Dynamic(ms)Improvement 6 Hops % 10 Hops % 12 Hops % Max Diff57.65%27.39%
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium On PlanetLab
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium CS Delay
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Faster Matching with TIDs Subscriptions are augmented with TIDs only once at the first broker. Other brokers can route the subscription based on the TID alone. Similar argument applies to publication routing.
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Advertisement Routing Each advertisement forms a spanning advertisement tree Duplicated advertisements are discarded by brokers Each advertisement is assigned a unique tree identifier (TID) e.g., a [class,eq,stock]……[TID,eq,adv_msg_id] Subscription Routing Table (SRT) A set of [advertisement, last hop]
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware Leuven, Belgium Subscription Routing Each subscription has a TID predicate with a variable. e.g., s [class,eq,stock]……[TID,eq,$X] The variable is bound to the TID of a matching advertisement Publication Routing Table (PRT) A set of [subscription, {TID, last hop of subscription }]