Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP
Agenda Problem statement and goal Related approaches How GRAPE and POP works? Experiment results Conclusions and Future Work
Problem Publishers can join anywhere in the network Closest broker Impact: High delivery delay High system utilization Matching Bandwidth Subscription Storage P P S S S S Pure forwarders
Goal Adaptively move publisher to the area of highest-rated subscribers or highest number of publication deliveries Key properties of solution: Dynamic Transparent Scalable Robust S S S S P P
Existing Approaches Filter-based Pub/Sub: R.Baldoni et al. Efficient publish/subscribe through a self-organizing broker overlay and its application to SIENA. The Computer Journal, Migliavacca et al. Adapting Publish-Subscribe Routing to Traffic Demands. DEBS Multicast-based Pub/Sub: Such as Riabov’s subscription clustering algorithms (ICDCS’02 and ‘03), SUB-2-SUB (one subscription per peer), TERA (topic-based) Assign similar subscriptions to one or more cluster of servers One-time-match at the dispatcher Suitable for static workloads May get false-positive publication delivery Architecture is fundamentally different than filter-based approaches
Terminology B1 B2 B3 B4 B5 P P Reference broker upstreamdownstream Publication flow
GRAPE - Intro Greedy Relocation Algorithm for Publishers of Events Goal: Move publishers to area with highest-rated subscribers or highest publication deliveries based on GRAPE’s configuration.
GRAPE’s Configuration The configuration tells GRAPE what aspect of system performance to improve: 1. Prioritize on minimizing average end-to-end delivery delay or total system message rate (a.k.a. system load) 2. Weight on prioritization falls on a scale between 0% (weakest) and 100% (full). Example: Prioritize on minimizing load at 100% (load100)
Minimize Delivery Delay or Load? S S S S S S S S S S S S [class,=,`STOCK’], [symbol,=,`GOOG’], [volume,>, ] P P [class,`STOCK’], [symbol,`GOOG’], [volume, ] [class,=,`STOCK’], [symbol,=,`GOOG’], [volume,>,0] 4 msg/s 1 msg/s 100% Load 0% 0% Delay 100%
GRAPE’s 3 Phases Operation of GRAPE is divided into 3 phases: Phase 1: Discover location of publication deliveries by tracing live publication messages in trace sessions Retrieve trace and broker performance information Phase 2: In a centralized manner, pinpoint the broker that minimizes the average delivery delay or system load Phase 3: Migrate the publisher to the broker decided in phase 2 Transparently with minimal routing table update and message overhead
Phase 1 – Logging Publication History Each broker records, per publisher, the publications delivered to local subscribers G threshold publications are traced per trace session Each trace session is identified by the message ID of first traced publication message of that session B34-M213 B34-M215 B34-M216 B34-M217 B34-M220 B34-M222 B34-M225 B34-M226 Publications received from start of trace session B34-M212 B34-M Trace session ID Start of bit vector GRAPE’s data structure representing local delivery pattern. Requires each publication to store the trace session ID
Phase 1 – Trace Data and Broker Performance Retrieval B1B5 B7 B6 B8 P P S S S S S S 1x 9x 5x S S 1x Reply B8 Reply B8 Reply B7 Reply B7 Reply B8, B7, B6 Reply B8, B7, B6 Reply B8, B7, B6, B5 Reply B8, B7, B6, B5 … at the end of a trace session
Phase 1 – Contents of Trace Information Broker ID Neighbor ID(s) Bit vector (for estimating total system message rate) Total number of local deliveries (for estimating end-to- end delivery delay) Input queuing delay Average matching delay Output queuing delays to neighbor(s) and binding(s) GRAPE adds 1 reply message per trace session.
Phase 2 – Broker Selection Estimate the average end-to-end delivery delay Local delivery counts, and queuing and matching delays Publisher ping times to the downstream brokers Estimate the total broker message rate Bit vectors
Phase 2 – Estimating Average End- to-End Delivery Delay B1 B8 B6 B7 P P S S S S S S 9 5 S S 2 1 Input Q: Matching: Output Q (RMI): Output Q (B5): Input Q: Matching: Output Q (RMI): Output Q (B5): Output Q (B7): Output Q (B8): Input Q: Matching: Output Q (RMI): Output Q (B6): Input Q: Matching: Output Q (RMI): Output Q (B6): 30 ms 20 ms 100 ms 50ms 20 ms 5 ms 45 ms 25 ms 40 ms 35 ms 30 ms 10 ms 70 ms 30 ms 35 ms 15 ms 75 ms 35 ms Subscriber at B1: 10+( ) ×1 = 160 ms Subscribers at B2: 10+[( )+( )] ×2 = 350 ms Subscribers at B7: 10+ [( )+( )+ ( )] ×9 = 2,485 ms Subscribers at B8: 10+[( )+( )+ ( )] ×5 = 1,435 ms Average end-to-end delivery delay: ( ) ÷ 17 = 268 ms 10 ms Ping time:
Phase 2 – Estimating Total Broker Message Rate B1 B8 B6 B7 P P S S S S S S 9 5 S S Bit vector capturing publication deliveries to local subscribers Message rate through a broker is calculated by using the OR-bit operator to aggregate the bit vectors of all downstream brokers
Phase 2 – Minimizing Delivery Delay with Weight P% 1. Get ping times from publisher 2. Calculate the average delivery delay if the publisher is positioned at any of the downstream brokers 3. Normalize, sort, and drop candidates with average delivery delays greater than 1-P (0 ≤ P ≤ 1). 4. Calculate the total broker message rate if the publisher is positioned at any of the remaining candidate brokers 5. Select the candidate that yields the lowest total system message rate.
Phase 3 – Publisher Migration Protocol Requirements: Transparent to the end-user publisher Minimize network and computational overhead No additional storage overhead
Phase 3 - Example B1 B3 B2 B5 B4B7 B6 B8 S S S S S S S S S S S S 2x 4x 3x 1x 9x 5x P P S S 1x (1) Update last hop of P to B6-x (1)Update last hop of P to B6 (2)Remove all S with B6 as last hop (1)Update last hop of P to B6 (2)Remove all S with B5 as last hop (3)Forward (all) matching S to B5 How to tell when all subs are processed by B6 before P can publish again? DONE
POP - Intro Publisher Optimistic Placement Goal: Move publishers to the area with highest publication delivery or concentration of matching subscribers
POP’s Methodology Overview 3 phase algorithm: Phase 1: Discover location of publication deliveries by probabilistically tracing live publication messages Ongoing, efficiently with minimal network, computational, and storage overhead Phase 2: In a decentralized fashion, pinpoint the broker closest to the set of matching subscribers using trace data from phase 1 Phase 3: Migrate the publisher to the broker decided in phase 2 Same as GRAPE’s Phase 3
Phase 1 – Aggregated Replies B43 B615 B1 B3 B2 B5 B4B7 B6 B8 P P S S S S S S S S S S S S 2x 4x 3x 1x 9x 5x S S 1x B1 B2 B4 B B32 B B89 B75 B6 B7 B Publisher Profile Table Multiple publication traces are aggregated by : S i = S new + (1 - α) S i-1 Reply 9 Reply 9 Reply 5 Reply 5 Reply 15 Reply 15 Reply 15 Reply 15 In terms of message overhead, POP introduces 1 reply message per traced publication
Phase 2 – Decentralized Broker Selection Algorithm Phase 2 starts when P threshold publications are traced Goal: Pinpoint the broker that is closest to highest concentration of matching subscribers Using trace information from only a subset of brokers The Next Best Broker condition: The next best neighboring broker is the one whose number of downstream subscribers is greater than the sum of all other neighbors' downstream subscribers plus the local broker's subscribers.
Phase 2 – Example B43 B615 B1 B3 B2 B5 B4B7 B6 B8 S S S S S S S S S S S S 2x 4x 3x 1x 9x 5x P P S S 1x B1 B2 B4 B B32 B B89 B75 B6 B7 B AdvId: P DestId: null Broker List: B1, B5, B6 10 B6
Experiment Setup Experiments on both PlanetLab and a cluster testbed PlanetLab: 63 brokers 1 broker per box 20 publishers with publication rate of 10 – 40 msg/min 80 subscribers per publisher 1600 subscribers in total P threshold of 50 G threshold of 50 Cluster testbed: 127 brokers Up to 7 brokers per box 30 publishers with publication rate of 30 – 300 msg/min 200 subscribers per publisher 6000 subscribers in total P threshold of 100 G threshold of 100
Average Input Utilization Ratio VS Subscriber Distribution Graph 4/25/200926
Average Delivery Delay VS Subscriber Distribution Graph 4/25/200927
Results Summary Under random workload No significant performance differences between POP and GRAPE Prioritization metric and weight has almost no impact on GRAPE’s performance Increasing the number of publication samples on POP Increases the response time Increases the amount of message overhead Increases the average broker message rate GRAPE reduces the input util ratio by up to 68%, average message rate by 84%, average delivery delay by 68%, and message overhead relative to POP by 91%.
Conclusions and Future Work POP and GRAPE moves publishers to highest-rated or highest number of matching subscribers to: Reduce load in the system, and/or Scalability Reduce average delivery delay on publication messages Performance Subscriber relocation algorithm that works in concert with GRAPE
Questions and Notes