Minimal Broker Overlay Design for Content-Based Publish/Subscribe Systems Naweed Tajuddin Balasubramaneyam Maniymaran Hans-Arno Jacobsen University of Toronto November 18, 2013, CASCON 2013 MSRG.ORG
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 2 Nov. 18, Introduction to Publish/Subscribe Messaging platform that decouples information sources and sinks GooPS – Google P/S –AdSense, Docs, YouTube Yahoo Message Broker (within PNUTs) –Data replication for Web Apps –Eventual consistency
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 3 Nov. 18, Advertise 2. Subscribe 3. Publish Content-Based Publish/Subscribe S P
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 4 Nov. 18, 2013 Challenges for Content-Based Publish/Subscribe B5B6 B2 B3 B7 S1 B8B9 P1 P2 B1 B4 Let me know when HP book < $15 Sale! HP: $14.99 Sale! MW: $15.99 Sale! OED: $24.99 Amazon.ca Chapters.ca
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 5 Nov. 18, 2013 Problem Statement Given a set of publishers and subscribers, how can we design a pub/sub overlay that maximizes performance (delivery latency) and minimizes cost (number of brokers). INPUT Brokers available for deployment (processing capacities) Publishers (advertisements) Subscribers (subscriptions) Publication rates per advertisement OUTPUT Set of deployed brokers Client-broker allocation Overlay topology CONSTRAINT Broker processing capacities not exceeded
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 6 Nov. 18, 2013 Content Space price volume sub: [price in (2,10)][volume in (2,7)] pub: [price = 3][volume = 5]
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 7 Nov. 18, 2013 Similarity Model: Interest Publisher-subscriber similarity Likelihood that a publication will match a subscription Geometric intersection between advertisement and subscription over advertisement size I = α 12 / |a 1 | α 12 a1a1 s2s att1 att2
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 8 Nov. 18, 2013 Similarity Model: Commonality Subscriber-subscriber similarity Likelihood that publications matching one subscription will match another subscription Geometric intersection over subscription size C = α 12 2 / |s 1 ||s 2 | α 12 s1s1 s2s att1 att2
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 9 Estimating Load Impact Publishers –Publication rate Subscriber –Σ pub (Interest * pub rate) Brokers –Sum of load impact of local publishers and subscribers –Load compensation factor: reserve broker capacity to account for pure forwarding traffic S1 P1 P2 10 msgs/s 8 msgs/s
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 10 Nov. 18, 2013 Solution Overview Two-phase Algorithm 1.Allocate clients across minimal set of brokers 2.Cluster brokers with high similarity to form good overlay topology Both problems are NP-complete, see paper for proof
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 11 Nov. 18, 2013 B2 B1 B3 B4 B5 Client 1Client 2Client 3Client 4Client 5 Brokers Ranked by Capacity Clients Ranked by Client Ranking Function B1 Client 1Client 2 Most similar broker Client 3 B2 Client 3Client 4 B1 Client 4 Client Allocation Client-Broker Allocation Algorithm
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 12 Nov. 18, 2013 Client Ranking Function Determines the order in which clients are deployed –Impacts how broker capacities are consumed #1 Greatest load impact (GLI) –Clients ranked by load imposed on broker #2 Greatest interest (GI) –Groups consisting of single publisher-subscriber-pair ranked by interest #3 Greatest interest per group (GIg) –Groups consisting of single publisher and all subscribers with non-zero interest, ranked by greatest interest #4 Baseline –Clients ranked and allocated in random order
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 13 Nov. 18, 2013 Overlay Topology Construction 1.Assign weight to every link equal to commonality of broker pair 2.Compute max spanning tree to get overlay topology a1a2 a3a4 s4s5 s6s7 s3 s1s2 AB CD
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 14 Nov. 18, 2013 Evaluation Overview / Experiment Steps Algorithms implemented in Java and simulated using JiST discrete event simulator 1.Execute algorithms to compute overlay design 2.Configure pub/sub system simulator overlay according to overlay design 3.Run experiment and record statistics
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 15 Nov. 18, 2013 Overlay Performance Message count: number of messages generated per publication (or number of broker hops publication must travel to reach all interested subscribers)
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 16 Nov. 18, 2013 Conclusions Optimization framework for pub/sub overlay construction Similarity framework: Interest and commonality –Tools for overlay construction –Leverage existing semantics of content-based pub/sub Load modeling framework –Broker congestion significantly reduced Client allocation and overlay topology construction algorithms –Low latency overlay topologies at reduced cost Future work –Support additional constraints and incorporate network congestion –Account for physical network and broker capacity model
Thank You! Questions?
** Extra Slides **
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 19 Nov. 18, Similarity Model: Commonality Subscriber-subscriber similarity Likelihood that publications matching one subscription will match another subscription Geometric intersection over subscription size C = α 12 2 / |s 1 ||s 2 | 7 α 34 s3s3 s4s4 α 12 s1s1 s2s att1 att2 att1 att2 class = “sale”
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 20 Nov. 18, 2013 Evaluation Overview Algorithms implemented in Java Pub/Sub system built using JiST discrete event simulator Workload details – publishers – subscribers – pubs –Pub rates: 1-10 msgs/s –Broker capacity: 1000 msg/s # of Advs # of Subs # of Pubs
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 21 Nov. 18, 2013 Overlay Cost
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 22 Nov. 18, Maximum Peak Load
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 23 Nov. 18, 2013 Load and Congestion Effects
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 24 Nov. 18, 2013 LCF Cost–Performance Tradeoff
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 25 Nov. 18, Overlay Design – Related Work B5B6 B2 B3 B7 S1 B8B9 P1 P2 B1 B4 B5B6 B2 B3 B7 S1 B8 P1 P2 B1 B4 B9 Rewire overlay [Baldoni et al., 2007] [Yoon et al., 2013] Move publishers [Cheung et al., 2010]
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 26 Nov. 18, 2013 Estimating Load Impact of Clients S1 P1 P2 Publisher –Publication Rate Subscriber –Σ(Interest * pub rate) s S1 a P1 a P2 i1i1 i2i2 i S1-P1 = 0.15i S1-P2 = msgs/s 8 msgs/s Load impact (S1) = 8(0.15) + 4(0.2) = 2 msgs/s
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 27 Nov. 18, 2013 Estimating Broker Load S1 P1 P2 10 msgs/s 8 msgs/s Sum of load impact of local publishers and subscriber Pure forwarding traffic Load compensation factor: reserve broker capacity by factor for pure forwarding traffic