Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks Yi-Min Wang, Lili Qiu, Dimitris Achlioptas, Gautam Das, Paul Larson, and Helen J. Wang Microsoft Research DISC 2002 Toulouse, France
Motivation
Network Architecture
Subscription Partitioning
Event Space Partitioning for Equality Predicates
Simulation Study with Stock- quote subscription data
Hash predicates to get uniform distribution Treat the hashed domain as the event space Use Event Space Partitioning Subscription is a point; does not intersect multiple sub- spaces Use over-partitioning for better load balancing Use offline greedy algorithm to assign buckets to servers for load balancing Use indirection table to dynamically map buckets to servers for load re-balancing Use Bloom filters to further reduce traffic Fast detection of true negatives at the expense of (very low) false-positive rate Equality Predicates
Summary of simulation results Actual MSN Money log: 1.48M subscription with 0.29M unique filters over 21,741 stock symbols Zipf-like distribution: MSFT 58,518, CSCO 43,073, $COMPX 32,485, $INDU 31,111, INTC 25,,903, $INX 22,519, AOL 21,836, LU 20,056, ORCL 18,038, DELL 16,211, etc.
Simulate 100M new subscriptions from 43,734 symbols Scaled-up Zipf-like distribution Perturbation and permutation Uniform distribution 50 servers with over-partitioning ratio = 10 Without load re-balancing Load imbalance (max/min) ranged from 1.41 to 6.66 (Uniform case) With imbalance threshold of 2.0 Re-balancing was triggered only 5 times, each time involving re-assignment of up to 3 buckets and migration of up to 0.7% subscriptions.
Filter Set Partitioning for Range Predicates
Related Work
Summary