MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of Electrical and Computer Engineering Department of Computer Science University of Toronto
Matching Performance Optimizations Often based on exploiting similarities between subscriptions Avoid unnecessary subscription and predicate evaluations Can we abstract these optimizations? Formalize content-based Matching Plans (order of predicate evaluations) Theoretically quantify performance of matching plans Compare heuristic techniques with optimal matching plans
Commonality Model For a subscription set or Disjunctive Commonality Expression Conjunctive Commonality Expression A set of commonality expressions is a subscription topology. Per-Link Matching DNF Subscriptions Shared predicates Clustering on subscription classes or attributes “Pruning” strategies (e.g., number of attributes)
Link-Group Topology Depth First Algorithm to determine probabilistically optimal matching plan [Greiner2006] in
Link-Group Topology Low Selectivity X X High Selectivity o o
Link-Cluster Topology... Multi-Cluster-Link Topology... Cluster Topology Multi-Link Topology... Dynamic Programming (not very efficient)... Arbitrary Topologies
Cluster Topology Dramatic scalability effects of clustering in CPS Observed trend depends on proportion of commonalities not number of predicates... X o
Applications – DoS Resilience Normal Subscription Migration
Applications – DoS Resilience High Commonality Low Commonality High Commonality
Related Work Carzaniga et al. [Carzaniga2001] Formal notation for covering Mühl [Mühl2002] Formal syntax for CPS routing Li et al. [Li2005] and Campailla et al. [Campailla2001] BDD based CPS matching algorithms
Conclusion Probabilistically optimal matching plans are known for some subscription topologies Scalable CPS matching depends heavily on commonalities Focus on abstracting commonalities Future work Express covering, correlation, … Arbitrary subscription topologies Metrics for expressing compression due to existence of commonalities
References [Greiner2006] Finding optimal satisficing strategies for And-Or trees, Artificial Intelligence [Carzaniga2001] Design and Evaluation of a Wide-Area Event Notification Service, ACM Transactions on Computer Systems [Mühl2002] Large-Scale Content-Based Publish/Subscribe Systems, PhD Thesis [Li2005] A Unified Approach to Routing, Covering and Merging in Publish/Subscribe Systems based on Modified Binary Decision Diagrams, ICDCS [Campailla2001] Efficient filtering in Publish-Subscribe Systems using Binary Decision, International Conference on Software Engineering
MIDDLEWARE SYSTEMS RESEARCH GROUP Extra Slides
Table-based versus Tree-based NaiveTable-basedTree-based
Disjunctive Commonalities “Shortcut” unnecessary subscription/predicate evaluations Examples: Per-Link Matching [Banavar1999,Carzaniga2003] DNF Subscriptions Given some publication P Computed by matching algorithm
Conjunctive Commonalities “Shortcut” unnecessary subscription/predicate evaluations Examples: Shared predicates Clustering on subscription classes or attributes “Pruning” strategies (e.g., number of attributes) Given some publication P Computed by matching algorithm