Presentation is loading. Please wait.

Presentation is loading. Please wait.

Talk at the 4th International Workshop on Distributed Event-Based Systems at the Conference ICDCS 2005 On the Benefits of Non-Canonical Filtering in Publish/Subscribe.

Similar presentations


Presentation on theme: "Talk at the 4th International Workshop on Distributed Event-Based Systems at the Conference ICDCS 2005 On the Benefits of Non-Canonical Filtering in Publish/Subscribe."— Presentation transcript:

1 Talk at the 4th International Workshop on Distributed Event-Based Systems at the Conference ICDCS 2005 On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems Sven Bittner and Annika Hinze, 10 June 2005

2 2/21 Structure MotivationMotivation Canonical TransformationCanonical Transformation Non-Canonical FilteringNon-Canonical Filtering ExperimentsExperiments Summary and Future WorkSummary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

3 3/21 Structure MotivationMotivation Canonical TransformationCanonical Transformation Non-Canonical FilteringNon-Canonical Filtering ExperimentsExperiments Summary and Future WorkSummary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

4 4/21 Motivation: Current Assumptions Expressive filteringExpressive filtering –All subscriptions might be transformed to DNFs (or are purely conjunctive) Efficient filteringEfficient filtering –Utilisation of indexes –Filtering on conjunctions (DNFs) is most efficient –Main memory solutions are most efficient Scalable filteringScalable filtering –Filtering is obtained on designated servers –Large main memories are available Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

5 5/21 Motivation: Our Point of View Main memory algorithms are as scalable as provided resourcesMain memory algorithms are as scalable as provided resources  Efficiency is only one quality measure  Matching algorithms should consider their memory usage (scalability) ClaimClaim –Filtering on arbitrary Boolean subscriptions is More expressive (i.e., richer subscription language)More expressive (i.e., richer subscription language) More scalable (i.e., requires less memory)More scalable (i.e., requires less memory) Only slightly less efficient (i.e., slower matching times)Only slightly less efficient (i.e., slower matching times) Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

6 6/21 Structure MotivationMotivation Canonical TransformationCanonical Transformation Non-Canonical FilteringNon-Canonical Filtering ExperimentsExperiments Summary and Future WorkSummary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

7 7/21 Transformations: Example Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary Transformation

8 8/21 Transformations: Implications EfficiencyEfficiency + Faster filtering algorithms applicable –Filtering on more subscriptions, common sub- expressions ScalabilityScalability + Storage of Boolean formulae not required –More subscriptions to store  Which influences overweigh? Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

9 9/21 Transformations: Origin - DBMS Utilised for query executionUtilised for query execution –Transform to canonical expression (e.g. DNF) –Simplify each element in disjunction separately –Create access plans and execute cheapest  Useful, since efficient data access is crucial  Several advantages Only few queries at one time  no memory problemsOnly few queries at one time  no memory problems Data storage is known in advance  data access might be optimisedData storage is known in advance  data access might be optimised Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

10 10/21 Transformations: Why in ENS? ENSs show converse problem definitionENSs show converse problem definition –Large subscription numbers (queries) –Events not known in advance (data) –Subscriptions are not optimised (in current approaches)  Memory usage even higher  Computations for more subscriptions  Is a transformation useful in ENSs? Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

11 11/21 Structure MotivationMotivation Canonical TransformationCanonical Transformation Non-Canonical FilteringNon-Canonical Filtering ExperimentsExperiments Summary and Future WorkSummary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

12 12/21 Non-Canonical Filtering: Trees (almost) as shown:(almost) as shown: Internal representationInternal representation –Predicate identifiers in leaves (indexes for predicates) –Space efficient encoding (in future) –Actually encoded on byte level, i.e., 1 byte each: No. of children, operator1 byte each: No. of children, operator 2 bytes: width of children2 bytes: width of children Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

13 13/21 Non-Canonical Filtering: 2 Steps Predicate matchingPredicate matching –Determine matching predicates Subscription matchingSubscription matching –Determine candidate subscriptions (min 1 match) –Evaluate their Boolean combinations Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

14 14/21 Structure MotivationMotivation Canonical TransformationCanonical Transformation Non-Canonical FilteringNon-Canonical Filtering ExperimentsExperiments Summary and Future WorkSummary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

15 15/21 Experiments: Initial Evaluation Comparison of Step 2 of matching approachesComparison of Step 2 of matching approaches –Step 1 utilises same indexes –Canonical counting algorithm (count no. of predicates) Original – compare for all subscriptionsOriginal – compare for all subscriptions Variant – compare for candidate subscriptions onlyVariant – compare for candidate subscriptions only –Our non-canonical approach Subscription characterisationSubscription characterisation –Number of predicates P (=6) –DNF consists of disjunctive elements (8) –Each element contains predicates (3) Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary AND OR p1p1 p2p2 p3p3 p4p4 p6p6 p5p5

16 16/21 Experiments: Setting ParameterValue CPU speed 1.8 GHz Total memory 512 MB No. of subscriptions 2,000 – 5,000,000 No. of original predicates per subscription 6 to 10 No. of disjunctive elements after transformation 8 to 32 Used Boolean operators AND, OR Matching predicates per event 5,000 – 10,000 P M Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

17 17/21 Experiments: Results - Scalability P=6; M=5,000 P=10; M=5,000 Counting algorithmCounting algorithm –Sharp bends denote when available main memory resources are exhausted –The less subscriptions are created the better the scalability Non-canonical approachNon-canonical approach –Available main memory sufficient –Scalability independent of transformations Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

18 18/21 Experiments: Results - Efficiency P=6; M=5,000 P=10; M=5,000 Counting algorithmCounting algorithm –Original approach shows linear increasing matching times –Variant becomes more efficient in case of large subscription numbers Non-canonical approachNon-canonical approach –Filtering more efficient than variant of counting algorithm –Difference becomes more pronounced when DNFs become larger Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

19 19/21 Experiments: Results - Summary 1.Transformations to DNFs radically drop scalability of filter algorithms  Memory requirements for transformed conjunctive subscriptions overweigh storage space for Boolean ones 2.Filtering on several conjunctive subscriptions instead of arbitrary Boolean ones decreases efficiency  Impact of more conjunctive (simpler) subscriptions on filtering performance overweighs higher matching costs of Boolean ones Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

20 20/21 Structure MotivationMotivation Canonical TransformationCanonical Transformation Non-Canonical FilteringNon-Canonical Filtering ExperimentsExperiments Future WorkFuture Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

21 21/21 Future Work Theoretical evaluation of memory requirementsTheoretical evaluation of memory requirements –Characterisation of subscriptions –Statements like “when to use which approach” Further practical experimentsFurther practical experiments –Prove correctness of theoretical evaluation –Analyse more sophisticated settings Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

22 Thank you for your attention! Contact: Sven Bittner, Annika Hinze {s.bittner, a.hinze}@cs.waikato.ac.nz


Download ppt "Talk at the 4th International Workshop on Distributed Event-Based Systems at the Conference ICDCS 2005 On the Benefits of Non-Canonical Filtering in Publish/Subscribe."

Similar presentations


Ads by Google