Download presentation
Presentation is loading. Please wait.
Published byDustin Rose Modified over 9 years ago
1
Sven Bittner, 28 November 2006 Department of Computer Science The University of Waikato, New Zealand Talk at the 3rd International Middleware Doctoral Symposium (MDS 2006) Supporting Arbitrary Boolean Subscriptions in Distributed Publish/Subscribe Systems This research is partially funded by the NZ Government under the New Zealand International Doctoral Research Scholarships (NZIDRS) programme.
2
2/32 Structure of Talk Motivation: Publish/SubscribeMotivation: Publish/Subscribe Problem DescriptionProblem Description Filtering in Central ComponentsFiltering in Central Components Routing in the Distributed SystemRouting in the Distributed System Summary and OutlookSummary and Outlook Sven Bittner – Supporting Arbitrary Boolean Subscriptions in Distributed Pub/Sub Systems
3
3/32 Publish/Subscribe Systems Publishers Subscribers Pub/sub system Publisheventmessages Registersubscriptions Sendsnotifications B7B7B7B7 B9B9B9B9 … … … … Routing table … … Routing table … … Motivation Problem Definition Central Filtering Routing Optimizations Summary Subscription index structures Filtering and routing B1B1B1B1 B2B2B2B2 B3B3B3B3 B4B4B4B4 B5B5B5B5 B6B6B6B6 B8B8B8B8
4
4/32 Messages & Subscriptions Event messagesEvent messages –Describe a state change/real-world event –Attribute-value pairs SubscriptionsSubscriptions –Describe interests –Arbitrary Boolean combination of predicates title: Harry Potter; endingWithin: 6 hours; condition: new; price: 15.00 Motivation Problem Definition Central Filtering Routing Optimizations Summary title like“Harry Potter” AND condition = NEWcondition = USED price < 10.0 price < 15.0 AND OR AND condition = NEWcondition = USED price < 10.0 price < 15.0 AND OR condition = NEWcondition = USED price < 10.0 price < 15.0 AND ORendingWithin< 1 dayendingWithin< 1 day
5
5/32 Context: Filtering Filtering algorithmFiltering algorithm –Determination of all subscriptions matching an incoming event message (messages not stored) –Indexation of subscriptions and predicates –Support of required subscription language (Boolean) Motivation Problem Definition Central Filtering Routing Optimizations Summary
6
6/32 Context: Routing Routing algorithmRouting algorithm –Determination of all brokers with matching subscriptions –Distribution of subscriptions to build event routing tables –Subscriptions as routing entries (where to route messages) Motivation Problem Definition Central Filtering Routing Optimizations Summary
7
7/32 Context: Routing Optimization Optimization goalOptimization goal –Improvement of routing process, e.g., Higher throughputHigher throughput Less memory for routing tablesLess memory for routing tables Manipulation ofManipulation of routing entries Motivation Problem Definition Central Filtering Routing Optimizations Summary Routing table S1S1S1S1 S9S9S9S9 B2B2B2B2 B3B3B3B3 … …
8
8/32 Structure of Talk Motivation: Publish/SubscribeMotivation: Publish/Subscribe Problem DescriptionProblem Description Filtering in Central ComponentsFiltering in Central Components Routing in the Distributed SystemRouting in the Distributed System Summary and OutlookSummary and Outlook Sven Bittner – Supporting Arbitrary Boolean Subscriptions in Distributed Pub/Sub Systems
9
9/32 Current Approaches (1) ObservationsObservations –Current systems only support conjunctive subscriptions –Restrictions exploited in Filtering algorithmsFiltering algorithms Routing optimizationsRouting optimizations No consideration of other operators in subscriptions Motivation Problem Definition Central Filtering Routing Optimizations Summary
10
10/32 Current Approaches (2) MotivationMotivation –Arbitrary Boolean subscriptions can be converted to DNF (exponential in size) –Every conjunction is handled as separate subscription Approach as in database management systems (conversion of query restrictions) Motivation Problem Definition Central Filtering Routing Optimizations Summary
11
11/32 Publish/Subscribe vs. DBMS (1) Important differences:Important differences: –Number of simultaneous “data requests” DBMSs: relatively small number of queriesDBMSs: relatively small number of queries Pub/sub systems: large number of subscriptionsPub/sub systems: large number of subscriptions Even higher load after conversion –Query processing DBMSs: query optimization on canonical form based on known data (access plans, cost estimation, etc.)DBMSs: query optimization on canonical form based on known data (access plans, cost estimation, etc.) Pub/sub systems: events are unknown, no optimization appliedPub/sub systems: events are unknown, no optimization applied Motivation Problem Definition Central Filtering Routing Optimizations Summary
12
12/32 Publish/Subscribe vs. DBMS (2) Considering data storage:Considering data storage: –Subscriptions queries –Data (base) subscription (base) –Queries event messages Messages are in canonical form (attribute-value pairs)Messages are in canonical form (attribute-value pairs) So, why converting subscriptions as well?So, why converting subscriptions as well? Questionable whether to take conversion approach in pub/sub (problem size explosion) Motivation Problem Definition Central Filtering Routing Optimizations Summary
13
13/32 Hypothesis The internal support of arbitrary Boolean subscriptions reduces the memory requirements compared to current conjunctive solutions without degrading the system efficiency. Motivation Problem Definition Central Filtering Routing Optimizations Summary
14
14/32 Steps to Take Development and analysis ofDevelopment and analysis of –Filtering algorithm (central broker components) –Routing optimization (distributed system) Motivation Problem Definition Central Filtering Routing Optimizations Summary
15
15/32 Structure of Talk Motivation: Publish/SubscribeMotivation: Publish/Subscribe Problem DescriptionProblem Description Filtering in Central ComponentsFiltering in Central Components Routing in the Distributed SystemRouting in the Distributed System Summary and OutlookSummary and Outlook Sven Bittner – Supporting Arbitrary Boolean Subscriptions in Distributed Pub/Sub Systems
16
16/32 Steps Undertaken (1) 1.Application scenario analysis [BH06b]: online auctions –Analysis of distributions on eBay –Identification of typical subscription classes Semi-realistic data set Used in later analysis Motivation Problem Definition Central Filtering Routing Optimizations Summary
17
17/32 Steps Undertaken (2) 2.Filtering algorithm for arbitrary Boolean subscriptions [BH05a] –Generic solution –Extends general-purpose conjunctive counting algorithm [YGM94, AJL02] –Filters on conjunctions the same way as counting approach Motivation Problem Definition Central Filtering Routing Optimizations Summary
18
18/32 Steps Undertaken (3) 3.Characterization scheme and memory analysis [BH05b] –Description of subscription patterns –Analysis of counting, cluster [HCKW90, FJL + 01] and Boolean approach –Determination of point where Boolean approach requires less memory Already one disjunction might favor Boolean approach Motivation Problem Definition Central Filtering Routing Optimizations Summary
19
19/32 Steps Undertaken (4) 4.Practical Verification/Efficiency Analysis –Confirmation of theoretical results –Efficiency is similar to counting approach Summary: Boolean solution –More space efficient filtering –Similar time efficiency properties –Scheme helps with decision Boolean/conjunctive algorithm Motivation Problem Definition Central Filtering Routing Optimizations Summary
20
20/32 Structure of Talk Motivation: Publish/SubscribeMotivation: Publish/Subscribe Problem DescriptionProblem Description Filtering in Central ComponentsFiltering in Central Components Routing in the Distributed SystemRouting in the Distributed System Summary and OutlookSummary and Outlook Sven Bittner – Supporting Arbitrary Boolean Subscriptions in Distributed Pub/Sub Systems
21
21/32 Subscription Pruning (1) Idea of pruning [BH06a]Idea of pruning [BH06a] Motivation Problem Definition Central Filtering Routing Optimizations Summary title like“Harry Potter”endingWithin< 1 day condition = NEWprice < 15.0 AND OR condition = USED AND OR title like“Harry Potter”endingWithin< 1 day condition = NEW price < 10.0 price < 15.0 AND OR condition = USED AND OR Pruning –Remove parts of subscription trees Creates more general subscription
22
22/32 Subscription Pruning (2) –Less complex (time and space) subscriptions (+) –More events forwarded (−) title like“Harry Potter”endingWithin< 1 day condition = NEWprice < 15.0 AND OR condition = USED AND OR title like“Harry Potter”endingWithin< 1 day condition = NEW price < 10.0 price < 15.0 AND OR condition = USED AND OR Pruning Consequences of pruningConsequences of pruning Motivation Problem Definition Central Filtering Routing Optimizations Summary
23
23/32 Application of Pruning (1) Pruning of routing entriesPruning of routing entries Un-optimized routing: Routing table Subscriber Motivation Problem Definition Central Filtering Routing Optimizations Summary
24
24/32 Application of Pruning (2) Pruning of routing entriesPruning of routing entries Routing table Subscriber No pruning in local broker Ensure correct filtering Motivation Problem Definition Central Filtering Routing Optimizations Summary
25
25/32 Application of Pruning (3) Pruning of routing entriesPruning of routing entries Routing table Less complex subscriptions More time and space efficient routing Subscriber Motivation Problem Definition Central Filtering Routing Optimizations Summary
26
26/32 Application of Pruning (4) Pruning of routing entriesPruning of routing entries Routing table But more general subscriptions More forwarded event messages (false positives) More event messages to route/process Subscriber Motivation Problem Definition Central Filtering Routing Optimizations Summary
27
27/32 Practical Pruning QuestionQuestion What subscription and what part of its subscription tree should be pruned first? AnswerAnswer –Four heuristics [BH06c] based on influence on Memory usageMemory usage Filter efficiencyFilter efficiency Network load (selectivity)Network load (selectivity) Network load (selectivity & popularity)Network load (selectivity & popularity) Motivation Problem Definition Central Filtering Routing Optimizations Summary
28
28/32 Experiments (In Progress) Evaluation in online auction settingEvaluation in online auction setting Different distributions in subscriptionsDifferent distributions in subscriptions Applicability to conjunctive subscriptionsApplicability to conjunctive subscriptions Comparison to conjunctive routing optimizationComparison to conjunctive routing optimization Motivation Problem Definition Central Filtering Routing Optimizations Summary
29
29/32 Structure of Talk Motivation: Publish/SubscribeMotivation: Publish/Subscribe Problem DescriptionProblem Description Filtering in Central ComponentsFiltering in Central Components Routing in the Distributed SystemRouting in the Distributed System Summary and OutlookSummary and Outlook Sven Bittner – Supporting Arbitrary Boolean Subscriptions in Distributed Pub/Sub Systems
30
30/32 Summary MotivationMotivation –Publish/subscribe systems –Online auction scenario Need for arbitrary Boolean subscriptions Problem definitionProblem definition –Only conjunctions supported –Conversion adopted from DBMSs –Does conversion make sense? –Hypothesis: No, if disjunctions occur! Motivation Problem Definition Central Filtering Routing Optimizations Summary
31
31/32 Summary Central broker componentsCentral broker components –Boolean Filtering algorithm –Characterization scheme –Analysis, comparison, and verification Boolean approach is favorable Distributed systemDistributed system –Novel optimization: subscription pruning –First Experiments: valuable optimization Memory requirements , throughput Memory requirements , throughput Motivation Problem Definition Central Filtering Routing Optimizations Summary
32
32/32 Future Work Future workFuture work –Writing up –Finish experiments –Heuristic based on multicriteria optimization Future work (not within PhD)Future work (not within PhD) –Analyze other applications –Optimize solutions –Open source prototype Motivation Problem Definition Central Filtering Routing Optimizations Summary
33
Sven Bittner, s.bittner@cs.waikato.ac.nz Talk: Supporting Arbitrary Boolean Subscriptions in Distributed Publish/Subscribe Systems in Distributed Publish/Subscribe Systems Selected further reading: [BH05a] S. Bittner and A. Hinze. On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems. In Proceedings of the 25th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW '05), Columbus, USA, June 2005. [BH05b] S. Bittner and A. Hinze. A Detailed Investigation of Memory Requirements for Pub/Sub Filtering Algorithms. In Proceedings of the 13th International Conference on Cooperative Information Systems (CoopIS 2005), Agia Napa, Cyprus, 31 October-4 November, 2005. [BH06a] S. Bittner and A. Hinze. Pruning Subscriptions in Distributed Pub/Sub Systems. In Proc. of the 29th Austral. Computer Science Conference (ACSC 2006), Hobart, Australia, 16-19 January, 2006. [BH06b] S. Bittner and A. Hinze. Event Distributions in Online Book Auctions. Technical Report 03/2006. Computer Science Department, Waikato University, New Zealand, February 2006. [BH06c] S. Bittner and A. Hinze. Dimension-Based Subscription Pruning for Publish/Subscribe Systems. In Proceedings of the 26th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW '06), Lisbon, Portugal, July 2006. [BH06d] S. Bittner and A. Hinze. Optimizing Pub/Sub Systems by Advertisement Pruning. In Proceedings of the 8th International Symposium on Distributed Objects and Applications (DOA 2006), Montpellier, France, 30 October-1 November 2006. Thank you for your attention!
34
34/32 Selected Other References [AJL02] G. Ashayer, H.-A. Jacobsen, and H. Leung. Predicate Matching and Subscription Matching in Publish/Subscribe Systems. In Proceedings of the 22nd IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW '02), Vienna, Austria, July 2-5 2002. [CRW01] A. Carzaniga, D. S. Rosenblum, and A. L. Wolf. Design and Evaluation of a Wide-Area Event Notification Service. ACM Transactions on Computer Systems (TOCS), 19(3):332-383, 2001. [FJL + 01] F. Fabret, A. Jacobsen, F. Llirbat, J. Pereira, K. Ross, and D. Shasha. Filtering Algorithms and Implementation for Very Fast Publish/Subscribe Systems. In Proc. of the 2001 ACM SIGMOD Intern. Conference on Management of Data (SIGMOD 2001), USA, May 2001. [HCKW90] E. N. Hanson, M. Chaabouni, C.-H. Kim, and Y.-W. Wang. A Predicate Matching Algorithm for Database Rule Systems. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data (SIGMOD 1990), Atlantic City, USA, May 23-25 1990. [LHJ05] G. Li, S. Hou, and H.-A. Jacobsen. A Unified Approach to Routing, Covering and Merging in Publish/Subscribe Systems based on Modified Binary Decision Diagrams. In Proc. of the 25th IEEE Intern. Conference on Distributed Computing Systems (ICDCS '05), USA, June 2005. [MF01] G. Muehl and L. Fiege. Supporting Covering and Merging in Content-Based Publish/Subscribe Systems: Beyond Name/Value Pairs. IEEE Distributed Systems Online (DSOnline), 2(7), 2001. [TE04] P. Triantafillou and A. Economides. Subscription Summarization: A New Paradigm for Efficient Publish/Subscribe Systems. In Proceedings of the 24th IEEE International Conference on Distributed Computing Systems (ICDCS '04), Tokyo, Japan, March, 2004. [WQV + 04] Y.-M. Wang, L. Qiu, C. Verbowski, D. Achlioptas, G. Das, and P. Larson. Summary- based Routing for Content-based Event Distribution Networks. ACM SIGCOMM Computer Communication Review, 34(5):59-74, 2004. [YGM94] T. W. Yan and H. Garcia-Molina. Index Structures for Selective Dissemination of Information Under the Boolean Model. ACM Transactions on Database Systems (TODS), 19(2):332-364, 1994.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.