1 Event Processing: An Academic Perspective Hans-Arno Jacobsen Bell University Laboratory Chair Middleware Systems Research Group University of Toronto.

Slides:



Advertisements
Similar presentations
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Advertisements

Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Management of Uncertainty in Publish/Subscribe Systems Haifeng Liu Department of Computer Sceince University of Toronto.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 2.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Distributed Systems Architectures Slide 1 1 Chapter 9 Distributed Systems Architectures.
Small-Scale Peer-to-Peer Publish/Subscribe
Transactional Mobility in Distributed Content-Based Publish/Subscribe Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese.
MIDDLEWARE SYSTEMS RESEARCH GROUP A Taxonomy for Denial of Service Attacks in Content-based Publish/Subscribe Systems Alex Wun, Alex Cheung, Hans-Arno.
Distributed Systems Architectures
©NEC Laboratories America 1 Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California.
Peter R. Pietzuch, Brian Shand, and Jean Bacon A Framework for Distributed Event Composition Middleware’03, Rio de Janeiro,
Hermes: A Distributed Event- Based Middleware Architecture Peter Pietzuch and Jean Bacon 1st DEBS Workshop, Vienna,
Background Notification services in LAN Provides Notification Selection Notification Delivery Done on a centralized server (hence not scalable) Challenge.
Real-time Publish/subscribe ECE Expert Topic Lizhong Cao Milenko Petrovic March 6 th,2003.
Distributed Publish/Subscribe Network Presented by: Yu-Ling Chang.
IBM Research – Thomas J Watson Research Center | March 2006 © 2006 IBM Corporation Events and workflow – BPM Systems Event Application symposium Parallel.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Alex King Yeung Cheung and Hans-Arno Jacobsen University of Toronto June, 24 th 2010 ICDCS 2010 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Effects of Routing Computations in Content-Based Routing Networks with Mobile Data Sources Vinod Muthusamy, Milenko Petrovic, Hans-Arno Jacobsen University.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
JMS Compliance in NaradaBrokering Shrideep Pallickara, Geoffrey Fox Community Grid Computing Laboratory Indiana University.
1 소프트웨어공학 강좌 Chap 9. Distributed Systems Architectures - Architectural design for software that executes on more than one processor -
Publisher Mobility in Distributed Publish/Subscribe Systems Vinod Muthusamy, Milenko Petrovic, Dapeng Gao, Hans-Arno Jacobsen University of Toronto June.
MIDDLEWARE SYSTEMS RESEARCH GROUP Denial of Service in Content-based Publish/Subscribe Systems M.A.Sc. Candidate: Alex Wun Thesis Supervisor: Hans-Arno.
Gil EinzigerRoy Friedman Computer Science Department Technion.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Supporting Disconnected Operations in Publish/Subscribe Systems Vinod Muthusamy Joint work with Milenko Petrovic, Ioana Burcea, H.-Arno Jacobsen, Eyal.
Content-Based Routing in Mobile Ad Hoc Networks Milenko Petrovic, Vinod Muthusamy, Hans-Arno Jacobsen University of Toronto July 18, 2005 MobiQuitous 2005.
Introduction GOALS:  To improve the Quality of Service (QoS) for the JBI platform and endpoints  E.g., latency, fault tolerance, scalability, graceful.
Event Processing with the PADRES Publish/Subscribe System Hans-Arno Jacobsen Bell University Laboratory Chair Middleware Systems Research Group University.
MIDDLEWARE SYSTEMS RESEARCH GROUP Middleware A Policy Management Framework for Content-based Publish/Subscribe Middleware Hans-Arno Jacobsen Department.
Event Processing A Perspective From Oracle Dieter Gawlick, Shailendra Mishra Oracle Corporation March,
Dynamic Load Balancing in Distributed Content-based Publish/Subscribe Alex K. Y. Cheung & Hans-Arno Jacobsen University of Toronto November 30 th, 2006.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Total Order in Content-based Publish/Subscribe Systems Joint work with: Vinod Muthusamy, Hans-Arno Jacobsen.
Distributed Automatic Service Composition in Large-Scale Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese Academy of.
Historic Data Access in Publish/Subscribe Middleware System Research Group University of Toronto.
PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011 MIDDLEWARE SYSTEMS RESEARCH.
CSC 480 Software Engineering Lecture 18 Nov 6, 2002.
Data-centric Networking Through Adaptive Content-based Routing Hans-Arno Jacobsen Bell University Laboratory Chair Middleware Systems Research Group University.
MIDDLEWARE SYSTEMS RESEARCH GROUP Adaptive Content-based Routing In General Overlay Topologies Guoli Li, Vinod Muthusamy Hans-Arno Jacobsen Middleware.
PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011 MIDDLEWARE SYSTEMS RESEARCH.
Minimal Broker Overlay Design for Content-Based Publish/Subscribe Systems Naweed Tajuddin Balasubramaneyam Maniymaran Hans-Arno Jacobsen University of.
ICDCS Beijing China Routing of XML and XPath Queries in Data Dissemination Networks Guoli Li, Shuang Hou Hans-Arno Jacobsen Middleware Systems Research.
Data-centric Networking Through Adaptive Content-based Routing Hans-Arno Jacobsen Bell University Laboratory Chair Middleware Systems Research Group University.
Information-Centric Networks10b-1 Week 10 / Paper 2 Hermes: a distributed event-based middleware architecture –P.R. Pietzuch, J.M. Bacon –ICDCS 2002 Workshops.
Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.
Information-Centric Networks Section # 10.2: Publish/Subscribe Instructor: George Xylomenos Department: Informatics.
Copyright © Hans-Arno Jacobsen DRDC-UofT Workshop, 2010 Information Infrastructure for Situational Awareness and Systems Integration Hans-Arno Jacobsen.
Peter R Pietzuch and Jean Bacon Peer-to-Peer Overlay Networks in an Event-Based Middleware DEBS’03, San Diego, CA, USA,
Optimizing BPM Through SLAs & Event Monitoring
1 State-of-the-art in Publish/Subscribe Middleware for Supporting Mobility Sumant Tambe EECS Preliminary Examination December 11, 2007 Vanderbilt University,
Distributed Automatic Service Composition in Large-Scale Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese Academy of.
1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.
Congestion Avoidance with Incremental Filter Aggregation in Content-Based Routing Networks Mingwen Chen 1, Songlin Hu 1, Vinod Muthusamy 2, Hans-Arno Jacobsen.
AMSA TO 4 Advanced Technology for Sensor Clouds 09 May 2012 Anabas Inc. Indiana University.
1 Towards Scalable Pub/Sub Systems Shuping Ji 1, Chunyang Ye 2, Jun Wei 1 and Arno Jacobsen 3 1 Chinese Academy of Sciences 2 Hainan University 3 Middleware.
1 Distributed Systems Architectures Distributed object architectures Reference: ©Ian Sommerville 2000 Software Engineering, 6th edition.
Introduction to Wireless Sensor Networks
A Framework for Object-Based Event Composition in Distributed Systems
Navneet Kumar Pandey1 Stéphane Weiss1 Roman Vitenberg1
Distributed Publish/Subscribe Network
Composite Subscriptions in Content-based Pub/Sub Systems
Overview of AIGA platform
Foundations for Highly-Available Content-based Publish/Subscribe Overlays Young Yoon, Vinod Muthusamy and Hans-Arno Jacobsen.
Small-Scale Peer-to-Peer Publish/Subscribe
Indirect Communication Paradigms (or Messaging Methods)
Indirect Communication Paradigms (or Messaging Methods)
Presentation transcript:

1 Event Processing: An Academic Perspective Hans-Arno Jacobsen Bell University Laboratory Chair Middleware Systems Research Group University of Toronto MIDDLEWARE SYSTEMS RESEARCH GROUP

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Querying the Future

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Amazon to Chapters to You.... Monday, October 10th in Cyberspace Your book “...” is available at.... $10 off Thursday, November 15th, in Toronto

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Location Constraint Matching d P1P1 P2P2 P3P3 P4P4 P5P5 A P4P4 P3P3 P2P2 P1P1 P5P5 dAdA

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, BPM & SLA Monitoring N Y Far? Get destination Validate request Find flight Find train cost < $0.02 service time < 3s Only trusted partners

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, The Common Denominator These applications are driven by asynchronous state transitions  Information becomes available  Change in state in the environment  Something happens These state transitions are events Events are disseminated and filtered against queries events queries

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Agenda What abstractions to use? Our approach Overview of paradigm The PADRES project

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, What Abstractions are Best Suited to Support this Modus Operandi? Databases  Great for managing historic data  But what about future data Data streams  Great for managing structured streams of tuples  But what about un-structured, multi-typed, sporadic events from many sources Rule-based expert systems  Great for inference and reasoning  But what about managing large numbers of fined-grained filters in distributed envrionments Take this cum gran salis

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Our Approach: Publish/Subscribe More specifically Content-based publish/subscribe  For fine-grained filtering Combined with content-based message routing  For selective data dissemination Extended with persistence et al.  For historic data access

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Content-based Publish/Subscribe Publisher Subscriber Subscriptions Publications Notification IBM=84 MSFT=27 INTC=19 JNJ=58 ORCL=12 HON=24 AMGN=58 Stock markets NYSE NASDAQ TSX Subscriptions: IBM > 85 ORCL < 10 JNJ > 60 Broker(s)

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, data tuples subscriptions query publication Query and subscription are very similar. Data tuples and publication are very similar. However, the two problem statements are inverse. That’s Like Data Base Querying  !! sets of tuples About past About future sets of tuples

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, The Content-based Model Language and data model  Conjunctive Boolean functions over predicates  Predicates are attribute-operator-value triples [class,=,trigger]  Subscriptions are conjunctions of predicates [class,=,trigger],[appl,=,payroll],[gid,=,g001]  Publications are sets of attribute-value pairs [class,trigger],[appl,printer],[gid,g007] Matching semantic  A subscription matches if all its predicates are matched P/S events notifications

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Publish/Subscribe Matching Problem Given a set of subscriptions, S, and a publication, e, return all s in S matched by e. e is referred to as event or publication Splitting hairs  Event is a state transition of interest in the environment  Publication is the information about e submitted to the publish/subscribe system Simple problem statement, widely applicable, and lots of open questions

MIDDLEWARE SYSTEMS RESEARCH GROUP Content-based Message Routing Publisher Subscriber 1. Advertise 2. Subscribe 3. Publish Event-Based Decoupled Flexible Responsive Content Routing Declarative A: [class, =, stock], [name, =, HP], [price, >, 50] S: [class, =, stock], [name, =, *], [price, >, 50] P: [class, stock], [name, HP], [price, 55]

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, ToPSS - The Toronto Publish/Subscribe System Family [2000 – present] Matching algorithms  Language expressiveness vs. efficient matching Routing protocols  Network architectures & scalability Higher level abstractions  Workflow execution  Monitoring S-ToPSS (semantic) X-ToPSS (XML matching) A-ToPSS (approximate) persistent-ToPSS (subject spaces) L-ToPSS (location-based) ToPSS (matching) M-ToPSS (mobile) Ad hoc-ToPSS (ad hoc networking) Federated-ToPSS (federation of ToPSS brokers) Rb-ToPSS (rule-based) P2P-ToPSS (peer-to-peer) LB-ToPSS (load balancing) FT-ToPSS (fault tolerance) Historic-ToPSS (historic data) CS-ToPSS (composite subs) BPEL-ToPSS (BPEL execution) JS-ToPSS (job scheduling)

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, PADRES Publish/Subscribe System Project [2003-present] Peng Alex David aRno Eli Serge, PAdres is Distributed REsource Scheduling Publish/subscribe Applied to Distributed Resource Scheduling

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, The PADRES ESB Stack Business Process Execution Layer

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Subscription Language & Publication Data Model Subscriptions are conjuncts of predicates  [class,=,job_info], [workflow,=, foo_example], [instanceID,=,$x], [job,=,A], [status,=,succ] isPresent operator, $X (variable binding), composite subscriptions, string operators etc. Publications are sets of attribute value pairs  [class, trigger], [workflow, foo_example], [instanceID, 54]

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Padres Broker Architecture QueueHandler BrokerCore Matching Engine Controller Lifecycle Manager Overlay Manager Publication / Subscription Routing Table JESS InputQueue … QueueHandler OutputQueues Broker_Control Message QueueHandler … RMITransport Handler JMS BrokerRMI ClientRMI DB

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, PADRES Features A BC D E F Composite Events Historic Access Management Robustness Load Balancing Security

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Limitations of Acyclic Overlays Broker Publisher Subscriber P Sensitive to  Congestion  Imbalanced workloads  Broker failures  Overlay changes

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Robustness Through Cycles PPP P Robust Self-healingAdaptive routing Flexible overlay Publisher Subscriber Congested Link

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, End to End Delay (preliminary results) 30 brokers on 15 machines (on LAN) 20 publishers (2400msg/min) 30 subscribers (2000 total) toplogy with average connection degree of 4 Fixed routing does not exploit cycles Dynamic routing does exploit cycles

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Sub Load Balancing Framework Local Load Balancing Global Load Balancing Pub offloading broker load-accepting broker

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Load Balancing Components DETECTOR PIE MEDIATOR PIE messages OFFLOAD ALGORITHMS InputMatchOutput LOAD ESTIMATION PRESS Mediation and migration protocols Subscribers to offload Target broker Load Balancer Establish and teardown of load balancing sessions between two brokers/clusters Coordinate transparent subscriber migration Calculates the set of subscribers to offload for balancing the performance metric in question based on load information about the subscriptions and load-accepting broker Estimates load of subscriptions Triggers load balancing if overload or uneven load distribution is detected by examining 3 performance metrics

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Output Utilization Ratio (no LB) Overload! Edge broker dies

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, End to End Delivery Delay Edge broker dies Increasing delay due to output overload at BOTH edge broker and cluster- head broker

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Output Utilization Ratio (LB) Overload! Load balancing converges

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Composite Subscription AND OR S1S2 OR S3S4 AND S5 CS={{S1 OR S2} AND {S3 OR S4} AND S5} A composite event is the constellation of events being detected by the composite subscription. S are atomic subscriptions. I.e., they are satisfied by a single, multi-attribute event. Composite subscriptions (CS) are used for event correlation, in network filtering, and the detection of complex events, Applications: BMP, BAM (see Thursday’s talk)

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Topology-based CS Routing Distributed Overlay Broker Network B3 && S1S2 hS3 S P2 Pk CS={{S1 AND S2} AND hS3} B2 B1 B6 B5 P1 CS’={S1 AND S2} P Publishers S Subscribers S2 S1 B4 CS’ hS3 CS hS3 DB

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Composite Event Detection Distributed Overlay Broker Network B3 && S1S2 hS3 S P2 PkPk CS={{S1 AND S2} AND hS3} B2 B1 B6 B5 P1 CS’={S1 AND S2} P Publishers S Subscribers CS’ hS3 S2 S1 P1 P2 B4 CS P123 P12 P3 hS3 DB

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Optimizations Dynamic composite subscription routing Runtime adjustments of CS joint points

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Cost Model I Definition 1. Data set size of atomic subscription S R i is the message rate at the source ApjApj SpjSpj p/s P1P1 PnPn AiAi AjAj AkAk S AiAi AkAk S S = p 1 and … and p n

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Cost Model II Definition 2. Data set size of composite subscription CS

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, CS Routing Optimization Objectives  Minimize network traffic  Minimize notification delay 2 13 Adv 1 Adv 2 CS={{S1 AND S2} 2 13 Adv 1 Adv 2 CS={{S1 AND S2} (a) (b)

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Estimate CE Detection Costs Traffic-based model Traffic Cost (S1) = Delay Cost (S1) = 2 13 Adv 1 Adv 2 CS={{S1 AND S2}

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Optimized Subscription Routing Determine joint points based on cost model Distributed Overlay Broker Network B3 S P2 PkPk B2 B1 B6 B5 P1 S2 S1 B7 B4 CS’ hS3 CS CS={{S1 AND S2} AND hS3} hS3 DB

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Optimized Subscription Routing Dynamically maintain joint points Distributed Overlay Broker Network B3 S P2 PkPk B2 B1 B6 B5 P1 S2 S1 B7 B4 CS’ hS3 CS CS={{S1 AND S2} AND hS3} hS3 DB

MIDDLEWARE SYSTEMS RESEARCH GROUP 39 Summary Publish/Subscribe solves a problem inverse to database query processing Publish/Subscribe is well suited as abstraction for many event processing based applications Publish/Subscribe is a paradigm for data-centric networking & selective information dissemination PADRES realizes a distributed publish/subscribe system offering many novel features PADRES can serve as ESB and ISB (Internet Service Bus); more on this Thursday at 8AM There are plenty of open research questions

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Asking EPTS and Industry for Data sets  Subscription data sets for various domains  Workload characteristics  For active research we are looking for SLA examples, business process examples Interesting and relevant problems  Challenges of tomorrow  Specific problems

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Acknowledgements Graduate students, visitors, and PDFs currently working on PADRES. Alex Cheung Guoli Li Jian Li Vinod Muthusamy Alex Wun Songlin Hu Reza Sherafat Partner from CA directly involved in project Serge Mankovskii And many alumni visit

42 “A forum dedicated to the dissemination of original research, the discussion of practical insights, and the reporting on relevant experience relating to event-based computing previously scattered across several communities.” In cooperation with (approval pending): July 2 nd – 4 th, 2008 Roma, Italy Submission deadline: March 15 th, 2008 Location: Dipartimento di Informatica e Sistemistica “A. Ruberti” Sapienza Università di Roma

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Questions? A D R E S P

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando,

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Routing in Cyclic Networks

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Limitations of Acyclic Overlays Broker Publisher Subscriber P Sensitive to  Congestion  Imbalanced workloads  Broker failures  Overlay changes

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, General Overlay PPP P Robust Self-healingAdaptive routing Flexible overlay Publisher Subscriber Congested Link

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Routing in General Overlays I 2 Advertisement Tree 1 Duplicate Messages S S S S S Advertisement Tree 2

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Routing in General Overlays II Scenario 1  Subscriptions are routing in loops  Brokers receive duplicated subscriptions Adv 1 Adv 2 S X S S

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Routing in General Overlays III Scenario 2  Subscriptions are multicasted to several destinations  Brokers receive duplicated subscriptions  Scenario 1 is exacerbated by duplicated subscriptions Adv 1 Adv 2 S Y S S S S

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Redundant Messages Problem

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Routing in General Overlays Extends standard content-based routing protocol Maintain the same interface to pub/sub clients Requires changes to  Advertisement routing  Subscription routing Atomic subscriptions Composite subscriptions  Publication routing

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Advertisement Routing Each advertisement forms a spanning advertisement tree Duplicated advertisements are discarded by brokers Each advertisement is assigned a unique tree identifier (TID)  e.g. a [class, eq, stock]……[TID, eq, adv_msg_id] SRT  A set of [advertisement, last hop]

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Subscription Routing Each subscription has a TID predicate with a variable.  e.g. s [class,eq,stock]……[TID,eq,$X] The variable is bound to the TID of matching advertisement upon subscribing PRT  A set of [subscription, {TID, last hop of subscription }]

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Subscription Routing Scenario Adv 1 Adv 2 S X S S: [class,eq,stock][name,eq,*][price,>,50] [TID,eq,$Z ] At Broker 1: Adv1: [class,eq,stock][name,eq,IBM][price,>,60][TID,eq,Adv1] Adv2: [class,eq,stock][name,eq,HP][price,>,50][TID,eq,Adv2] S matching Adv1: [class,eq,stock][name,eq,*][price,>,50] [TID,eq,Adv1] S matching Adv2: [class,eq,stock][name,eq,*][price,>,50] [TID,eq,Adv2]

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Subscription Routing Scenario 2 SRT(B4)Last hop Adv12 Adv Adv 1 Adv 2 S Y S S PRT(B4)TID Last hop SAdv1Y Adv2Y

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Publication Routing Each publication is assigned the TID of its matching advertisement Publications are routed:  Fixed TID routing  Dynamic publication routing Lemma 1: No broker receives duplicated publication messages No subscriber receives duplicated publications according to Lemma 1

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Fixed TID Routing Adv 1 Adv 2 X P Sub P

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Dynamic Publication Routing Publication’s TID is changeable Best path algorithms Lemma 2: Changing a publication p’s TID while in transit will not change the set of subscribers, N, notified of p Adv 1 Adv 2 X Sub P

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Advantages of TID-based Approach Retains the publish/subscribe client interface Speeds up subscription and publication propagation Generates duplicated messages only at advertisement level Builds multiple subscription routing paths for publications Routes publications dynamically Delivers publications around failed brokers

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Preliminary Experimental Results Setup  Five dual core servers CPU: 2.4GHz ~ 3.4GHz Memory: 1GB ~2GB  Cyclic broker overlays Number of nodes: 23 brokers Average connection degree: D = 5; D=3  Delay = D_queue + D_transmission + D_ propagation Metrics  Notification delay  CPU and memory usage  Bandwidth consumption

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Notification Delay

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Notification Delay

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Notification Delay

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Notification Delay

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Overhead in terms of CPU Usage Average over all brokers

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Overhead in terms of Memory Use Average over all brokers

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Benefits of Content-based Publish/Subscribe Simplifies IT development and maintenance by decoupling enterprise components Supports sophisticated interactions among components using expressive subscription languages – going beyond the limits of topics Allows fine-grained queries and event management Achieves scalability with in-network filtering and processing

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Adaptive CS Routing Setup  1 subscriber with composite subscriptions  4 publishers with unbalanced publication rates  Focus on behaviours of four particular brokers: The subscriber connects to Broker A Broker B is the joint point broker in Topology-based routing Broker C is the new joint point broker in adaptive CS routing The slower publishers connect to Broker D Compare  Simple routing  Topology-based routing  QoS-based adaptive routing

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Traffic of CS Routing Simple Routing A B C D

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Traffic of CS Routing Topology-based Routing A B C D

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Traffic of CS Routing QoS-based Adaptive Routing A B C D

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Notification Delay of CS

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Adaptive CS Routing CSs may be split according to potential publication traffic, bandwidth, latency, etc Adv 1 Adv 2 CS={{S1 AND S2} 2 13 Adv 1 Adv 2 CS={{S1 AND S2} (a) (b)

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Estimate Data Set Size Definition 1. Data set size of atomic subscription Definition 2. Data set size of composite subscription

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Estimate CE Detection Costs Traffic-based model Traffic Cost (S1) = Delay Cost (S1) = QoS-based model Delay Cost (S1) = 2 13 Adv 1 Adv 2 CS={{S1 AND S2}

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Adaptive CS Routing Algorithm Evaluate the cost model at each broker for composite subscriptions Choose the broker with minimum detection cost as the joint point Joint points are maintained dynamically according to the traffic and network condition

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Example: Topology-based CS Routing Adv 1 Adv CS S2 S3 S1 CS’ Adv 3 CS’={S1 AND S2} AND S1S2 S3 CS={{S1 AND S2} ANDS3}

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Example: Adaptive CS Routing Adv 1 Adv CS={{S1 AND S2} ANDS3} CS S2 S3 S1 CS’ Adv 3

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Effect of Parallelism

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, body Constraint Matching d 1 <d satisfied d 2 >d unsatisfied d 3 <d satisfied d enclosing circle |green set| < d satisfied |red set| < d unsatisfied

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Composite Subscription Routing Distributed Overlay Broker Network B4 B3 AND S1S2 S3 S P2 P3 CS={{S1 AND S2} ANDS3} B2 B1 B6 B5 P1 CS CS’={S1 AND S2} P Publishers S Subscribers CS’ S3S2 S1

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Composite Event Detection Distributed Overlay Broker Network B4 B3 AND S1S2 S3 S P2 P3 CS={{S1 AND S2} ANDS3} B2 B1 B6 B5 P1 CS CS’={S1 AND S2} P Publishers S Subscribers CS’ S3 S2 S1 CS P1 P2 P12 P3 P123

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Optimized Routing Algorithm Evaluate the cost model at each broker for composite subscriptions Choose the broker with minimum detection cost as the joint point Joint points are maintained dynamically according to the traffic and network condition

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, Publish/Subscribe in Industry Standards  CORBA Event Service  CORBA Notification Service  OMG Data Dissemination Service  Java Messaging Service  WS Eventing  WS Notification  WS Brokered Notifications  INFO-D (Grid Forum)  AMQP Emerging technologies (not complete)  RSS aggregators PubSub.com, FeedTree  Real-time data dissemination TIBCO, RTI Inc., Mantara Software  Application integration Softwired  Hardware-based brokers Sarvega (Intel), Solace Systems, DataPower (IBM)

MIDDLEWARE SYSTEMS RESEARCH GROUP EPTS Symposium, Orlando, P S = publisher / sensor = subscriber / doctor S S B B B B B P P medication = NoNameDrugX temperature = 40°C illness = diabetes B B heart rate = 150 bpm blood pressure = 60 mmHg P P [medication = NoNameDrugX] & [blood pressure > 100] & [heart rate > 130] [heart rate > 140 bpm] & [temperature > 39] & [illness != cold] S [temperature > 42] & [medication = Advil] P temperature = 38°C Historic Query with Event Correlation Monitoring patients who still have high blood pressure after taking NoNameDrugX Monitoring patients having a fever but not having a cold Monitoring patients who still have fever after taking Advil Medical records database [medication = NoNameDrugX] [blood pressure > 100] [heart rate > 130]