Event Processing in Operational Information Systems: Two Case Studies and BAM/EDA Implications Karsten Schwan, Brian Cooper, Greg Eisenhauer Georgia Institute of Technology Center for Experimental Research in Computer Systems (CERCS) NSF Industry University Co-operative Research Center
I. Delta Air Lines Operational Information Systems (OIS) – Internal View High event rates for simple/mediated events Complex events processed/produced by business logic
I. Delta Air Lines OIS – External View
I. Continuous Event Processing in Delta’s OIS –Complex systems and large event volumes TPF, DTMI, TIBCO, Tuxedo, Web Services; Mainframes, Clusters, End Systems –event services across multiple system `silos’ »interoperability APIs »event filtering, replication, morphing »JIT XML and event conversion – for outsources services »runtime trust management vs. security? –data tapping – for legacy systems (hardware support?) –deep packet inspection/event morphing (system/network support?)
–Complex system interactions and 24/7 operation: high reliability and availability: with stateful operation –continuous monitoring and repair »abnormal behavior (e.g., timeout behavior) detection, with human intervention after thresholds exceeded –`poison messages’ and poison message sequences »avoid recovery and/or bound recovery time online performance management –utility-based event scheduling/routing »ability to distinguish service levels –link to immediate business needs »e.g., revenue management –performance isolation vs. optimization »e.g., isolation from recovery traffic –NOTES: highly distributed event processing; most events carry business data (additional BAM events); BASE, not ACID, for most events; multi-model event processing, not SQL; STATEful processing I. Integrated BAM: Continuously Managed Event Flows
II. Worldspan: Need for QoS in Business Monitoring SLA-driven operation and online event scheduling: QoS in Business Monitoring for differentiated services 24/7 operation and stateful services: Management must include incremental updates of service state Huge event volumes Utility Obtained from Worldspan’s Flight Search Engine
Summary Event-based Systems for the Enterprise Domain: GT Focus: Adaptive/Autonomic Distributed Information Flows IBM, Tata (iFlow: utility-based, autonomic management of distributed information flows; performance isolation in web-based event flows; online monitoring and management with Eclipse) HP (automated application deployment; QMon: QoS in business activity monitoring) Worldspan (`power udpates’: non-intrusive dynamic state updates; utility-based activity monitoring) Delta, Raytheon (performance isolation/robustness; utility-driven failure management; monitoring web-based infrastructures) Cisco, Intel (network-level services for event-based systems) NSF, DARPA, DOE (continual queries; ECho/IQ-ECho:publish/subscribe event system, with resource-aware operation; EV(ent)Path: dynamic overlay creation and management, with runtime event scheduing; event flows and mobility) Security Systems
EDA/BAM Implications Multiple event/processing models –Monitoring events, Business events,... Interoperability –Differently structured event data, eventually should include unstructured data Complex, domain-specific event processing –Importance of state state recovery/expiration –Distributed data and processing Security/performance/reliability implications –Importance of online management integrated into business event processing driven by end user utility strong QoS/real-time constraints Overlap/conflicts with AC (ICAC) (many companies involved!) –Terminology: CBEs (events), touchpoints, symptoms/symptom databases, SLAs, SLOs,... –Technology: non-intrusive instrumentation,...
Georgia Tech Information Flow Research Scientific.Grid Enterprise Computing Enterprise Computing Embedded Systems Embedded Systems To construct the interactive information grids of the future and to create the intellectual capital that can advance these technologies and fuel future advances. Information anytime, anywhere Timeliness! Robustness! Quality! Security and Trust! Remote access to the Information Grid Brian Cooper Ling Liu Calton Pu Kishore Ramachandran Karsten Schwan Continual Queries ECho/IQ-ECho Fusion Channels IFlow/EVPath
Additional Insights Enterprise Systems Utility-based mapping and configuration in: –shared execution environments High Performance Computing Large-data events in: –simulation monitoring: e.g., remote data visualization –GT Smartpointer application Pervasive Systems Online path management in: –situation monitoring and assessment Location-aware operation in: – mobile end user systems
Research Agenda for Event-based Systems I. Stateful Event Services: –Dynamic service and code deployment (DCG, dynamic compilation) –Runtime code modification and adaptation, dynamic data conversion –Dynamic state saving and updates (e.g., power updates) –Dynamic overlays, … II. Resource- and Needs-Awareness: –Diverse metrics: bandwidth, power, trust,... –Changing end user needs, application behaviors –Performance monitoring/understanding: integrate across user and system levels III. Runtime Management: –Utility-driven operation –New reliability and availability methods –`Vertical’ integration: user/system/network levels –Multi-dimensional optimization vs. performance robustness IV. Open Infrastructures: –App-level (e.g., `inside’ JMS) or `instrumented networks’ –`Black box’ operating systems vs. dynamic extension and VM technologies –`Closed’ networks vs. application-level services `in’ network devices e.g., Cisco’s AONS, Intel’s IXP network processors
Event Processing in EScience – SmartPointer Example Dynamic composition of user- specified services. SmartPointer: Data- intensive scientific collaboration