Model-based Validation of Streaming Data Cheng Xu, Tore Risch Dept. Information Technology Uppsala University, Sweden Daniel Wedlund, Martin Helgoson AB.

Slides:



Advertisements
Similar presentations
1 VLDB 2006, Seoul Mapping a Moving Landscape by Mining Mountains of Logs Automated Generation of a Dependency Model for HUG’s Clinical System Mirko Steinle,
Advertisements

1 Dynamic Adaptation of Temporal Event Correlation Rules Rean Griffith‡, Gail Kaiser‡ Joseph Hellerstein*, Yixin Diao* Presented by Rean Griffith
CLUE: SYSTEM TRACE ANALYTICS FOR CLOUD SERVICE PERFORMANCE DIAGNOSIS Hui Zhang 1, Junghwan Rhee 1, Nipun Arora 1, Sahan Gamage 2, Guofei Jiang 1, Kenji.
4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.
Massive Scale-out of Expensive Continuous Queries Erik Zeitler and Tore Risch Uppsala Database Laboratory Uppsala University.
Fundamentals of Computer Security Geetika Sharma Fall 2008.
Models and Security Requirements for IDS. Overview The system and attack model Security requirements for IDS –Sensitivity –Detection Analysis methodology.
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Transport Protocols.
A New Household Security Robot System Based on Wireless Sensor Network Reporter :Wei-Qin Du.
Dunja Mladenić Marko Grobelnik Jožef Stefan Institute, Slovenia.
Mining Behavior Models Wenke Lee College of Computing Georgia Institute of Technology.
Software Testing for Safety- Critical Applications Presented by: Ciro Espinosa & Daniel Llauger.
HOL9396: Oracle Event Processing 12c
Winter Retreat Connecting the Dots: Using Runtime Paths for Macro Analysis Mike Chen, Emre Kıcıman, Anthony Accardi, Armando Fox, Eric Brewer
1 FM Overview of Adaptation. 2 FM RAPIDware: Component-Based Design of Adaptive and Dependable Middleware Project Investigators: Philip McKinley, Kurt.
Cross Strait Quad-Regional Radio Science and Wireless Technology Conference, Vol. 2, p.p. 980 – 984, July 2011 Cross Strait Quad-Regional Radio Science.
EMBEDDED SOFTWARE Team victorious Team Victorious.
SECURING NETWORKS USING SDN AND MACHINE LEARNING DRAGOS COMANECI –
Management by Network Search
A User Experience-based Cloud Service Redeployment Mechanism KANG Yu.
University of Zagreb MMVE 2012 workshop1 Towards Reinterpretation of Interaction Complexity for Load Prediction in Cloud-based MMORPGs Mirko Sužnjević,
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
Tufts Wireless Laboratory School Of Engineering Tufts University “Network QoS Management in Cyber-Physical Systems” Nicole Ng 9/16/20151 by Feng Xia, Longhua.
Event Metadata Records as a Testbed for Scalable Data Mining David Malon, Peter van Gemmeren (Argonne National Laboratory) At a data rate of 200 hertz,
Fault Diagnosis System for Wireless Sensor Networks Praharshana Perera Supervisors: Luciana Moreira Sá de Souza Christian Decker.
Capacity analysis of complex materials handling systems.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
VeriFlow: Verifying Network-Wide Invariants in Real Time
NSF Critical Infrastructures Workshop Nov , 2006 Kannan Ramchandran University of California at Berkeley Current research interests related to workshop.
Cluster Reliability Project ISIS Vanderbilt University.
CLOUD BASED MACHINE LEARNING APPROACHES FOR LEAKAGE ASSESSMENT AND MANAGEMENT IN SMART WATER NETWORKS Dr. Steve Mounce, Ms. Catalina Pedroza, Dr. Tom Jackson,
Department of Information Engineering The Chinese University of Hong Kong A Framework for Monitoring and Measuring a Large-Scale Distributed System in.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Grant Pannell. Intrusion Detection Systems  Attempt to detect unauthorized activity  CIA – Confidentiality, Integrity, Availability  Commonly network-based.
High Performance Embedded Computing © 2007 Elsevier Lecture 3: Design Methodologies Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte Based.
Using Prediction to Accelerate Coherence Protocols Authors : Shubendu S. Mukherjee and Mark D. Hill Proceedings. The 25th Annual International Symposium.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
1 COPYRIGHT © 2015 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Cognitive Security: Security Analytics and Autonomics for Virtualized Networks Lalita Jagadeesan.
Experimental Evaluation of System-Level Supervisory Approach for SEFIs Mitigation Mrs. Shazia Maqbool and Dr. Craig I Underwood Maqbool 1 MAPLD 2005/P181.
4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.
Presented By, Shivvasangari Subramani. 1. Introduction 2. Problem Definition 3. Intuition 4. Experiments 5. Real Time Implementation 6. Future Plans 7.
Static WCET Analysis vs. Measurement: What is the Right Way to Assess Real-Time Task Timing? Worst Case Execution Time Prediction by Static Program Analysis.
Software Overhead in Messaging Layers Pitch Patarasuk.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
WIRELESS INTEGRATED NETWORK SENSORS
1 Semantics and Evaluation Techniques for Window Aggregates in Data Streams Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, Peter Tucker This work.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
The Utilization of Artificial Intelligence in a Hybrid Intrusion Detection System Authors : Martin Botha, Rossouw von Solms, Kent Perry, Edwin Loubser.
Best detection scheme achieves 100% hit detection with
Control-Theoretic Approaches for Dynamic Information Assurance George Vachtsevanos Georgia Tech Working Meeting U. C. Berkeley February 5, 2003.
Microsoft Advertising 16:9 Template Light Use the slides below to start the design of your presentation. Additional slides layouts (title slides, tile.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
TWOJA CYFROWA PRZYSZŁOŚĆ. JUŻ DZISIAJ. Christoph F. Strnadl CTO Central & Eastern Europe 11 May 2016.
The article written by Boyarshinova Vera Scientific adviser: Eltyshev Denis THE USE OF NEURO-FUZZY MODELS FOR INTEGRATED ASSESSMENT OF THE CONDITIONS OF.
Monitoring Dynamic IOC Installations Using the alive Record Dohn Arms Beamline Controls & Data Acquisition Group Advanced Photon Source.
MAXPRO NVR HONEYWELL - CONFIDENTIAL File Number MAXPRO NVR 2.5 HON Cameras Support – HDZ PTZ, ONVIF cameras (equivalent ONVIF models to current.
IncApprox The marriage of incremental and approximate computing Pramod Bhatotia Dhanya Krishnan, Do Le Quoc, Christof Fetzer, Rodrigo Rodrigues* (TU Dresden.
Sensing and Measurements Tom King Oak Ridge National Laboratory April 2016.
András Benczúr Head, “Big Data – Momentum” Research Group Big Data Analytics Institute for Computer.
A monitoring system for the beam-based feedbacks in the LHC
Streaming Analytics & CEP Two sides of the same coin?
ROBUST FACE NAME GRAPH MATCHING FOR MOVIE CHARACTER IDENTIFICATION
Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory
Event Driven Programming Dick Steflik
Overview Introduction VPS Understanding VPS Architecture
EPICS: Experimental Physics and Industrial Control System
Adaptive Query Processing (Background)
Yining ZHAO Computer Network Information Center,
Task Manager & Profile Interface
Presentation transcript:

Model-based Validation of Streaming Data Cheng Xu, Tore Risch Dept. Information Technology Uppsala University, Sweden Daniel Wedlund, Martin Helgoson AB Sandvik Coromant, Sweden

Informationsteknologi Institutionen för informationsteknologi | Talk Overview Motivation Approach and System Architecture Demonstrators Performance experiments Conclusion Related work Future work

Informationsteknologi Institutionen för informationsteknologi | Motivation Functional products: integrated provision of hardware, software and services, not just the traditional hardware => Manufacturer responsble for functioning In modern manufacturing industry sensors installed on equipment-in-use generate many high rate data streams Providing productivity, reliability, and quality of functional products require monitoring many streams for unexpected behavior. When the number of machines increases and data flows are high, validation with low latency may be challenging SVALI (Stream VALIdator): General system to validate correct equipment behavior by analyzing streams on- the-fly.

Informationsteknologi Institutionen för informationsteknologi | SVALI, Stream VALIdator Two validation approaches: Model-and-validate The user defines an analytical math model of expected behavior based on streams from equipment sensors The user also defines a validation model that identifies abnormal equipment sensor readings by comparing the result of the analytical model with measured sensor streams. A simple case is detecting when difference between expected power consumption and measured power consumption exceeds some threshold. Learn-and-validate The user provides (statistical) learning model based on a sampled sub-stream of correctly behaving equipment As for model-and-validate the user also provides a validation model

Informationsteknologi Institutionen för informationsteknologi | SVALI Architecture CLIENT VISUALIZERS AND ALERTERSUPDATES SVALI VALIDATION FUNCTIONS model-n-validatelearn-n-validate STREAM MODELS Analytical modelStatistical model STREAM WRAPPERS Stream wrapper AStream wrapper B equipment Aequipment B CQ 1 CQ 2 TCP set threshold = 1.3 EPIC DSMS DBDB

Informationsteknologi Institutionen för informationsteknologi | Model-and-validate model_n_validate(Bag of Stream s, Function modelfn, Function validatefn) ->Stream of (Number ts, Object me, Object ex) modelfn(Object se)->Object ex validatefn(Object se, Object ex)->(Number ts, Object me) Learn-and-validate learn_n_validate(Bag of Stream s, Function learnfn, Integer n, Function validatefn) -> Stream of (Number ts, Object me, Object ex) learnfn(Vector of Object sa)->Object ex validatefn(Object se, Object ex)->(Number ts, Object me) The difference is how the model is defined SVALI Validation functions

Informationsteknologi Institutionen för informationsteknologi | create function validatePower(Record r, Number ex) -> (Number ts, Number me) as select ts(r), me where me = measuredPower(r) and abs(ex - me) > th(“mill1”); select model_n_validate(bagof(input), #'expectedPower',#’validatePower’) from Stream input where input = corenetJsonWrapper("h1", 1337); Model-n-validate demonstrator The side milling process The analytical and validation models are entered into the SVALI system a e [mm] f z [mm/tooth] h ex [mm] a p [mm] v c [m/min] zczc

Informationsteknologi Institutionen för informationsteknologi | create function extractPowerW(Window w) -> Vector of Number as vselect extractPower(r) from Record r where r in w; Learn-n-validate demonstrator Cyclic behavior Cyclic behavior is defined as predicate (dynamic) windows. A vector of expected power consumptions is computed from the sampled n first predicate windows The learning model is the normalized average vector over the sampled windows Validation is done by comparing the normalized euclidean distance between the learnt power consumptions and the current window’s power consumptions create function cycleStart(Record s) -> Boolean as s[“trigger”] = 1; The window starts when the trigger is 1 create function cycleStop(Record s, Record r) -> Boolean as r[“trigger”] = 0 and s[“trigger”] = 1; The window ends when the trigger is 0 and the window was started create function learnCycle(Vector of Window f) -> Vector of Number as navg(select extractPowerW(w) from Window w where w in f); create function validateCycle(Window w, Vector e) -> (Number ts, Vector of Number m) as select timestamp(w), m where neuclid(e, m) > th(“machine2”) and m = extractPowerW(w); select learn_n_validate(bagof(sw), #’learnCycle’, 2, #’validateCycle’) from Stream s, Stream sw where s= corenetJsonWrapper( "h2", 1338) and sw = pwindowize(s, #’cycleStart’, #’cycleStop’);

Informationsteknologi Institutionen för informationsteknologi | Performance Experiments Experiment setup  Dell NUMA computer PowerEdge R815 featuring 4 CPUs with GHz cores each. OS: Scientific Linux release 6.2 The performance of SVALI is measured by average response time of two queries  Q1, model-and-validate over single stream events  Q2, model-and-validate moving average over 0.1 second stream windows To scale-up the number of machines, streams are generated based on real data streams provided by industrial partner with different arrival rates (1 ms – 10 ms), each stream is tagged with a machine id.

Informationsteknologi Institutionen för informationsteknologi | Central vs Parallel Performance Experiments merge on ts validation machine 0 machine i... one SVALI node machine 0 machine i... validation 0 validation i... merge on ts central validation parallel validation

Informationsteknologi Institutionen för informationsteknologi | Fig. 1 Average response time Q1 Experiment Measurement Q1 merge on ts validation machine 0 machine i... one SVALI node machine 0 machine i... validation 0 validation i... merge on ts

Informationsteknologi Institutionen för informationsteknologi | Experiment Measurement Q2 Fig. 2 Average response time Q2 merge on ts validation machine 0 machine i... one SVALI node machine 0 machine i... validation 0 validation i... merge on ts validation includes a groupby on machine id It is already grouped around 2 ms

Informationsteknologi Institutionen för informationsteknologi | Conclusion Two general validation approaches were presented to validate stream behaviors, called model-and-validate and learn-and- validate Two demonstrators show how they are used in real industrial application streams Parallel execution enables computation of stream validation with limited delays over many machines

Informationsteknologi Institutionen för informationsteknologi | Related work Jakubek, S. and Strasser, T.: Fault-diagnosis using neural networks with ellipsoidal basis functions. American Control Conference. Vol. 5. pp , 2002  Learning algorithm to reduce the number of measurements for fault detection, while we use parallel processing to enable low delays Tan, T., Gu, X., and Wang, H.: Adaptive system anomaly prediction for large-scale hosting infrastructures. PODC Conf., 2010  Prediction instead of detection  Low arrival rates, e.g. one sample every 2 seconds, need not parallelization Wang, D., Rundensteiner, E., Ellison, R.: Active Complex Event Processing for Realtime Health Care, VLDB Conf., 3(2): pp , 2010  Lower level rule mechanism triggered by state changes during the continuous query process Zeitler, E. and Risch, T.: Massive scale-out of expensive continuous queries, Proceedings of the VLDB Endowment, ISSN , Vol. 4, No. 11, pp , 2011  SVALI’s underlying DSMS EPIC extends that work with e.g. sliding windows and incremental aggregation.  SVALI provides validation functionalities on top of EPIC

Informationsteknologi Institutionen för informationsteknologi | Future work Other strategies for automatic performance improvements Adaptive learning model by re-sampling Adaptive parallelization of expensive validation functions

Informationsteknologi Institutionen för informationsteknologi |