Download presentation
Presentation is loading. Please wait.
Published byGriffin Parrish Modified over 9 years ago
1
Model-based Validation of Streaming Data Cheng Xu, Tore Risch Dept. Information Technology Uppsala University, Sweden Daniel Wedlund, Martin Helgoson AB Sandvik Coromant, Sweden
2
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Talk Overview Motivation Approach and System Architecture Demonstrators Performance experiments Conclusion Related work Future work
3
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Motivation Functional products: integrated provision of hardware, software and services, not just the traditional hardware => Manufacturer responsble for functioning In modern manufacturing industry sensors installed on equipment-in-use generate many high rate data streams Providing productivity, reliability, and quality of functional products require monitoring many streams for unexpected behavior. When the number of machines increases and data flows are high, validation with low latency may be challenging SVALI (Stream VALIdator): General system to validate correct equipment behavior by analyzing streams on- the-fly.
4
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se SVALI, Stream VALIdator Two validation approaches: Model-and-validate The user defines an analytical math model of expected behavior based on streams from equipment sensors The user also defines a validation model that identifies abnormal equipment sensor readings by comparing the result of the analytical model with measured sensor streams. A simple case is detecting when difference between expected power consumption and measured power consumption exceeds some threshold. Learn-and-validate The user provides (statistical) learning model based on a sampled sub-stream of correctly behaving equipment As for model-and-validate the user also provides a validation model
5
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se SVALI Architecture CLIENT VISUALIZERS AND ALERTERSUPDATES SVALI VALIDATION FUNCTIONS model-n-validatelearn-n-validate STREAM MODELS Analytical modelStatistical model STREAM WRAPPERS Stream wrapper AStream wrapper B equipment Aequipment B CQ 1 CQ 2 TCP set threshold = 1.3 EPIC DSMS DBDB
6
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Model-and-validate model_n_validate(Bag of Stream s, Function modelfn, Function validatefn) ->Stream of (Number ts, Object me, Object ex) modelfn(Object se)->Object ex validatefn(Object se, Object ex)->(Number ts, Object me) Learn-and-validate learn_n_validate(Bag of Stream s, Function learnfn, Integer n, Function validatefn) -> Stream of (Number ts, Object me, Object ex) learnfn(Vector of Object sa)->Object ex validatefn(Object se, Object ex)->(Number ts, Object me) The difference is how the model is defined SVALI Validation functions
7
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se create function validatePower(Record r, Number ex) -> (Number ts, Number me) as select ts(r), me where me = measuredPower(r) and abs(ex - me) > th(“mill1”); select model_n_validate(bagof(input), #'expectedPower',#’validatePower’) from Stream input where input = corenetJsonWrapper("h1", 1337); Model-n-validate demonstrator The side milling process The analytical and validation models are entered into the SVALI system a e [mm] f z [mm/tooth] h ex [mm] a p [mm] v c [m/min] zczc 20.07560.05202004 30.06410.05202004
8
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se create function extractPowerW(Window w) -> Vector of Number as vselect extractPower(r) from Record r where r in w; Learn-n-validate demonstrator Cyclic behavior Cyclic behavior is defined as predicate (dynamic) windows. A vector of expected power consumptions is computed from the sampled n first predicate windows The learning model is the normalized average vector over the sampled windows Validation is done by comparing the normalized euclidean distance between the learnt power consumptions and the current window’s power consumptions create function cycleStart(Record s) -> Boolean as s[“trigger”] = 1; The window starts when the trigger is 1 create function cycleStop(Record s, Record r) -> Boolean as r[“trigger”] = 0 and s[“trigger”] = 1; The window ends when the trigger is 0 and the window was started create function learnCycle(Vector of Window f) -> Vector of Number as navg(select extractPowerW(w) from Window w where w in f); create function validateCycle(Window w, Vector e) -> (Number ts, Vector of Number m) as select timestamp(w), m where neuclid(e, m) > th(“machine2”) and m = extractPowerW(w); select learn_n_validate(bagof(sw), #’learnCycle’, 2, #’validateCycle’) from Stream s, Stream sw where s= corenetJsonWrapper( "h2", 1338) and sw = pwindowize(s, #’cycleStart’, #’cycleStop’);
9
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Performance Experiments Experiment setup Dell NUMA computer PowerEdge R815 featuring 4 CPUs with 16 2.3 GHz cores each. OS: Scientific Linux release 6.2 The performance of SVALI is measured by average response time of two queries Q1, model-and-validate over single stream events Q2, model-and-validate moving average over 0.1 second stream windows To scale-up the number of machines, streams are generated based on real data streams provided by industrial partner with different arrival rates (1 ms – 10 ms), each stream is tagged with a machine id.
10
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Central vs Parallel Performance Experiments merge on ts validation machine 0 machine i... one SVALI node machine 0 machine i... validation 0 validation i... merge on ts central validation parallel validation
11
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Fig. 1 Average response time Q1 Experiment Measurement Q1 merge on ts validation machine 0 machine i... one SVALI node machine 0 machine i... validation 0 validation i... merge on ts
12
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Experiment Measurement Q2 Fig. 2 Average response time Q2 merge on ts validation machine 0 machine i... one SVALI node machine 0 machine i... validation 0 validation i... merge on ts validation includes a groupby on machine id It is already grouped around 2 ms
13
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Conclusion Two general validation approaches were presented to validate stream behaviors, called model-and-validate and learn-and- validate Two demonstrators show how they are used in real industrial application streams Parallel execution enables computation of stream validation with limited delays over many machines
14
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Related work Jakubek, S. and Strasser, T.: Fault-diagnosis using neural networks with ellipsoidal basis functions. American Control Conference. Vol. 5. pp.3846-3851, 2002 Learning algorithm to reduce the number of measurements for fault detection, while we use parallel processing to enable low delays Tan, T., Gu, X., and Wang, H.: Adaptive system anomaly prediction for large-scale hosting infrastructures. PODC Conf., 2010 Prediction instead of detection Low arrival rates, e.g. one sample every 2 seconds, need not parallelization Wang, D., Rundensteiner, E., Ellison, R.: Active Complex Event Processing for Realtime Health Care, VLDB Conf., 3(2): pp.1545-1548, 2010 Lower level rule mechanism triggered by state changes during the continuous query process Zeitler, E. and Risch, T.: Massive scale-out of expensive continuous queries, Proceedings of the VLDB Endowment, ISSN 2150-8097, Vol. 4, No. 11, pp. 1181- 11888, 2011 SVALI’s underlying DSMS EPIC extends that work with e.g. sliding windows and incremental aggregation. SVALI provides validation functionalities on top of EPIC
15
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se Future work Other strategies for automatic performance improvements Adaptive learning model by re-sampling Adaptive parallelization of expensive validation functions
16
Informationsteknologi Institutionen för informationsteknologi | www.it.uu.se
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.