University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Mining Time Series State Changes with Prototype Based Clustering Markus Pylvänen Sami Äyrämö Tommi Kärkkäinen University of Jyväskylä
Department of Mathematical Information Technology ICANNGA 2009 The Problem Industrial processes produce a huge amount of multivariate time series data Manual surveillance requires too much resources Malfunction should be detected before the occurrence –The malfunction state and the preceeding states, or even sequence of the states, must be recognized, characterized and detected for proactive surveillance
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Äyrämö, S., Knowledge Mining using Robust Clustering, PhD Thesis, University of Jyväskylä, 2006
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 About the Domain Monitoring of wind turbine gears and mechanical drives manufactured for the process industries By detecting faults before they occur it is possible to plan service breaks in advance and maximize the running time of gear units No a priori information are available on the operational states The visualization tool –detecting and visualizing the state changes in gear units –a simple and understandable view to the process data for the use of industrial process experts
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Gear unit Measured gear unit is 750 kW industrial planetary gear
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 The data Condition of the gear units are monitored by Moventas Condition Management System (CMaS) which uses several sensors for detecting –count of oil particles –vibration –rotation speed –oil temperature –oil pressure Size of test data 2029 × 215 One hour resolution One malfunction was detected by the domain specialist in the test data collected from the gear unit
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 The method Time-series data can be analyzed with many data mining techniques –E.g., clustering and dimension reduction provide information about process states or correlations between measurements Using sequence mining also the order of state changes can be recognized Combining these with visualization can get an overall view to the different states in the process and the order they occurred
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Mining the Time Series State Changes
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Occurrence of Clusters in Timeline Colors represent clusters Each cluster correspond to a particular state Any clustering method can be applied Information about within- and between-cluster similarities is lost Recurrent sequences are still difficult to recognize
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Implementation The MATLAB K-means algorithm was used in the clustering step –The prototype-based methods provide natural representatives for clusters prototypes –Easy to modify for incomplete data sets –Based on classical statistics, not robust against gross errors –The other methods should be tried later when more data will be available Dimension reduction was realized using MATLAB PCA-method Graphical user interface was programmed with Java using JFreeChart library All the written code are open source and licensed with GPLv3
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Transition network Malfunction state
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Window for Comparing Clusters Clusters are compared with one of the vibration variables Malfunction in cluster 5 can be easily seen
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Conclusions The prototype software was found to be a promising monitoring tool for gear unit monitoring More data from normal behavior and malfunctions are required More efficient clustering techniques (including missing data treatment) must be evaluated Design of the visual outlook must be enhanced