An example of a data mining project. Problem Detect and explain faults of a continuous pulp digester Faults: drops in the output quality of the digester.

Slides:



Advertisements
Similar presentations
ENTITIES FOR A UN SYSTEM EVALUATION FRAMEWORK 17th MEETING OF SENIOR FELLOWSHIP OFFICERS OF THE UNITED NATIONS SYSTEM AND HOST COUNTRY AGENCIES BY DAVIDE.
Advertisements

Campus02.at don't stop thinking about tomorrow DI Anton Scheibelmasser Setubal ICINCO /25 Device integration into automation systems with.
Operating the Harmonizer
Division of Operation and Maintenance Engineering Wear prediction of grinding mill liners Farzaneh Ahmadzadeh, Jan Lundberg
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.
Data Preparation for Data Mining Prepared by: Yuenho Leung.
Information System Design IT60105 Lecture 3 System Requirement Specification.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering On-line Alert Systems for Production Plants A Conflict Based Approach.
Robofest 2001 Online Management System Jim Needham MCS 4833/01 Senior Project Dr. Chan-Jin Chung, Ph.D.
Arduino Week 3 Lab ECE 1020 Prof. Ahmadi. Objective Data acquisition (DAQ) is the process of measuring an electrical or physical phenomenon such as voltage,
Computer Science, Software Engineering & Robotics Workshop, FGCU, April 27-28, 2012 Bayesian Belief Networks in Anomaly Detection, Fault Diagnosis & Failure.
seminar on Intrusion detection system
Swami NatarajanJuly 14, 2015 RIT Software Engineering Reliability: Introduction.
Standardized Processes and Procedures. Standardization Supports StabilityStandardization Supports Stability  NOT the same as “work standards”  Faster.
Predictive Maintenance: Condition monitoring Tools and Systems for asset management September 19, 2007.
High Level: Generic Test Process (from chapter 6 of your text and earlier lesson) Test Planning & Preparation Test Execution Goals met? Analysis & Follow-up.
Creating Research proposal. What is a Marketing or Business Research Proposal? “A plan that offers ideas for conducting research”. “A marketing research.
Data Collection and Triangulation
Compare and contrast batch processing and online processing, outlining the meaning, advantages and disadvantages of the two. Which one would you recommend.
Requirements Engineering Processes
Chapter 10.
Verification and Validation Yonsei University 2 nd Semester, 2014 Sanghyun Park.
A Visual Comparison Approach to Automated Regression Testing (PDF to PDF Compare)
CHAPTER 4 Marketing Information and Research: Analyzing the Business Environment Off-line and Online M A R K E T I N G.
Chapter 7 Preparation for the Audit ACCT620 Internal Auditing Otto Chang Professor of Accounting.
1 1 Slide Introduction to Data Mining and Business Intelligence.
IPECJAFPilgrim Pilgrim Nuclear Power Station Successes At Entergy Nuclear Northeast Efficiencies and Cost Savings Using PI.
Chapter 29 conducting marketing research Section 29.1
Configuration Management (CM)
Data Survey Chapters in Data Preparation for Data Mining by Dorian Pyle Martti Kesäniemi.
Visual Materials Help Improve Knowledge Retention David Byers July 2001.
NAME Evaluation Report Name of author(s) Name of institution Year.
United Nations Economic Commission for Europe Statistical Division Mapping Data Production Processes to the GSBPM Steven Vale UNECE
Chap. 5 Building Valid, Credible, and Appropriately Detailed Simulation Models.
Task 2.6. Products and services. Sub-tasks Definition and design of added value products Definition and design of added value products
Pipeline Basics Jared Crossley NRAO NRAO. What is a data pipeline?  One or more programs that perform a task with reduced user interaction.  May be.
GCSE ICT Systems Analysis. Systems analysis Systems analysis is the application of analytical processes to the planning, design and implementation of.
C HOOSING E VALUATION T OOLS. Gathering information requires the appropriate tool Important to use a variety of data collection types.
Systems Development The Kingsway School. Systems Development This is carried out when a company is having a problem. They usually employ an ICT Consultant.
Chapter 14 Programming and Languages McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved.
University of Jyväskylä Department of Mathematical Information Technology ICANNGA 2009 Mining Time Series State Changes with Prototype Based Clustering.
Audit Evidence Process
BME 353 – BIOMEDICAL MEASUREMENTS AND INSTRUMENTATION MEASUREMENT PRINCIPLES.
ANALYSIS PHASE OF BUSINESS SYSTEM DEVELOPMENT METHODOLOGY.
Documenting LabVIEW Data & Data Mining with LabVIEW and DIAdem Presentation with self paced training exercises.
GCSE ICT 3 rd Edition The system life cycle 18 The system life cycle is a series of stages that are worked through during the development of a new information.
1 The System life cycle 16 The system life cycle is a series of stages that are worked through during the development of a new information system. A lot.
Chapter 3 of Your Research Project AED 615 Fall 2006 Dr. Franklin.
Computer Troubleshooting Intelligent System (CTIS) is being developed for Computer Services Department of the Student Health Center at UCF EEL 5874 EXPERT.
Chapter 10 Understanding and Planning Reports and Proposals 10-1.
MANAGING CUSTOMER INFORMATION TO GAIN CUSTOMER INSIGHTS Chapter 5 Kotler, Bowen, Makens and Baloglu Marketing for Hospitality and Tourism.
Arduino Week 3 Lab ECE 1020 Prof. Ahmadi. Objective Data acquisition (DAQ) is the process of measuring an electrical or physical phenomenon such as voltage,
Russell & Jamieson chapter Evaluation Steps 15. Evaluation Steps Step 1: Preparing an Evaluation Proposal Step 2: Designing the Study Step 3: Selecting.
Overview Modern chip designs have multiple IP components with different process, voltage, temperature sensitivities Optimizing mix to different customer.
System Design, Implementation and Review
Commissioning – Remote monitoring
Applying Control Theory to Stream Processing Systems
IT6004 – SOFTWARE TESTING.
11th International Conference on Mobile and Ubiquitous Systems:
NX I/O and the New High Speed Analog 4 Input Module
A (prototype) Shiny app for QCing continuous stream sensor data
Alexander Sterin, Dmitri Nikolaev, RIHMI-WDC
The Scientific Method.
by Xiang Mao and Qin Chen
The Scientific Method.
MECH 3550 : Simulation & Visualization
The Scientific Method.
Requirements Validation – I
IMPLEMENTATION ,EVALUATION AND MAINTENANCE OF MIS
Presentation transcript:

An example of a data mining project

Problem Detect and explain faults of a continuous pulp digester Faults: drops in the output quality of the digester.

Solution A report which consists of –description of analyzed data, –analysis methods, –results, –conclusions, and –process improvement recommendations.

Problem understanding Several sources of information: –description of process instrumentation, –documentation of digester control system, –ISO 9000 documents, –interviews of operation personnel, process engineers, researchers, and automation system vendor engineers.

Data acquisition About 200 on-line measurements Sampling rate 1 sample/10 minutes Data stored in SQL-database at the mill

Data acquisition Data acquisition procedure –a shell script run in SQL host twice a month –ftp-transfer of the data to HUT through firewall by a mill computer operator –addition of the new data files after the existing ones at HUT using shell scripts

Data acquisition Data file format: value1 checkbits1 timelabel1 value2 checkbits2 timelabel2... valueN checkbitsN timelabelN

Basic data preparation For each measurement channel: –check that the measurements are valid using checkbits –check using timelabel s if some samples are missing; if this is the case, fill in the empty gaps with NaNs

Data survey Visual data inspection (time series plots) revealed some problems: –some measurements didn’t work at all, –some measurements worked properly, but not all the time, –changes in production speed could be seen in most measurements, and –process tuning altered the behavior of some measurements.

Data survey Computation of material balances provides a way to roughly estimate reliability of some sensors Process delay from input to output of the digester about three hours  Delay between different measurements in different parts of the process had to be compensated

Data survey In order to get reliable results, only periods with constant production speed should be analyzed

Data modeling First, only temperature measurements in the digester sides were used Basic idea: to estimate the movement of chips using correlations between neighboring measurements  Failed

Data modeling Next, all available measurements were used The measurements were reduced to the ones best depicting the state of the digester The reduction was carried out using –process knowledge, –data visualization, and –correlation analysis.

Data modeling During the project, a digester modeling expert was consulted A model depicting the fault sensitivity of digester was created