October 2-3, 2015, İSTANBUL Boğaziçi University Prof.Dr. M.Erdal Balaban Istanbul University Faculty of Business Administration Avcılar, Istanbul - TURKEY.

Slides:



Advertisements
Similar presentations
On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach Author: Steven L. Salzberg Presented by: Zheng Liu.
Advertisements

1. Abstract 2 Introduction Related Work Conclusion References.
/faculteit technologie management Introduction to Data Mining a.j.m.m. (ton) weijters (slides are partially based on an introduction of Gregory Piatetsky-Shapiro)
1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.
SLIDE 1IS 257 – Fall 2008 Data Mining and the Weka Toolkit University of California, Berkeley School of Information IS 257: Database Management.
Data Mining By Archana Ketkar.
Introduction to WEKA Aaron 2/13/2009. Contents Introduction to weka Download and install weka Basic use of weka Weka API Survey.
Data Mining – Intro.
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Chapter 5 Data mining : A Closer Look.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Dr. Awad Khalil Computer Science Department AUC
More on Data Mining KDnuggets Datanami ACM SIGKDD
University of Toronto 8/30/20151 Data Mining The Art and Science of Obtaining Knowledge from Data Dr. Saed Sayad.
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Techniques As Tools for Analysis of Customer Behavior
Data Mining Chun-Hung Chou
Introduction: The essential background
Cost-Sensitive Bayesian Network algorithm Introduction: Machine learning algorithms are becoming an increasingly important area for research and application.
The CRISP-DM Process Model
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
An Example of Course Project Face Identification.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Data Mining Overview. Lecture Objectives After this lecture, you should be able to: 1.Explain key data mining tasks in your own words. 2.Draw an overview.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Mining Process A manifestation of best practices A systematic way to conduct DM projects Different groups has different versions Most common standard.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
27-18 września Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Prepared by: Mahmoud Rafeek Al-Farra College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Chapter 14 Data Mining Transparencies. 2 Chapter Objectives u The concepts associated with data mining. u The main features of data mining operations,
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Friday, 14 November 2003 William.
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall DM Finals Study Guide Rodney Nielsen.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
DR. SATISH NARGUNDKAR GEORGIA STATE UNIVERSITY Analytics Overview.
Data Mining Copyright KEYSOFT Solutions.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.

WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
DATA MINING It is a process of extracting interesting(non trivial, implicit, previously, unknown and useful ) information from any data repository. The.
A Decision Support Based on Data Mining in e-Banking Irina Ionita Liviu Ionita Department of Informatics University Petroleum-Gas of Ploiesti.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Data Mining is the process of analyzing data and summarizing it into useful information Data Mining is usually used for extremely large sets of data It.
Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.
Data Mining – Intro.
DATA MINING © Prentice Hall.
Prepared by: Mahmoud Rafeek Al-Farra
Data Mining 101 with Scikit-Learn
Data Mining: Concepts and Techniques Course Outline
Machine Learning Week 1.
Prepared by: Mahmoud Rafeek Al-Farra
Prepared by: Mahmoud Rafeek Al-Farra
Machine Learning with Weka
Prepared by: Mahmoud Rafeek Al-Farra
Promising “Newer” Technologies to Cope with the
Presentation transcript:

October 2-3, 2015, İSTANBUL Boğaziçi University Prof.Dr. M.Erdal Balaban Istanbul University Faculty of Business Administration Avcılar, Istanbul - TURKEY Project Management in Data Mining

PRESENTATION OUTLINE What is Data Mining? Data Mining Environment Decision Making Process CRISP-DM Methodology Phases of Data Mining Process Flowchart of Data Mining Process (Proposal) Conclusions 2/17October 2, 2015

What is Data Mining? “Data mining is the process of discovering useful patterns and trends in large data sets.” (Larose, 2014).  Data mining makes the difference which are used in many areas: health care, banking, finance, insurance, telecommunications, manufacturing, retail, market research, and the public sector. 3/17October 2, 2015

Data Mining Environment Database Technology Statistics Database Technology Data Mining Database Technology Machine Learning Other Disciplines Information Science Visualizations 4/17October 2, 2015

Decision Making Process DATA INFORMATION KNOWLEDGE DECISIONS ACTION 5/17October 2, 2015

CRISP-DM Methodology CRISP-DM focuses data mining on rapid model development and deployment to optimize decisions. CRoss-Industry Standard Process for Data Mining (Shearer, 2000) 6/17October 2, 2015

CRISP-DM  The Cross-Industry Standard Process for Data Mining (CRISP-DM) is the dominant data-mining process framework. It's an open standard; anyone may use it. The following list describes the various phases of the process. 7/17October 2, 2015

Tasks (bold) and outputs (italic) of the CRISP-DM reference model 8/17October 2, 2015

Define Project Data Gathering Data Sources Data Understanding & Data Selection Data Preprocessing Supervised Learning ? Training Dataset Training Dataset Test Dataset Test Dataset Evaluation of Model Performance Classification Methods Clustering Methods or Association Rules Selecting Algorithm & Model Building Measuring Model Performance Evaluate Model Data Preparation Model Implementation Data Mining Phases (Proposal Flowchart) No Yes High Low Dataset Crucial Phase ! Knowledge Representation & Decision October 2, 2015

Planing for data mining project  Produce project plan: List the stages in the project, together with duration, resources required, and relations. Define the project Prepare data for data mining modeling Separate data into training and testing parts for performance evaluation Apply alternative algorithms to build model and evaluate the model’s performances Implement the model to generate knowledge and make a decision before action 10/17October 2, 2015

Define project  Understand the project objectives and requirements on the first phase of data mining  List the assumptions made by the project and list the constrains on the project  Construct a cost-benefit analysis for the project 11/17October 2, 2015

Prepare data for data mining  Collect the data (or datasets),  Select data,  Explore data,  Clean the data,  Reformat data,  Transform data. 12/17October 2, 2015

Separate the dataset for performance evaluation  Select the evaluation method Hold-out Cross validation (k-fold cv) Bootstrapping 13/17October 2, 2015

Apply alternative algorithms and select the best model  There are several techniques for the same data mining problem type. Some techniques have specific requirements on the form of data. Classification algorithms  k Nearest Neigbour (kNN)  Naive Bayes  Logistic Regression  Decision Trees  Support Vector Machines  Artificial Neural Networks –ANNs Clustering Algorithms Assocation Algorithms  The generated models that meet the selected criteria become approved models. 14/17October 2, 2015

Implement the model to make a decision  Creation of the model is generally not the end of the project. Even if the purpose of the model is to increase knowledge of the data.  Apply the model within the organization’s decision making process and then activate. 15/17October 2, 2015

CONCLUSIONS 1.Data Mining Techniques are important to discover knowledge which is more meaningful and valuable for decision making. 2.Project management approach is important for succeessful data mining. 3.Each phase of data mining process is important but most important phases are data preparation before modeling and evaluation of model performance after modeling. These crucial phases are usually disregarded or skipped in practice. 4.All phases and sub operations should be planned and scheduled by using project management methods for successful data mining. 16/17October 2, 2015

 Thank you very much for your attention and listenning.  Are there any questions and suggestions? 17October 2, 2015