Data Mining: 5. Penelitian Data Mining Romi Satria Wahono WA/SMS: +6281586220090 1.

Slides:



Advertisements
Similar presentations
The Management Process
Advertisements

Software Engineering: Research Romi Satria Wahono
Data Mining: Penelitian Data Mining
Data Mining: Metode dan Algoritma
BPMN Fundamentals Romi Satria Wahono WA/SMS:
Advanced Manufacturing Laboratory Department of Industrial Engineering Sharif University of Technology Session # 5.
TOGAF 9 Fundamental: 3. Core Concepts
These slides are additional material for TIES4451 Data Mining Lecture 1 TIES445 Data mining Nov-Dec 2007 Sami Äyrämö.
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
Data Mining Lecture 1: Introduction to Data Mining Manuel Penaloza, PhD.
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer.
An Introduction to Marketing Research
SLIDE 1IS 257 – Fall 2008 Data Mining and the Weka Toolkit University of California, Berkeley School of Information IS 257: Database Management.
Software Engineering: 3. Methodology
BPMN Fundamentals: 4. BPMN Refactoring Romi Satria Wahono WA:
Data Mining – Intro.
Microsoft Enterprise Consortium Data Mining Concepts Introduction: The essential background Prepared by David Douglas, University of ArkansasHosted by.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
TOGAF 9 Fundamental: 2. Basic Concepts
Data Mining CMPT 455/826 - Week 10, Day 2 Jan-Apr 2009 – w10d21.
More on Data Mining KDnuggets Datanami ACM SIGKDD
Introduction to Computer and Programming CS-101 Lecture 6 By : Lecturer : Omer Salih Dawood Department of Computer Science College of Arts and Science.
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Introduction: The essential background
1 Programming Thinking and Method (0) Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design.
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer.
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Knowledge Management: 2. Foundations Romi Satria Wahono WA/SMS:
BPMN Fundamentals: 2. BPMN Basic Concepts Romi Satria Wahono WA:
Copyright 2002 Prentice-Hall, Inc. 1.1 Modern Systems Analysis and Design Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 1 The Systems Development.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
Knowledge Management: 3. Solutions Romi Satria Wahono WA/SMS:
Most of contents are provided by the website Introduction TJTSD66: Advanced Topics in Social Media Dr.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
9/03 Data Mining – Introduction G Dong (WSU)1 CS499/ Data Mining Fall 2003 Professor Guozhu Dong Computer Science & Engineering WSU.
TOGAF 9 Fundamental: 3. TOGAF ADM
Introduction to Operations Research. MATH Mathematical Modeling 2 Introduction to Operations Research Operations research/management science –Winston:
Pertemuan 16 Materi : Buku Wajib & Sumber Materi :
BPMN Fundamentals Romi Satria Wahono WA/SMS:
Sotarat Thammaboosadee, Ph.D. EGIT563- Data Mining Course Outline.
Data Mining: 8. Text Mining Romi Satria Wahono WA/SMS:
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
BPMN Fundamentals: 5. BPMN Guide and Examples
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Data Mining – Intro.
BPMN Fundamentals: 4. BPMN Refactoring
Chapter 1 The Systems Development Environment
Introduction C.Eng 714 Spring 2010.
Chapter 1 The Systems Development Environment
Chapter 1 The Systems Development Environment
Data Mining: Concepts and Techniques Course Outline
Information Technology (IT)
Data Warehousing and Data Mining
Dept. of Computer Science University of Liverpool
Chapter 1 The Systems Development Environment
Promising “Newer” Technologies to Cope with the
Presentation transcript:

Data Mining: 5. Penelitian Data Mining Romi Satria Wahono WA/SMS:

Romi Satria Wahono SD Sompok Semarang (1987) SMPN 8 Semarang (1990) SMA Taruna Nusantara Magelang (1993) B.Eng, M.Eng and Ph.D in Software Engineering from Saitama University Japan ( ) Universiti Teknikal Malaysia Melaka (2014) Research Interests: Software Engineering, Machine Learning Founder dan Koordinator IlmuKomputer.Com Peneliti LIPI ( ) Founder dan CEO PT Brainmatics Cipta Informatika 2

Course Outline 1.Pengenalan Data Mining 2.Proses Data Mining 3.Evaluasi dan Validasi pada Data Mining 4.Metode dan Algoritma Data Mining 5.Penelitian Data Mining 3

1.Standard Proses Penelitian pada Data Mining 2.Masalah Umum Penelitian Data Mining 3.Journal Publications on Data Mining 4

1. Standard Proses Penelitian pada Data Mining 5

Data Mining Standard Process (CRISP–DM) A cross-industry standard was clearly required that is industry neutral, tool- neutral, and application-neutral The Cross-Industry Standard Process for Data Mining (CRISP–DM) was developed in 1996 (Chapman, 2000) CRISP-DM provides a nonproprietary and freely available standard process for fitting data mining into the general problem-solving strategy of a business or research unit 6

CRISP-DM 7

1. Business Understanding Phase Enunciate the project objectives and requirements clearly in terms of the business or research unit as a whole Translate these goals and restrictions into the formulation of a data mining problem definition Prepare a preliminary strategy for achieving these objectives 8

2. Data Understanding Phase Collect the data Use exploratory data analysis to familiarize yourself with the data and discover initial insights Evaluate the quality of the data If desired, select interesting subsets that may contain actionable patterns 9

3. Data Preparation Phase Prepare from the initial raw data the final data set that is to be used for all subsequent phases. This phase is very labor intensive Select the cases and variables you want to analyze and that are appropriate for your analysis Perform transformations on certain variables, if needed Clean the raw data so that it is ready for the modeling tools 10

4. Modeling phase Select and apply appropriate modeling techniques Calibrate model settings to optimize results Remember that often, several different techniques may be used for the same data mining problem If necessary, loop back to the data preparation phase to bring the form of the data into line with the specific requirements of a particular data mining technique 11

5. Evaluation phase Evaluate the one or more models delivered in the modeling phase for quality and effectiveness before deploying them for use in the field Determine whether the model in fact achieves the objectives set for it in the first phase Establish whether some important facet of the business or research problem has not been accounted for sufficiently Come to a decision regarding use of the data mining results 12

6. Deployment phase Make use of the models created: Model creation does not signify the completion of a project Example of a simple deployment: Generate a report Example of a more complex deployment: Implement a parallel data mining process in another department For businesses, the customer often carries out the deployment based on your model 13

Latihan Pelajari dan pahami Case Study 1-5 dari buku Larose (2005) Chapter 1 Pelajari dan pahami bagaimana menerapkan CRISP-DM pada tesis Firmansyah (2011) tentang penerapan algoritma C4.5 untuk penentuan kelayakan kredit 14

2. Masalah Umum Penelitian Data Mining 15

Masalah Utama Penelitian Data Mining Mining Methodology Mining various and new kinds of knowledge Mining knowledge in multi-dimensional space Data mining: An interdisciplinary effort Boosting the power of discovery in a networked environment Handling noise, uncertainty, and incompleteness of data Pattern evaluation and pattern- or constraint-guided mining User Interaction Interactive mining Incorporation of background knowledge Presentation and visualization of data mining results 16

Masalah Utama Penelitian Data Mining Efficiency and Scalability Efficiency and scalability of data mining algorithms Parallel, distributed, stream, and incremental mining methods Diversity of data types Handling complex types of data Mining dynamic, networked, and global data repositories Data Mining and Society Social impacts of data mining Privacy-preserving data mining Invisible data mining 17

3. Journal Publications on Data Mining 18

Transactions and Journals Review Paper (survey and state-of-the-art): ACM Computing Surveys (CSUR) Research Paper (technical): ACM Transactions on Knowledge Discovery from Data (TKDD) ACM Transactions on Information Systems (TOIS) IEEE Transactions on Knowledge and Data Engineering Springer Data Mining and Knowledge Discovery International Journal of Business Intelligence and Data Mining (IJBIDM) 19

Cognitive Assignment III 1.Baca paper yang ada di 2.Rangkumkan masing-masing dalam bentuk slide dengan struktur: 1.Latar Belakang Masalah (Research Background) 2.Pernyataan Masalah (Problem Statements) 3.Pertanyaan Penelitian (Research Questions) 4.Tujuan Penelitian (Research Objective) 5.Metode-Metode yang Sudah Ada (Existing Methods) 6.Metode yang Diusulkan (Proposed Method) 7.Hasil (Results) 8.Kesimpulan (Conclusion) 3.Presentasikan di depan kelas pada mata kuliah berikutnya 20

Referensi 1.Ian H. Witten, Frank Eibe, Mark A. Hall, Data mining: Practical Machine Learning Tools and Techniques 3rd Edition, Elsevier, Daniel T. Larose, Discovering Knowledge in Data: an Introduction to Data Mining, John Wiley & Sons, Florin Gorunescu, Data Mining: Concepts, Models and Techniques, Springer, Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques Third Edition, Elsevier, Oded Maimon and Lior Rokach, Data Mining and Knowledge Discovery Handbook Second Edition, Springer, Warren Liao and Evangelos Triantaphyllou (eds.), Recent Advances in Data Mining of Enterprise Data: Algorithms and Applications, World Scientific,