These slides are additional material for TIES4451 Data Mining Lecture 1 TIES445 Data mining Nov-Dec 2007 Sami Äyrämö.

Slides:



Advertisements
Similar presentations
Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute INTRODUCTION TO KNOWLEDGE DISCOVERY IN DATABASES AND DATA MINING.
Advertisements

CS583 – Data Mining and Text Mining
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
CS583 – Data Mining and Text Mining
CS/CMPE 536 –Data Mining Outline. CS Data Mining (Au 2004/2005) - Asim LUMS2 Description A comprehensive introduction to the concepts and.
CS 536 –Data Mining Outline.
Data Mining: Concepts and Techniques
1 Data Mining Techniques Instructor: Ruoming Jin Fall 2006.
Introduction to Data Mining with Case Studies
CS/CMPE 636 – Advanced Data Mining Outline. CS Adv. Data Mining (Wi ) - Asim LUMS2 Description Cover recent developments in some.
CS/CMPE 536 –Data Mining Outline. CS Data Mining (Au ) - Asim LUMS2 Description A comprehensive introduction to the concepts and.
An Overview of Our Course:
Data Mining – Intro.
CS583 – Data Mining and Text Mining
Data Mining.
Introduction to Data Mining Engineering Group in ACL.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
1 1 Data Mining: Concepts and Techniques (3 rd ed.) — Chapter 1 — Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign.
Machine Learning Lecture 1. Course Information Text book “Introduction to Machine Learning” by Ethem Alpaydin, MIT Press. Reference book “Data Mining.
Overview of CS Class Jiawei Han Department of Computer Science
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
An Evaluation of Commercial Data Mining Proposed and Presented by Emily Davis Supervisor: John Ebden.
General Information 439 – Data Mining Assist.Prof.Dr. Derya BİRANT.
The Interplay Between Mathematics/Computation and Analytics Haesun Park Division of Computational Science and Engineering Georgia Institute of Technology.
9/03 Data Mining – Introduction G Dong (WSU)1 CS499/ Data Mining Fall 2003 Professor Guozhu Dong Computer Science & Engineering WSU.
MIS2502: Data Analytics Advanced Analytics - Introduction.
January 17, 2016Data Mining: Concepts and Techniques 1 What Is Data Mining? Data mining (knowledge discovery from data) Extraction of interesting ( non-trivial,
CSCE 5073 Section 001: Data Mining Spring Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur,
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Waqas Haider Bangyal. 2 Source Materials “ Data Mining: Concepts and Techniques” by Jiawei Han & Micheline Kamber, Second Edition, Morgan Kaufmann, 2006.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
CSC 4740 / 6740 Fall 2016 Data Mining Instructor: Yubao Wu Fall 2016.
CS583 – Data Mining and Text Mining
Data Mining: Concepts and Techniques
Term Project Proposal By J. H. Wang Apr. 7, 2017.
Why Data Mining? What Is Data Mining?
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Data Mining – Intro.
Overview on Data Mining
MIS2502: Data Analytics Advanced Analytics - Introduction
CS583 – Data Mining and Text Mining
Eick: Introduction Machine Learning
CS583 – Data Mining and Text Mining
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
中国计算机学会学科前沿讲习班:信息检索 Course Overview
Introduction C.Eng 714 Spring 2010.
Data Mining: Concepts and Techniques (3rd ed
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Jiawei Han Computer Science University of Illinois at Urbana-Champaign
Data Mining: Concepts and Techniques Course Outline
CS583 – Data Mining and Text Mining
Sangeeta Devadiga CS 157B, Spring 2007
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Data Warehousing and Data Mining
CS583 – Data Mining and Text Mining
Data Mining: Introduction
CS583 – Data Mining and Text Mining
Dept. of Computer Science University of Liverpool
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Christoph F. Eick: A Gentle Introduction to Machine Learning
CSCE 4143 Section 001: Data Mining Spring 2019.
CS583 – Data Mining and Text Mining
Promising “Newer” Technologies to Cope with the
Presentation transcript:

These slides are additional material for TIES4451 Data Mining Lecture 1 TIES445 Data mining Nov-Dec 2007 Sami Äyrämö

These slides are additional material for TIES4452 Data Mining lectures (on weeks 44-50) Mondays 12:15-14:00 Tuesdays 10:15-12:00 NOTE: No lectures on week 47 3 x 2h demonstrations (one weeks in a computer classroom) Final exam in January cr without seminar work 5cr with seminar work (will be held in January 2008)

These slides are additional material for TIES4453 About lectures The lectures are based on: Han and Kamber (based on Data Mining: Concepts and Techniques) Tan, Steinbach and Kumar (based on Introduction to Data Mining) Some slides by the lecturer

These slides are additional material for TIES4454 Literature l P-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, Addison Wesley, l J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, l D. Hand, H. Mannila, and P. Smyth, Principles of Data Mining, MIT Press, l D. Pyle, Data Preparation for Data Mining, Morgan Kaufmann, l M. Berry, Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, Wiley, l T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer-Verlag, l U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Advances in Knowledge Discovery and Data Mining, MIT Press, l M.H. Dunham, Data Mining Introductory and Advanced Topics, Prentice Hall, l F. Witten, Data Mining: Practical Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, l J.P. Bigus, Data Mining with Neural Networks, McGraw-Hill, l J-M- Adamo, Data Mining for Association Rules and Sequential Patterns: Sequential and Parallel Algorithms, Springer-Verlag, l H. Liu and H., Motoda, Feature Selection for Knowledge Discovery and Data mining, Kluwer, 1998.

These slides are additional material for TIES4455 Theses, publications etc. M. Pechenizkiy, Feature Extraction for Supervised Learning in Knowledge Discovery Systems, PhD thesis, University of Jyväskylä, S. Äyrämö, Knowledge Mining using Robust Clustering, PhD thesis, University of Jyväskylä, J. Mäkinen, Roskapostin älykäs suodattaminen, Pro gradu, Jyväskylän yliopisto, M. Nurminen, Tiedonlouhinta rakenteisista dokumenteista, Pro gradu, Jyväskylän yliopisto, K. Arkko, Assosiaatioiden ja sekvenssien louhinta suurista tietomassoista, Pro gradu, Jyväskylän yliopisto, J. Hänninen, Batch- ja online-hermoverkko-opetusalgoritmien ominaisuudet ja eroavaisuudet, Pro gradu, Jyväskylän yliopisto, Kärkkäinen, T., MLP-network in a layer-wise form with applications to weight decay. Neural Computing, 14 (6), , Kärkkäinen, T. & Heikkola, E., Robust Formulations for Training Multilayer Perceptrons. Neural Computation, 16 (4), , Kärkkäinen, T. and Äyrämö, S., Robust Clustering Methods for Incomplete and Erroneous Data, in Data Mining V: Data Mining, Text Mining and their Business Applications, Äyrämö, S., Kärkkäinen, T. & Majava, K., Robust refinement of initial prototypes for partitioning-based clustering algorithms. In C. Skiadas (Eds.), Recent Advances in Stochastic Modeling and Data Analysis, pp , World Scientific, many more!

These slides are additional material for TIES4456 Journals, conferences,… l Journals –Data Mining and Knowledge Discovery, Springer –The Transactions on Knowledge Discovery from Data (TKDD), ACM –IEEE Transactions on Knowledge and Data Engineering, IEEE –SIGKDD Explorations –Statistical Analysis and Data Mining, Wiley –Data & Knowledge Engineering, Elsevier –Computational Statistics & Data Analysis, Elsevier l Conferences, seminars, workshops –ACM SIGKDD, PKDD, PAKDD, (IEEE) ICDM, SIAM data mining (SDM), DMIN,... –ICTAI, IJCAI, VLDB, ICDE, ICML, CVPR, MSR,...

These slides are additional material for TIES4457 Control data Process data Quality Feedback Customer Manager Operator Laborant Sample application

These slides are additional material for TIES4458 Real-world data set

These slides are additional material for TIES4459 Mining Large Data Sets - Motivation R. Grossman (2001):”During the next decade, the amount of data will continue to explode, while the number of scientists and engineers available to analyze it will remain essentially constant.” P.S. Bradley (2003) : “The ability of organizations to effectively utilize this information for decision support typically lags behind their ability to collect and store it. But, organizations that can leverage their data for decision support are more likely to have a competitive edge in their sector of the market.”

These slides are additional material for TIES44510 Knowledge Mining (KM) process

These slides are additional material for TIES44511 l Draws ideas from machine learning/AI, pattern recognition, statistics, and database systems Database systems Statistics/ Numerical optimization Origins of Data Mining Machine Learning/ Pattern Recognition/ Artificial Intelligence Data Mining Visualization

These slides are additional material for TIES44512 Major Issues and Challenges in DM/KDD l Mining methodology –Mining different kinds of knowledge from diverse data types, e.g., bio, stream, Web –Algorithmic requirements: Performance: efficiency, scalability, robustness, reliability –High dimensionality, complex and heterogeneous data –Pattern evaluation: the interestingness problem –Incorporation of background knowledge –Data quality: Handling noise and incomplete data (robustness, reliability) –Parallel, distributed and incremental mining methods –Integration of the discovered knowledge with existing one: knowledge fusion –Data Ownership and Distribution l User interaction –Expression and visualization of data mining results –Interactive mining of knowledge at multiple levels of abstraction l Applications and social impacts –Domain-specific data mining & invisible data mining –Protection of data security, integrity, and privacy