Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,

Slides:



Advertisements
Similar presentations
Godfather to the Singularity
Advertisements

Slides from: Doug Gray, David Poole
Artificial Neural Networks (1)
1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Machine Learning Neural Networks
Lecture 1: Introduction to Data Mining for Bioinformatics Fall 2005 Peter van der Putten (putten_at_liacs.nl) Databases and Data Mining.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Data Mining.
CS Instance Based Learning1 Instance Based Learning.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Data Mining: A Closer Look
Chapter 5 Data mining : A Closer Look.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Data Mining Techniques
University of Toronto 8/30/20151 Data Mining The Art and Science of Obtaining Knowledge from Data Dr. Saed Sayad.
DATA MINING Team #1 Kristen Durst Mark Gillespie Banan Mandura University of DaytonMBA APR 09.
Data Mining Chun-Hung Chou
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
COMP3503 Intro to Inductive Modeling
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004.
Chapter 9 Neural Network.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Chapter 7 Neural Networks in Data Mining Automatic Model Building (Machine Learning) Artificial Intelligence.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Learning from observations
Artificial Neural Network Building Using WEKA Software
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College Bio Informatics January
October 2-3, 2015, İSTANBUL Boğaziçi University Prof.Dr. M.Erdal Balaban Istanbul University Faculty of Business Administration Avcılar, Istanbul - TURKEY.
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
Data Mining and Decision Support
Data Mining Copyright KEYSOFT Solutions.
Data Mining By: Johan Johansson. Mining Techniques Association Rules Association Rules Decision Trees Decision Trees Clustering Clustering Nearest Neighbor.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Knowledge Discovery and Data Mining 19 th Meeting Course Name: Business Intelligence Year: 2009.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Big data classification using neural network
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning with Spark MLlib
DATA MINING © Prentice Hall.
School of Computer Science & Engineering
Prepared by: Mahmoud Rafeek Al-Farra
CH. 1: Introduction 1.1 What is Machine Learning Example:
Prepared by: Mahmoud Rafeek Al-Farra
Neuro-Computing Lecture 4 Radial Basis Function Network
Supporting End-User Access
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining Overview.
Presentation transcript:

Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting, meaningful and actionable patterns hidden in large amounts of data Multidisciplinary field originating from artificial intelligence, pattern recognition, statistics, machine learning, econometrics, ….

Data mining is a process… Business objectives Model Development –Model objective –Data collection & preparation –Model construction –Model evaluation –Combining models with business knowledge into decision logic Model / decision logic deployment Model / decision logic monitoring

Data mining is a process… a marketing example Business objectives –Cross sell MMS bundle to lapsed users / non users Model Development –Model objective For consumers with no MMS bundle in past 6 months, predict MMS bundle ownership yes/no in next three months –Data collection & preparation All fields for all active customers as of end APR05; remove all customers with MMS bundle in NOV04- APR05; Left join MMS Bundle field from MAY05, JUNE05, JULY05 –Model construction Build various models to predict MMS Bundle MAY or JUNE or JULY = ‘N’ on 70% if the data –Model evaluation Evaluate predictive power on 70% data for model development and 30% test set –Combining models with business knowledge into decision logic Target the top 30% and randomly test two propositions (50 MMS for 5Euro; 100MMS for 7.50Euro) across two channel (Direct mail and SMS) Model / decision logic deployment –Run the campaign Model / decision logic monitoring –Compare predctions against actual response to evaluate model quality and robustness –What propositions / channels work best

Data mining tasks Undirected, explorative, descriptive, ‘unsupervised’ data mining –Matching & search –Profile & rule extraction –Clustering & segmentation; dimension reduction Directed, predictive, ‘supervised’ data mining –Predictive modeling

Data mining task example: Clustering & segmentation

Start Looking Glass Source: Sentient Information Systems (

Tussenresultaat looking glass Source: Sentient Information Systems (

Resultaat Looking Glass Source: Sentient Information Systems (

Resultaat Looking Glass Source: Sentient Information Systems (

Case A7 Case B Worse business Score Better business Case A Case B Past experience Data Behaviour Good Bad Good Model Data mining task example: predictive modeling

Collected data

score = (0 x Income) + (-1 x Age) + (25 x Children) Data mining task example: predictive modeling

Data mining techniques for predictive modeling Linear and logistic regression Decision trees Neural Networks Nearest Neighbor Genetic Algorithms ….

score = (0 x Income) + (-1 x Age) + (25 x Children) Linear Regression Models

Regression in pattern space ageincome Only a single line available in pattern space to separate classes Class ‘circle’ Class ‘square’

Decision Trees customers response 1% Income >150000? customers Purchases >10? 1200 customers balance>50000? 800 customers response 1,8% etc. 400 customers response 0,1% no yes no

Decision Trees in Pattern Space ageincome Line pieces perpendicular to axes Each line is a split in the tree, two answers to a question

Decision Trees in Pattern Space ageweight Goal classifier is to seperate classes (circle, square) on the basis of attribute age and income Each line corresponds to a split in the tree Decision areas are ‘tiles’ in pattern space

Nearest Neighbour Data itself is the classification model, so no abstraction like a tree etc. For a given instance x, search the k instances that are most similar to x Classify x as the most occurring class for the k most similar instances

= new instance Any decision area possible Condition: enough data available Nearest Neighbor in Pattern Space Classification fe agefe weight

Nearest Neighbor in Pattern Space Voorspellen f.e. agebvb. weight Any decision area possible Condition: enough data available

Example classification algorithm 3: Neural Networks Inspired by neuronal computation in the brain (McCullough & Pitts 1943 (!)) Input (attributes) is coded as activation on the input layer neurons, activation feeds forward through network of weighted links between neurons and causes activations on the output neurons (for instance diabetic yes/no) Algorithm learns to find optimal weight using the training instances and a general learning rule.

Example simple network (2 layers) Probability of being diabetic = f (age * weight age + body mass index * weight body mass index) Neural Networks Weight body mass index Probability of being diabetic age body_mass_index weight age

Neural Networks in Pattern Space Classification f.e. agef.e. weight Simpel network: only a line available (why?) to seperate classes Multilayer network: Any classification boundary possible

Dilbert’s Perspective on Data Mining