Classifiers, Part 3 Week 1, Video 5 Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical.

Slides:



Advertisements
Similar presentations
DECISION TREES. Decision trees  One possible representation for hypotheses.
Advertisements

G53MLE | Machine Learning | Dr Guoping Qiu
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
NEURAL NETWORKS Perceptron
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Feature Engineering Studio Special Session October 23, 2013.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 18, 2013.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
CMPUT 466/551 Principal Source: CMU
Lecture 14 – Neural Networks
Decision Tree Rong Jin. Determine Milage Per Gallon.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
Three kinds of learning
Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.
General Mining Issues a.j.m.m. (ton) weijters Overfitting Noise and Overfitting Quality of mined models (some figures are based on the ML-introduction.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
SPAM DETECTION USING MACHINE LEARNING Lydia Song, Lauren Steimle, Xiaoxiao Xu.
Classifiers, Part 1 Week 1, video 3:. Prediction  Develop a model which can infer a single aspect of the data (predicted variable) from some combination.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Chapter 13 Genetic Algorithms. 2 Data Mining Techniques So Far… Chapter 5 – Statistics Chapter 6 – Decision Trees Chapter 7 – Neural Networks Chapter.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
COMP3503 Intro to Inductive Modeling
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 February 13, 2012.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Text Classification 2 David Kauchak cs459 Fall 2012 adapted from:
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Christoph Eick: Learning Models to Predict and Classify 1 Learning from Examples Example of Learning from Examples  Classification: Is car x a family.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
COMP24111: Machine Learning Ensemble Models Gavin Brown
Data Mining and Decision Support
From OLS to Generalized Regression Chong Ho Yu (I am regressing)
COMP 2208 Dr. Long Tran-Thanh University of Southampton Decision Trees.
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Data Mining By: Johan Johansson. Mining Techniques Association Rules Association Rules Decision Trees Decision Trees Clustering Clustering Nearest Neighbor.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
COM24111: Machine Learning Decision Trees Gavin Brown
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Understanding unstructured texts via Latent Dirichlet Allocation Raphael Cohen DSaaS, EMC IT June 2015.
Introduction to Machine Learning, its potential usage in network area,
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Core Methods in Educational Data Mining
Lecture 17. Boosting¶ CS 109A/AC 209A/STAT 121A Data Science: Harvard University Fall 2016 Instructors: P. Protopapas, K. Rader, W. Pan.
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Reading: Pedro Domingos: A Few Useful Things to Know about Machine Learning source: /cacm12.pdf reading.
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Estimating Link Signatures with Machine Learning Algorithms
COMP61011 : Machine Learning Ensemble Models
Christoph Eick: Learning Models to Predict and Classify 1 Learning from Examples Example of Learning from Examples  Classification: Is car x a family.
Artificial Intelligence Lecture No. 28
Model generalization Brief summary of methods
Information Retrieval
A task of induction to find patterns
COSC 4368 Intro Supervised Learning Organization
Is Statistics=Data Science
Presentation transcript:

Classifiers, Part 3 Week 1, Video 5

Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical  The answer is one of a set of categories, not a number

In a Previous Class  Step Regression  Logistic Regression  J48/C4.5 Decision Trees

Today  More Classifiers

Decision Rules  Sets of if-then rules which you check in order

Decision Rules Example  IF time 0.55 then CORRECT  ELSE IF time 0.82 then CORRECT  ELSE IF numattempts > 4 and knowledge < 0.33 then INCORRECT  OTHERWISE CORRECT

Many Algorithms  Differences are in terms of how rules are generated and selected  Most popular subcategory (including JRip and PART) repeatedly creates decision trees and distills best rules

Generating Rules from Decision Tree 1. Create Decision Tree 2. If there is at least one path that is worth keeping, go to 3 else go to 6 3. Take the “Best” single path from root to leaf and make that path a rule 4. Remove all data points classified by that rule from data set 5. Go to step 1 6. Take all remaining data points 7. Find the most common value for those data points 8. Make an “otherwise” rule using that

Relatively conservative  Leads to simpler models than most decision trees

Very interpretable models  Unlike most other approaches

Good when multi-level interactions are common  Just like decision trees

K*  Predicts a data point from neighboring data points  Weights points more strongly if they are nearby

Good when data is very divergent  Lots of different processes can lead to the same result  Intractable to find general rules  But data points that are similar tend to be from the same group

Big Advantage  Sometimes works when nothing else works  Has been useful for my group in detecting emotion from log files (Baker et al., 2012)

Big Drawback  To use the model, you need to have the whole data set

Bagged Stumps  Related to decision trees  Lots of trees with only the first feature  Relatively conservative  A close variant is Random Forest

Common Thread  So far, all the classifiers I’ve discussed are conservative  Find simple models  Don’t over-fit  These algorithms appear to do better for most educational data mining than less conservative algorithms  In brief, educational data has lots of systematic noise

Some less conservative algorithms

Support Vector Machines  Conducts dimensionality reduction on data space and then fits hyperplane which splits classes  Creates very sophisticated models  Great for text mining  Great for sensor data  Not optimal for most other educational data  Logs, grades, interactions with software

Genetic Algorithms  Uses mutation, combination, and natural selection to search space of possible models  Can produce inconsistent answers

Neural Networks  Composes extremely complex relationships through combining “perceptrons”  Finds very complicated models

Soller & Stevens (2007)

Note  Support Vector Machines, Genetic Algorithms, and Neural Networks are great for some problems  For most types of educational data, they have not historically produced the best solutions

Later Lectures  Goodness metrics for comparing classifiers  Validating classifiers  Classifier conservatism and over-fitting

Next Lecture  A case study in classification