Junheng, Shengming, Yunsheng 10/19/2018

Slides:



Advertisements
Similar presentations
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
Advertisements

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Classification Techniques: Decision Tree Learning
Decision Tree under MapReduce Week 14 Part II. Decision Tree.
Decision Trees.
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
1 Classification with Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Classification: Decision Trees
1 Classification with Decision Trees I Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Biological Data Mining (Predicting Post-synaptic Activity in Proteins)
Decision Trees an Introduction.
Classification: Decision Trees 2 Outline  Top-Down Decision Tree Construction  Choosing the Splitting Attribute  Information Gain and Gain Ratio.
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
SEEM Tutorial 2 Classification: Decision tree, Naïve Bayes & k-NN
R OBERTO B ATTITI, M AURO B RUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Feb 2014.
Learning Chapter 18 and Parts of Chapter 20
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Induction of Decision Trees. An Example Data Set and Decision Tree yes no yesno sunnyrainy no med yes smallbig outlook company sailboat.
Classification I. 2 The Task Input: Collection of instances with a set of attributes x and a special nominal attribute Y called class attribute Output:
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
Decision Trees Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
1 Learning Chapter 18 and Parts of Chapter 20 AI systems are complex and may have many parameters. It is impractical and often impossible to encode all.
CS Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Data Mining – Algorithms: Decision Trees - ID3 Chapter 4, Section 4.3.
CS690L Data Mining: Classification
Lecture 27: Recognition Basics CS4670/5670: Computer Vision Kavita Bala Slides from Andrej Karpathy and Fei-Fei Li
Slide 1 DSCI 4520/5240: Data Mining Fall 2013 – Dr. Nick Evangelopoulos Lecture 5: Decision Tree Algorithms Material based on: Witten & Frank 2000, Olson.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
1 Classification with Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Classification and Regression Trees
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
SEEM Tutorial 1 Classification: Decision tree Siyuan Zhang,
Decision Trees by Muhammad Owais Zahid
Data Mining Chapter 4 Algorithms: The Basic Methods - Constructing decision trees Reporter: Yuen-Kuei Hsueh Date: 2008/7/24.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
DECISION TREES An internal node represents a test on an attribute.
Decision Trees an introduction.
Lecture 18. Support Vector Machine (SVM)
Classification Algorithms
Teori Keputusan (Decision Theory)
Artificial Intelligence
Ch9: Decision Trees 9.1 Introduction A decision tree:
Data Science Algorithms: The Basic Methods
Decision Tree Saed Sayad 9/21/2018.
Support Vector Machines
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Jianping Fan Dept of Computer Science UNC-Charlotte
Roberto Battiti, Mauro Brunato
DATAWAREHOUSING AND DATAMINING
Machine Learning: Lecture 3
Classification with Decision Trees
Lecture 05: Decision Trees
Machine Learning Week 3.
Dept. of Computer Science University of Liverpool
Machine Learning in Practice Lecture 17
Learning Chapter 18 and Parts of Chapter 20
Data Mining(中国人民大学) Yang qiang(香港科技大学) Han jia wei(UIUC)
Other Classification Models: Support Vector Machine (SVM)
Junheng, Shengming, Yunsheng 11/09/2018
A task of induction to find patterns
Data Mining CSCI 307, Spring 2019 Lecture 15
A task of induction to find patterns
Data Mining CSCI 307, Spring 2019 Lecture 6
Presentation transcript:

Junheng, Shengming, Yunsheng 10/19/2018 CS145 Discussion Week 3 Junheng, Shengming, Yunsheng 10/19/2018

Roadmap Announcements Review: HW1 due Oct 19, 2018 (Friday, tonight) Package your Report AND codes,README together and submit it through CCLE Review: Decision Tree Information Gain Gain Ratio Gini Index SVM Linear SVM Soft Margin SVM Non-linear SVM

Decision Tree Decision Tree Classification --> Example: Play or Not? -->

Decision Tree Choosing the Splitting Attribute At each node, available attributes are evaluated on the basis of separating the classes of the training examples. A Goodness function is used for this purpose: Information Gain Gain Ratio Gini Index

A criterion for attribute selection Which is the best attribute? The one which will result in the smallest tree Heuristic: choose the attribute that produces the “purest” nodes Popular impurity criterion: information gain Information gain increases with the average purity of the subsets that an attribute produces Strategy: choose attribute that results in greatest information gain

Entropy of a split Information in a split with x items of one class, y items of the second class

Example: attribute “Outlook” “Outlook” = “Sunny”: 2 and 3 split

Outlook = Overcast “Outlook” = “Overcast”: 4/0 split Note: log(0) is not defined, but we evaluate 0*log(0) as zero

Outlook = Rainy “Outlook” = “Rainy”:

Expected Information Expected information for attribute:

The final decision tree Note: not all leaves need to be pure; sometimes identical instances have different classes ⇒ Splitting stops when data can’t be split any further

Computing the information gain (information before split) – (information after split) Information gain for attributes from weather data:

Continuing to split

The final decision tree Note: not all leaves need to be pure; sometimes identical instances have different classes ⇒ Splitting stops when data can’t be split any further

Decision Tree Gain Ratio Gain Ratio = Gain_A(D) / SplitInfo_A(D) Why Gain Ratio? Unbiased compared with Information Gain Why? (https://stats.stackexchange.com/questions/306456/how-is-information-gain-biased)

Decision Tree •Is the decision boundary for decision tree linear? No

Visual Tutorials of Decision Trees https://algobeans.com/2016/07/27/decision-trees-tutorial/

Support Vector Machine Hyperplane separating the data points Maximize margin Solution

Margin Formula Margin Lines Distance between parallel lines Margin

Linear SVM Example

Linear SVM Example

Plot

Non-linear SVM Example

Non-linear SVM Example

Non-linear SVM Example

Plot

Visualize Tutorials of Decision Trees http://www.r2d3.us/visual-intro-to-machine-learning-part-1/ http://explained.ai/decision-tree-viz/

Visual Tutorials of SVM https://cs.stanford.edu/people/karpathy/svmjs/demo/

Thank you! Q & A