Today’s Topics HW1 Due 11:55pm Today (no later than next Tuesday) HW2 Out, Due in Two Weeks Next Week We’ll Discuss the Make-Up Midterm Be Sure to Check.

Slides:



Advertisements
Similar presentations
1 Machine Learning: Lecture 3 Decision Tree Learning (Based on Chapter 3 of Mitchell T.., Machine Learning, 1997)
Advertisements

CHAPTER 9: Decision Trees
Machine Learning III Decision Tree Induction
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
Sparse vs. Ensemble Approaches to Supervised Learning
Neural Networks Marco Loog.
Ensemble Learning: An Introduction
Sparse vs. Ensemble Approaches to Supervised Learning
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Ensemble Learning (2), Tree and Forest
Decision tree LING 572 Fei Xia 1/16/06.
Module 04: Algorithms Topic 07: Instance-Based Learning
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
Today’s Topics Chapter 2 in One Slide Chapter 18: Machine Learning (ML) Creating an ML Dataset –“Fixed-length feature vectors” –Relational/graph-based.
For Friday No reading No homework. Program 4 Exam 2 A week from Friday Covers 10, 11, 13, 14, 18, Take home due at the exam.
Machine Learning Lecture 10 Decision Tree Learning 1.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Today’s Topics HW0 due 11:55pm tonight and no later than next Tuesday HW1 out on class home page; discussion page in MoodleHW1discussion page Please do.
Today’s Topics FREE Code that will Write Your PhD Thesis, a Best-Selling Novel, or Your Next Methods for Intelligently/Efficiently Searching a Space.
Today’s Topics Dealing with Noise Overfitting (the key issue in all of ML) A ‘Greedy’ Algorithm for Pruning D-Trees Generating IF-THEN Rules from D-Trees.
Learning from Observations Chapter 18 Through
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
Decision Trees DefinitionDefinition MechanismMechanism Splitting FunctionSplitting Function Issues in Decision-Tree LearningIssues in Decision-Tree Learning.
Today’s Topics Read –For exam: Chapter 13 of textbook –Not on exam: Sections & Genetic Algorithms (GAs) –Mutation –Crossover –Fitness-proportional.
Today Ensemble Methods. Recap of the course. Classifier Fusion
For Wednesday No reading Homework: –Chapter 18, exercise 6.
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
Today’s Topics Read Chapter 3 & Section 4.1 (Skim Section 3.6 and rest of Chapter 4), Sections 5.1, 5.2, 5.3, 5,7, 5.8, & 5.9 (skim rest of Chapter 5)
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
Today’s Topics Learning Decision Trees (Chapter 18) –We’ll use d-trees to introduce/motivate many general issues in ML (eg, overfitting reduction) “Forests”
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
1 Decision Tree Learning Original slides by Raymond J. Mooney University of Texas at Austin.
Today’s Topics Playing Deterministic (no Dice, etc) Games –Mini-max –  -  pruning –ML and games? 1997: Computer Chess Player (IBM’s Deep Blue) Beat Human.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.2: Classification Rules Rodney Nielsen Many.
Today’s Topics Read: Chapters 7, 8, and 9 on Logical Representation and Reasoning HW3 due at 11:55pm THURS (ditto for your Nannon Tourney Entry) Recipe.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
Data Mining and Decision Support
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Today’s Topics 11/10/15CS Fall 2015 (Shavlik©), Lecture 21, Week 101 More on DEEP ANNs –Convolution –Max Pooling –Drop Out Final ANN Wrapup FYI:
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
CS Fall 2015 (Shavlik©), Midterm Topics
CS Fall 2016 (© Jude Shavlik), Lecture 4
Data Science Algorithms: The Basic Methods
COMP61011 : Machine Learning Ensemble Models
CS Fall 2016 (Shavlik©), Lecture 12, Week 6
Issues in Decision-Tree Learning Avoiding overfitting through pruning
Introduction to Data Mining, 2nd Edition by
CS Fall 2016 (© Jude Shavlik), Lecture 6, Week 4
Introduction to Data Mining, 2nd Edition by
cs540 - Fall 2016 (Shavlik©), Lecture 20, Week 11
Machine Learning Chapter 3. Decision Tree Learning
cs540 - Fall 2016 (Shavlik©), Lecture 18, Week 10
CS 4700: Foundations of Artificial Intelligence
CS Fall 2016 (© Jude Shavlik), Lecture 3
Machine Learning: Lecture 3
CS Fall 2016 (Shavlik©), Lecture 2
CS Fall 2016 (© Jude Shavlik), Lecture 7, Week 4
Machine Learning Chapter 3. Decision Tree Learning
Statistical Learning Dong Liu Dept. EEIS, USTC.
CS Fall 2016 (Shavlik©), Lecture 12, Week 6
Data Mining CSCI 307, Spring 2019 Lecture 21
Presentation transcript:

Today’s Topics HW1 Due 11:55pm Today (no later than next Tuesday) HW2 Out, Due in Two Weeks Next Week We’ll Discuss the Make-Up Midterm Be Sure to Check ! Forward to your Work ? When is 100 < 99 ? (Unrelated to AI) Unstable Algorithms (mentioned on slide last week) D-Tree Wrapup What ‘Space’ does ID3 Search? (Transition to new AI topic: SEARCH ) 9/29/15CS Fall 2015 (Shavlik©), Lecture 8, Week 41

Unstable Algorithms An idea from the stats community An ML is unstable if small changes to the trainset can lead to large changes to the learned model D-trees unstable since one different example can change the root k-NN stable since impact of examples local Ensembles work best with unstable algos 9/29/15CS Fall 2015 (Shavlik©), Lecture 8, Week 4 Lecture 1, Slide 2

CS Fall 2015 (Shavlik©), Lecture 8, Week 4 ID3 Recap: Questions Addressed How closely should we fit the training data? –Completely, then prune –Use tuning sets to score candidates –Learn forests and no need to prune! Why? How do we judge features? –Use info theory (Shannon) What if a features has many values? –Convert to Boolean-valued features D-trees can also handle missing feature values (but we won’t cover this for d-trees) 9/29/153

CS Fall 2015 (Shavlik©), Lecture 8, Week 4 ID3 Recap (cont.) What if some features cost more to evaluate (eg, CAT scan vs Temperature)? –Use an ad-hoc correction factor Best way to use in an ensemble? –Random forests often perform quite well Batch vs. incremental (aka, online) learning? –Basically a ‘batch’ approach –Incremental variants exist but since ID3 is so fast, why not simply rerun ‘from scratch’ whenever a mistake is made? Looks like a d-tree! 9/29/154

CS Fall 2015 (Shavlik©), Lecture 8, Week 4 ID3 Recap (cont.) What about real-valued outputs? –Could learn a linear approximation for various regions of the feature space, eg How rich is our language for describing examples? –Limited to fixed-length feature vectors (but they are surprisingly effective) f f 2 3 f 1 - f 2 f4f4 Venn 9/29/155

CS Fall 2015 (Shavlik©), Lecture 8, Week 4 Summary of ID3 Strengths –Good technique for learning models from ‘real world’ (eg, noisy) data –Fast, simple, and robust –Potentially considers complete hypothesis space –Successfully applied to many real-world tasks –Results (trees or rules) are human-comprehensive –One of the most widely used techniques in data mining 9/29/156

CS Fall 2015 (Shavlik©), Lecture 8, Week 4 Summary of ID3 (cont.) Weaknesses –Requires fixed-length feature vectors –Only makes axis-parallel (univariate) splits –Not designed to make probabilistic predictions –Non-incremental –Hill-climbing algorithm (poor early decisions can be disastrous) However, extensions exist 9/29/157

CS Fall 2015 (Shavlik©), Lecture 8, Week 4 A Sample Search Tree - so we can use another search method besides hill climbing (‘greedy’ algo) Nodes are PARTIALLY COMPLETE D-TREES Expand ‘left most’ (in yellow) question mark (?) of current node All possible trees can be generated (given thresholds ‘implied’ by real values in train set) F2F2 ? ? F1F1 ? ? FNFN ? ? F1F1 ? ? ?... Add F 1 Add F N Add F 1 Add F 2 F2F2 ? +... Create leaf node + - 9/29/158 F2F2 ? Assume F2 scores best

CS Fall 2015 (Shavlik©), Lecture 8, Week 4 Viewing ID3 as a Search Algorithm Search Space Operators Search Strategy Heuristic Function Start Node Goal Node 9/29/159

CS Fall 2015 (Shavlik©), Lecture 8, Week 4 Viewing ID3 as a Search Algorithm Search Space Space of all decision trees constructible using current feature set OperatorsAdd a node (ie, grow tree) Search Strategy Hill Climbing Heuristic Function Information Gain (Other d-tree algo’s use similar ‘purity measures’) Start Node An isolated leaf node marked ‘?’ Goal Node Tree that separates all the training data (‘post pruning’ may be done later to reduce overfitting) 9/29/1510

What We’ve Covered So Far Supervised ML Algorithms –Instance-based (kNN) –Logic-based (ID3, Decision Stumps) –Ensembles (Random Forests, Bagging, Boosting) Train/Tune/Test Sets, N-Fold Cross Validation Feature Space, (Greedily) Searching Hypothesis Spaces Parameter Tuning (‘Model Selection’), Feature Selection (info gain) Dealing w/ Real-Valued and Hierarchical Features Overfitting Reduction, Occam’s Razor Fixed-Length Feature Vectors, Graph/Logic-Based Reps of Examples Understandability of Learned Models, “Generalizing not Memorizing” Briefly: Missing Feature Values, Stability (to small changes in training sets) Algo’s Methodology Issues 9/29/15CS Fall 2015 (Shavlik©), Lecture 8, Week 411