1 The Restaurant Domain Will they wait, or not?. 2 Decision Trees Patrons? NoYesWaitEst? No Alternate?Hungry?Yes Reservation?Fri/Sat?Alternate?Yes NoYesBar?Yes.

Slides:



Advertisements
Similar presentations
YOSI! BLUE (30 sec play) Ball moves to X1 from X2
Advertisements

Table de multiplication, division, addition et soustraction.
Maximal Independent Subsets of Linear Spaces. Whats a linear space? Given a set of points V a set of lines where a line is a k-set of points each pair.
Cenni di Machine Learning: Decision Tree Learning Fabio Massimo Zanzotto.
Order of Operations PEMDAS.
List Order Maintenance
PROLOG MORE PROBLEMS.
Bank Accounts Management System - p. 448
Learning from Observations
Induction of Decision Trees (IDT)
The Dating Game With Our Host, Welcome to the Dating Game, the game where we find the greatest common factors between different numbers. Hopefully well.
1 Parallel Algorithms (chap. 30, 1 st edition) Parallel: perform more than one operation at a time. PRAM model: Parallel Random Access Model. p0p0 p1p1.
Times tables By Chloe and Izzy.
Learning from Observations Chapter 18 Section 1 – 3.
Outcome: Determine the square root of perfect squares.
Random Variables Lesson chapter 16 part A Expected value and standard deviation for random variables.
23-8 3x6 Double it Take Away 6 Share By 9 Double it +10 Halve it Beginner Start Answer Intermediate 70 50% of this ÷7÷7 x8 Double it Start Answer.
2 x0 0 12/13/2014 Know Your Facts!. 2 x1 2 12/13/2014 Know Your Facts!
÷12=5 5x6=30 One half Half past twelve.
2 x /18/2014 Know Your Facts!. 11 x /18/2014 Know Your Facts!
Check Digit - Mod 11 Please use speaker notes for additional information!
2 x /10/2015 Know Your Facts!. 8 x /10/2015 Know Your Facts!
2.4 – Factoring Polynomials Tricky Trinomials The Tabletop Method.
Strategies – Multiplication and Division
SAT Encoding For Sudoku Puzzles
ICS 178 Intro Machine Learning
Properties of Multiplication Zero property of multiplication Identity property of multiplication Commutative property of multiplication Associative property.
PERMUTATIONS AND COMBINATIONS
1 Lecture 5 PRAM Algorithm: Parallel Prefix Parallel Computing Fall 2008.
A man-machine human interface for a special device of the pervasive computing world B. Apolloni, S. Bassis, A. Brega, S. Gaito, D. Malchiodi, A.M. Zanaboni.
5 x4. 10 x2 9 x3 10 x9 10 x4 10 x8 9 x2 9 x4.
Parallel algorithms for expression evaluation Part1. Simultaneous substitution method (SimSub) Part2. A parallel pebble game.
Linear Programming – Simplex Method: Computational Problems Breaking Ties in Selection of Non-Basic Variable – if tie for non-basic variable with largest.
Production Mix Problem Graphical Solution med lrg Electronics Cabinetry Profit (10,20) (Optimal Product Mix!) Profit.
CS 478 – Tools for Machine Learning and Data Mining Clustering: Distance-based Approaches.
The Problem of K Maximum Sums and its VLSI Implementation Sung Eun Bae, Tadao Takaoka University of Canterbury Christchurch New Zealand.
Multiplication Facts Practice
Variational Inference Amr Ahmed Nov. 6 th Outline Approximate Inference Variational inference formulation – Mean Field Examples – Structured VI.
Computational Facility Layout
Shannon Expansion Given Boolean expression F = w 2 ’ + w 1 ’w 3 ’ + w 1 w 3 Shannon Expansion of F on a variable, say w 2, is to write F as two parts:
COMP9314Xuemin Continuously Maintaining Order Statistics Over Data Streams Lecture Notes COM9314.
Graeme Henchel Multiples Graeme Henchel
1.Print and cut round outside of cootie catcher 2.Fold in half and in half again 3.Open out, turn over so.
The Project Problem formulation (one page) Literature review –“Related work" section of final paper, –Go to writing center, –Present paper(s) to class.
Susan Van Geest Artwork. Altar I: Centrality of Being Marble 1.5’x1.5’x1.5’
0 x x2 0 0 x1 0 0 x3 0 1 x7 7 2 x0 0 9 x0 0.
C2 Part 4: VLSI CAD Tools Problems and Algorithms Marcelo Johann EAMTA 2006.
Count to 20. Count reliably at least 10 objects. Use ‘more’ and ‘less’ to compare two numbers. Count reliably at least 10 objects. Estimate number of objects.
7x7=.
Decision Trees References: "Artificial Intelligence: A Modern Approach, 3 rd ed" (Pearson)
LEARNING DECISION TREES
1 Theory of Inductive Learning zSuppose our examples are drawn with a probability distribution Pr(x), and that we learned a hypothesis f to describe a.
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Learning decision trees
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
ICS 273A Intro Machine Learning
Induction of Decision Trees (IDT) CSE 335/435 Resources: – –
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
LEARNING DECISION TREES Yılmaz KILIÇASLAN. Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm.
Learning from Observations Chapter 18 Through
CHAPTER 18 SECTION 1 – 3 Learning from Observations.
Chapter 18 Section 1 – 3 Learning from Observations.
Learning From Observations Inductive Learning Decision Trees Ensembles.
Learning from Observations
Learning from Observations
Introduce to machine learning
Presented By S.Yamuna AP/CSE
Example Example Alternate Type Patrons Target_wait x1 Yes Thai Full
Learning from Observations
Learning from Observations
Presentation transcript:

1 The Restaurant Domain Will they wait, or not?

2 Decision Trees Patrons? NoYesWaitEst? No Alternate?Hungry?Yes Reservation?Fri/Sat?Alternate?Yes NoYesBar?Yes No Raining?Yes No none some full > no yes no yes no yesnoyes noyes noyesno yes

3 Inducing Decision Trees zStart at the root with all examples. zIf there are both positive and negative examples, choose an attribute to split them. zIf all remaining examples are positive (or negative), label with Yes (or No). zIf no example exists, determine label according to majority in parent. zIf no attributes left, but you still have both positive and negative examples, you have a problem...

4 Inducing decision trees Patrons? + - X7, X11 none some full + X1, X3, X4, X6, X8, X12 - X2, X5, X7, X9, X10, X11 +X1, X3, X6, X8 - +X4, X12 - X2, X5, X9, X10 Type? + X1 - X5 French Italian Thai +X6 - X10 +X3, X12 - X7, X9 + X4,X8 - X2, X11 Burger

5 Continuing Induction Patrons? + - X7, X11 none some full + X1, X3, X4, X6, X8, X12 - X2, X5, X7, X9, X10, X11 +X1, X3, X6, X8 - +X4, X12 - X2, X5, X9, X10 NoYes Hungry? + X4, X12 - X2, X X5, X9

6 Final Decision Tree Patrons? NoYesHungry? Type? Fri/Sat? No Yes No none some full >60 NoYes French Italian noyes Thai burger

7 Decision Trees: summary zFinding optimal decision tree is computationally intractable. zWe use heuristics: yChoosing the right attribute is the key. Choice based on information content that the attribute provides. zRepresent DNF boolean formulas. zWork well in practice. zWhat do do with noise? Continuous attributes? Attributes with large domains?

8 Choosing an Attribute: Disorder vs. Homogeneity Bad Good

9 The Value of Information zIf you control the mail, you control information zInformation theory enables to quantify the discriminating value of an attribute. zIt will rain in Seattle tomorrow (Boring) zWe’ll have an earthquake tomorrow (ok, I’m listening) zThe value of a piece of information is inversely proportional to its probability. - Seinfeld

10 Information Theory zWe quantify the value of knowing E as -Lg 2 Prob(E). zIf E1,…,En are the possible outcomes of an event, then the value of knowing the outcome is: zExamples: yP( 1/2, 1/2) = -1/2 Lg (1/2) - 1/2 Lg 1/2 = 1 yP(0.99, 0.01) = 0.08

11 Why Should We care? zSuppose we have p positive examples, and n negative ones. zIf I classify an example for you as positive or negative, then I’m giving you information: zNow let’s calculate the information you would need after I gave you the value of the attribute A.

12 The Value of an Attribute zSuppose the attribute can take on n values. zFor A=val i, there would still be p i positive examples, and n i neagive examples. zThe probability of the A=val i is (p i +n i )/(p+n). zHence, after I tell you the value of A, you need the following amount of information to classify an example:

13 The value of an Attribute (cont) zThe value of an attribute is the difference between the amount of information to classify before and after, I.e., yInitial - Remainder. zPatrons: zRemainder(Patrons) = + - X7, X11 +X1, X3, X6, X8 - +X4, X12 - X2, X5, X9, X10