Presentation is loading. Please wait.

Presentation is loading. Please wait.

COMP61011 : Machine Learning Decision Trees

Similar presentations


Presentation on theme: "COMP61011 : Machine Learning Decision Trees"— Presentation transcript:

1 COMP61011 : Machine Learning Decision Trees

2 Recap: Decision Stumps
1 Q. Where is a good threshold?

3 From Decision Stumps, to Decision Trees
New type of non-linear model Copes naturally with continuous and categorical data Fast to both train and test (highly parallelizable) Generates a set of interpretable rules

4 Recap: Decision Stumps
1 10.5 14.1 17.0 21.0 23.2 27.1 30.1 42.0 47.0 57.3 59.9 yes no The stump “splits” the dataset. Here we have 4 classification errors.

5 A modified stump 1 no yes 57.3 59.9 10.5 14.1 17.0 21.0 23.2 27.1 30.1 42.0 47.0 Here we have 3 classification errors.

6 Recursion… Just another dataset! Build a stump! 10.5 14.1 17.0 21.0
23.2 27.1 30.1 42.0 47.0 57.3 59.9 yes no Just another dataset! Build a stump! no yes no yes

7 Decision Trees = nested rules
yes no if x>25 then if x>50 then y=0 else y=1; endif else if x>16 then y=0 else y=1; endif endif

8 Here’s one I made earlier…..

9 ‘pop’ is the number of training points that arrived at that node.
‘err’ is the fraction of those examples incorrectly classified.

10 Increasing the maximum depth (10)
Decreasing the minimum number of examples required to make a split (5)

11

12 Overfitting…. testing error tree depth starting to overfit
optimal depth about here starting to overfit

13

14 Talk to your neighbours.
A little break. Talk to your neighbours.

15 We’ve been assuming continuous variables!
10 5

16 The Tennis Problem

17 Outlook Humidity HIGH RAIN SUNNY OVERCAST NO Wind YES WEAK STRONG
NORMAL

18

19 The Tennis Problem Note: 9 examples say “YES”, while 5 say “NO”.

20 Partitioning the data…

21 Thinking in Probabilities…

22 The “Information” in a feature
More uncertainty = less information H(X) = 1

23 The “Information” in a feature
Less uncertainty = more information H(X) =

24 Entropy

25 Calculating Entropy

26 Information Gain, also known as “Mutual Information”

27 Why don’t we just measure the number of errors?!
Errors = 0.25 … for both! Mutual information I(X1,Y) = I(X2;Y) =

28

29 maximum information gain

30 Outlook Temp Humidity HIGH RAIN SUNNY OVERCAST MILD Wind YES WEAK STRONG NO NORMAL HOT COOL

31 Decision Trees, done. Now, grab some lunch. Then, get to the labs.
Start to think about your projects…

32 Reminder : this is the halfway point!
Project suggestions in handbook. Feedback opportunity next week. Final project deadline… 4pm Friday of week six (that’s the end of reading week – submit to SSO)

33 Team Projects deep study of any topic in ML that you want!
6 page report (maximum), written as a research paper. Minimum font size 11, margins 2cm. NO EXCEPTIONS. Does not have to be “academic”. You could build something – but you must thoroughly analyze and evaluate it according to the same principles. 50 marks, based on your 6 page report. Grades divided equally unless a case is made. Graded according to: Technical understanding/achievements Experimental Methodologies Depth of critical analysis Writing Style and exposition


Download ppt "COMP61011 : Machine Learning Decision Trees"

Similar presentations


Ads by Google