Download presentation
Presentation is loading. Please wait.
1
Learning: Identification Trees Larry M. Manevitz All rights reserved
2
Given data; Find rules characterizing Problems –Multi-Dimensional –Want rules that generalize
3
Sample Data (Winston) NameHairHeightWeightLotionBurnt? Sarahblondeaverlightnoyes Danablondetallaveryesno Alexbrownshortaveryesno Annieblondeshortavernoyes Emilyredaverheavynoyes Petebrowntallheavyno Johnbrownaverheavyno Katieblondeshortlightyesnone
4
How to analyze data? Make identical match (unlikely) Find best match (other techniques) Build Decision Tree that –gives correct decisions for all data –is simplest (for generalizability)
5
Decision Trees blonde red brown Emily (yes) Alex Pete John (All No) Hair color Lotion? Yes Sarah Annie (all yes) Dana Katie (All No)
6
Another Decision Tree Height Hair Color Weight Hair Color
7
Want to make homogeneous sets? Idea: Minimize Disorder by Tests Can be measured by “ average entropy ” Formula from information theory
8
Disorder or Entropy Suppose two classes perfectly balanced –Then Entropy = 1 (highest value) Suppose all members in one class then entropy is 0. (note 0 log 0 = 0)
9
So on our tabular example, we compute the entropy result from each test and then choose the one with lowest average entropy. Test Entropy Hair0.5 Height0.69 Lotion0.61 Weight0.94
10
Summary Figure out the disorder (or entropy) for each possible test. Choose the one with smallest entropy Then continue on each subbranch
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.