Learning from Data
Focus on Supervised Learning first… Given previous data, how can we “learn” to classify new data?
APPLE APPLE BANANA BANANA APPLE or BANANA? APPLE
Learned model/ Classifier Training Learned model/ Classifier Training Set Extract features/ labels Train Decision Trees Bayesian Learning Neural Nets...
Training Classifying Learned model/ Classifier Training Set Extract features/ labels Train Decision Trees Bayesian Learning Neural Nets... Classifying Learned model/ Classifier Label Instance/Example Extract features
Inductive Learning Supervised Learning: Training data is a set of (x, y) pairs x: input example/instance y: output/label Learn an unknown function f(x)=y x represented by D-dimensional feature vector x = < x1 , x2 , x3 ,…, xD > Each dimension is a feature or attribute
Wait for a Table?
Wait for a Table? T: Positive/Yes examples (better to wait for a table) F: Negative/No examples (better not wait)
All examples with Patrons=None were No Patrons = Some were Yes Examples with Patrons = Full depended on other features
Decision Trees
How to classify new example? All examples with Patrons=None were No Patrons = Some were Yes Examples with Patrons = Full depended on other features
How to classify new example? All examples with Patrons=None were No Patrons = Some were Yes Examples with Patrons = Full depended on other features
Classifying a New Example
Which one is better? All examples with Patrons=None were No Patrons = Some were Yes Examples with Patrons = Full depended on other features
Better because smaller
Decision Trees How to find the smallest decision tree?
Constructing the “best” decision tree NP-hard to find smallest tree, so just try to fall a “smallish” decision tree. First, how to construct any decision tree?
Construct Tree Example F Full $ Italian 30-60 Patrons = None (All False)
Construct Tree Example F Full $ Italian 30-60 Patrons = Some (All True)
Construct Tree Example F Full $ Italian 30-60 Patrons = Full (Some True, Some False)
Construct Tree Example F Full $ Italian 30-60 Patrons = Full (Some True, Some False) AND Hungry = False
Construct Tree Example F Full $ Italian 30-60 Patrons = Full (Some True, Some False) AND Hungry = True
Choosing the Best Feature Compare Type and Patrons Which one seems better? True examples False examples False True
Choosing the Best Feature Compare Type and Patrons Which one seems better? At each node select the feature that divides the examples into sets which are almost all positive or all negative. Yes examples No examples