Download presentation
Presentation is loading. Please wait.
Published byEvelyn Sherman Modified over 9 years ago
1
CpSc 810: Machine Learning Concept Learning and General to Specific Ordering
2
2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various sources. The Copyright belong to the original authors. Thanks!
3
3 Review Choose the training experience Choose the target function Choose a representation Choose the parameter fitting algorithm Evaluate the entire system
4
4 What is a Concept? A Concept is a subset of objects or events defined over a larger set Example: The concept of a bird is the subset of all objects (i.e., the set of all things or all animals) that belong to the category of bird. Alternatively, a concept is a Boolean-valued function defined over this larger set Example: a function defined over all animals whose value is true for birds and false for every other animal. Things Animals Birds Cars
5
5 What is Concept-Learning? Given a set of examples labeled as members or non-members of a concept, concept-learning consists of automatically inferring the general definition of this concept. In other words, concept-learning is inferring a Boolean-valued function from training examples of its input and output.
6
6 Example of a Concept Learning task Concept: Good Days for Water Sports (values: Yes, No) Attributes/Features: Sky (values: Sunny, Cloudy, Rainy) AirTemp (values: Warm, Cold) Humidity (values: Normal, High) Wind (values: Strong, Weak) Water (Warm, Cool) Forecast (values: Same, Change) A training example: class
7
7 Example of a Concept Learning task Training Examples The task is to learn to predict the value of EnjoySport for an arbitrary day. Learn the general concept of EnjoySport Day Sky AirTemp Humidity Wind Water Forecast WaterSport 1 Sunny Warm Normal Strong Warm Same Yes 2 Sunny Warm High Strong Warm Same Yes 3 Rainy Cold High Strong Warm Change No 4 Sunny Warm High Strong Cool Change Yes
8
8 Terminology and Notation Instances X: the set of items over which the concept is defined is called the set of instances The concept to be learned is called the Target Concept (denoted by c: X--> {0,1}) The set of Training Examples D is a set of instances, x, along with their target concept value c(x). Members of the concept (instances for which c(x)=1) are called positive examples. Nonmembers of the concept (instances for which c(x)=0) are called negative examples.
9
9 Terminology and Notation Given a set of training examples of the target concept c, the problem faced by the learner is to hypothesize, or estimate c. Hypotheses Space H represents the set of all possible hypotheses. H is determined by the human designer’s choice of a hypothesis representation. Each Hypothesis h in H is described by a conjunction of constraints on the attributes. The goal of concept-learning is to find a hypothesis h:X --> {0,1} such that h(x)=c(x) for all x in X.
10
10 Representing Hypotheses For each attribute, the constrain can be A specific value (e.g. Water = Warm) “?” means “any value is acceptable” “0” means “no value is acceptable” Each hypothesis consists of a conjunctions of constraints on attributes. Example of a hypothesis: If the air temperature is cold and the humidity high then it is a good day for water sports) Selecting a Hypothesis Representation is an important step since it restricts (or biases) the space that can be searched. For example, the hypothesis “If the air temperature is cold or the humidity high then it is a good day for water sports” cannot be expressed in our chosen representation.
11
11 Prototypical Concept Learning Task
12
12 The inductive learning hypothesis The inductive learning hypothesis: Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples.
13
13 Concept Learning as Search Concept Learning can be viewed as the task of searching through a large space of hypotheses implicitly defined by the hypothesis representation. Goal of the search is to find the hypotheses that best fits the training examples. What is the possible hypotheses for the EnjoySport learning task Syntactically distinct hypotheses: 5*4*4*4*4*4=5120 Semantically distinct hypotheses (every hypothesis containing one or more “0” represents the empty set of instances): 1+(4*3*3*3*3*3)=973
14
14 General to Specific Ordering of Hypotheses Most General Hypothesis: Everyday is a good day for water sports Most Specific Hypothesis: No day is a good day for water sports Definition: Let h j and h k be boolean-valued functions defined over X. Then h j is more- general-than-or-equal-to h k iff for all x in X, [(h k (x) = 1) --> (h j (x)=1)] Example: h1 = h2 = Every instance that are classified as positive by h1 will also be classified as positive by h2 in our example data set. Therefore h2 is more general than h1.
15
15 Instances, Hypotheses, more_general_than
16
16 Find-S, a Maximally Specific Hypothesis Learning Algorithm Initialize h to the most specific hypothesis in H For each positive training instance x For each attribute constraint a i in h If the constraint a i is satisfied by x then do nothing else replace a i in h by the next more general constraint that is satisfied by x Output hypothesis h
17
17 Hypothesis Space Search by Find-S
18
18 Shortcomings of Find-S Although Find-S finds a hypothesis consistent with the training data, it does not indicate whether that is the only one available Is it a good strategy to prefer the most specific hypothesis? What if the training set is inconsistent (noisy)? What if there are several maximally specific consistent hypotheses? Find-S cannot backtrack!
19
19 Version Spaces A hypothesis h is consistent with a set of training examples D of target concept c iff h(x) = c(x) for each example in D. The version space, denoted VS H,D, with respect to hypothesis space H and training examples D, is the subset of hypotheses from H consistent with the training examples in D.
20
20 The List-Then-Eliminate Algorithm 1. VersionSpace <- a list containing every hypothesis in H 2. For each training example, remove from VersionSpace any hypothesis h for which h(x)≠c(x). 3. Output the list of hypotheses in VersionSpace
21
21 Example Version Space Version Space for our training examples. Day Sky AirTemp Humidity Wind Water Forecast WaterSport 1 Sunny Warm Normal Strong Warm Same Yes 2 Sunny Warm High Strong Warm Same Yes 3 Rainy Cold High Strong Warm Change No 4 Sunny Warm High Strong Cool Change Yes
22
22 Discussions on List-Then-Eliminate It works whenever the hypothesis space H is finite. It is guaranteed to output all hypotheses consistent with the training data. Shortcoming: the exhaustively enumerating all hypotheses in H is an unrealistic requirement.
23
23 A Compact Representation for Version Spaces Instead of enumerating all the hypotheses consistent with a training set, we can represent its most specific and most general boundaries. The hypotheses included in-between these two boundaries can be generated as needed. Definition: The general boundary G, with respect to hypothesis space H and training data D, is the set of maximally general members of H consistent with D. Definition: The specific boundary S, with respect to hypothesis space H and training data D, is the set of minimally general (i.e., maximally specific) members of H consistent with D.
24
24 Version space representation theorem Every member of the version space lies between these boundaries. Where x≥y means x is more general or equal to y. Proof: 1. every h satisfying the right-hand side of the above expression is in VS H,D, 2. every member of VS H,D satisfied the right-hand side of expression.
25
25 Candidate-Elimination Learning Algorithm
26
26 Candidate-Elimination Trace
27
27 Candidate-Elimination Trace
28
28 Candidate-Elimination Trace
29
29 Candidate-Elimination Trace
30
30 Candidate-Elimination Trace
31
31 Remarks on Version Spaces and Candidate-Elimination The candidate-Elimination algorithm computes the version space containing all (and only those) hypotheses from H that are consistent with an observed sequence of training examples. The version space learned by the Candidate- Elimination Algorithm will converge toward the hypothesis that correctly describes the target concept provided: (1) There are no errors in the training examples; (2) There is some hypothesis in H that correctly describes the target concept. Convergence can be speeded up by presenting the data in a strategic order. The best examples are those that satisfy exactly half of the hypotheses in the current version space. Version-Spaces can be used to assign certainty scores to the classification of new examples
32
32 How Should These Be Classified?
33
33 A Biased Hypothesis Space Given our previous choice of the hypothesis space representation, no hypothesis is consistent with the following examples: we have BIASED the learner to consider only conjunctive hypotheses In this case, we need a more expressive hypothesis space
34
34 An Un-Biased Learner In order to solve the problem caused by the bias of the hypothesis space, we can remove this bias and allow the hypotheses to represent every possible subset of instances by allowing arbitrary disjunctions, conjunctions and negations of previous hypotheses. The previous exmaples could then be expressed as: v However, such an unbiased learner is not able to generalize beyond the observed examples!!!! All the non-observed examples will be well- classified by half the hypotheses of the version space and misclassified by the other half.
35
35 A Example of Un-Biased Learner Rote-Learner: Learning corresponds simply to storing each observed training example in memory. Subsequent instances are classified by looking them up in memory. If the instance is found in memory, the stored classification is returned. Otherwise, the system refuses to classify the new instance.
36
36 The Futility of Bias-Free Learning Fundamental Property of Inductive Learning A learner that makes no a priori assumptions regarding the identity of the target concept has no rational basis for classifying any unseen instances. In EnjoySport, there are implicit assumption that the target concept could be represented by a conjunction of attribute values. We constantly have recourse to inductive biases Example: we all know that the sun will rise tomorrow. Although we cannot deduce that it will do so based on the fact that it rose today, yesterday, the day before, etc., we do take this leap of faith or use this inductive bias, naturally!
37
37 The Futility of Bias-Free Learning Notation: A |- B indicates that B follows deductively from A. A B indicates that B follows inductively from A. The inductive inference step:
38
38 Inductive Bias Consider Concept learning algorithm L Instances X, target concept c Training examples D c = { } Let L(x i, D c ) denote the classification assigned to the instance x i by L after training on data D c Definition: The inductive bias of L is any minimal set of additional assumptions B, for any target concept c and corresponding training examples D c, sufficient to justify its inductive inferences as deductive inferences. Inductive bias of Candidate-Elimination algorithm: The target concept c is contained in the given hypothesis space H.
39
39 Inductive Systems and Equivalent Deductive Systems
40
40 Ranking Inductive Learners according to their Biases Rote-Learner: This system simply memorizes the training data and their classification--- No generalization is involved. Candidate-Elimination: New instances are classified only if all the hypotheses in the version space agree on the classification Find-S: New instances are classified using the most specific hypothesis consistent with the training data Bias Strength Weak Strong
41
41 Summary Points Concept learning as search through H General to specific ordering over H Version space candidate elimination algorithm S and G boundaries characterize learner’s uncertainty Learner can generate useful queries Inductive leaps possible only if learner is biased Inductive learners can be modeled by equivalent deductive systems.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.