Download presentation
Presentation is loading. Please wait.
Published byGunnel Lindgren Modified over 5 years ago
1
Inductive Learning (2/2) Version Space and PAC Learning
Russell and Norvig: Chapter 18, Sections 18.5 through 18.7 Chapter 18, Section 18.5 Chapter 19, Sections 19.1 through 19.3 CS121 – Winter 2003 Version Space and PAC Learning
2
Version Space and PAC Learning
Contents Introduction to inductive learning Logic-based inductive learning: Decision tree method Version space method Function-based inductive learning Neural nets + PAC learning Version Space and PAC Learning
3
Inductive Learning Scheme
Inductive hypothesis h Training set D + - Example set X {[A, B, …, CONCEPT]} Hypothesis space H {[CONCEPT(x) S(A,B, …)]} Version Space and PAC Learning
4
Predicate-Learning Methods
Decision tree Version space Need to provide H with some “structure” Explicit representation of hypothesis space H Version Space and PAC Learning
5
Version Space Method V is the version space V H
For every example x in training set D do Eliminate from V every hypothesis that does not agree with x If V is empty then return failure Return V But the size of V is enormous!!! Idea: Define a partial ordering on the hypotheses in H and only represent the upper and lower bounds of V for this ordering Compared to the decision tree method, this algorithm is: incremental least-commitment Version Space and PAC Learning
6
Version Space and PAC Learning
Rewarded Card Example (r=1) v … v (r=10) v (r=J) v (r=Q) v (r=K) ANY-RANK(r) (r=1) v … v (r=10) NUM(r) (r=J) v (r=Q) v (r=K) FACE(r) (s=) v (s=) v (s=) v (s=) ANY-SUIT(s) (s=) v (s=) BLACK(s) (s=) v (s=) RED(s) An hypothesis is any sentence of the form: R(r) S(s) REWARD([r,s]) where: R(r) is ANY-RANK(r), NUM(r), FACE(r), or (r=j) S(s) is ANY-SUIT(s), BLACK(s), RED(s), or (s=k) Version Space and PAC Learning
7
Simplified Representation
For simplicity, we represent a concept by rs, with: r = a, n, f, 1, …, 10, j, q, k s = a, b, r, , , , For example: n represents: NUM(r) (s=) REWARD([r,s]) aa represents: ANY-RANK(r) ANY-SUIT(s) REWARD([r,s]) Version Space and PAC Learning
8
Extension of an Hypothesis
The extension of an hypothesis h is the set of objects that verifies h Examples: The extension of f is: {j, q, k} The extension of aa is the set of all cards Version Space and PAC Learning
9
More General/Specific Relation
Let h1 and h2 be two hypotheses in H h1 is more general than h2 iff the extension of h1 is a proper superset of h2’s Examples: aa is more general than f f is more general than q fr and nr are not comparable Version Space and PAC Learning
10
More General/Specific Relation
Let h1 and h2 be two hypotheses in H h1 is more general than h2 iff the extension of h1 is a proper superset of h2’s The inverse of the “more general” relation is the “more specific” relation The “more general” relation defines a partial ordering on the hypotheses in H Version Space and PAC Learning
11
Example: Subset of Partial Order
aa na ab nb n 4 4b a 4a Version Space and PAC Learning
12
Construction of Ordering Relation
1 10 n a f j k … b a r Version Space and PAC Learning
13
G-Boundary / S-Boundary of V
An hypothesis in V is most general iff no hypothesis in V is more general G-boundary G of V: Set of most general hypotheses in V Version Space and PAC Learning
14
G-Boundary / S-Boundary of V
An hypothesis in V is most general iff no hypothesis in V is more general G-boundary G of V: Set of most general hypotheses in V An hypothesis in V is most specific iff no hypothesis in V is more general S-boundary S of V: Set of most specific hypotheses in V Version Space and PAC Learning
15
Example: G-/S-Boundaries of V
aa na ab nb n 4 4b a 4a aa We replace every hypothesis in S whose extension does not contain 4 by its generalization set Now suppose that 4 is given as a positive example 4 1 k … S Version Space and PAC Learning
16
Example: G-/S-Boundaries of V
aa na ab Here, both G and S have size 1. This is not the case in general! 4a nb a 4b n 4 Version Space and PAC Learning
17
Example: G-/S-Boundaries of V
The generalization set of an hypothesis h is the set of the hypotheses that are immediately more general than h aa na ab 4a nb a Generalization set of 4 4b n Let 7 be the next (positive) example 4 Version Space and PAC Learning
18
Example: G-/S-Boundaries of V
aa na ab 4a nb a 4b n Let 7 be the next (positive) example 4 Version Space and PAC Learning
19
Example: G-/S-Boundaries of V
Specialization set of aa aa na ab nb a n Let 5 be the next (negative) example Version Space and PAC Learning
20
Example: G-/S-Boundaries of V
G and S, and all hypotheses in between form exactly the version space ab nb a 1. If an hypothesis between G and S disagreed with an example x, then an hypothesis G or S would also disagree with x, hence would have been removed n Version Space and PAC Learning
21
Example: G-/S-Boundaries of V
G and S, and all hypotheses in between form exactly the version space ab nb a 2. If there were an hypothesis not in this set which agreed with all examples, then it would have to be either no more specific than any member of G – but then it would be in G – or no more general than some member of S – but then it would be in S n Version Space and PAC Learning
22
Example: G-/S-Boundaries of V
At this stage … ab No Yes nb a Maybe n Do 8, 6, j satisfy CONCEPT? Version Space and PAC Learning
23
Example: G-/S-Boundaries of V
ab nb a n Let 2 be the next (positive) example Version Space and PAC Learning
24
Example: G-/S-Boundaries of V
ab nb Let j be the next (negative) example Version Space and PAC Learning
25
Example: G-/S-Boundaries of V
+ 4 7 2 – 5 j nb NUM(r) BLACK(s) REWARD([r,s]) Version Space and PAC Learning
26
Example: G-/S-Boundaries of V
Let us return to the version space … … and let 8 be the next (negative) example ab nb a The only most specific hypothesis disagrees with this example, hence no hypothesis in H agrees with all examples n Version Space and PAC Learning
27
Example: G-/S-Boundaries of V
Let us return to the version space … … and let j be the next (positive) example ab nb a The only most general hypothesis disagrees with this example, hence no hypothesis in H agrees with all examples n Version Space and PAC Learning
28
Version Space and PAC Learning
Version Space Update x new example If x is positive then (G,S) POSITIVE-UPDATE(G,S,x) Else (G,S) NEGATIVE-UPDATE(G,S,x) If G or S is empty then return failure Version Space and PAC Learning
29
POSITIVE-UPDATE(G,S,x)
Eliminate all hypotheses in G that do not agree with x Version Space and PAC Learning
30
POSITIVE-UPDATE(G,S,x)
Eliminate all hypotheses in G that do not agree with x Minimally generalize all hypotheses in S until they are consistent with x Using the generalization sets of the hypotheses Version Space and PAC Learning
31
POSITIVE-UPDATE(G,S,x)
Eliminate all hypotheses in G that do not agree with x Minimally generalize all hypotheses in S until they are consistent with x Remove from S every hypothesis that is neither more specific than nor equal to a hypothesis in G This step was not needed in the card example Version Space and PAC Learning
32
POSITIVE-UPDATE(G,S,x)
Eliminate all hypotheses in G that do not agree with x Minimally generalize all hypotheses in S until they are consistent with x Remove from S every hypothesis that is neither more specific than nor equal to a hypothesis in G Remove from S every hypothesis that is more general than another hypothesis in S Return (G,S) Version Space and PAC Learning
33
NEGATIVE-UPDATE(G,S,x)
Eliminate all hypotheses in S that do not agree with x Minimally specialize all hypotheses in G until they are consistent with x Remove from G every hypothesis that is neither more general than nor equal to a hypothesis in S Remove from G every hypothesis that is more specific than another hypothesis in G Return (G,S) Version Space and PAC Learning
34
Example-Selection Strategy
Suppose that at each step the learning procedure has the possibility to select the object (card) of the next example Let it pick the object such that, whether the example is positive or not, it will eliminate one-half of the remaining hypotheses Then a single hypothesis will be isolated in O(log |H|) steps Version Space and PAC Learning
35
Version Space and PAC Learning
Example aa na ab 9? j? j? nb a n Version Space and PAC Learning
36
Example-Selection Strategy
Suppose that at each step the learning procedure has the possibility to select the object (card) of the next example Let it pick the object such that, whether the example is positive or not, it will eliminate one-half of the remaining hypotheses Then a single hypothesis will be isolated in O(log |H|) steps But picking the object that eliminates half the version space may be expensive Version Space and PAC Learning
37
Version Space and PAC Learning
Noise If some examples are misclassified the version space may collapse Possible solution: Maintain several G- and S-boundaries, e.g., consistent with all examples, all examples but one, etc… (Exercise: Develop this idea!) Version Space and PAC Learning
38
Current-Best-Hypothesis Search
Keep one hypothesis at each step Generalize or specialize the hypothesis at each new example Details left as an exercise… Version Space and PAC Learning
39
Version Space and PAC Learning
VSL vs DTL Decision tree learning (DTL) is more efficient if all examples are given in advance; else, it may produce successive hypotheses, each poorly related to the previous one Version space learning (VSL) is incremental DTL can produce simplified hypotheses that do not agree with all examples DTL has been more widely used in practice Version Space and PAC Learning
40
Can Inductive Learning Work?
Inductive hypothesis h size m Training set D + - Example set X Hypothesis space H f: correct hypothesis p(x): probability that example x is picked from X size |H| Version Space and PAC Learning
41
Approximately Correct Hypothesis
h H is approximately correct (AC) with accuracy e iff: Pr[h(x) f(x)] e where x is an example picked with probability distribution p from X Version Space and PAC Learning
42
PAC Learning Procedure
L is Provably Approximately Correct (PAC) with confidence g iff: Pr[Pr[h(x) f(x)] > e] g Can L be PAC? If yes, how big should the size m of the training set D be? Version Space and PAC Learning
43
Version Space and PAC Learning
Can L Be PAC? Let g be an arbitrary element of H that is not approximately correct Since g is not AC, we have: Pr[g(x) f(x)] > e So, the probability that g is consistent with all the examples in D is at most (1-e)m … … and he probability that there exists a non-AC hypothesis matching all the examples in D is at most |H|(1-e)m Version Space and PAC Learning
44
Version Space and PAC Learning
Can L Be PAC? Let g be an arbitrary element of H that is not approximately correct Since g is not AC, we have: Pr[g(x) f(x)] > e So, the probability that g is consistent with all the examples in D is at most (1-e)m … … and he probability that there exists a non-AC hypothesis matching all the examples in D is at most |H|(1-e)m Therefore, L is PAC if the size m of the training set verifies: |H|(1-e)m d Version Space and PAC Learning
45
Version Space and PAC Learning
Size of Training Set From |H|(1-e)m g we derive: m ln(g/|H|) / ln(1-e) Since e < -ln(1-e) for 0<e<1, we have: m ln(g/|H|) / (-e) m ln(|H|/g) / e So, m increases logarithmically with the size of the hypothesis space But how big is |H|? Version Space and PAC Learning
46
Version Space and PAC Learning
Importance of KIS Bias If H is the set of all logical sentences with n base predicates, then |H| = , and m is exponential in n If H is the set of all conjunctions of k << n base predicates picked among n predicates, then |H| = O(nk) and m is logarithmic in n Importance of choosing a “good” KIS bias 2 2n Version Space and PAC Learning
47
Explanation-Based Learning
KB: Background knowledge D: Observed knowledge such that KB D Inductive learning Find h such that KB and h are consistent KB,h D Explanation-based learning Find h such that KB = KB1,KB2 KB1 h KB2,h D Example: Derivatives of functions KB1 is the general theory D consists of examples h defines the derivatives of usual functions KB2 gives simplification rules Nothing really new is learnt! Version Space and PAC Learning
48
Version Space and PAC Learning
Summary Version space method Structure of hypothesis space Generalization/specialization of hypothesis PAC learning Explanation-based learning Version Space and PAC Learning
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.