Download presentation
Presentation is loading. Please wait.
Published byTrevor Blake Modified over 8 years ago
1
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William H. Hsu Department of Computing and Information Sciences, KSU http://www.cis.ksu.edu/~bhsu Readings: Chapter 1-2, Witten and Frank Sections 2.7-2.8, Mitchell Data Mining Basics Lecture 3
2
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Lecture Outline Read Chapters 1-2, Witten and Frank; 2.7-2.8, Mitchell Homework 1: Due Friday, February 2, 2001 (before 12 AM CST) Paper Commentary 1: Due This Friday (in class) –U. Fayyad, “From Data Mining to Knowledge Discovery” –See guidelines in course notes Supervised Learning (continued) –Version spaces –Candidate elimination algorithm Derivation Examples The Need for Inductive Bias –Representations (hypothesis languages): a worst-case scenario –Change of representation Computational Learning Theory
3
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Representing Version Spaces Hypothesis Space –A finite meet semilattice (partial ordering Less-Specific-Than; all ?) –Every pair of hypotheses has a greatest lower bound (GLB) –VS H,D the consistent poset (partially-ordered subset of H) Definition: General Boundary –General boundary G of version space VS H,D : set of most general members –Most general minimal elements of VS H,D “set of necessary conditions” Definition: Specific Boundary –Specific boundary S of version space VS H,D : set of most specific members –Most specific maximal elements of VS H,D “set of sufficient conditions” Version Space –Every member of the version space lies between S and G –VS H,D { h H | s S. g G. g P h P s } where P Less-Specific-Than
4
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Candidate Elimination Algorithm [1] 1. Initialization G (singleton) set containing most general hypothesis in H, denoted { } S set of most specific hypotheses in H, denoted { } 2. For each training example d If d is a positive example (Update-S) Remove from G any hypotheses inconsistent with d For each hypothesis s in S that is not consistent with d Remove s from S Add to S all minimal generalizations h of s such that 1. h is consistent with d 2. Some member of G is more general than h (These are the greatest lower bounds, or meets, s d, in VS H,D ) Remove from S any hypothesis that is more general than another hypothesis in S (remove any dominated elements)
5
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Candidate Elimination Algorithm [2] (continued) If d is a negative example (Update-G) Remove from S any hypotheses inconsistent with d For each hypothesis g in G that is not consistent with d Remove g from G Add to G all minimal specializations h of g such that 1. h is consistent with d 2. Some member of S is more specific than h (These are the least upper bounds, or joins, g d, in VS H,D ) Remove from G any hypothesis that is less general than another hypothesis in G (remove any dominating elements)
6
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence S2S2 = G 2 Example Trace S0S0 G0G0 d 1 : d 2 : d 3 : d 4 : = S 3 G3G3 S4S4 G4G4 S1S1 = G 1
7
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence What Next Training Example? S: G: What Query Should The Learner Make Next? How Should These Be Classified? –
8
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence What Justifies This Inductive Leap? Example: Inductive Generalization –Positive example: –Induced S: Why Believe We Can Classify The Unseen? –e.g., –When is there enough information (in a new case) to make a prediction?
9
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence An Unbiased Learner Example of A Biased H –Conjunctive concepts with don’t cares –What concepts can H not express? (Hint: what are its syntactic limitations?) Idea –Choose H’ that expresses every teachable concept –i.e., H’ is the power set of X –Recall: | A B | = | B | | A | ( A = X; B = {labels}; H’ = A B ) –{{Rainy, Sunny} {Warm, Cold} {Normal, High} {None, Mild, Strong} {Cool, Warm} {Same, Change}} {0, 1} An Exhaustive Hypothesis Language –Consider: H’ = disjunctions ( ), conjunctions ( ), negations (¬) over previous H –| H’ | = 2 (2 2 2 3 2 2) = 2 96 ; | H | = 1 + (3 3 3 4 3 3) = 973 What Are S, G For The Hypothesis Language H’? –S disjunction of all positive examples –G conjunction of all negated negative examples
10
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Inductive Bias Components of An Inductive Bias Definition –Concept learning algorithm L –Instances X, target concept c –Training examples D c = { } –L(x i, D c ) = classification assigned to instance x i by L after training on D c Definition –The inductive bias of L is any minimal set of assertions B such that, for any target concept c and corresponding training examples D c, x i X. [(B D c x i ) | L(x i, D c )] where A | B means A logically entails B –Informal idea: preference for (i.e., restriction to) certain hypotheses by structural (syntactic) means Rationale –Prior assumptions regarding target concept –Basis for inductive generalization
11
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Inductive Systems and Equivalent Deductive Systems Candidate Elimination Algorithm Using Hypothesis Space H Inductive System Theorem Prover Equivalent Deductive System Training Examples New Instance Training Examples New Instance Assertion { c H } Inductive bias made explicit Classification of New Instance (or “Don’t Know”) Classification of New Instance (or “Don’t Know”)
12
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Three Learners with Different Biases Rote Learner –Weakest bias: anything seen before, i.e., no bias –Store examples –Classify x if and only if it matches previously observed example Version Space Candidate Elimination Algorithm –Stronger bias: concepts belonging to conjunctive H –Store extremal generalizations and specializations –Classify x if and only if it “falls within” S and G boundaries (all members agree) Find-S –Even stronger bias: most specific hypothesis –Prior assumption: any instance not observed to be positive is negative –Classify x based on S set
13
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Recall: 4-Variable Concept Learning Problem Bias: Simple Conjunctive Rules –Only 16 simple conjunctive rules of the form y = x i x j x k –y = Ø, x 1, …, x 4, x 1 x 2, …, x 3 x 4, x 1 x 2 x 3, …, x 2 x 3 x 4, x 1 x 2 x 3 x 4 –Example above: no simple rule explains the data (counterexamples?) –Similarly for simple clauses (conjunction and disjunction allowed) Hypothesis Space: A Syntactic Restriction Unknown Function x1x1 x2x2 x3x3 x4x4 y = f (x 1, x 2, x 3, x 4 )
14
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Hypothesis Space: m-of-n Rules m-of-n Rules –32 possible rules of the form: “y = 1 iff at least m of the following n variables are 1” Found A Consistent Hypothesis!
15
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Views of Learning Removal of (Remaining) Uncertainty –Suppose unknown function was known to be m-of-n Boolean function –Could use training data to infer the function Learning and Hypothesis Languages –Possible approach to guess a good, small hypothesis language: Start with a very small language Enlarge until it contains a hypothesis that fits the data –Inductive bias Preference for certain languages Analogous to data compression (removal of redundancy) Later: coding the “model” versus coding the “uncertainty” (error) We Could Be Wrong! –Prior knowledge could be wrong (e.g., y = x 4 one-of (x 1, x 3 ) also consistent) –If guessed language was wrong, errors will occur on new cases
16
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Two Strategies for Machine Learning Develop Ways to Express Prior Knowledge –Role of prior knowledge: guides search for hypotheses / hypothesis languages –Expression languages for prior knowledge Rule grammars; stochastic models; etc. Restrictions on computational models; other (formal) specification methods Develop Flexible Hypothesis Spaces –Structured collections of hypotheses Agglomeration: nested collections (hierarchies) Partitioning: decision trees, lists, rules Neural networks; cases, etc. –Hypothesis spaces of adaptive size Either Case: Develop Algorithms for Finding A Hypothesis That Fits Well –Ideally, will generalize well Later: Bias Optimization (Meta-Learning, Wrappers)
17
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Terminology The Version Space Algorithm –Version space: constructive definition S and G boundaries characterize learner’s uncertainty Version space can be used to make predictions over unseen cases –Algorithms: Find-S, List-Then-Eliminate, candidate elimination –Consistent hypothesis - one that correctly predicts observed examples –Version space - space of all currently consistent (or satisfiable) hypotheses Inductive Bias –Strength of inductive bias: how few hypotheses? –Specific biases: based on specific languages Hypothesis Language –“Searchable subset” of the space of possible descriptors –m-of-n, conjunctive, disjunctive, clauses –Ability to represent a concept
18
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Summary Points Introduction to Supervised Concept Learning Inductive Leaps Possible Only if Learner Is Biased –Futility of learning without bias –Strength of inductive bias: proportional to restrictions on hypotheses Modeling Inductive Learners with Equivalent Deductive Systems –Representing inductive learning as theorem proving –Equivalent learning and inference problems Syntactic Restrictions –Example: m-of-n concept –Other examples? Views of Learning and Strategies –Removing uncertainty (“data compression”) –Role of knowledge Next Lecture: More on Knowledge Discovery in Databases (KDD)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.