Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William.

Slides:

Advertisements

Similar presentations

Concept Learning and the General-to-Specific Ordering

Advertisements

2. Concept Learning 2.1 Introduction

Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수

Università di Milano-Bicocca Laurea Magistrale in Informatica

1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.

Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007 Machine Learning Lecture 2: Concept Learning and Version Spaces 1.

Chapter 2 - Concept learning

Machine Learning: Symbol-Based

MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.

Lecture 32 of 42 Machine Learning: Basic Concepts

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 19, 2001.

Computing & Information Sciences Kansas State University Lecture 01 of 42 Wednesday, 24 January 2008 William H. Hsu Department of Computing and Information.

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 25 January 2008 William.

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, April 3, 2000 DingBing.

Machine Learning Version Spaces Learning. 2  Neural Net approaches  Symbolic approaches:  version spaces  decision trees  knowledge discovery  data.

CS 478 – Tools for Machine Learning and Data Mining The Need for and Role of Bias.

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, February 7, 2001.

1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.

Machine Learning Chapter 11.

Machine Learning CSE 681 CH2 - Supervised Learning.

CpSc 810: Machine Learning Decision Tree Learning.

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 19, 2000.

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Friday, February 4, 2000 Lijun.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 Wednesday, 20 October.

Computing & Information Sciences Kansas State University Monday, 13 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 34 of 42 Monday, 13 November.

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 16 February 2007 William.

1 Concept Learning By Dong Xu State Key Lab of CAD&CG, ZJU.

Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Tom M. Mitchell.

Chapter 2: Concept Learning and the General-to-Specific Ordering.

Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 22 of 42 Wednesday, 22 October.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 of 41 Monday, 25 October.

Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.

CpSc 810: Machine Learning Concept Learning and General to Specific Ordering.

Concept Learning and the General-to-Specific Ordering 이 종우 자연언어처리연구실.

Outline Inductive bias General-to specific ordering of hypotheses

Overview Concept Learning Representation Inductive Learning Hypothesis

Computing & Information Sciences Kansas State University Wednesday, 15 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 35 of 42 Wednesday, 15 November.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 14 of 41 Wednesday, 22.

1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Thursday, August 26, 1999 William.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 34 of 41 Wednesday, 10.

Machine Learning: Lecture 2

Machine Learning Concept Learning General-to Specific Ordering

Kansas State University Department of Computing and Information Sciences CIS 690: Implementation of High-Performance Data Mining Systems Thursday, 20 May.

CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.

CS464 Introduction to Machine Learning1 Concept Learning Inducing general functions from specific training examples is a main issue of machine learning.

Concept Learning and The General-To Specific Ordering

Computational Learning Theory Part 1: Preliminaries 1.

Concept learning Maria Simi, 2011/2012 Machine Learning, Tom Mitchell Mc Graw-Hill International Editions, 1997 (Cap 1, 2).

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 14 of 42 Wednesday, 22.

Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.

Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.

Computing & Information Sciences Kansas State University Wednesday, 04 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 17 of 42 Wednesday, 04 October.

Computing & Information Sciences Kansas State University Friday, 13 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 21 of 42 Friday, 13 October.

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Thursday, 25 January 2007 William.

Chapter 2 Concept Learning

Concept Learning Machine Learning by T. Mitchell (McGraw-Hill) Chp. 2

CS 9633 Machine Learning Concept Learning

Analytical Learning Discussion (4 of 4):

Machine Learning Chapter 2

Data Mining Lecture 11.

Machine Learning: Lecture 3

Concept Learning.

Concept Learning Berlin Chen 2005 References:

Machine Learning Chapter 2

Inductive Learning (2/2) Version Space and PAC Learning

Version Space Machine Learning Fall 2018.

Machine Learning Chapter 2

Presentation transcript:

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William H. Hsu Department of Computing and Information Sciences, KSU Readings: Chapter 1-2, Witten and Frank Sections , Mitchell Data Mining Basics Lecture 3

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Lecture Outline Read Chapters 1-2, Witten and Frank; , Mitchell Homework 1: Due Friday, February 2, 2001 (before 12 AM CST) Paper Commentary 1: Due This Friday (in class) –U. Fayyad, “From Data Mining to Knowledge Discovery” –See guidelines in course notes Supervised Learning (continued) –Version spaces –Candidate elimination algorithm Derivation Examples The Need for Inductive Bias –Representations (hypothesis languages): a worst-case scenario –Change of representation Computational Learning Theory

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Representing Version Spaces Hypothesis Space –A finite meet semilattice (partial ordering Less-Specific-Than;   all ?) –Every pair of hypotheses has a greatest lower bound (GLB) –VS H,D  the consistent poset (partially-ordered subset of H) Definition: General Boundary –General boundary G of version space VS H,D : set of most general members –Most general  minimal elements of VS H,D  “set of necessary conditions” Definition: Specific Boundary –Specific boundary S of version space VS H,D : set of most specific members –Most specific  maximal elements of VS H,D  “set of sufficient conditions” Version Space –Every member of the version space lies between S and G –VS H,D  { h  H |  s  S.  g  G. g  P h  P s } where  P  Less-Specific-Than

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Candidate Elimination Algorithm [1] 1. Initialization G  (singleton) set containing most general hypothesis in H, denoted { } S  set of most specific hypotheses in H, denoted { } 2. For each training example d If d is a positive example (Update-S) Remove from G any hypotheses inconsistent with d For each hypothesis s in S that is not consistent with d Remove s from S Add to S all minimal generalizations h of s such that 1. h is consistent with d 2. Some member of G is more general than h (These are the greatest lower bounds, or meets, s  d, in VS H,D ) Remove from S any hypothesis that is more general than another hypothesis in S (remove any dominated elements)

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Candidate Elimination Algorithm [2] (continued) If d is a negative example (Update-G) Remove from S any hypotheses inconsistent with d For each hypothesis g in G that is not consistent with d Remove g from G Add to G all minimal specializations h of g such that 1. h is consistent with d 2. Some member of S is more specific than h (These are the least upper bounds, or joins, g  d, in VS H,D ) Remove from G any hypothesis that is less general than another hypothesis in G (remove any dominating elements)

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence S2S2 = G 2 Example Trace S0S0 G0G0 d 1 : d 2 : d 3 : d 4 : = S 3 G3G3 S4S4 G4G4 S1S1 = G 1

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence What Next Training Example? S: G: What Query Should The Learner Make Next? How Should These Be Classified? –

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence What Justifies This Inductive Leap? Example: Inductive Generalization –Positive example: –Induced S: Why Believe We Can Classify The Unseen? –e.g., –When is there enough information (in a new case) to make a prediction?

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence An Unbiased Learner Example of A Biased H –Conjunctive concepts with don’t cares –What concepts can H not express? (Hint: what are its syntactic limitations?) Idea –Choose H’ that expresses every teachable concept –i.e., H’ is the power set of X –Recall: | A  B | = | B | | A | ( A = X; B = {labels}; H’ = A  B ) –{{Rainy, Sunny}  {Warm, Cold}  {Normal, High}  {None, Mild, Strong}  {Cool, Warm}  {Same, Change}}  {0, 1} An Exhaustive Hypothesis Language –Consider: H’ = disjunctions (  ), conjunctions (  ), negations (¬) over previous H –| H’ | = 2 ( ) = 2 96 ; | H | = 1 + ( ) = 973 What Are S, G For The Hypothesis Language H’? –S  disjunction of all positive examples –G  conjunction of all negated negative examples

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Inductive Bias Components of An Inductive Bias Definition –Concept learning algorithm L –Instances X, target concept c –Training examples D c = { } –L(x i, D c ) = classification assigned to instance x i by L after training on D c Definition –The inductive bias of L is any minimal set of assertions B such that, for any target concept c and corresponding training examples D c,  x i  X. [(B  D c  x i ) |  L(x i, D c )] where A |  B means A logically entails B –Informal idea: preference for (i.e., restriction to) certain hypotheses by structural (syntactic) means Rationale –Prior assumptions regarding target concept –Basis for inductive generalization

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Inductive Systems and Equivalent Deductive Systems Candidate Elimination Algorithm Using Hypothesis Space H Inductive System Theorem Prover Equivalent Deductive System Training Examples New Instance Training Examples New Instance Assertion { c  H } Inductive bias made explicit Classification of New Instance (or “Don’t Know”) Classification of New Instance (or “Don’t Know”)

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Three Learners with Different Biases Rote Learner –Weakest bias: anything seen before, i.e., no bias –Store examples –Classify x if and only if it matches previously observed example Version Space Candidate Elimination Algorithm –Stronger bias: concepts belonging to conjunctive H –Store extremal generalizations and specializations –Classify x if and only if it “falls within” S and G boundaries (all members agree) Find-S –Even stronger bias: most specific hypothesis –Prior assumption: any instance not observed to be positive is negative –Classify x based on S set

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Recall: 4-Variable Concept Learning Problem Bias: Simple Conjunctive Rules –Only 16 simple conjunctive rules of the form y = x i  x j  x k –y = Ø, x 1, …, x 4, x 1  x 2, …, x 3  x 4, x 1  x 2  x 3, …, x 2  x 3  x 4, x 1  x 2  x 3  x 4 –Example above: no simple rule explains the data (counterexamples?) –Similarly for simple clauses (conjunction and disjunction allowed) Hypothesis Space: A Syntactic Restriction Unknown Function x1x1 x2x2 x3x3 x4x4 y = f (x 1, x 2, x 3, x 4 )

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Hypothesis Space: m-of-n Rules m-of-n Rules –32 possible rules of the form: “y = 1 iff at least m of the following n variables are 1” Found A Consistent Hypothesis!

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Views of Learning Removal of (Remaining) Uncertainty –Suppose unknown function was known to be m-of-n Boolean function –Could use training data to infer the function Learning and Hypothesis Languages –Possible approach to guess a good, small hypothesis language: Start with a very small language Enlarge until it contains a hypothesis that fits the data –Inductive bias Preference for certain languages Analogous to data compression (removal of redundancy) Later: coding the “model” versus coding the “uncertainty” (error) We Could Be Wrong! –Prior knowledge could be wrong (e.g., y = x 4  one-of (x 1, x 3 ) also consistent) –If guessed language was wrong, errors will occur on new cases

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Two Strategies for Machine Learning Develop Ways to Express Prior Knowledge –Role of prior knowledge: guides search for hypotheses / hypothesis languages –Expression languages for prior knowledge Rule grammars; stochastic models; etc. Restrictions on computational models; other (formal) specification methods Develop Flexible Hypothesis Spaces –Structured collections of hypotheses Agglomeration: nested collections (hierarchies) Partitioning: decision trees, lists, rules Neural networks; cases, etc. –Hypothesis spaces of adaptive size Either Case: Develop Algorithms for Finding A Hypothesis That Fits Well –Ideally, will generalize well Later: Bias Optimization (Meta-Learning, Wrappers)

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Terminology The Version Space Algorithm –Version space: constructive definition S and G boundaries characterize learner’s uncertainty Version space can be used to make predictions over unseen cases –Algorithms: Find-S, List-Then-Eliminate, candidate elimination –Consistent hypothesis - one that correctly predicts observed examples –Version space - space of all currently consistent (or satisfiable) hypotheses Inductive Bias –Strength of inductive bias: how few hypotheses? –Specific biases: based on specific languages Hypothesis Language –“Searchable subset” of the space of possible descriptors –m-of-n, conjunctive, disjunctive, clauses –Ability to represent a concept

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Summary Points Introduction to Supervised Concept Learning Inductive Leaps Possible Only if Learner Is Biased –Futility of learning without bias –Strength of inductive bias: proportional to restrictions on hypotheses Modeling Inductive Learners with Equivalent Deductive Systems –Representing inductive learning as theorem proving –Equivalent learning and inference problems Syntactic Restrictions –Example: m-of-n concept –Other examples? Views of Learning and Strategies –Removing uncertainty (“data compression”) –Role of knowledge Next Lecture: More on Knowledge Discovery in Databases (KDD)