Concept Learning and Version Spaces

Slides:



Advertisements
Similar presentations
Concept Learning and the General-to-Specific Ordering
Advertisements

2. Concept Learning 2.1 Introduction
Concept Learning DefinitionsDefinitions Search Space and General-Specific OrderingSearch Space and General-Specific Ordering The Candidate Elimination.
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
Università di Milano-Bicocca Laurea Magistrale in Informatica
Decision Trees. DEFINE: Set X of Instances (of n-tuples x = ) –E.g., days decribed by attributes (or features): Sky, Temp, Humidity, Wind, Water, Forecast.
Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007 Machine Learning Lecture 2: Concept Learning and Version Spaces 1.
Chapter 2 - Concept learning
Machine Learning: Symbol-Based
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
Artificial Intelligence 6. Machine Learning, Version Space Method
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 19, 2001.
Computing & Information Sciences Kansas State University Lecture 01 of 42 Wednesday, 24 January 2008 William H. Hsu Department of Computing and Information.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 25 January 2008 William.
Machine Learning Version Spaces Learning. 2  Neural Net approaches  Symbolic approaches:  version spaces  decision trees  knowledge discovery  data.
CS 484 – Artificial Intelligence1 Announcements List of 5 source for research paper Homework 5 due Tuesday, October 30 Book Review due Tuesday, October.
CS 478 – Tools for Machine Learning and Data Mining The Need for and Role of Bias.
For Friday Read chapter 18, sections 3-4 Homework: –Chapter 14, exercise 12 a, b, d.
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Machine Learning Chapter 11.
Machine Learning CSE 681 CH2 - Supervised Learning.
CpSc 810: Machine Learning Decision Tree Learning.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 19, 2000.
General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes.
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul.
Computing & Information Sciences Kansas State University Monday, 13 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 34 of 42 Monday, 13 November.
1 Concept Learning By Dong Xu State Key Lab of CAD&CG, ZJU.
机器学习 陈昱 北京大学计算机科学技术研究所 信息安全工程研究中心. 课程基本信息  主讲教师:陈昱 Tel :  助教:程再兴, Tel :  课程网页:
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Tom M. Mitchell.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William.
Chapter 2: Concept Learning and the General-to-Specific Ordering.
CpSc 810: Machine Learning Concept Learning and General to Specific Ordering.
Concept Learning and the General-to-Specific Ordering 이 종우 자연언어처리연구실.
Outline Inductive bias General-to specific ordering of hypotheses
Overview Concept Learning Representation Inductive Learning Hypothesis
START OF DAY 1 Reading: Chap. 1 & 2. Introduction.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 2-Concept Learning (1/3) Eduardo Poggi
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Thursday, August 26, 1999 William.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 34 of 41 Wednesday, 10.
Machine Learning: Lecture 2
Machine Learning Concept Learning General-to Specific Ordering
1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo.
Kansas State University Department of Computing and Information Sciences CIS 690: Implementation of High-Performance Data Mining Systems Thursday, 20 May.
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
CS464 Introduction to Machine Learning1 Concept Learning Inducing general functions from specific training examples is a main issue of machine learning.
Concept Learning and The General-To Specific Ordering
Computational Learning Theory Part 1: Preliminaries 1.
Concept learning Maria Simi, 2011/2012 Machine Learning, Tom Mitchell Mc Graw-Hill International Editions, 1997 (Cap 1, 2).
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Chapter 2 Concept Learning
Computational Learning Theory
Concept Learning Machine Learning by T. Mitchell (McGraw-Hill) Chp. 2
CSE543: Machine Learning Lecture 2: August 6, 2014
CS 9633 Machine Learning Concept Learning
Analytical Learning Discussion (4 of 4):
Machine Learning Chapter 2
Introduction to Machine Learning Algorithms in Bioinformatics: Part II
Ordering of Hypothesis Space
Version Spaces Learning
Concept Learning.
Concept learning Maria Simi, 2010/2011.
IES 511 Machine Learning Dr. Türker İnce (Lecture notes by Prof. T. M
Concept Learning Berlin Chen 2005 References:
Machine Learning Chapter 2
Implementation of Learning Systems
Version Space Machine Learning Fall 2018.
Machine Learning Chapter 2
Presentation transcript:

Concept Learning and Version Spaces Based Ch.2 of Tom Mitchell’s Machine Learning and lecture slides by Uffe Kjaerulff

Presentation Overview Concept learning as boolean function approximation Ordering of hypothesis Version spaces and candidate-elimination algorithm The role of bias

A Concept Learning Task Inferring boolean-valued functions from training examples; Inductive learning. Example Given: Instances X: Possible days described by Sky, AirTemp, Humidity, Wind, Water, Forecast; Target concept c: Enjoy-Sport: Day t {Yes,No}; Hypothesis H: described by a conjunction of attributes, e.g. Water=Warm  Sky=Sunny; Training examples D: positive and negative examples of target function, <x1, c(x1),…, xm, c(xm)>. Determine: A hypothesis h from H such that h(x)=c(x) for all x in X. Example Sky AirTemp Humidity Wind Water Forecast EnjoySport 1 Sunny Warm Normal Strong Same Yes 2 High 3 Rainy Cold Change No 4 Cool

The Inductive Learning Hypothesis Note: the only information available about c is c(x) for each <x, c(x)> in D. Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other observed example.

Concept Learning as Search Some notation for hypothesis representation: “?” means that any value is acceptable as an attribute; “0” means that no value is acceptable. In our example Sky  {Sunny, Cloudy, Rainy}; AirTemp  {Warm, Cold}; Humidity  {Normal, High}; Wind  {Strong, Weak}; Water  {Warm, Cold}; Forecast  {Same, Change}. The instance space contains 3*2*2*2*2*2=96 distinct instances. The hypothesis space contains 5*4*4*4*4*4=5120 syntactically distinct hypothesis More realistic learning tasks contain much larger H. Efficient strategies are crucial.

More-General-Than Let hj and hk be boolean functions over X, then More-General-Than-Or-Equal(hj,hk)(x  X) [hk(x)  hj(x)] Establishes partial order on the hypothesis space.

Find-S Algorithm Initialize h to the most specific hypothesis in H; For each positive training instance x For each attribute ai in h If the constraint ai in h is not satisfied by x then replace ai in h by the most general constraint that is satisfied by x Output hypothesis h. Note: Assume that H contains c and that D contains no errors; Otherwise this technique does not work. Limitations: Can’t tell if it’s learned the concept: Other consistent hypothesis? Fails if training data is inconsistent; Picks maximally specific h; Depending on H there might be several.

Version Spaces A hypothesis h is consistent with a set of training examples D of target concept if and only if h(x)=c(x) for each training example <x, c(x)> in D: Consistent(h,D)  ( <x, c(x)>  D) [ h(x) = c(x) ] A version space VSH,D wrt H and D is the subset of hypothesis from H consistent with all training examples in D: VSH,D  { h  H: Consistent(h, D) }

The List-Then-Eliminate Algorithm VersionSpace  a list containing every hypothesis in H; For each training example <x, c(x)> in D Remove from VersionSpace any h for which h(x)c(x) Output the list of hypothesis. Maintains a list of all hypothesis in VSH,D. Unrealistic for most H. More compact (regular) representation of VSH,D is needed.

Example Version Space Idea: VSH,D can be represented by the set of most general and most specific consistent hypothesis.

Representing Version Spaces The general boundary G of version space VSH,D is the set of its most general members. The specific boundary S of version space VSH,D is the set of its most specific members. Version Space Representation Theorem Let X be an arbitrary set of instances and let H be a set of boolean-valued hypothesis defined over X. Let c: X  {0,1} be an arbitrary target concept defined over X, and let D be an arbitrary set of training examples {<x, c(x)>}. For all X, H, c, and D such that S and G are well defined VSH,D  { h  H s  S g  G g  h  s }.

Candidate-Elimination Algorithm G  maximally general hypothesis in H S  maximally specific hypothesis in H For each training example d If d is a positive example Remove from G any hypothesis that does not cover d For each hypothesis s in S that does not cover d Remove s from S Add to S all minimal generalizations h of s such that h covers d and some member of G is more general than h Remove from S any hypothesis that is more general than another hypothesis in S If d is a negative example Remove from S any hypothesis that covers d For each hypothesis g in G that covers d Remove g from G Add to G all minimal specializations h of g such that h does not cover d and some member of S is more specific than h Remove from G any hypothesis that is more specific than another hypothesis in G

Some Notes on Candidate-Elimination Algorithm Positive examples make S become increasingly general. Negative examples make G become increasingly specific. Candidate-Elimination algorithm will converge toward the hypothesis that correctly describes the target concept provided that There are no errors in the training example; There is some hypothesis in H that correctly describes the target concept. The target concept is exactly learned when the S and G boundary sets converge to a single identical hypothesis. Under the above assumptions, new training data can be used to resolve ambiguity. The algorithm beaks down if the data is noisy(inconsistent); Inconsistency can be eventually detected given sufficient training data is given: S and G converge to an empty version space. The target concept is a disjunction of feature attributes.

A Biased Hypothesis Space Bias: Each h  H given by a conjunction of attribute values Unable to represent disjunctive concepts: Sky=Sunny  Sky=Cloudy Most specific hypothesis consistent with 1 and 2 and representable in H is (?,Warm, Normal, Strong, Cool, Change). But it is too general: Covers 3. Example Sky AirTemp Humidity Wind Water Forecast EnjoySport 2 Sunny Warm Normal Strong Cool Change Yes 3 Cloudy 4 Rainy No

Unbiased Learner Idea: Choose H that expresses every teachable concept; H is is a power set of X; Allow disjunction and negation. For our example we get 296 possible hypothesis. What is G and S? S becomes a disjunction of positive examples; G becomes a negated disjunction of negative examples. Only training examples will be unambiguously classified. The algorithm cannot generalize!

Inductive Bias Let L be a concept learning algorithm; X be a set instances; c be the target concept; Dc={<x, c(x)>} be the set of training examples; L(xi,Dc) denote the classification assigned to the instance xi by L after training on Dc. The inductive bias of L is any minimal set of assertions B such that for the target concept c and corresponding training examples Dc:  xi  X: (BDc xi)  L(xi, Dc) Inductive bias of Candidate-Elimination algorithm: The target concept c is contained in the given hypothesis space H.

Summary Points Concept learning as search through H Partial ordering of H Version space candidate elimination algorithm S and G characterize learner’s uncertainty Inductive leaps are possible only if the learner is biased