1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 2-Concept Learning (1/3) Eduardo Poggi

Slides:



Advertisements
Similar presentations
Explanation-Based Learning (borrowed from mooney et al)
Advertisements

Concept Learning and the General-to-Specific Ordering
2. Concept Learning 2.1 Introduction
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
_ Rough Sets. Basic Concepts of Rough Sets _ Information/Decision Systems (Tables) _ Indiscernibility _ Set Approximation _ Reducts and Core _ Rough Membership.
Concept Learning DefinitionsDefinitions Search Space and General-Specific OrderingSearch Space and General-Specific Ordering The Candidate Elimination.
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
Università di Milano-Bicocca Laurea Magistrale in Informatica
Decision Trees. DEFINE: Set X of Instances (of n-tuples x = ) –E.g., days decribed by attributes (or features): Sky, Temp, Humidity, Wind, Water, Forecast.
Northwestern University Winter 2007 Machine Learning EECS Machine Learning Lecture 13: Computational Learning Theory.
Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007 Machine Learning Lecture 2: Concept Learning and Version Spaces 1.
Chapter 2 - Concept learning
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Machine Learning: Symbol-Based
1er. Escuela Red ProTIC - Tandil, de Abril, Bayesian Learning 5.1 Introduction –Bayesian learning algorithms calculate explicit probabilities.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 19, 2001.
Computing & Information Sciences Kansas State University Lecture 01 of 42 Wednesday, 24 January 2008 William H. Hsu Department of Computing and Information.
Machine Learning Version Spaces Learning. 2  Neural Net approaches  Symbolic approaches:  version spaces  decision trees  knowledge discovery  data.
CS 478 – Tools for Machine Learning and Data Mining The Need for and Role of Bias.
For Friday Read chapter 18, sections 3-4 Homework: –Chapter 14, exercise 12 a, b, d.
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Machine Learning Chapter 11.
Machine Learning CSE 681 CH2 - Supervised Learning.
General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes.
1 Concept Learning By Dong Xu State Key Lab of CAD&CG, ZJU.
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Tom M. Mitchell.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William.
Chapter 2: Concept Learning and the General-to-Specific Ordering.
CpSc 810: Machine Learning Concept Learning and General to Specific Ordering.
Concept Learning and the General-to-Specific Ordering 이 종우 자연언어처리연구실.
Outline Inductive bias General-to specific ordering of hypotheses
Overview Concept Learning Representation Inductive Learning Hypothesis
Computational Learning Theory IntroductionIntroduction The PAC Learning FrameworkThe PAC Learning Framework Finite Hypothesis SpacesFinite Hypothesis Spaces.
CS Machine Learning 15 Jan Inductive Classification.
START OF DAY 1 Reading: Chap. 1 & 2. Introduction.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Thursday, August 26, 1999 William.
Machine Learning: Lecture 2
Machine Learning Concept Learning General-to Specific Ordering
Data Mining and Decision Support
Kansas State University Department of Computing and Information Sciences CIS 690: Implementation of High-Performance Data Mining Systems Thursday, 20 May.
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
CS464 Introduction to Machine Learning1 Concept Learning Inducing general functions from specific training examples is a main issue of machine learning.
Concept Learning and The General-To Specific Ordering
Computational Learning Theory Part 1: Preliminaries 1.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 4-Inducción de árboles de decisión (1/2) Eduardo Poggi.
Concept learning Maria Simi, 2011/2012 Machine Learning, Tom Mitchell Mc Graw-Hill International Editions, 1997 (Cap 1, 2).
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
On-Line Algorithms in Machine Learning By: WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ KIM HYEONGCHEOL HE RUIDAN SHANG XINDI.
1 CS 391L: Machine Learning: Computational Learning Theory Raymond J. Mooney University of Texas at Austin.
Chapter 2 Concept Learning
Concept Learning Machine Learning by T. Mitchell (McGraw-Hill) Chp. 2
CSE543: Machine Learning Lecture 2: August 6, 2014
CS 9633 Machine Learning Concept Learning
Machine Learning Chapter 2
Ordering of Hypothesis Space
Data Mining Lecture 11.
Version Spaces Learning
Concept Learning.
Machine Learning: Lecture 6
Concept Learning Berlin Chen 2005 References:
Machine Learning: UNIT-3 CHAPTER-1
Machine Learning Chapter 2
Implementation of Learning Systems
Version Space Machine Learning Fall 2018.
Machine Learning Chapter 2
Presentation transcript:

1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 2-Concept Learning (1/3) Eduardo Poggi Ernesto Mislej otoño de 2008

2 Agenda Definitions Search Space and General-Specific Ordering Concept learning as search FIND-S

3 Definition The problem is to learn a function mapping examples into two classes: positive and negative. We are given a database of examples already classified as positive or negative. Concept learning: the process of inducing a function mapping input examples into a Boolean output. Examples: Classifying objects in astronomical images as stars or galaxies Classifying animals as vertebrates or invertebrates

4 Working Example: Mushrooms Class of Tasks: Predicting poisonous mushrooms Performance: Accuracy of Classification Experience: Database describing mushrooms with their class Knowledge to learn: Function mapping mushrooms to {+,-} where -:not-poisonous and +:poisonous where -:not-poisonous and +:poisonous Representation of target knowledge: conjunction of attribute values. Learning mechanism: candidate-elimination

5 Notation Set of instances X Target concept c : X  {+,-} Training examples E = {(x, c(x))} Data set D  X Set of possible hypotheses H h  H h : X  {+,-} Goal:Find h / h(x)=c(x)

6 Representation of Examples Features: color {red, brown, gray} color {red, brown, gray} size {small, large} size {small, large} shape {round,elongated} shape {round,elongated} land {humid,dry} land {humid,dry} air humidity {low,high} air humidity {low,high} texture {smooth, rough} texture {smooth, rough}

7 The Input and Output Space X Only a small subset is contained in our database. Y = {+,-} X : The space of all possible examples (input space). Y: The space of classes (output space). An example in X is a feature vector X. For instance: X = (red,small,elongated,humid,low,rough) X is the cross product of all feature values. X is the cross product of all feature values.

8 The Training Examples D : The set of training examples. D is a set of pairs { (x,c(x)) }, where c is the target concept Example of D: ((red,small,round,humid,low,smooth), +) ((red,small,elongated,humid,low,smooth),+) ((gray,large,elongated,humid,low,rough), -) ((red,small,elongated,humid,high,rough), +) Instances from the input space Instances from the output space

9 Hypothesis Representation Consider the following hypotheses: (*,*,*,*,*,*): all mushrooms are poisonous (*,*,*,*,*,*): all mushrooms are poisonous (0,0,0,0,0,0): no mushroom is poisonous (0,0,0,0,0,0): no mushroom is poisonous Special symbols:  * Any value is acceptable  0 no value is acceptable Any hypothesis h is a function from X to Y h: X Y h: X Y We will explore the space of conjunctions.

10 Hypothesis Space The space of all hypotheses is represented by H The space of all hypotheses is represented by H Let h be a hypothesis in H. Let h be a hypothesis in H. Let X be an example of a mushroom. Let X be an example of a mushroom. if h(X) = + then X is poisonous, if h(X) = + then X is poisonous, otherwise X is not-poisonous Our goal is to find the hypothesis, h*, that is very “close” Our goal is to find the hypothesis, h*, that is very “close” to target concept c. to target concept c. A hypothesis is said to “cover” those examples it classifies A hypothesis is said to “cover” those examples it classifies as positive. as positive. X h

11 Assumption 1 We will explore the space of all conjunctions. We assume the target concept falls within this space. Target concept c H

12 Assumption 2 A hypothesis close to target concept c obtained after seeing many training examples will result in high accuracy on the set of unobserved examples. Training set D Hypothesis h* is good Complement set D’ Hypothesis h* is good

13 Concept Learning as Search There is a general to specific ordering inherent to any hypothesis space. Consider these two hypotheses: h1 = (red,*,*,humid,*,*) h2 = (red,*,*,*,*,*) We say h2 is more general than h1 because h2 classifies more instances than h1 and h1 is covered by h2.

14 General-Specific For example, consider the following hypotheses: h1 h2h3 h1 is more general than h2 and h3. h2 and h3 are neither more specific nor more general than each other.

15 Let hj and hk be two hypotheses mapping examples into {+,-}. We say hj is more general than hk iff For all examples X, hk(X) = +  hj(X) = + We represent this fact as hj >= hk The >= relation imposes a partial ordering over the hypothesis space H (reflexive, antisymmetric, and transitive). Definition

16 Lattice Any input space X defines then a lattice of hypotheses ordered according to the general-specific relation: h1 h3h4 h2 h5h6 h7 h8

17 Finding a Maximally-Specific Hypothesis Algorithm to search the space of conjunctions:  Start with the most specific hypothesis  Generalize the hypothesis when it fails to cover a positive example Algorithm: 1.Initialize h to the most specific hypothesis 2.For each positive training example X For each value a in h For each value a in h If example X and h agree on a, do nothing If example X and h agree on a, do nothing else generalize a by the next more general constraint else generalize a by the next more general constraint 3. Output hypothesis h

18 Example Let’s run the learning algorithm above with the following examples: ((red,small,round,humid,low,smooth), +) ((red,small,elongated,humid,low,smooth),+) ((gray,large,elongated,humid,low,rough), -) ((red,small,elongated,humid,high,rough), +) We start with the most specific hypothesis: h = (0,0,0,0,0,0) The first example comes and since the example is positive and h fails to cover it, we simply generalize h to cover exactly this example: h = (red,small,round,humid,low,smooth)

19 Example Hypothesis h basically says that the first example is the only positive example, all other examples are negative. Then comes examples 2: ((red,small,elongated,humid,low,smooth), poisonous) This example is positive. All attributes match hypothesis h except for attribute shape: it has the value elongated, not round. We generalize this attribute using symbol * yielding: h: (red,small,*,humid,low,smooth) The third example is negative and so we just ignore it. Why is it we don’t need to be concerned with negative examples?

20 Example Upon observing the 4 th example, hypothesis h is generalized to the following: h = (red,small,*,humid,*,*) h is interpreted as any mushroom that is red, small and found on humid land should be classified as poisonous.

21 Analyzing the Algorithm The algorithm is guaranteed to find the hypothesis that is most specific and consistent with the set of training examples.The algorithm is guaranteed to find the hypothesis that is most specific and consistent with the set of training examples. It takes advantage of the general-specific ordering to move on the corresponding lattice searching for the next most specific hypothesis.It takes advantage of the general-specific ordering to move on the corresponding lattice searching for the next most specific hypothesis. h1 h3h4 h2 h5h6 h7 h8

22 X-H Relation

23 X-H Relation

24 Points to Consider There are many hypotheses consistent with the training data D. There are many hypotheses consistent with the training data D. Why should we prefer the most specific hypothesis? Why should we prefer the most specific hypothesis? What would happen if the examples are not consistent? What would happen if the examples are not consistent? What would happen if they have errors, noise? What would happen if they have errors, noise? What if there is a hypothesis space H where one can find more that one maximally specific hypothesis h? What if there is a hypothesis space H where one can find more that one maximally specific hypothesis h? The search over the lattice must then be different to allow for this possibility. The search over the lattice must then be different to allow for this possibility.

25 Summary The input space is the space of all examples; the output space is the space of all classes. The input space is the space of all examples; the output space is the space of all classes. A hypothesis maps examples into classes. A hypothesis maps examples into classes. We want a hypothesis close to target concept c. We want a hypothesis close to target concept c. The input space establishes a partial ordering over the hypothesis space. The input space establishes a partial ordering over the hypothesis space. One can exploit this ordering to move along the corresponding lattice. One can exploit this ordering to move along the corresponding lattice.

26 Tareas Leer Capítulo 2 de Mitchell (-2.5)