General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes.

Slides:



Advertisements
Similar presentations
Concept Learning and the General-to-Specific Ordering
Advertisements

2. Concept Learning 2.1 Introduction
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Linear Regression.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
Università di Milano-Bicocca Laurea Magistrale in Informatica
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Decision Trees. DEFINE: Set X of Instances (of n-tuples x = ) –E.g., days decribed by attributes (or features): Sky, Temp, Humidity, Wind, Water, Forecast.
Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007 Machine Learning Lecture 2: Concept Learning and Version Spaces 1.
Chapter 2 - Concept learning
Machine Learning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.
Machine Learning: Symbol-Based
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
Concept Learning and Version Spaces
Part I: Classification and Bayesian Learning
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Artificial Intelligence 6. Machine Learning, Version Space Method
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 19, 2001.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Machine Learning Version Spaces Learning. 2  Neural Net approaches  Symbolic approaches:  version spaces  decision trees  knowledge discovery  data.
CS 484 – Artificial Intelligence1 Announcements List of 5 source for research paper Homework 5 due Tuesday, October 30 Book Review due Tuesday, October.
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
CS 478 – Tools for Machine Learning and Data Mining The Need for and Role of Bias.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Machine Learning Chapter 11.
Machine Learning CSE 681 CH2 - Supervised Learning.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
1 Concept Learning By Dong Xu State Key Lab of CAD&CG, ZJU.
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Tom M. Mitchell.
Chapter 2: Concept Learning and the General-to-Specific Ordering.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
CpSc 810: Machine Learning Concept Learning and General to Specific Ordering.
Concept Learning and the General-to-Specific Ordering 이 종우 자연언어처리연구실.
Outline Inductive bias General-to specific ordering of hypotheses
Overview Concept Learning Representation Inductive Learning Hypothesis
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Linear Models for Classification
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Thursday, August 26, 1999 William.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 34 of 41 Wednesday, 10.
Machine Learning: Lecture 2
Machine Learning Concept Learning General-to Specific Ordering
1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo.
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
CS464 Introduction to Machine Learning1 Concept Learning Inducing general functions from specific training examples is a main issue of machine learning.
Concept Learning and The General-To Specific Ordering
Computational Learning Theory Part 1: Preliminaries 1.
Concept learning Maria Simi, 2011/2012 Machine Learning, Tom Mitchell Mc Graw-Hill International Editions, 1997 (Cap 1, 2).
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.
Chapter 2 Concept Learning
Concept Learning Machine Learning by T. Mitchell (McGraw-Hill) Chp. 2
CSE543: Machine Learning Lecture 2: August 6, 2014
CS 9633 Machine Learning Concept Learning
Analytical Learning Discussion (4 of 4):
Data Mining Lecture 11.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Version Spaces Learning
Concept Learning.
Why Machine Learning Flood of data
Machine Learning: Lecture 6
Concept Learning Berlin Chen 2005 References:
Machine Learning: UNIT-3 CHAPTER-1
Machine Learning Chapter 2
Version Space Machine Learning Fall 2018.
Machine Learning Chapter 2
Presentation transcript:

General-to-Specific Ordering

8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes Tree questions Sky? Sunny, ok, Wind? Strong, ok yes enjoy sport Tree questions Sky? Sunny, ok, Wind? Strong, ok yes enjoy sport  Like Decision Tree

8/29/03Logic Based Classification3 Expression Means will enjoy sport only when sky is sunny and wind is strong, don’t care about other attributes Expression Means will enjoy sport only when sky is sunny and wind is strong, don’t care about other attributes SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes  With candidate elimination object is to predict class through the use of expressions ?’s are like wild cards Expressions represent conjunctions

8/29/03Logic Based Classification4  Finding a maximally specific hypothesis  Start with most restrictive (specific) one can get and relax to satisfy each positive training sample Most general (all dimensions can be any value) Most restrictive (no dimension can be anything Most general (all dimensions can be any value) Most restrictive (no dimension can be anything Ø’s mean nothing will match it

8/29/03Logic Based Classification5  What if a relation has a single Ø? (remember, the expression is a conjunction) Ø

8/29/03Logic Based Classification6 Initialize h to most specific hypothesis in H ( ) For each positive training instance x For each attribute constraint a i in h If the constraint a i is satisfied by x then do nothing Else replace a i in h by the next more general constraint that is satisfied by x Return h Order of generality ? is more general than a specific attribute value which is more specific than Ø Order of generality ? is more general than a specific attribute value which is more specific than Ø

8/29/03Logic Based Classification7 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes  Set h to  First positive (x)   Which attributes of x are satisfied by h? None?  Replace each a i with a relaxed form from x 

8/29/03Logic Based Classification8 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes  h is now   Next positive   Which attributes of x are satisfied by h? Not humidity  Replace h with 

8/29/03Logic Based Classification9 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes  h is now   Next positive   Which attributes of x are satisfied by h? Not water or forcast  Replace h with  Return Can one use this to “test” a new instance? Return Can one use this to “test” a new instance?

8/29/03Logic Based Classification10  What if want all hypotheses that are consistent with a training set (called a version space )  A hypothesis is consistent with a set of training examples if and only if h(x)=c(x) for each training example

8/29/03Logic Based Classification11 Number of hypotheses 5,120 that can be represented (5*4*4*4*4*4) But a single Ø represents an empty set So semantically distinct hypotheses 973

8/29/03Logic Based Classification12

8/29/03Logic Based Classification13

8/29/03Logic Based Classification14

8/29/03Logic Based Classification15  All yes’s are sunny, warm, and strong  But “strong” isn’t enough to identify a yes S: { } G: {, } S: { } G: {, } 5 ?’s 3 ?’s 4 ?’s SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes

8/29/03Logic Based Classification16

8/29/03Logic Based Classification17  Initialize G to the set of maximally general hypotheses in H  Initialize S to the set of maximally specific hypotheses in H  For each training example d, do  If d is a positive example  Remove from G any hypothesis inconsistent with d  For each hypothesis s in S that is not consistent with d  Remove s from S  Add to S all minimal generalizations h of s such that  h is consistent with d and some member of G is more general than h  Remove from S any hypothesis that is more general than another hypothesis in S  If d is a negative example  Remove from S any hypothesis inconsistent with d  For each hypothesis g in G that is not consistent with d  Remove g from G  Add to G all minimal specializations h of g such that  h is consistent with d, and some member of S is more specific than h  Remove from G any hypothesis that is less general than another hypothesis in G

8/29/03Logic Based Classification18  Initialize S 0 : G0: { } S 0 : G0: { } SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes

8/29/03Logic Based Classification19  First record S 1 : { } G 0 G 1 : { } S 1 : { } G 0 G 1 : { } SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes

8/29/03Logic Based Classification20  Second S 2 : { } G 0 G 1 G 2 : { } S 2 : { } G 0 G 1 G 2 : { } SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes Modify previous S minimally to keep consistent with d

8/29/03Logic Based Classification21  Third S 2 S 3 : { } G 3 : {,, } S 2 S 3 : { } G 3 : {,, } SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes Replace { } with all one member expressions (minimally specialized)

8/29/03Logic Based Classification22  Fourth S 4 : { } G 3 G 4 : {,, } S 4 : { } G 3 G 4 : {,, } SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes Back to positive, replace warm and same with “?” and remove “Same” from General Then can calculate the interior expressions

8/29/03Logic Based Classification23  Have two identical records but different classes?  If positive shows up first it, first step in evaluating a negative states “Remove from S any hypothesis that is not consistent with d” (S is now empty)  For each hypothesis g in G that is not consistent with d  Remove g from G (all ?’s is inconsistent with No, G is empty)  Add to G all minimal specializations h of g such that h is consistent with d, and some member of S is more specific than h  No matter what add to G it will violate either d or S (remains empty)  Both are empty, broken. Known as converging to an empty version space SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmNormalStrongWarmSameNo S 1 : { } G 0 G 1 : { } S 1 : { } G 0 G 1 : { } Established by first positive

8/29/03Logic Based Classification24  Have two identical records but different classes?  If negative shows up first it, first step in evaluating a positive states “Remove from G any hypothesis that is not consistent with d”  This is all of them, leaving an empty set  For each hypothesis s in S that is not consistent with d  Remove s from S  Add to S all minimal generalizations h of s such that h is consistent with d and some member of G is more general than h  No minimal generalization exists except SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameNo SunnyWarmNormalStrongWarmSameYes S 0 : G 0 G 1 : {,,,,,, } S 0 : G 0 G 1 : {,,,,,, } Established by first negative

8/29/03Logic Based Classification25  Bad with noisy data  Similar effect with false positives or negatives

8/29/03Logic Based Classification26

8/29/03Logic Based Classification27  Never before seen data S 4 : { } G 3 G 4 : {,, } S 4 : { } G 3 G 4 : {,, } SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalLightWarmSame? All training samples were strong wind Proportion can be a confidence metric

8/29/03Logic Based Classification28  Same confidence as if already converged to the single correct target concept  Regardless of which hypothesis in the version space is eventually found to be correct, it will be positive for at least some of the hypotheses in the current set, and the test case is unanimously positive

8/29/03Logic Based Classification29  Discrete data  Binary classes SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes

8/29/03Logic Based Classification30  Have seen 4 classifiers  Naïve Bayesian  KNN  Decision Tree  Candidate Elimination  Now for some theory

8/29/03Logic Based Classification31  Curse of dimensionality  Overfitting  Lazy/Eager  Radial basis  Normalization  Gradient descent  Entropy/Information gain  Occam’s razor

8/29/03Logic Based Classification32  Another way of measuring whether a hypothesis captures the learning concept  Candidate Elimination  Conjunction of constraints on the attributes

8/29/03Logic Based Classification33  In regression  Biased toward linear solutions  Naïve Bayes  Biased to a given distribution or bin selection  KNN  Biased toward solutions that assume cohabitation of similarly classed instances  Decision Tree  Short trees

8/29/03Logic Based Classification34  Must be able to accommodate every distinct subset as class definition  96 distinct instances (3*2*2*2*2*2)  Sky has three possible answers–rest two  Number of distinct subsets 2 96  Think binary: 1 indicates membership SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes RainyColdHighStrongWarmChangeNo SunnyWarmHighStrongCoolChangeYes

8/29/03Logic Based Classification35  Number of hypotheses 5,120 that can be represented (5*4*4*4*4*4)  But a single Ø represents an empty set  So semantically distinct hypotheses 973  Each hypothesis represents a subset (due to wild cards)  1+(4*3*3*3*3*3) S 0 : G0: { } S 0 : G0: { } Candidate elimination can represent 973 different subsets But 2 96 is the number of distinct subsets Very biased

8/29/03Logic Based Classification36  I think of bias as inflexibility in expressing hypotheses  Or, alternatively, what are the implicit assumptions of the approach

8/29/03Logic Based Classification37  Next term: inductive inference  The process by which a conclusion is inferred from multiple observations

8/29/03Logic Based Classification38  Inductive learning hypothesis  Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples

8/29/03Logic Based Classification39  Concept learning  Automatically inferring the general definition of some concept, given examples labeled as members or nonmembers of the concept Roughly equate “Concept” to “Class”

8/29/03Logic Based Classification40

8/29/03Logic Based Classification41  In regression  The various “y” values of the training instances  Function approximation  Naïve Bayes, KNN, and Decision Tree  Class

8/29/03Logic Based Classification42  In regression  Line; the coefficients (or other equation members such as exponents)  Naïve Bayes  Class of an instance is predicted by determining most probable class given the training data. That is, by finding the probability for each class for each dimension, multiplying these probabilities (across the dimensions for each class) and taking the class with the maximum probability as the predicted class  KNN  Class of an instance is predicted by examining an instance’s neighborhood  Decision Tree  Tree itself  Candidate Elimination  Conjunction of constraints on the attributes

8/29/03Logic Based Classification43  Supervised Learning  Supervision from an oracle that knows the classes of the training data  Is there unsupervised learning?  Yes, covered in pattern rec  Seeks to determine how the data are organized  Clustering  PCA  Edge detection

8/29/03Logic Based Classification44  Machine learning addresses the question of how to build computer programs that improve their performance at some task through experience.

8/29/03Logic Based Classification45

8/29/03Logic Based Classification46  Look at every legal move  Determine goodness (score) of resultant board state  Return the highest score (argmax)

8/29/03Logic Based Classification47  Score function, we will keep it simple  Work with a polynomial with just a few variables  X 1 : the number of black pieces on the board  X 2 : the number of red pieces on the board  X 3 : the number of black kings on the board  X 4 : the number of red kings on the board  X 5 : the number of black pieces threatened by red  X 6 : the number of red pieces threatened by black

8/29/03Logic Based Classification48  Gotta learn them weights  But how? X 1 : the number of black pieces on the board X 2 : the number of red pieces on the board X 3 : the number of black kings on the board X 4 : the number of red kings on the board X 5 : the number of black pieces threatened by red X 6 : the number of red pieces threatened by black

8/29/03Logic Based Classification49  A bunch of board states (a series of games)  Use them to jiggle the weights  Must know the current real “score” vs. “predicted score” using polynomial Train the scoring function

8/29/03Logic Based Classification50  If my predictor is good then it will be self- consistent  That is, the score of my best move should lead to a good scoring board state  If it doesn’t maybe we should adjust our predictor

8/29/03Logic Based Classification51  Successor returns the board state of the best move (returned by chooseNextMove(b))  It has been found to be surprisingly successful

8/29/03Logic Based Classification52  For each training sample (board states from a series of games)  If win (zero opponent pieces on the board) could give some fixed score (100 if win, -100 if lose) Look familiar? LMS (least mean squares) weight update rule

8/29/03Logic Based Classification53  Is this a classifier?  Is it Machine Learning?

8/29/03Logic Based Classification54

8/29/03Logic Based Classification55  At the beginning of candidate elim pg 29  Diff between satisfies and consistent with  Satisfies h when h(x)=1 regardless of whether x is a positive or negative example  Consistent with h depends on the target concept, whether h(x)=c(x)