CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.

Slides:



Advertisements
Similar presentations
Concept Learning and the General-to-Specific Ordering
Advertisements

1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)
2. Concept Learning 2.1 Introduction
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
1er. Escuela Red ProTIC - Tandil, de Abril, Introduction How to program computers to learn? Learning: Improving automatically with experience.
Università di Milano-Bicocca Laurea Magistrale in Informatica
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007 Machine Learning Lecture 2: Concept Learning and Version Spaces 1.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Machine Learning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
1 Some rules  No make-up exams ! If you miss with an official excuse, you get average of your scores in the other exams – at most once.  WP only-if you.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Artificial Intelligence 6. Machine Learning, Version Space Method
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 19, 2001.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
CS 484 – Artificial Intelligence1 Announcements List of 5 source for research paper Homework 5 due Tuesday, October 30 Book Review due Tuesday, October.
For Friday Read chapter 18, sections 3-4 Homework: –Chapter 14, exercise 12 a, b, d.
CpSc 810: Machine Learning Design a learning system.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Machine Learning Chapter 11.
Machine Learning CSE 681 CH2 - Supervised Learning.
General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes.
1 Concept Learning By Dong Xu State Key Lab of CAD&CG, ZJU.
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Tom M. Mitchell.
Chapter 2: Concept Learning and the General-to-Specific Ordering.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
CpSc 810: Machine Learning Concept Learning and General to Specific Ordering.
Concept Learning and the General-to-Specific Ordering 이 종우 자연언어처리연구실.
Outline Inductive bias General-to specific ordering of hypotheses
Overview Concept Learning Representation Inductive Learning Hypothesis
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
Chapter 1: Introduction. 2 목 차목 차 t Definition and Applications of Machine t Designing a Learning System  Choosing the Training Experience  Choosing.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 34 of 41 Wednesday, 10.
Machine Learning: Lecture 2
Machine Learning Concept Learning General-to Specific Ordering
Data Mining and Decision Support
1 Introduction to Machine Learning Chapter 1. cont.
Introduction Machine Learning: Chapter 1. Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
CS464 Introduction to Machine Learning1 Concept Learning Inducing general functions from specific training examples is a main issue of machine learning.
Concept Learning and The General-To Specific Ordering
Computational Learning Theory Part 1: Preliminaries 1.
Concept learning Maria Simi, 2011/2012 Machine Learning, Tom Mitchell Mc Graw-Hill International Editions, 1997 (Cap 1, 2).
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
Machine Learning & Datamining CSE 454. © Daniel S. Weld 2 Project Part 1 Feedback Serialization Java Supplied vs. Manual.
1 Machine Learning Patricia J Riddle Computer Science 367 6/26/2016Machine Learning.
Supervise Learning Introduction. What is Learning Problem Learning = Improving with experience at some task – Improve over task T, – With respect to performance.
Data Mining Lecture 3.
Chapter 2 Concept Learning
Concept Learning Machine Learning by T. Mitchell (McGraw-Hill) Chp. 2
CSE543: Machine Learning Lecture 2: August 6, 2014
CS 9633 Machine Learning Concept Learning
Spring 2003 Dr. Susan Bridges
Analytical Learning Discussion (4 of 4):
Why Machine Learning Flood of data
Machine Learning: Lecture 6
Concept Learning Berlin Chen 2005 References:
Machine Learning: UNIT-3 CHAPTER-1
Machine Learning Chapter 2
Version Space Machine Learning Fall 2018.
Machine Learning Chapter 2
Presentation transcript:

CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18 Bring 1 8x11.5 piece of paper with anything you want written on it Book Review is due Thursday, October 25 Andrew's Current Event Volunteer for Tuesday?

Introduction to Machine Learning Lecture 14

CS 484 – Artificial Intelligence3 Effects of Programs that Learn Application areas Learning from medical records which treatments are most effective for new diseases Houses learning from experience to optimize energy costs based on usage patterns of their occupants Personal software assistants learning the evolving interests of users in order to highlight especially relevant stories from the online morning newspaper

CS 484 – Artificial Intelligence4 Effective Applications of Learning Speech recognition outperform all other approaches that have been attempted to date Data mining Learning algorithms being used to discover valuable knowledge from large commercial databases detect fraudulent use of credit cards Play Games Play backgammon at levels approaching the performance of human world champions

CS 484 – Artificial Intelligence5 Learning Programs A computer program is said to learn from experiences E with respect to some class of tasks T and performance P, if its performance at tasks in T, as measured by P, improves with experience E Examples A checkers learning problem: Task T: playing checkers Performance measure P: percent of games won against opponents Training experience E: playing practice games against itself Handwriting recognition learning problem: Task T: recognizing and classifying handwritten words within images Performance measure P: percent of words correctly classified Training experience E: a database of handwritten words with given classifications

CS 484 – Artificial Intelligence6 Designing a Learning System Consider designing a program to learn to play checkers, with the goal of entering it in the world checkers tournament Requires the following sets Choosing Training Experience Choosing the Target Function Choosing the Representation of the Target Function Choosing the Function Approximation Algorithm

CS 484 – Artificial Intelligence7 Choosing the Training Experience (1) Will the training experience provide direct or indirect feedback? Direct Feedback: system learns from examples of individual checkers board states and the correct move for each Indirect Feedback: Move sequences and final outcomes of various games played Credit assignment problem: Value of early states must be inferred from the outcome Degree to which the learner controls the sequence of training examples Teacher selects informative boards and gives correct move Learner proposes board states that it finds particularly confusing. Teacher provides correct moves Learner controls board states and (indirect) training classifications

CS 484 – Artificial Intelligence8 Choosing the Training Experience (2) How well the training experience represents the distribution of examples over which the final system performance P will be measured If training the checkers program consists only of experiences played against itself, it may never encounter crucial board states that are likely to be played by the human checkers champion Most theory of machine learning rests on the assumption that the distribution of training examples is identical to the distribution of test examples

CS 484 – Artificial Intelligence9 Partial Design of Checkers Learning Program A checkers learning problem: Task T: playing checkers Performance measure P: percent of games won in the world tournament Training experience E: games played against itself Remaining choices The exact type of knowledge to be learned A representation for this target knowledge A learning mechanism

CS 484 – Artificial Intelligence10 Choosing the Target Function (1) Assume that you can determine legal moves Program needs to learn the best move from among legal moves Defines large search space known a priori target function: ChooseMove : B → M ChooseMove is difficult to learn given indirect training Alternative target function An evaluation function that assigns a numerical score to any given board state V : B → ( where is the set of real numbers) V(b) for an arbitrary board state b in B if b is a final board state that is won, then V(b) = 100 if b is a final board state that is lost, then V(b) = -100 if b is a final board state that is drawn, then V(b) = 0 if b is not a final state, then V(b) = V(b '), where b' is the best final board state that can be achieved starting from b and playing optimally until the end of the game

CS 484 – Artificial Intelligence11 Choosing the Target Function (2) V(b) gives a recursive definition for board state b Not usable because not efficient to compute except is first three trivial cases nonoperational definition Goal of learning is to discover an operational description of V Learning the target function is often called function approximation Referred to as

CS 484 – Artificial Intelligence12 Choosing a Representation for the Target Function Choice of representations involve trade offs Pick a very expressive representation to allow close approximation to the ideal target function V More expressive, more training data required to choose among alternative hypotheses Use linear combination of the following board features: x1: the number of black pieces on the board x2: the number of red pieces on the board x3: the number of black kings on the board x4: the number of red kings on the board x5: the number of black pieces threatened by red (i.e. which can be captured on red's next turn) x6: the number of red pieces threatened by black

CS 484 – Artificial Intelligence13 Partial Design of Checkers Learning Program A checkers learning problem: Task T: playing checkers Performance measure P: percent of games won in the world tournament Training experience E: games played against itself Target Function: V: Board → Target function representation

CS 484 – Artificial Intelligence14 Choosing a Function Approximation Algorithm To learn we require a set of training examples describing the board b and the training value V train (b) Ordered pair

CS 484 – Artificial Intelligence15 Estimating Training Values Need to assign specific scores to intermediate board states Approximate intermediate board state b using the learner's current approximation of the next board state following b Simple and successful approach More accurate for states closer to end states

CS 484 – Artificial Intelligence16 Adjusting the Weights Choose the weights w i to best fit the set of training examples Minimize the squared error E between the train values and the values predicted by the hypothesis Require an algorithm that will incrementally refine weights as new training examples become available will be robust to errors in these estimated training values Least Mean Squares (LMS) is one such algorithm

CS 484 – Artificial Intelligence17 LMS Weight Update Rule For each train example Use the current weights to calculate For each weight w i, update it as where is a small constant (e.g. 0.1)

CS 484 – Artificial Intelligence18 Final Design Experiment Generator Performance System Critic Generalizer Hypothesis New problem (initial game board) Solution trace (game history) Training examples

CS 484 – Artificial Intelligence19 Summary of Design Choices Determine Type of Training Experience Determine Target Function Determine Representation of Learned Function Determine Learning Algorithm Complete Design Games against itself Table of correct Moves Games against Experts … Board → valueBoard → move … Linear function of six features Polynomial Artificial neural network … Gradient descent Linear Programming …

CS 484 – Artificial Intelligence20 Training Classification Problems Many learning problems involve classifying inputs into a discrete set of possible categories. Learning is only possible if there is a relationship between the data and the classifications. Training involves providing the system with data which has been manually classified. Learning systems use the training data to learn to classify unseen data.

CS 484 – Artificial Intelligence21 Rote Learning A very simple learning method. Simply involves memorizing the classifications of the training data. Can only classify previously seen data – unseen data cannot be classified by a rote learner.

CS 484 – Artificial Intelligence22 Concept Learning Concept learning involves determining a mapping from a set of input variables to a Boolean value. Such methods are known as inductive learning methods. If a function can be found which maps training data to correct classifications, then it will also work well for unseen data – hopefully! This process is known as generalization.

CS 484 – Artificial Intelligence23 Example Learning Task Learn the "days on which my friend Aldo enjoys his favorite water sport" ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo 4SunnyWarmHighStrongCoolChangeYes

CS 484 – Artificial Intelligence24 Hypotheses A hypothesis is a vector of constraints for each attribute indicate by a "?" that any value is acceptable for this attribute specify a single required value for the attribute indication by a " Ø " that no value is acceptable If some instance x satisfies all the constraints of hypothesis h, then h classifies x as a positive example (h(x) = 1) Example hypothesis for EnjoySport:

CS 484 – Artificial Intelligence25 EnjoySport concept learning task Given Instances X: Possible days, each described by the attributes Sky (with possible values Sunny, Cloudy, and Rainy) AirTemp (with values Warm and Cold) Humidity (with values Normal and High) Wind (with values Strong and Weak) Water (with values Warm and Cool), and Forecast (with values Same and Change) Hypothesis H: Each hypothesis is described by a conjunction of constraints on the attributes. The constraints may be "?", " Ø ", or a specific value Target concept c: EnjoySport : X → {0,1} Training Examples D: Positive or negative examples of the target function Determine A hypothesis h in H such that h(x) = c(x) for all x in X

CS 484 – Artificial Intelligence26 Inductive Learning Hypothesis Determine a hypothesis h identical to the target concept c over the entire set of instances X only information about c is its values over the training examples Inductive learning at best guarantees that the output hypothesis fits the target concept over the training data Fundamental assumption of inductive learning Inductive Learning Hypothesis: Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples

CS 484 – Artificial Intelligence27 Concept Learning As Search Search through a large space of hypothesis implicitly defined by the hypothesis representation Find the hypothesis that best fits the training examples How big is the hypothesis space? In EnjoySport six attributes: Sky has 3 values, and the rest have 2 How many distinct instances? How many hypothesis?

CS 484 – Artificial Intelligence28 General to Specific Ordering This hypothesis is the most general hypothesis. It represents the idea that every day is a positive example: h g = The following hypothesis is the most specific hypothesis: it says that no day is a positive example: h s = We can define a partial order over the set of hypotheses: h 1 > g h 2 This states that h 1 is more general than h 2 Let h 1 = Let h 2 = Given hypothesis h j and h k, h j is more_general_than_or_equal_to h k if and only if any instance that satisfies h k also satisfies h j One learning method is to determine the most specific hypothesis that matches all the training data.

CS 484 – Artificial Intelligence29 Partial Ordering x1x1 x2x2 Specific General Hypotheses HInstances X h1h1 h3h3 h2h2 x 1 = x 2 = h 1 = h 2 = h 3 =

CS 484 – Artificial Intelligence30 Find-S: Finding a maximally Specific Hypothesis 1.Initialize h to the most specific hypothesis in H 2.For each positive training instance x For each attribute constraint a i in h If the constraint a i is satisfied by x Then do nothing Else replace a i in h by the next more general constraint that is satisfied by x 3.Output hypothesis h Begin: h ← ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo 4SunnyWarmHighStrongCoolChangeYes