Introduction Machine Learning: Chapter 1. Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed.

Slides:



Advertisements
Similar presentations
Machine learning Overview
Advertisements

1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)
Supervised Learning Recap
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
1er. Escuela Red ProTIC - Tandil, de Abril, Introduction How to program computers to learn? Learning: Improving automatically with experience.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
An Introduction to Machine Learning In the area of AI (earlier) machine learning took a back seat to Expert Systems Expert system development usually consists.
Machine Learning (Extended) Dr. Ata Kaban
Evaluating Hypotheses
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Machine Learning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.
1 Some rules  No make-up exams ! If you miss with an official excuse, you get average of your scores in the other exams – at most once.  WP only-if you.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
A Brief Survey of Machine Learning
Part I: Classification and Bayesian Learning
Chapter 5 Data mining : A Closer Look.
Introduction to machine learning
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Inductive learning Simplest form: learn a function from examples
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Reinforcement Learning
For Friday Read chapter 18, sections 3-4 Homework: –Chapter 14, exercise 12 a, b, d.
CpSc 810: Machine Learning Design a learning system.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Machine Learning Chapter 11.
General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Learning from Observations Chapter 18 Through
1 Machine Learning (Extended) Dr. Ata Kaban Algorithms to enable computers to learn –Learning = ability to improve performance automatically through experience.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Learning from observations
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
CHECKERS: TD(Λ) LEARNING APPLIED FOR DETERMINISTIC GAME Presented By: Presented To: Amna Khan Mis Saleha Raza.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
Chapter 1: Introduction. 2 목 차목 차 t Definition and Applications of Machine t Designing a Learning System  Choosing the Training Experience  Choosing.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Data Mining and Decision Support
1 Introduction to Machine Learning Chapter 1. cont.
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Machine Learning & Datamining CSE 454. © Daniel S. Weld 2 Project Part 1 Feedback Serialization Java Supplied vs. Manual.
Pattern recognition – basic concepts. Sample input attribute, attribute, feature, input variable, independent variable (atribut, rys, příznak, vstupní.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
1 Machine Learning Patricia J Riddle Computer Science 367 6/26/2016Machine Learning.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Machine Learning. Definition: The ability of a machine to improve its performance based on previous results.
Supervise Learning Introduction. What is Learning Problem Learning = Improving with experience at some task – Improve over task T, – With respect to performance.
Brief Intro to Machine Learning CS539
Data Mining Lecture 3.
Spring 2003 Dr. Susan Bridges
Chapter 11: Learning Introduction
Data Mining Lecture 11.
Basic Intro Tutorial on Machine Learning and Data Mining
Overview of Machine Learning
3.1.1 Introduction to Machine Learning
Why Machine Learning Flood of data
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Machine Learning: Lecture 6
Machine Learning: UNIT-3 CHAPTER-1
Presentation transcript:

Introduction Machine Learning: Chapter 1

Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed learning problems Designing a learning system  Choosing the training experience  Choosing the target function  Choosing a representation for the target function  Choosing a function approximation algorithm  The final design Issues in machine learning Summary

Types of Learning Supervised Learning: Given training data comprising examples of input vectors along with their corresponding target vectors, goal is either (1) to assign each input vector to one of a finite number of discrete categories (classification) or (2) to assign each input vector to one or more continuous variables (regression). Unsupervised Learning: Given training data consists of a set of input vector without any corresponding target values, goal is to either (1) to discover groups of similar examples within data, called clustering, or (2) to determine the distribution of data within the input space, known as density estimation, or (3) to project the data from a high-dimensional space down to two or three dimensions for the purpose of visualization. Reinforcement Learning: Given an agent having a set of sensors to observe the state of its environment and a set of actions it can performs to alter this state, goal is to find suitable actions to take in a given situation in order to maximize the accumulated reward where each action resulting in certain state is given a reward (exploration and exploitation).

Applications of machine learning Recognizing spoken words:  primitive sounds (phonemes), words from observed speech signal  Use of neural networks, hidden Markov models for customizing to individual speakers  Applications in signal-interpretation problems Driving an autonomous vehicle:  Steering for driving on a variety of road types  Applications in sensor based control problems Classifying new astronomical structures  Use of decision tree algorithms to automatically classify objects in sky survey  Classifying variety of large databases to learn general regularities Playing world-class backgammon  Learning strategy by self playing against itself over a million times and becoming competitive with human world champion  Applications to the problems with very large search spaces

Backgammon game rules Middle bar: When you have a hoarse in the middle bar, it should be moved first to the opponents land Capture: when you move to the place where only one opponent horse, you are capturing it (you are not allowed to go to the place where there are more then 2 opponent horses) Removal: when you have all of your horses in your side, your horses can be removed.

Checkers game rules Players who cannot move loses the game Moving: piece (forward), king (forward or backward) Jumping: you can jump to capture the opponents piece. piece (forward), king (forward or backward) Kinging: when a piece moves to the last row, it becomes king. The captured piece is placed on top of to make it look bigger. (A piece that has just kinged, cannot continue jumping pieces, until the next move)

Disciplines related with machine learning Artificial intelligence:  Learning symbolic representations of concepts  An approach to improving problem solving Bayesian methods:  Bayes’ theorem as the basis for calculating probabilities of hypotheses Computational complexity theory  Bounds on the inherent complexity of different learning tasks  Measured by number of training examples, computational effort, number of mistakes in order to learn Control theory:  Control processes to optimize predefined objectives  Predict next state of the process

Disciplines related with machine learning Information theory:  Measures of entropy and information content  Minimum description length approaches to learning Philosophy:  Occam’s razor: Simplest possible hypothesis is the best for the given data  Justification (heuristic rule) for generalizing beyond observed data Psychology and neurobiology:  Power law of practice for peoples response time in learning problems  Neurobiological studies motivating ANNs Statistics:  Characterization of errors (e.g. bias and variance) for estimating the accuracy of a hypothesis based on a limited sample of data  Statistical tests

Well-Posed Learning Problems :Definition: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. Three features in learning problems –The class of tasks (T) –The measure of performance to be improved (P) –The source of experience (E)

Well-Posed Learning Problems : Examples A checkers learning problem –Task T : playing checkers –Performance measure P : percent of games won against opponents –Training experience E : playing practice games against itself A handwriting recognition learning problem –Task T : recognizing and classifying handwritten words within images –Performance measure P : percent of words correctly classified –Training experience E : a database of handwritten words with given classifications

Well-Posed Learning Problems : Examples A robot driving learning problem –Task T : driving on public four-lane highways using vision sensors –Performance measure P : average distance traveled before an error (as judged by human overseer) –Training experience E : a sequence of images and steering commands recorded while observing a human driver

Designing a Learning System Choosing the Training Experience Choosing the Target Function Choosing a Representation for the Target Function Choosing a Function Approximation Algorithm The Final Design

Choosing the Training Experience Whether the training experience provides direct or indirect feedback regarding the choices made by the performance system: Example: –Direct training examples in learning to play checkers consist of individual checkers board states and the correct move for each. –Indirect training examples in the same game consist of the move sequences and final outcomes of various games played in which information about the correctness of specific moves early in the game must be inferred indirectly from the fact that the game was eventually won or lost – credit assignment problem.

Choosing the Training Experience The degree to which the learner controls the sequence of training examples: Example: –The learner might rely on the teacher to select informative board states and to provide the correct move for each –The learner might itself propose board states that it finds particularly confusing and ask the teacher for the correct move. –The learner may have complete control over the board states and (indirect) classifications, as it does when it learns by playing against itself with no teacher present.

Choosing the Training Experience How well it represents the distribution of examples over which the final system performance P must be measured: In general learning is most reliable when the training examples follow a distribution similar to that of future test examples. Example: –If the training experience in play checkers consists only of games played against itself, the learner might never encounter certain crucial board states that are very likely to be played by the human checkers champion. (Note however that the most current theory of machine learning rests on the crucial assumption that the distribution of training examples is identical to the distribution of test examples)

A checkers learning problem Task T: playing ckeckers Performance measure P: percent of games won in the world tournament Training experience E: games played against itself Choose: –The exact type of knowledge to be learned (target function) –A representation for this target knowledge –A learning mechanism (function approximation algorithm)

Choosing the Target Function To determine what type of knowledge will be learned and how this will be used by the performance program: Example: –In play checkers, it needs to learn to choose the best move among those legal moves: ChooseMove: B  M, which accepts as input any board from the set of legal board states B and produces as output some move from the set of legal moves M.

Choosing the Target Function Since the target function such as ChooseMove turns out to be very difficult to learn given the kind of indirect training experience available to the system, an alternative target function is then an evaluation function that assigns a numerical score to any given board state, V: B  R. (non-operational) Definition of target function: –If b is a final board state that is won, then V(b) = 100 –If b is a final board state that is lost, then V(b) = -100 –If b is a final board state that is drawn (even), then V(b) = 0 –If b is not a final board state in the game, then V(b) = V(b’) for the optimal state b’ obtained from b Operational description of V needs function approximation

Choosing a Representation for the Target Function Given the ideal target function V, we choose a representation that the learning system will use to describe V’ that it will learn: Describing the function: –Tables –Rules –Polynomial functions –Neural nets Trade-off in choice –Expressive power –Size of training data

Choosing a Representation for the Target Function Target function: V: B  R Target function representation (linear combination of board features): –V’(b) = w 0 + w 1 x 1 + w 2 x 2 + w 3 x 3 + w 4 x 4 + w 5 x 5 + w 6 x 6 –where x i s are the number of black/red pieces/kings/queens on/threatened on the board The effect of this design choices is to reduce the problem of learning a checker’s strategy to the problem of learning values for the coefficients w 0 to w 6 in the target function representation

Each training example is given by where V train (b) is the training value for a board b. Example: “black has won the game (x 2 = 0), +100> Estimating Training Values: –Ambiguity in estimating training values: with only final results, need to assign specific scores to specific board states –Effective approach: using current approximation of V and next state V train (b)  V’(Successor(b)) Choosing a Function Approximation Algorithm

For each training example Use the current weights to calculate V’(b) For each weight w i, update it as w i  w i +  (V train (b) – V’(b)) x i Adjusting the Weights –To specify the learning algorithm for choosing the weights w i to best fit the set of training examples { } –minimizing the squared error E between the training values and the values predicted by the hypothesis V’ E =  observed training examples (V training (b) – V’(b)) 2 LMS weight update rule –When error is positive, V’ is low and weight is increased to raise V’ Choosing a Function Approximation Algorithm

The Final Design 4 program modules: Performance system, Critic, Generalizer, Experiment generator Performance System: To solve the given performance task (playing checkers) by using the learned target function(s). It takes an instance of a new problem (new game) as input and a trace of its solution (game history) as output. Critic: To take as input the history or trace of the game and produce as output a set of training examples of the target function.

The Final Design Generalizer: To take as input the training examples and produce an output hypothesis that is its estimate of the target function. It generalizes from the specific training examples, hypothesizing a general function that covers these examples and other cases beyond the training examples.

The Final Design Experiment Generator: To take as input the current hypothesis (currently learned function) and outputs a new problem (i.e., initial board state) for Performance System to explore. Its role is to pick new practice problems that will maximize the learning rate of the overall system.

The Final Design Experiment Generator New problem (initial game board) Hypothesis (V’) Training examples Solution trace (game history) Figure 1.1 Final design of the checkers learning program Performanc e System Critic Generalizer {, …}

Linear programming Polynomial Linear function of six features Table of correct moves Games against self Games against experts Determine Type of Training Experience Determine Target Function Determine Representation of Learned Function Completed Design Determine Learning Algorithm … … … … Board  move Board  value Artificial neural network Gradient descent Choices in designing the checkers learning problem

Issues in Machine Learning What algorithms exist for learning general target functions from specific training examples ? –Convergence settings under sufficient data –Performance with respect to the types of problems and representations How much training data is sufficient? –General bounds to relate the confidence in learned hypothesis to the amount of training experience, the character of learner’s hypothesis space When and how can prior knowledge held by the learner guide the process of generalizing from examples ? –Can approximately correct prior knowledge be helpful ?

Issues in Machine Learning What is the best strategy for choosing a useful next training experience ? –And how does the choice of this strategy alter the complexity of the learning problem ? What is the best way to reduce the learning task to one or more function approximation problems ? –What specific functions should be learned ? –Can this process be automated ? How can the learner automatically alter its representation to improve its ability to represent and learn the target function ?

Summary Usefulness of machine learning algorithms: –Data mining of large data –Poorly understood domains (e.g. Face recognition) –Domains with dynamically changing conditions (e.g. Manufacturing process) Ideas from diverse principles –Artificial intelligence –Probability and statistics –Computational complexity –Information theory –Psychology and neurobiology –Control theory –philosophy Well defined learning problem requires a well specified task, performance measure, training experience

Summary Designing a machine learning involves choosing –the type of training experience –the target function –representation of this target function –algorithm for learning this target function Learning involves search through a space of possible hypothesis to find the hypothesis that fits best the available training examples and other prior knowledge