Supervise Learning Introduction. What is Learning Problem Learning = Improving with experience at some task – Improve over task T, – With respect to performance.

Slides:



Advertisements
Similar presentations
Machine learning Overview
Advertisements

1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)
ChooseMove=16  19 V =30 points. LegalMoves= 16  19, or SimulateMove = 11  15, or …. 16  19,
Perceptron Learning Rule
RL for Large State Spaces: Value Function Approximation
Adversarial Search: Game Playing Reading: Chapter next time.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
1er. Escuela Red ProTIC - Tandil, de Abril, Introduction How to program computers to learn? Learning: Improving automatically with experience.
Perceptron.
Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficiency with boostrap sampling: Every example.
Machine Learning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.
1 Kunstmatige Intelligentie / RuG KI Reinforcement Learning Johan Everts.
1 Some rules  No make-up exams ! If you miss with an official excuse, you get average of your scores in the other exams – at most once.  WP only-if you.
A Brief Survey of Machine Learning
Adversarial Search: Game Playing Reading: Chess paper.
Algorithms. Introduction Before writing a program: –Have a thorough understanding of the problem –Carefully plan an approach for solving it While writing.
Kunstmatige Intelligentie / RuG KI Reinforcement Learning Sander van Dijk.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Radial Basis Function Networks
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
CpSc 810: Machine Learning Design a learning system.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Machine Learning Chapter 11.
Computing & Information Sciences Kansas State University Wednesday, 13 Sep 2006CIS 490 / 730: Artificial Intelligence Lecture 9 of 42 Wednesday, 13 September.
Computing & Information Sciences Kansas State University Lecture 10 of 42 CIS 530 / 730 Artificial Intelligence Lecture 10 of 42 William H. Hsu Department.
Equations of Linear Relationships
Computer Go : A Go player Rohit Gurjar CS365 Project Presentation, IIT Kanpur Guided By – Prof. Amitabha Mukerjee.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Computing & Information Sciences Kansas State University Wednesday, 12 Sep 2007CIS 530 / 730: Artificial Intelligence Lecture 9 of 42 Wednesday, 12 September.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
Chapter 1: Introduction. 2 목 차목 차 t Definition and Applications of Machine t Designing a Learning System  Choosing the Training Experience  Choosing.
Concept learning, Regression Adapted from slides from Alpaydin’s book and slides by Professor Doina Precup, Mcgill University.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 34 of 41 Wednesday, 10.
ADALINE (ADAptive LInear NEuron) Network and
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
Chapter 8: Adaptive Networks
1 Introduction to Machine Learning Chapter 1. cont.
Introduction Machine Learning: Chapter 1. Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed.
Introduction to Machine Learning © Roni Rosenfeld,
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Chapter 6 Neural Network.
The Standard Genetic Algorithm Start with a “population” of “individuals” Rank these individuals according to their “fitness” Select pairs of individuals.
Computing & Information Sciences Kansas State University CIS 530 / 730: Artificial Intelligence Lecture 09 of 42 Wednesday, 17 September 2008 William H.
February 25, 2016Introduction to Artificial Intelligence Lecture 10: Two-Player Games II 1 The Alpha-Beta Procedure Can we estimate the efficiency benefit.
Machine Learning & Datamining CSE 454. © Daniel S. Weld 2 Project Part 1 Feedback Serialization Java Supplied vs. Manual.
Discriminative n-gram language modeling Brian Roark, Murat Saraclar, Michael Collins Presented by Patty Liu.
1 Machine Learning Patricia J Riddle Computer Science 367 6/26/2016Machine Learning.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Machine Learning. Definition: The ability of a machine to improve its performance based on previous results.
Game Playing Why do AI researchers study game playing?
Data Mining Lecture 3.
Spring 2003 Dr. Susan Bridges
Mastering the game of Go with deep neural network and tree search
AlphaGo with Deep RL Alpha GO.
A Brief Introduction of RANSAC
Reinforcement Learning
Training Neural networks to play checkers
Machine Learning.
N-Gram Model Formulas Word sequences Chain rule of probability
RL for Large State Spaces: Value Function Approximation
Chapter 2: Evaluative Feedback
The Alpha-Beta Procedure
Why Machine Learning Flood of data
Game Playing Fifth Lecture 2019/4/11.
Machine learning overview
Chapter 2: Evaluative Feedback
CAP 5610: Introduction to Machine Learning Spring 2011 Dr
Presentation transcript:

Supervise Learning Introduction

What is Learning Problem Learning = Improving with experience at some task – Improve over task T, – With respect to performance measure P – Based on experience E. Example: Learn to play checkers – T: Play checkers – P: % of games won in world tournament – E: Opportunity to play against self

Learning to Play Checkers T: Play checkers Percent of games won in world tournament What experience? What exactly should be learned. How shall it be represented? Training Distribution=Testing Distribution? What specific algorithm to learn it?

Choose the Target Function ChooseMove: Board  Move ?? – Input: The set of legal board state – Output: Some move from legal move Reduce the problem of improving performance P at task T to the problem of learning some target function such as ChooseMove

Choose the Target Function ChooseMove – Straightforward to transform to this function – Difficult to learn given indirect training experience Function V – Board  Move can assign a score for each state – Easy to select best move

Possible Definition for Target Function V If b is a final board state that is won, then V(b) = 100 If b is a final board state that is lost, then V(b) = -100 If b is a final board state that is drawn, then V(b) = 0 if b is a not a final state in the game, then V(b) = V(b'),where b' is the best final board state that can be achieved starting from b and playing optimally until the end of the game. This gives correct values, but is not operational

Choose the Target Function Recursive function Not efficiently computable Nonoperational definition Ideal target function is difficult Approximation solution

Choose the Target Function Use linear function of the form to represent the function – x1 , the number of black pieces on the board – x2 , the number of red pieces on the board – x3 , the number of black kings on the board – x4 , the number of red kings on the board – x5 , the number of black pieces threatened by red – x6 , the number of black pieces threatened by black

Choose the Target Function Target function representation V(b)=w0+w1x1+w2x2+…+w6x6 Reduce the problem of learning a checkers strategy to the problem of learning values for coefficients w0 though w6

An Approximated Function Each training example is an ordered pair of the form – – B is board state , Vtrain(b) is training value – E.g. ,,100> Training process – Assign specific scores to specific board states. – Find the best wi to match the training examples

An Approximated Function Rules for estimating training values – Vtrain(b)=V’(Successor(b)) Find best weight – Learning algorithm for choosing the weights wi to best fit the set of training examples. – LMS Least Mean Squares

LMS Weight Update Rule For each training example – Use the current weights to calculate – For each weight wi, update it as