1 Some rules  No make-up exams ! If you miss with an official excuse, you get average of your scores in the other exams – at most once.  WP only-if you.

Slides:



Advertisements
Similar presentations
Machine learning Overview
Advertisements

1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)
ChooseMove=16  19 V =30 points. LegalMoves= 16  19, or SimulateMove = 11  15, or …. 16  19,
Slides from: Doug Gray, David Poole
RL for Large State Spaces: Value Function Approximation
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
Classification Neural Networks 1
1er. Escuela Red ProTIC - Tandil, de Abril, Introduction How to program computers to learn? Learning: Improving automatically with experience.
The loss function, the normal equation,
Machine Learning Case study. What is ML ?  The goal of machine learning is to build computer systems that can adapt and learn from their experience.”
An Introduction to Machine Learning In the area of AI (earlier) machine learning took a back seat to Expert Systems Expert system development usually consists.
Machine Learning Bob Durrant School of Computer Science
1 Machine Learning Spring 2010 Rong Jin. 2 CSE847 Machine Learning  Instructor: Rong Jin  Office Hour: Tuesday 4:00pm-5:00pm Thursday 4:00pm-5:00pm.
Evaluating Hypotheses
Machine Learning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
A Brief Survey of Machine Learning
Artificial Neural Networks
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Part I: Classification and Bayesian Learning
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Artificial Intelligence (AI) Addition to the lecture 11.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
CpSc 810: Machine Learning Design a learning system.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
Introduction Many decision making problems in real life
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Machine Learning Chapter 11.
Computing & Information Sciences Kansas State University Wednesday, 13 Sep 2006CIS 490 / 730: Artificial Intelligence Lecture 9 of 42 Wednesday, 13 September.
 The most intelligent device - “Human Brain”.  The machine that revolutionized the whole world – “computer”.  Inefficiencies of the computer has lead.
Machine Learning Chapter 4. Artificial Neural Networks
Classification / Regression Neural Networks 2
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
Evaluation Function in Game Playing Programs M1 Yasubumi Nozawa Chikayama & Taura Lab.
Computing & Information Sciences Kansas State University Lecture 10 of 42 CIS 530 / 730 Artificial Intelligence Lecture 10 of 42 William H. Hsu Department.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Benk Erika Kelemen Zsolt
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Computing & Information Sciences Kansas State University Wednesday, 12 Sep 2007CIS 530 / 730: Artificial Intelligence Lecture 9 of 42 Wednesday, 12 September.
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
CHECKERS: TD(Λ) LEARNING APPLIED FOR DETERMINISTIC GAME Presented By: Presented To: Amna Khan Mis Saleha Raza.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
Chapter 1: Introduction. 2 목 차목 차 t Definition and Applications of Machine t Designing a Learning System  Choosing the Training Experience  Choosing.
Data Mining and Decision Support
Machine Learning 5. Parametric Methods.
1 Introduction to Machine Learning Chapter 1. cont.
Introduction Machine Learning: Chapter 1. Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed.
Introduction to Machine Learning © Roni Rosenfeld,
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Computing & Information Sciences Kansas State University CIS 530 / 730: Artificial Intelligence Lecture 09 of 42 Wednesday, 17 September 2008 William H.
Machine Learning BY UZMA TUFAIL MCS : section (E) ROLL NO: /31/2016.
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
1 Machine Learning Patricia J Riddle Computer Science 367 6/26/2016Machine Learning.
Supervise Learning Introduction. What is Learning Problem Learning = Improving with experience at some task – Improve over task T, – With respect to performance.
Data Mining Lecture 3.
Introduction to Machine Learning
Spring 2003 Dr. Susan Bridges
Data Mining Lecture 11.
Machine Learning Today: Reading: Maria Florina Balcan
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Perceptron as one Type of Linear Discriminants
Why Machine Learning Flood of data
Presentation transcript:

1 Some rules  No make-up exams ! If you miss with an official excuse, you get average of your scores in the other exams – at most once.  WP only-if you get at least 40% in the exams before you withdraw.  Grades (roughly): D, D+, C, C+, B, B+, A, A , 53-60, 61-65, 66-70, 71-75, 76-80, 81-85, > 85  Attendance: more than 9 absences  DN You get bonus upto 2 marks (to push up grade)  Absences < 4 and well-behaved

2 Some rules  Never believe in anything but "I can!" It always leads to  "I will",  "I did" and  "I'm glad!"

3 Ch 1: Introduction to ML - Outline  What is machine learning?  Why machine learning?  Well-defined learning problem  Example: checkers  Questions that should be asked about ML

4 What is Machine Learning? Definition: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience. Machine learning is the study of how to make computers learn; the goal is to make computers improve their performance through experience.

5 Successful Applications of ML  Learning to recognize spoken words SPHINX (Lee 1989)  Learning to drive an autonomous vehicle ALVINN (Pomerleau 1989)  Learning to classify celestial objects (Fayyad et al 1995)  Learning to play world-class backgammon TD-GAMMON (Tesauro 1992)  Designing the morphology and control structure of electro-mechanical artefacts GOLEM (Lipton, Pollock 2000)

6 Why Machine Learning?  Some tasks cannot be defined well, except by examples (e.g., recognizing people).  Relationships and correlations can be hidden within large amounts of data. Machine Learning/Data Mining may be able to find these relationships.  The amount of knowledge available about certain tasks might be too large for explicit encoding by humans (e.g., medical diagnostic).

7 Why Machine Learning?  Environments change over time.  New knowledge about tasks is constantly being discovered by humans. It may be difficult to continuously re-design systems “by hand”.  Time is right progress in algorithms & theory growing flood of online data computational power available budding industry

8 Why ML – 3 niches  Data mining medical records --> medical knowledge  Self customizing programs learning newsreader  Applications we can’t program autonomous driving speech recognition

9 Multidisciplinary Field MachineLearning Probability & Statistics ComputationalComplexityTheory InformationTheory Philosophy Neurobiology ArtificialIntelligence

10 Learning Problem  Improving with experience at some task task T performance measure P experience E  Example: Handwriting recognition T: recognize & classify handwritten words in images P: % of words correctly classified E: database with words & classifications

11 More examples...  Checkers T: play checkers P: % of games won E: play against self  Robot driving T: drive on a public highway using vision sensors P: average distance traveled before making an error E: sequence of images and steering commands (observed from a human driver)

12 Designing a Learning System: Problem Description – E.g., Checkers  What experience?  What exactly should be learned?  How to represent it?  What algorithm to learn it? 1. Choosing the Training Experience 2. Choosing the Target Function 3. Choosing a Representation for the Target Function 4. Choosing a Function Approximation Algorithm 5. Final Design

13 Type of Training Experience  Direct or indirect? Direct: board state -> correct move Indirect: outcome of a complete game  Move sequences & final outcomes  Credit assignment problem  Thus, more difficult to obtain  What degree of control over examples? Next slide  Is training experience representative of performance goal? Next slide

14 Training experience - control  Degree of control over examples rely on teacher (who selects informative board states & correct moves) ask teacher (proposes difficult board states, ask for move) complete control (play games against itself & check the outcome)  variations: experiment new states or play minor variations of a promising sequence

15 Training experience - training data  How well does it represent the problem?  Is the training experience representative of the task the system will actually have to solve? It is best if it is, but such a situation cannot systematically be achieved! Distribution of examples Same as future test examples? Most theory makes this assumption  Checkers Training playing against itself Performance evaluated playing against world champion

16 Choosing the Target function  Determine type of knowledge to be learned how this will be used  Checkers legal and best moves legal moves easy, best hard large class of tasks are like this

17 Target function  Program choosing the best move ChooseMove: Board -> Move  “improve P in T” reduces to finding a function  choice is a key decision  difficult to learn given (only) indirect examples Alternative: assign a numerical score  V: Board -> R  Assign a numerical score to each board.  Select the best move by evaluating all successor states of legal moves

18 Definitions for V  Final board states V(b) = 100 if winning, -100 if loosing and 0 if draw  Intermediate states? V(b) = V(b’) where b’ is the best final state accessible from b playing optimal game correct but not effectively computable

19 The real target function  Operational V can be used & computed goal: operational description of the ideal target function The ideal target function can often not be learned and must be approximated  Notation ^V: function actually learned V: the ideal target function

20 Choosing a representation for V  Many possibilities collection of rules, neural network, arithmetic function on board features, etc  Usual tradeoff: the more expressive the representation, the more training examples are necessary to choose among the large number of “representable” possibilities

21 Simple representation  Linear function of board features x1: black pieces, x2: red pieces x3: black kings, x4: red kings x5: black threatened by red x6: red threatened by black  ^V  0 +  1 x1 + … +  6 x6 wi: weights to be learned

22 Note  T, P & E are part of the specification  V and ^V are design choices  Here effect of choices is to reduce the learning problem  to finding numbers  0,…,  6

23 Approximation Algorithm  Obtaining training examples Vt(b): training value examples:  Follows procedure deriving from indirect experience weight adjusting procedure to fit ^V to examples

24 Estimating Vt(b)  Game was won/lost does not mean each state was good/bad early play good, late disaster -> loss  Simple approach: Vt(b) = ^V(b’) b’ is the next state where player is allowed to move surprisingly successful intuition: ^V is more accurate at states close game end

25 Adjusting weights  What is best fit to training data?  One common approach: minimize squared error E E = sum (Vt(b) - ^V(b)) 2 several algorithms known  Properties we want Incremental – in refining weights as examples arrive robust to errors in Vt(b)

26 LMS update rule  “Least mean squares”  REPEAT select a random example b compute error(b) = Vt(b) - ^V(b) For each board feature f i, update weight  i   i +  f i error(b)   : learning rate constant, approx. 0.1

27 Notes about LMS  Actually performs stochastic gradient descent search in the weight space to minimize E --- see Ch. 4  Why works no error: no adjusting pos/neg error: weight increased/decr. if a feature f i does not occur, no adjustment to its weight  i is made

28 Final Design  Four distinct modules 1. performance system gives trace for the given board state (using ^V) 2. critic produces examples Vtr(b), from the trace 3. generalizer produces ^V from training data 4. experiment generator generates new problems (initial board state) for ^V Expt. Gen. Perform. system General. Critic

29 Sequence of Design Choices Determine Type of Training Experience Games against experts Games against self Table of correct moves Board  Move Determine Target Function Board  Value Determine Representation of Learned Function polynomial Linear function of six features Artificial neural network Determine Learning Algorithm Gradient descentLinear programming

30 Useful perspective of ML  Search in space of hypotheses Usually a large space All 6-tuples for checkers!  find the one best fitting to examples and prior knowledge  Different spaces depending on the target function and representation  Space concept gives basis to formal analysis – size of the space, number of examples, confidence in the hypothesis…

31 Issues in ML  Algorithms What generalization methods exist? When (if ever) will they converge? Which are best for which types of problems and representations?  Amount of Training Data How much is sufficient? confidence, data & size of space  Reducing problems learning task --> function approximation

32 Issues in ML  Prior knowledge When & how can it help? Helpful even when approximate?  Choosing experiments Are there good strategies? How do choices affect complexity?  Flexible representations Can the learner automatically modify its representation for better performance?