Machine Learning CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.

Slides:



Advertisements
Similar presentations
Machine learning Overview
Advertisements

1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)
Machine Learning: Intro and Supervised Classification
1 Machine Learning: Lecture 3 Decision Tree Learning (Based on Chapter 3 of Mitchell T.., Machine Learning, 1997)
Machine Learning III Decision Tree Induction
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
CS 4700: Foundations of Artificial Intelligence
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
1er. Escuela Red ProTIC - Tandil, de Abril, Introduction How to program computers to learn? Learning: Improving automatically with experience.
Bayesian Learning Rong Jin. Outline MAP learning vs. ML learning Minimum description length principle Bayes optimal classifier Bagging.
Machine Learning II Decision Tree Induction CSE 473.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18 Fall 2004.
CSE 5731 Lecture 26 Introduction to Machine Learning CSE 573 Artificial Intelligence I Henry Kautz Fall 2001.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
1 Some rules  No make-up exams ! If you miss with an official excuse, you get average of your scores in the other exams – at most once.  WP only-if you.
A Brief Survey of Machine Learning
Adversarial Search: Game Playing Reading: Chess paper.
Learning Programs Danielle and Joseph Bennett (and Lorelei) 4 December 2007.
1 Inductive Learning of Rules MushroomEdible? SporesSpots Color YN BrownN YY GreyY NY BlackY NN BrownN YN WhiteN YY BrownY YN Brown NN Red Don’t try this.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
Part I: Classification and Bayesian Learning
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang.
Machine Learning Chapter 3. Decision Tree Learning
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
COMP3503 Intro to Inductive Modeling
For Friday Read chapter 18, sections 3-4 Homework: –Chapter 14, exercise 12 a, b, d.
CpSc 810: Machine Learning Design a learning system.
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Machine Learning Chapter 11.
Machine Learning CSE 681 CH2 - Supervised Learning.
General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes.
Learning from Observations Chapter 18 Through
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Learning from observations
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Intro Learning (Reading: Chapter.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
Chapter 1: Introduction. 2 목 차목 차 t Definition and Applications of Machine t Designing a Learning System  Choosing the Training Experience  Choosing.
Machine Learning II Decision Tree Induction CSE 573.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Data Mining and Decision Support
1 Text Categorization CSE Categorization Given: –A description of an instance, x  X, where X is the instance language or instance space. –A fixed.
1 Introduction to Machine Learning Chapter 1. cont.
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
Introduction Machine Learning: Chapter 1. Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Chapter 18 Section 1 – 3 Learning from Observations.
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Learning From Observations Inductive Learning Decision Trees Ensembles.
Machine Learning & Datamining CSE 454. © Daniel S. Weld 2 Project Part 1 Feedback Serialization Java Supplied vs. Manual.
1 Machine Learning Patricia J Riddle Computer Science 367 6/26/2016Machine Learning.
Supervise Learning Introduction. What is Learning Problem Learning = Improving with experience at some task – Improve over task T, – With respect to performance.
Data Mining Lecture 3.
Spring 2003 Dr. Susan Bridges
Chapter 11: Learning Introduction
Knowledge Representation
Computational Learning Theory
Overview of Machine Learning
Computational Learning Theory
Machine Learning CSE 454.
Why Machine Learning Flood of data
CS639: Data Management for Data Science
Lecture 14 Learning Inductive inference
Learning from Observations
Presentation transcript:

Machine Learning CSE 473

© Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning Supervised Learning LogicProbability

© Daniel S. Weld 3 Machine Learning Outline Machine learning: What & why? Bias Supervised learning Classifiers A supervised learning technique in depth Induction of Decision Trees Ensembles of classifiers Overfitting

© Daniel S. Weld 4 Why Machine Learning Flood of data WalMart – 25 Terabytes WWW – 1,000 Terabytes Speed of computer vs. of programming Highly complex systems (telephone switching systems) Productivity = 1 line programmer Desire for customization A browser that browses by itself? Hallmark of Intelligence How do children learn language?

© Daniel S. Weld 5 Applications of ML Credit card fraud Product placement / consumer behavior Recommender systems Speech recognition Most mature & successful area of AI

© Daniel S. Weld 6 Examples of Learning Baby touches stove, gets burned, next time… Medical student is shown cases of people with disease X, learns which symptoms… How many groups of dots?

© Daniel S. Weld 7 What is Machine Learning??

© Daniel S. Weld 8 Defining a Learning Problem Task T: Playing checkers Performance Measure P: Percent of games won against opponents Experience E: Playing practice games against itself A program is said to learn from experience E with respect to task T and performance measure P, if it’s performance at tasks in T, as measured by P, improves with experience E.

© Daniel S. Weld 9 Last Class? Task T: Programming by demonstration Performance Measure P: Experience E: A program is said to learn from experience E with respect to task T and performance measure P, if it’s performance at tasks in T, as measured by P, improves with experience E.

© Daniel S. Weld 10 Issues What feedback (experience) is available? How should these features be represented? What kind of knowledge is being increased? How is that knowledge represented? What prior information is available? What is the right learning algorithm? How avoid overfitting?

© Daniel S. Weld 11 Choosing the Training Experience Credit assignment problem: Direct training examples: E.g. individual checker boards + correct move for each Supervised learning Indirect training examples : E.g. complete sequence of moves and final result Reinforcement learning Which examples: Random, teacher chooses, learner chooses

© Daniel S. Weld 12 Choosing the Target Function What type of knowledge will be learned? How will the knowledge be used by the performance program? E.g. checkers program Assume it knows legal moves Needs to choose best move So learn function: F: Boards -> Moves hard to learn Alternative: F: Boards -> R Note similarity to choice of problem space

© Daniel S. Weld 13 The Ideal Evaluation Function V(b) = 100 if b is a final, won board V(b) = -100 if b is a final, lost board V(b) = 0 if b is a final, drawn board Otherwise, if b is not final V(b) = V(s) where s is best, reachable final board Nonoperational… Want operational approximation of V: V

© Daniel S. Weld 14 How Represent Target Function x 1 = number of black pieces on the board x 2 = number of red pieces on the board x 3 = number of black kings on the board x 4 = number of red kings on the board x 5 = num of black pieces threatened by red x 6 = num of red pieces threatened by black V(b) = a + bx 1 + cx 2 + dx 3 + ex 4 + fx 5 + gx 6 Now just need to learn 7 numbers!

© Daniel S. Weld 15 Example: Checkers Task T: Playing checkers Performance Measure P: Percent of games won against opponents Experience E: Playing practice games against itself Target Function V: board -> R Representation of approx. of target function V(b) = a + bx1 + cx2 + dx3 + ex4 + fx5 + gx6

© Daniel S. Weld 16 Target Function Profound Formulation: Can express any type of inductive learning as approximating a function E.g., Checkers V: boards -> evaluation E.g., Handwriting recognition V: image -> word E.g., Mushrooms V: mushroom-attributes -> {E, P}

© Daniel S. Weld 17 More Examples

© Daniel S. Weld 18 More Examples Collaborative Filtering Eg, when you look at book B in Amazon It says “Buy B and also book C together & save!” Automatic Steering

© Daniel S. Weld 19 Supervised Learning Inductive learning or “Prediction”: Given examples of a function (X, F(X)) Predict function F(X) for new examples X Classification F(X) = Discrete Regression F(X) = Continuous Probability estimation F(X) = Probability(X): Task Performance Measure Experience

© Daniel S. Weld 20 Why is Learning Possible? Experience alone never justifies any conclusion about any unseen instance. Learning occurs when PREJUDICE meets DATA! Learning a “FOO”

© Daniel S. Weld 21 Bias The nice word for prejudice is “bias”. What kind of hypotheses will you consider? What is allowable range of functions you use when approximating? What kind of hypotheses do you prefer?

© Daniel S. Weld 22 Some Typical Bias The world is simple Occam’s razor “It is needless to do more when less will suffice” – William of Occam, died 1349 of the Black plague MDL – Minimum description length Concepts can be approximated by... conjunctions of predicates... by linear functions... by short decision trees

© Daniel S. Weld 23

© Daniel S. Weld 24

© Daniel S. Weld 25

© Daniel S. Weld 26

© Daniel S. Weld 27 Two Strategies for ML Restriction bias: use prior knowledge to specify a restricted hypothesis space. Version space algorithm over conjunctions. Preference bias: use a broad hypothesis space, but impose an ordering on the hypotheses. Decision trees.

© Daniel S. Weld 28

© Daniel S. Weld 29