Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 12 January 2007 William.

Slides:



Advertisements
Similar presentations
1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)
Advertisements

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, August 24, 1999 William.
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
Machine Learning Bob Durrant School of Computer Science
Machine Learning (Extended) Dr. Ata Kaban
1 Machine Learning Spring 2010 Rong Jin. 2 CSE847 Machine Learning  Instructor: Rong Jin  Office Hour: Tuesday 4:00pm-5:00pm Thursday 4:00pm-5:00pm.
Machine Learning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.
1 Some rules  No make-up exams ! If you miss with an official excuse, you get average of your scores in the other exams – at most once.  WP only-if you.
A Brief Survey of Machine Learning Used materials from: William H. Hsu Linda Jackson Lex Lane Tom Mitchell Machine Learning, Mc Graw Hill 1997 Allan Moser.
A Brief Survey of Machine Learning
Part I: Classification and Bayesian Learning
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Lecture 32 of 42 Machine Learning: Basic Concepts
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 19, 2001.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 25 January 2008 William.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Lecture 0 Friday, January.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
COMP3503 Intro to Inductive Modeling
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
CpSc 810: Machine Learning Design a learning system.
1 Machine Learning What is learning?. 2 Machine Learning What is learning? “That is what learning is. You suddenly understand something you've understood.
Computing & Information Sciences Kansas State University Wednesday, 13 Sep 2006CIS 490 / 730: Artificial Intelligence Lecture 9 of 42 Wednesday, 13 September.
Computing & Information Sciences Kansas State University Friday, 18 Jan 2008CIS 732 / 830: Machine Learning / Advanced Topics in AI Lecture 0 of 42 Friday,
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 19, 2000.
Computing & Information Sciences Kansas State University Lecture 10 of 42 CIS 530 / 730 Artificial Intelligence Lecture 10 of 42 William H. Hsu Department.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 Wednesday, 20 October.
CS 445/545 Machine Learning Winter, 2012 Course overview: –Instructor Melanie Mitchell –Textbook Machine Learning: An Algorithmic Approach by Stephen Marsland.
1 Machine Learning (Extended) Dr. Ata Kaban Algorithms to enable computers to learn –Learning = ability to improve performance automatically through experience.
Machine Learning (ML) and Knowledge Discovery in Databases (KDD) Instructor: Rich Maclin Texts: Machine Learning, Mitchell Notes based.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William.
Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 22 of 42 Wednesday, 22 October.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Computing & Information Sciences Kansas State University Friday, 10 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 33 of 42 Friday, 10 November.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Computing & Information Sciences Kansas State University Wednesday, 12 Sep 2007CIS 530 / 730: Artificial Intelligence Lecture 9 of 42 Wednesday, 12 September.
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 24, 2001.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, December 7, 1999 William.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
Chapter 1: Introduction. 2 목 차목 차 t Definition and Applications of Machine t Designing a Learning System  Choosing the Training Experience  Choosing.
Kansas State University Department of Computing and Information Sciences CIS 690: Data Mining Systems Lab 0 Monday, May 15, 2000 William H. Hsu Department.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Thursday, August 26, 1999 William.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 34 of 41 Wednesday, 10.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Friday, 14 November 2003 William.
Machine Learning Introduction. Class Info Office Hours –Monday:11:30 – 1:00 –Wednesday:10:00 – 1:00 –Thursday:11:30 – 1:00 Course Text –Tom Mitchell:
Data Mining and Decision Support
Kansas State University Department of Computing and Information Sciences CIS 690: Implementation of High-Performance Data Mining Systems Thursday, 20 May.
1 Introduction to Machine Learning Chapter 1. cont.
Introduction Machine Learning: Chapter 1. Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Computing & Information Sciences Kansas State University CIS 530 / 730: Artificial Intelligence Lecture 09 of 42 Wednesday, 17 September 2008 William H.
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
Machine Learning. Definition: The ability of a machine to improve its performance based on previous results.
Supervise Learning Introduction. What is Learning Problem Learning = Improving with experience at some task – Improve over task T, – With respect to performance.
Brief Intro to Machine Learning CS539
School of Computer Science & Engineering
Spring 2003 Dr. Susan Bridges
Analytical Learning Discussion (4 of 4):
Data Mining Lecture 11.
Data Mining: Concepts and Techniques Course Outline
Basic Intro Tutorial on Machine Learning and Data Mining
A Brief Survey of Machine Learning
Overview of Machine Learning
3.1.1 Introduction to Machine Learning
Why Machine Learning Flood of data
Presentation transcript:

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 12 January 2007 William H. Hsu Department of Computing and Information Sciences, KSU Readings: Chapter 1, Mitchell A Brief Survey of Machine Learning CIS 732/830: Lecture 0

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Lecture Outline Course Information: Format, Exams, Homework, Programming, Grading Class Resources: Course Notes, Web/News/Mail, Reserve Texts Overview –Topics covered –Applications –Why machine learning? Brief Tour of Machine Learning –A case study –A taxonomy of learning –Intelligent systems engineering: specification of learning problems Issues in Machine Learning –Design choices –The performance element: intelligent systems Some Applications of Learning –Database mining, reasoning (inference/decision support), acting –Industrial usage of intelligent systems

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Course Information and Administrivia Instructor: William H. Hsu – –Office phone: (785) x29; home phone: (785) ; ICQ –Office hours: Wed/Thu/Fri (213 Nichols); by appointment, Tue Grading –Assignments (8): 32%, paper reviews (13): 10%, midterm: 20%, final: 25% –Class participation: 5% –Lowest homework score and lowest 3 paper summary scores dropped Homework –Eight assignments: written (5) and programming (3) – drop lowest –Late policy: due on Fridays –Cheating: don’t do it; see class web page for policy Project Option –CIS 690/890 project option for graduate students –Term paper or semester research project; sign up by 09 Feb 2007

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Class Resources Web Page (Required) – –Lecture notes (MS PowerPoint XP, PDF) –Homeworks (MS PowerPoint XP, PDF) –Exam and homework solutions (MS PowerPoint XP, PDF) –Class announcements (students responsibility to follow) and grade postings Course Notes at Copy Center (Required) Course Web Group –Mirror of announcements from class web page –Discussions (instructor and other students) –Dated research announcements (seminars, conferences, calls for papers) Mailing List (Automatic) –Sign-up sheet –Reminders, related research announcements

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Course Overview Learning Algorithms and Models –Models: decision trees, linear threshold units (winnow, weighted majority), neural networks, Bayesian networks (polytrees, belief networks, influence diagrams, HMMs), genetic algorithms, instance-based (nearest-neighbor) –Algorithms (e.g., for decision trees): ID3, C4.5, CART, OC1 –Methodologies: supervised, unsupervised, reinforcement; knowledge-guided Theory of Learning –Computational learning theory (COLT): complexity, limitations of learning –Probably Approximately Correct (PAC) learning –Probabilistic, statistical, information theoretic results Multistrategy Learning: Combining Techniques, Knowledge Sources Data: Time Series, Very Large Databases (VLDB), Text Corpora Applications –Performance element: classification, decision support, planning, control –Database mining and knowledge discovery in databases (KDD) –Computer inference: learning to reason

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Why Machine Learning? New Computational Capability –Database mining: converting (technical) records into knowledge –Self-customizing programs: learning news filters, adaptive monitors –Learning to act: robot planning, control optimization, decision support –Applications that are hard to program: automated driving, speech recognition Better Understanding of Human Learning and Teaching –Cognitive science: theories of knowledge acquisition (e.g., through practice) –Performance elements: reasoning (inference) and recommender systems Time is Right –Recent progress in algorithms and theory –Rapidly growing volume of online data from various sources –Available computational power –Growth and interest of learning-based industries (e.g., data mining/KDD)

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Rule and Decision Tree Learning Example: Rule Acquisition from Historical Data Data –Patient 103 (time = 1): Age 23, First-Pregnancy: no, Anemia: no, Diabetes: no, Previous-Premature-Birth: no, Ultrasound: unknown, Elective C-Section: unknown, Emergency-C-Section: unknown –Patient 103 (time = 2): Age 23, First-Pregnancy: no, Anemia: no, Diabetes: yes, Previous-Premature-Birth: no, Ultrasound: abnormal, Elective C-Section: no, Emergency-C-Section: unknown –Patient 103 (time = n): Age 23, First-Pregnancy: no, Anemia: no, Diabetes: no, Previous-Premature-Birth: no, Ultrasound: unknown, Elective C-Section: no, Emergency-C-Section: YES Learned Rule –IFno previous vaginal delivery, AND abnormal 2nd trimester ultrasound, AND malpresentation at admission, AND no elective C-Section THENprobability of emergency C-Section is 0.6 –Training set: 26/41 = –Test set: 12/20 = 0.600

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Neural Network Learning Autonomous Learning Vehicle In a Neural Net (ALVINN): Pomerleau et al – –Drives 70mph on highways

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Relevant Disciplines Artificial Intelligence Bayesian Methods Cognitive Science Computational Complexity Theory Control Theory Information Theory Neuroscience Philosophy Psychology Statistics Machine Learning Symbolic Representation Planning/Problem Solving Knowledge-Guided Learning Bayes’s Theorem Missing Data Estimators PAC Formalism Mistake Bounds Language Learning Learning to Reason Optimization Learning Predictors Meta-Learning Entropy Measures MDL Approaches Optimal Codes ANN Models Modular Learning Occam’s Razor Inductive Generalization Power Law of Practice Heuristic Learning Bias/Variance Formalism Confidence Intervals Hypothesis Testing

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Specifying A Learning Problem Learning = Improving with Experience at Some Task –Improve over task T, –with respect to performance measure P, –based on experience E. Example: Learning to Play Checkers –T: play games of checkers –P: percent of games won in world tournament –E: opportunity to play against self Refining the Problem Specification: Issues –What experience? –What exactly should be learned? –How shall it be represented? –What specific algorithm to learn it? Defining the Problem Milieu –Performance element: How shall the results of learning be applied? –How shall the performance element be evaluated? The learning system?

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Example: Learning to Play Checkers Type of Training Experience –Direct or indirect? –Teacher or not? –Knowledge about the game (e.g., openings/endgames)? Problem: Is Training Experience Representative (of Performance Goal)? Software Design –Assumptions of the learning system: legal move generator exists –Software requirements: generator, evaluator(s), parametric target function Choosing a Target Function –ChooseMove: Board  Move (action selection function, or policy) –V: Board  R (board evaluation function) –Ideal target V; approximated target –Goal of learning process: operational description (approximation) of V

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition A Target Function for Learning to Play Checkers Possible Definition –If b is a final board state that is won, then V(b) = 100 –If b is a final board state that is lost, then V(b) = -100 –If b is a final board state that is drawn, then V(b) = 0 –If b is not a final board state in the game, then V(b) = V(b’) where b’ is the best final board state that can be achieved starting from b and playing optimally until the end of the game –Correct values, but not operational Choosing a Representation for the Target Function –Collection of rules? –Neural network? –Polynomial function (e.g., linear, quadratic combination) of board features? –Other? A Representation for Learned Function – –bp/rp = number of black/red pieces; bk/rk = number of black/red kings; bt/rt = number of black/red pieces threatened (can be taken on next turn)

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition A Training Procedure for Learning to Play Checkers Obtaining Training Examples –the target function – the learned function – the training value One Rule For Estimating Training Values: – Choose Weight Tuning Rule –Least Mean Square (LMS) weight update rule: REPEAT Select a training example b at random Compute the error(b) for this training example For each board feature f i, update weight w i as follows: where c is a small, constant factor to adjust the learning rate

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Design Choices for Learning to Play Checkers Completed Design Determine Type of Training Experience Games against experts Games against self Table of correct moves Determine Target Function Board  valueBoard  move Determine Representation of Learned Function Polynomial Linear function of six features Artificial neural network Determine Learning Algorithm Gradient descent Linear programming

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Some Issues in Machine Learning What Algorithms Can Approximate Functions Well?When? How Do Learning System Design Factors Influence Accuracy? –Number of training examples –Complexity of hypothesis representation How Do Learning Problem Characteristics Influence Accuracy? –Noisy data –Multiple data sources What Are The Theoretical Limits of Learnability? How Can Prior Knowledge of Learner Help? What Clues Can We Get From Biological Learning Systems? How Can Systems Alter Their Own Representation?

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Interesting Applications Reasoning (Inference, Decision Support) Cartia ThemeScapes news stories from the WWW in 1997 Planning, Control Normal Ignited Engulfed Destroyed Extinguished Fire Alarm Flooding DC-ARM - Database Mining NCSA D2K -

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Lecture Outline Read: Chapter 2, Mitchell; Section 5.1.2, Buchanan and Wilkins Suggested Exercises: 2.2, 2.3, 2.4, 2.6 Taxonomy of Learning Systems Learning from Examples –(Supervised) concept learning framework –Simple approach: assumes no noise; illustrates key concepts General-to-Specific Ordering over Hypotheses –Version space: partially-ordered set (poset) formalism –Candidate elimination algorithm –Inductive learning Choosing New Examples Next Week –The need for inductive bias: 2.7, Mitchell; , Shavlik and Dietterich –Computational learning theory (COLT): Chapter 7, Mitchell –PAC learning formalism: , Mitchell; 2.4.2, Shavlik and Dietterich

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition What to Learn? Classification Functions –Learning hidden functions: estimating (“fitting”) parameters –Concept learning (e.g., chair, face, game) –Diagnosis, prognosis: medical, risk assessment, fraud, mechanical systems Models –Map (for navigation) –Distribution (query answering, aka QA) –Language model (e.g., automaton/grammar) Skills –Playing games –Planning –Reasoning (acquiring representation to use in reasoning) Cluster Definitions for Pattern Recognition –Shapes of objects –Functional or taxonomic definition Many Problems Can Be Reduced to Classification

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition How to Learn It? Supervised –What is learned? Classification function; other models –Inputs and outputs? Learning: –How is it learned? Presentation of examples to learner (by teacher) Unsupervised –Cluster definition, or vector quantization function (codebook) –Learning: –Formation, segmentation, labeling of clusters based on observations, metric Reinforcement –Control policy (function from states of the world to actions) –Learning: –(Delayed) feedback of reward values to agent based on actions selected; model updated based on reward, (partially) observable state

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition (Supervised) Concept Learning Given: Training Examples of Some Unknown Function f Find: A Good Approximation to f Examples (besides Concept Learning) –Disease diagnosis x = properties of patient (medical history, symptoms, lab tests) f = disease (or recommended therapy) –Risk assessment x = properties of consumer, policyholder (demographics, accident history) f = risk level (expected cost) –Automatic steering x = bitmap picture of road surface in front of vehicle f = degrees to turn the steering wheel –Part-of-speech tagging –Fraud/intrusion detection –Web log analysis –Multisensor integration and prediction

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition A Learning Problem Unknown Function x1x1 x2x2 x3x3 x4x4 y = f (x 1, x 2, x 3, x 4 ) x i : t i, y: t, f: (t 1  t 2  t 3  t 4 )  t Our learning function: Vector (t 1  t 2  t 3  t 4  t)  (t 1  t 2  t 3  t 4 )  t

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Hypothesis Space: Unrestricted Case | A  B | = | B | | A | | H 4  H | = | {0,1}  {0,1}  {0,1}  {0,1}  {0,1} | = = function values Complete Ignorance: Is Learning Possible? –Need to see every possible input/output pair –After 7 examples, still have 2 9 = 512 possibilities (out of 65536) for f

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Training Examples for Concept EnjoySport Specification for Examples –Similar to a data type definition –6 attributes: Sky, Temp, Humidity, Wind, Water, Forecast –Nominal-valued (symbolic) attributes - enumerative data type Binary (Boolean-Valued or H -Valued) Concept Supervised Learning Problem: Describe the General Concept

Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Representing Hypotheses Many Possible Representations Hypothesis h: Conjunction of Constraints on Attributes Constraint Values –Specific value (e.g., Water = Warm) –Don’t care (e.g., “Water = ?”) –No value allowed (e.g., “Water = Ø”) Example Hypothesis for EnjoySport –SkyAirTempHumidityWindWater Forecast –Is this consistent with the training examples? –What are some hypotheses that are consistent with the examples?