INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

CHAPTER 1: Introduction
Introduction to Machine Learning BITS C464/BITS F464
Machine Learning CSE 681 CH1 - INTRODUCTION. INTRODUCTION TO Machine Learning 2nd Edition ETHEM ALPAYDIN © The MIT Press, 2010
Godfather to the Singularity
INTRODUCTION TO MACHINE LEARNING David Kauchak CS 451 – Fall 2013.
ECE 8443 – Pattern Recognition Objectives: Course Introduction Typical Applications Resources: Syllabus Internet Books and Notes D.H.S: Chapter 1 Glossary.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
MACHINE LEARNING 1. Introduction. What is Machine Learning? Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Need.
Introduction to Machine Learning Anjeli Singh Computer Science and Software Engineering April 28 th 2008.
Chapter 1: Introduction to Pattern Recognition
OUTLINE Course description, What is pattern recognition, Cost of error, Decision boundaries, The desgin cycle.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Machine Learning CMPUT 466/551
Pattern Classification, Chapter 1 1 Basic Probability.
Introduction to Machine Learning course fall 2007 Lecturer: Amnon Shashua Teaching Assistant: Yevgeny Seldin School of Computer Science and Engineering.
Learning Programs Danielle and Joseph Bennett (and Lorelei) 4 December 2007.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Part I: Classification and Bayesian Learning
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
INTRODUCTION TO Machine Learning 3rd Edition
Introduction to machine learning
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Introduction to Pattern Recognition Charles Tappert Seidenberg School of CSIS, Pace University.
Introduction to Pattern Recognition Chapter 1 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis 1.
MACHINE LEARNING 張銘軒 譚恆力 1. OUTLINE OVERVIEW HOW DOSE THE MACHINE “ LEARN ” ? ADVANTAGE OF MACHINE LEARNING ALGORITHM TYPES  SUPERVISED.
Classification. An Example (from Pattern Classification by Duda & Hart & Stork – Second Edition, 2001)
IBS-09-SL RM 501 – Ranjit Goswami 1 Basic Probability.
CpSc 881: Machine Learning Introduction. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various sources.
Perception Introduction Pattern Recognition Image Formation
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Compiled By: Raj G Tiwari.  A pattern is an object, process or event that can be given a name.  A pattern class (or category) is a set of patterns sharing.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Yazd University, Electrical and Computer Engineering Department Course Title: Advanced Software Engineering By: Mohammad Ali Zare Chahooki 1 Introduction.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
CHAPTER 1: Introduction. 2 Why “Learn”? Machine learning is programming computers to optimize a performance criterion using example data or past experience.
Introduction to Pattern Recognition (การรู้จํารูปแบบเบื้องต้น)
يادگيري ماشين Machine Learning Lecturer: A. Rabiee
Data Mining and Decision Support
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Artificial Intelligence
Machine Learning for Computer Security
Pattern Classification Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000 Dr. Ding Yuxin Pattern Recognition.
Machine Learning overview Chapter 18, 21
Introduction Machine Learning 14/02/2017.
IMAGE PROCESSING RECOGNITION AND CLASSIFICATION
Eick: Introduction Machine Learning

LECTURE 01: COURSE OVERVIEW
Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas
CH. 1: Introduction 1.1 What is Machine Learning Example:
What is Pattern Recognition?
Introduction to Pattern Recognition and Machine Learning
Introduction to Pattern Recognition
Data Warehousing and Data Mining
An Introduction to Supervised Learning
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Machine Learning” Lecture 1
Overview of Machine Learning
LECTURE 01: COURSE OVERVIEW
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Christoph F. Eick: A Gentle Introduction to Machine Learning
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
INTRODUCTION TO Machine Learning
Machine learning: What is it?
Presentation transcript:

INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

CHAPTER 1: Introduction

3 Why “Learn” ? Machine learning is programming computers to optimize a performance criterion using example data or past experience. There is no need to “learn” to calculate payroll Learning is used when:  Human expertise does not exist (navigating on Mars),  Humans are unable to explain their expertise (speech recognition)  Solution changes in time (routing on a computer network)  Solution needs to be adapted to particular cases (user biometrics)

4 What We Talk About When We Talk About“Learning” Learning general models from a data of particular examples Data is cheap and abundant (data warehouses, data marts); knowledge is expensive and scarce. Example in retail: Customer transactions to consumer behavior: People who bought “Introduction to Machine Learning (Adaptive Computation and Machine Learning)” also bought “Data Mining: Practical Machine Learning Tools and Techniques ” ( Build a model that is a good and useful approximation to the data.

5 Data Mining Retail: Market basket analysis, Customer relationship management (CRM) Finance: Credit scoring, fraud detection Manufacturing: Optimization, troubleshooting Medicine: Medical diagnosis Telecommunications: Quality of service optimization Bioinformatics: Motifs, alignment Web mining: Search engines...

6 What is Machine Learning? Optimize a performance criterion using example data or past experience. Role of Statistics: Inference from a sample Role of Computer science: Efficient algorithms to  Solve the optimization problem  Representing and evaluating the model for inference

7 What is Machine Learning? An Abstract Learning Machine (Algorithm) TRAINING DATA Answer Trained Machine Query

8 Some Learning Machines Linear models Kernel methods Neural networks Markov Model Decision trees …

9 Some Area of Applications inputs Training examples Bioinformatics Ecology OCR HWR Market Analysis Text Categorization Machine Vision System diagnosis

10 Bioinformatics: Biomedical / Biometrics Medicine:  Screening  Diagnosis and prognosis  Drug discovery Security:  Face recognition  Signature / fingerprint / iris verification  DNA fingerprinting 6

11 Computer / Internet Computer interfaces:  Troubleshooting wizards  Handwriting and speech  Brain waves Internet  Hit ranking  Spam filtering  Text categorization  Text translation  Recommendation 7

Application Types

13 Application Types Learning Association Supervised Learning  Classification  Regression Unsupervised Learning Reinforcement Learning (unsupervised)

14 Learning Associations Basket analysis: P (Y | X ) probability that somebody who buys X also buys Y where X and Y are products/services. Example: P ( pen | note book ) = 0.7

15 Application Types Learning Association Supervised Learning  Classification  Regression Unsupervised Learning Reinforcement Learning (unsupervised)

16 Classification Example: Credit scoring Differentiating between low-risk and high-risk customers from their income and savings Discriminant: IF income > θ 1 AND savings > θ 2 THEN low-risk ELSE high-risk

17 Classification: Applications Face recognition: Pose, lighting, occlusion (glasses, beard), make-up, hair style Character recognition: Different handwriting styles. Speech recognition: Temporal dependency.  Use of a dictionary or the syntax of the language.  Sensor fusion: Combine multiple modalities; eg, visual (lip image) and acoustic for speech Medical diagnosis: From symptoms to illnesses...

18 Face Recognition Training examples of a person Test images AT&T Laboratories, Cambridge UK

19 Example 1 “Sorting incoming Fish on a conveyor according to species using optical sensing” Sea bass Species Salmon

20 Problem Analysis  Set up a camera and take some sample images to extract features: Length Lightness Width Number and shape of fins Position of the mouth, etc… This is the set of all suggested features to explore for use in our classifier!

21 Preprocessing  Use a segmentation operation to isolate fishes from one another and from the background. Information from a single fish is sent to a feature extractor whose purpose is to reduce the data by measuring certain features. The features are passed to a classifier.

22

23 Classification  Select the length of the fish as a possible feature for discrimination The length is a poor feature alone! Select the lightness as a possible feature.

24 Classification  Select the lightness of the fish as a possible feature for discrimination  Move decision boundary toward smaller values of lightness in order to minimize the cost (reduce the number of sea bass that are classified salmon!) Task of decision theory

25 Adopt the lightness and add the width of the fish Fish x T = [x 1, x 2 ] lightness width (length) We might add other features that are not correlated with the ones we already have.

26 Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure:

27 However, our satisfaction is premature because the central aim of designing a classifier is to correctly classify novel input Issue of generalization!

28 Example: Particle Classification Particles on an air filter P1: P2: P3:

29 Sample data set Features Class

30 Histogram Analysis P1 P2 P1:P2:P3: Area Number P2 Area Number P3

31 Histogram Analysis P2 P3 P1:P2:P3: Perimeter Number Area Perimeter P1 P2 P3

32 Input Space Partitioning P1:P2:P3: Area Perimeter P1 P2 P3 Area Perimeter P1 P2 P3

33 Classification Systems Sensing  Use of a transducer (camera or microphone)  PR system depends of the bandwidth, the resolution sensitivity distortion of the transducer Segmentation and grouping  Patterns should be well separated and should not overlap

34

35 Feature extraction  Discriminative features  Invariant features with respect to translation, rotation and scale. Classification  Use a feature vector provided by a feature extractor to assign the object to a category Post Processing  Exploit context input dependent information other than from the target pattern itself to improve performance

36 The Design Cycle Data collection Feature Choice Model Choice Training Evaluation Computational Complexity

37

38 Data Collection  How do we know when we have collected an adequately large and representative set of examples for training and testing the system? Feature Choice  Depends on the characteristics of the problem domain. Simple to extract, invariant to irrelevant transformation insensitive to noise. Model Choice  Unsatisfied with the performance of our fish classifier and want to jump to another class of model.

39 Training  Use data to determine the classifier. Many different procedures for training classifiers and choosing models Evaluation  Measure the error rate (or performance and switch from one set of features to another one

40 Application Types Learning Association Supervised Learning  Classification  Regression Unsupervised Learning Reinforcement Learning (unsupervised)

41 Regression [start Matlab demo lecture2.m] Given examples Predict given a new point

42 Linear regression Prediction

43 Regression - Example Price of a used car x : car attributes y : price y = g (x | θ  ) g ( ) : model, θ : parameters (car type or model or options … y = wx+w 0

44 Some Regression Applications Navigating a car: Angle of the steering wheel Kinematics of a robot arm α 1 = g 1 (x,y) α 2 = g 2 (x,y) α1α1 α2α2 (x,y)(x,y) Response surface design

45 Response Surface Design Response Surface Design is useful for the modeling and analysis of programs in which a response of interest is influenced by several variables and the objective is to optimize this response. For example: Find the levels of temperature (x 1 ) and pressure (x 2 ) to maximize the yield (y) of a process.

46 Application Types Learning Association Supervised Learning  Classification  Regression Unsupervised Learning Reinforcement Learning (unsupervised)

47 Unsupervised Learning Learning “what normally happens” No output Clustering: Grouping similar instances Example applications  Customer segmentation in CRM  Image compression: Color quantization  Bioinformatics: Learning motifs A motif is a pattern common to a set of nucleic or amino acid subsequences which share some biological property of interest (such as being DNA binding sites for a regulatory protein).

48 Application Types Learning Association Supervised Learning  Classification  Regression Unsupervised Learning Reinforcement Learning (unsupervised)

49 Reinforcement Learning Learning a policy: A sequence of outputs No supervised output but delayed reward Credit assignment problem Game playing Robot in a maze Multiple agents, partial observability,...

50 Reinforcement Learning (RL) In RL problems, an agent interacts with an environment and iteratively learns a policy, i.e., it maps actions to states to maximize a reward signal. Agent Environment action a t state s t+1 reward r t+1 Update V(s t ) or Q(s t,a t )

51 Learning Autonomous Agent Environment ActionsPerceptions

52 Learning Autonomous Agent Environment ActionsPerceptions Delayed Reward

53 Learning Autonomous Agent Environment ActionsPerceptions New behavior Delayed Reward The Reward won’t be given immediately after agent’s action. Usually, it will be given only after achieving the goal. This delayed reward is the only clue to agent’s learning.

54 Resources: Datasets UCI Repository: UCI KDD Archive: Statlib: Delve:

55 Resources: Journals Journal of Machine Learning Research Machine Learning Neural Computation Neural Networks IEEE Transactions on Neural Networks IEEE Transactions on Pattern Analysis and Machine Intelligence Annals of Statistics Journal of the American Statistical Association...

56 Resources: Conferences International Conference on Machine Learning (ICML)  ICML05: European Conference on Machine Learning (ECML)  ECML05: Neural Information Processing Systems (NIPS)  NIPS05: Uncertainty in Artificial Intelligence (UAI)  UAI05: Computational Learning Theory (COLT)  COLT05: International Joint Conference on Artificial Intelligence (IJCAI)  IJCAI05: International Conference on Neural Networks (Europe)  ICANN05: