1 Pattern Classification X. 2 Content General Method K Nearest Neighbors Decision Trees Nerual Networks.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Slides from: Doug Gray, David Poole
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
Introduction to Neural Networks Computing
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Neural Network I Week 7 1. Team Homework Assignment #9 Read pp. 327 – 334 and the Week 7 slide. Design a neural network for XOR (Exclusive OR) Explore.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
Support Vector Machines
Intelligent Environments1 Computer Science and Engineering University of Texas at Arlington.
Classification Neural Networks 1
Machine Learning Neural Networks
Overview over different methods – Supervised Learning
Soft computing Lecture 6 Introduction to neural networks.
Neural NetworksNN 11 Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Simple Neural Nets For Pattern Classification
A Review: Architecture
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
20.5 Nerual Networks Thanks: Professors Frank Hoffmann and Jiawei Han, and Russell and Norvig.
Connectionist Modeling Some material taken from cspeech.ucd.ie/~connectionism and Rich & Knight, 1991.
An Illustrative Example
Artificial Neural Networks
Before we start ADALINE
Data Mining with Neural Networks (HK: Chapter 7.5)
Aula 4 Radial Basis Function Networks
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Radial-Basis Function Networks
Neurons, Neural Networks, and Learning 1. Human brain contains a massively interconnected net of (10 billion) neurons (cortical cells) Biological.
Artificial Neural Networks
Computer Science and Engineering
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
DIGITAL IMAGE PROCESSING Dr J. Shanbehzadeh M. Hosseinajad ( J.Shanbehzadeh M. Hosseinajad)
CS464 Introduction to Machine Learning1 Artificial N eural N etworks Artificial neural networks (ANNs) provide a general, practical method for learning.
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic Algorithms (GAs)
K Nearest Neighbors Classifier & Decision Trees
Artificial Neural Networks An Introduction. What is a Neural Network? A human Brain A porpoise brain The brain in a living creature A computer program.
CS344 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29 Introducing Neural Nets.
So Far……  Clustering basics, necessity for clustering, Usage in various fields : engineering and industrial fields  Properties : hierarchical, flat,
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Chapter 2 Single Layer Feedforward Networks
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Supervised learning network G.Anuradha. Learning objectives The basic networks in supervised learning Perceptron networks better than Hebb rule Single.
COMP53311 Other Classification Models: Neural Network Prepared by Raymond Wong Some of the notes about Neural Network are borrowed from LW Chan’s notes.
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
“Principles of Soft Computing, 2 nd Edition” by S.N. Sivanandam & SN Deepa Copyright  2011 Wiley India Pvt. Ltd. All rights reserved. CHAPTER 2 ARTIFICIAL.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Other Classification Models: Neural Network
Other Classification Models: Neural Network
Artificial Neural Networks
CSSE463: Image Recognition Day 17
Data Mining with Neural Networks (HK: Chapter 7.5)
Classification Neural Networks 1
CSSE463: Image Recognition Day 17
Lecture Notes for Chapter 4 Artificial Neural Networks
CSSE463: Image Recognition Day 17
CSSE463: Image Recognition Day 17
CSSE463: Image Recognition Day 17
Seminar on Machine Learning Rada Mihalcea
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

1 Pattern Classification X

2 Content General Method K Nearest Neighbors Decision Trees Nerual Networks

General Method Training  Learning knowledge or parameters Testing  Applying learned to new instance 3

4 KNN in Digit Recognition

5 K Nearest Neighbors  Advantage Nonparametric architecture Simple Powerful Requires no training time  Disadvantage Memory intensive Classification/estimation is slow

6 K Nearest Neighbors The key issues involved in training this model includes setting  the variable K Validation techniques (ex. Cross validation)  the type of distant metric Euclidean measure

7 Figure K Nearest Neighbors Example X Stored training set patterns X input pattern for classification --- Euclidean distance measure to the nearest three patterns

8 Store all input data in the training set For each pattern in the test set Search for the K nearest patterns to the input pattern using a Euclidean distance measure For classification, compute the confidence for each class as C i /K, (where C i is the number of patterns among the K nearest patterns belonging to class i.) The classification for the input pattern is the class with the highest confidence.

9 Training parameters and typical settings Number of nearest neighbors  The numbers of nearest neighbors (K) should be based on cross validation over a number of K setting.  When k=1 is a good baseline model to benchmark against.  A good rule-of-thumb numbers is k should be less than the square root of the total number of training patterns.

10 Training parameters and typical settings Input compression  Since KNN is very storage intensive, we may want to compress data patterns as a preprocessing step before classification.  Using input compression will result in slightly worse performance.  Sometimes using compression will improve performance because it performs automatic normalization of the data which can equalize the effect of each input in the Euclidean distance measure.

11CPC group SeminarThursday, June 1, 2006 Euclidean distance metric fails Pattern to be classifiedPrototype APrototype B  Prototype B seems more similar than Prototype A according to Euclidean distance.  Digit “9” misclassified as “4”.  Possible solution is to use an distance metric invariant to irrelevant transformations.

12 Decision trees Decision trees are popular for pattern recognition because the models they produce are easier to understand. Root node AA B B BB A.Nodes of the tree B.Leaves (terminal nodes) of the tree C.Branches (decision point) of the tree C

13 Decision trees -Binary decision trees Classification of an input vector is done by traversing the tree beginning at the root node, and ending the leaf. Each node of the tree computes an inequality (ex. BMI<24, yes or no) based on a single input variable. Each leaf is assigned to a particular class. Yes No Yes No Yes BMI<24

14 Decision trees -Binary decision trees Since each inequality that is used to split the input space is only based on one input variable. Each node draws a boundary that can be geometrically interpreted as a hyperplane perpendicular to the axis. BC

15 Decision trees -Linear decision trees Linear decision trees are similar to binary decision trees, except that the inequality computed at each node takes on an arbitrary linear from that may depend on multiple variables. aX1+bX2 Yes No Yes No Yes

Biological Neural Systems Neuron switching time : > secs Number of neurons in the human brain: ~10 10 Connections (synapses) per neuron : ~10 4 –10 5 Face recognition : 0.1 secs High degree of distributed and parallel computation  Highly fault tolerent  Highly efficient  Learning is key

Excerpt from Russell and Norvig

A Neuron Computation:  input signals  input function(linear)  activation function(nonlinear)  output signal ajaj output links  akak output Input links WkjWkj a i = output(in j ) in j j

Part 1. Perceptrons: Simple NN  x1x1 x2x2 xnxn w1w1 w2w2 wnwn a=  i=1 n w i x i Xi’s range: [0, 1] 1 if a   y = 0 if a <  y { inputs weights activation output 

Decision Surface of a Perceptron x1x1 x2x2 Decision line w 1 x 1 + w 2 x 2 =  w

Linear Separability x1x1 x2x Logical AND x1x1 x2x2 ay w 1 =1 w 2 =1  =1.5 x1x w 1 =? w 2 =?  = ? 1 Logical XOR x1x1 x2x2 y

Threshold as Weight: W0  x1x1 x2x2 xnxn w1w1 w2w2 wnwn w0w0 x 0 =-1 a=  i=0 n w i x i y 1 if a   y= 0 if a <  {  =w 0 Thus, y= sgn(a)=0 or 1

Perceptron Learning Rule w’=w +  (t-y) x w i := w i +  w i = w i +  (t-y) x i (i=1..n) The parameter  is called the learning rate.  In Han’s book it is lower case L  It determines the magnitude of weight updates  w i. If the output is correct (t=y) the weights are not changed (  w i =0). If the output is incorrect (t  y) the weights w i are changed such that the output of the Perceptron for the new weights w’ i is closer/further to the input x i.

Perceptron Training Algorithm Repeat for each training vector pair (x,t) evaluate the output y when x is the input if y  t then form a new weight vector w’ according to w’=w +  (t-y) x else do nothing end if end for Until y=t for all training vector pairs or # iterations > k

Perceptron Learning Example t=1 t=-1 w=[0.25 – ] x 2 = 0.2 x 1 – 0.5 o=1o=-1 (x,t)=([-1,-1],1) o=sgn( ) =-1  w=[0.2 –0.2 –0.2] (x,t)=([2,1],-1) o=sgn( ) =1  w=[-0.2 –0.4 –0.2] (x,t)=([1,1],1) o=sgn( ) =-1  w=[ ]

Part 2. Multi Layer Networks Output nodes Input nodes Hidden nodes Output vector Input vector

Can use multi layer to learn nonlinear functions How to set the weights? x1x w 1 =? w 2 =?  = ? 1 Logical XOR x1x1 x2x2 y x1 x w23 w35

28 End