Machine Learning Perceptron: Linearly Separable Supervised Learning

Machine Learning Perceptron: Linearly Separable Supervised Learning
Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron: Linearly Separable Multilayer Perceptron & EBP & Deep Learning, RBF Network Support Vector Machine Ensemble Learning: Voting, Boosting(Adaboost) Unsupervised Learning Principle Component Analysis Independent Component Analysis Clustering: K-means Semi-supervised Learning & Reinforcement Learning

Classification Credit scoring example: Formally speaking
Inputs are income and savings Output is low-risk vs. high-risk Formally speaking Decision rule: if we know

Bayes’ Rule Bayes rule for one concept
Bayes rule for K > 1 concepts Decision rule using Bayes rule (Bayes optimal classifier):

Losses and Risks Back to credit scoring example Define Expected risk:
Accepted low-risk applicant increases profit, Rejected high-risk applicant decreases loss In general, loss by accepted high-risk applicant ≠ potential gain by rejected low- risk applicant Errors are not symmetric! Define Expected risk: Decision rule (minimum risk classifier):

More on Losses and Risks

Discriminant Functions

Likelihood-based vs. Discriminant-based
Likelihood-based classification Discriminant-based classification Estimating the boundaries is enough; no need to accurately estimate the densities inside the boundaries!

Linear Discriminant Linear discriminant Advantages:
Simple: O(d) space/computation Knowledge extraction: Weighted sum of attributes; positive/negative weights, magnitudes (credit scoring) Optimal when are Gaussian with shared covariance matrix; useful when classes are (almost) linearly separable

Two Class Case

Geometric View

Multiple Classes (One-vs-All)

Pairwise Separation (One-vs-One)

Single-Layer Perceptron
Classification

Single-Layer Perceptron with K Outputs

Gradient Descent

Training Perceptron Regression (Linear output) Classification
Single sigmoid output K>2 softmax outputs

Training Perceptron Online learning (instances seen one by one) vs. batch learning (whole sample) No need to store the whole sample Problem may change in time Wear and degradation in system components Stochastic gradient descent Update after a single pattern

Expressiveness of Perceptrons
Consider perceptron with a = step function Can represent AND, OR, NOT, majority, etc., but not XOR Represents a linear separator in input space:

Homework: Perceptron Learning for OR Problem- Sigmoid Output
/* Perceptron Learning for OR problem with Linear Output*/ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <math.h> /*parameters*/ #define NUMEPOCH 100 // training epoch #define NUMIN 2 // input #define NUMP 4 // no. training sample #define RWMAX 0.1 // max of weights #define wseed 100 // weight seed #define eta 0.1 //learning rate double rw[NUMIN+1]; main(argc,argv) int argc; char *argv[]; { FILE *fp; double x[NUMP][NUMIN+1], y[NUMP], ys[NUMP], t{NUMP], mse; int i, j, k, p, epoch; x[0][0]=0.; …..x[0][2]=1.; t[0]=0.; ……. srand(wseed); for (i=0; i<MUNIN+1; i++) rw[i]=RWMAX*((double)rand()*2/RAND_MAX-1.); // Begin Training for (epoch=0; epoch<NUMEPOCH; epoch++){ mse=0.; for (p=0; p<NUMP; p++) { for (i=0; i<NUMIN+1; i++) y[p]=rw[i]*x[p][i]; mse+=pow((t[p]-y[p]),2.0)/NUMP; for (i=0; i<NUMIN+1; i++) rw[i]+=eta*(t[p]-y[p])*x[p][i]; } // end of p printf(“%d %e\n”, epoch, mse); } // end of epoch } // end of main

Machine Learning Perceptron: Linearly Separable Supervised Learning

Similar presentations

Presentation on theme: "Machine Learning Perceptron: Linearly Separable Supervised Learning"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Machine Learning Perceptron: Linearly Separable Supervised Learning

Similar presentations

Presentation on theme: "Machine Learning Perceptron: Linearly Separable Supervised Learning"— Presentation transcript:

Similar presentations

About project

Feedback