Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

1 Image Classification MSc Image Processing Assignment March 2003.
Perceptron Learning Rule
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Pattern Recognition and Machine Learning
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
An Overview of Machine Learning
Supervised Learning Recap
Artificial Neural Networks
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Machine Learning Neural Networks
Overview over different methods – Supervised Learning
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Lecture 14 – Neural Networks
x – independent variable (input)
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed.
Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.
Chapter 6: Multilayer Neural Networks
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Lecture 4 Neural Networks ICS 273A UC Irvine Instructor: Max Welling Read chapter 4.
MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Artificial Neural Networks
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Radial-Basis Function Networks
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Neural Networks. Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
Machine Learning Chapter 4. Artificial Neural Networks
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Non-Bayes classifiers. Linear discriminants, neural networks.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
Linear Models for Classification
Insight: Steal from Existing Supervised Learning Methods! Training = {X,Y} Error = target output – actual output.
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Linear Classifiers Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning Supervised Learning Classification and Regression
Deep Feedforward Networks
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Classification with Perceptrons Reading:
CSE P573 Applications of Artificial Intelligence Neural Networks
Machine Learning Today: Reading: Maria Florina Balcan
Neuro-Computing Lecture 4 Radial Basis Function Network
Pattern Recognition and Machine Learning
Lecture Notes for Chapter 4 Artificial Neural Networks
Introduction to Radial Basis Function Networks
Machine Learning Perceptron: Linearly Separable Supervised Learning
David Kauchak CS51A Spring 2019
Machine Learning – a Probabilistic Perspective
Introduction to Neural Networks
Hairong Qi, Gonzalez Family Professor
What is Artificial Intelligence?
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4

Machine Learning2 A Generic System System … … Input Variables: Hidden Variables: Output Variables: Training examples: Parameters:

Machine Learning3 A Generic System System … … Input Variables: Hidden Variables: Output Variables: Training examples: Parameters:

Machine Learning4 Different types of learning Supervised learning: 1.Classification (discrete y), 2.Regression (continuous y). Unsupervised learning (no target y). 1.Clustering (h = different groups of types of data). 2.Density estimation (h = parameters of probability dist.) 3.Reduction (h= a few latent variable describing high dimensional data). Reinforcement learning (y = actions).

Digit recognition (supervised) Handwritten Digit Recognition x: pixelized or pre-processed image. t: classs of pre-classified digits (training example.) y: digit class (computed by ML algorithm). h: contours, left/right handed…

Regression (supervised) Target output Parameters

Linear classifier ? Training examples

Linear classifier Decision boundary Heavyside function: 0 1

Linear classifier Decision boundary Heavyside function: 0 1

Assumptions Multivariate Gaussians Same covariance Two classes equiprobable

How do we compute the output? Positive: Class 1 Negative: Class 0 Orthogonal to decision boundary

How do we compute the output? Orthogonal to decision boundary

How do we learn the parameters? Orthogonal to decision boundary Linear discriminant analysis = Direct parameter estimation

How do we learn the parameters? Orthogonal to decision boundary Minimize mean-squared error:

How do we learn the parameters? Minimize mean-squared error: Gradient descent:

How do we learn the parameters? Minimize mean-squared error: Gradient descent: Stochastic gradient descent:

How do we learn the parameters? Stochastic gradient descent: Problem:is not differentiable

3. How do we learn the parameters? Solution: change y to expected class: The output is now the expected class Logistic function

3. How do we learn the parameters? Stochastic gradient descent:

3. How do we learn the parameters? Stochastic gradient descent: Always positive

3. How do we learn the parameters? Learning based on expected class: with Perceptron learning rule with

Application 1: Neural population decoding

w

How to find ? w w

Linear Discriminant Analysis (LDA) Covariance Matrix: Mean responses:

Inverse Covariance matrix Average neural responses when motion is right Average neural responses when motion is left Linear Discriminant Analysis (LDA) w

Neural network interpretation: Learning the connections with « Delta rule »: Each neuron is a classifier

Limitation of 1 layer perceptron: Linearly separable: ANDNon linearly separable: XOR

Extension: multilayer perceptron Towards a universal computer

Learning a multi-layer neural network with backprop Towards a universal computer

Extension: multilayer perceptron Towards a universal computer Initial error:

Extension: multilayer perceptron Towards a universal computer Backpropagate errors Initial error:

Extension: multilayer perceptron Towards a universal computer Backpropagate errors Apply delta rule: Initial error:

Big problem: overfitting... … Backprop was abandoned in the late eighties…

Compensate with very large datasets 9 th Order Polynomial … Resurgence of backprop with big data

Deep convulational networks Google: Image recognition, speech recognition. Trained on billions of examples…

Single neurons as 2 layer perceptron Poirazi and Mel, 2001, 2003

Regression (supervised) Target output Parameters

Regression in general Target output Basis functions

Gaussian noise assumption

How to learn the parameters? Gradient descent:

But: overfitting...

How to learn the parameters? Gradient descent:

Application 3: Neural coding: function approximation with tuning curves

“Classical view”: multiple spatial maps

Application 3: function approximation in sensorimotor area In Parietal cortex: Retinotopic cells gain modulated by eye position And also head position, arm position … Snyder and Pouget, 2000

Multisensory integration = multidirectional coordinate transform Experimental validation Model prediction: Pouget, Duhamel and Deneve, 2004 Avillac et al, 2005 Partially shifting tuning curves

Unsupervised learning …. First example of many

Principal component analysis Orthogonal basis

Principal component analysis (unsupervised learning) Orthogonal basis

Principal component analysis Orthogonal basis:Uncorrelated components: Note: not the same as independent

Principal component analysis and dimensionality reduction K<<N + “Noise”

Principal component analysis (unsupervised learning) Orthogonal basis N=2 K=1

One solution: eigenvalue decomposition of covariance matrix D D

How do we “learn” the parameters? K<<N Standard iterative method First component: other components:

PCA: gradient descent « Maximization » « Expectation » Generalized Oja rule

Natural images: Weights learnt by PCA

Application of PCA: analysis of large neural datasets Machens, Brody and Romo, 2010

Application of PCA: analysis of large neural datasets Time Frequency Machens, Brody and Romo, 2010