Neural networks for data mining Eric Postma MICC-IKAT Universiteit Maastricht.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Multi-Layer Perceptron (MLP)
Neural networks Eric Postma IKAT Universiteit Maastricht.
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
1 Image Classification MSc Image Processing Assignment March 2003.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Neural Networks Chapter 9 Joost N. Kok Universiteit Leiden.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
Artificial neural networks:
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
X0 xn w0 wn o Threshold units SOM.
Machine Learning Neural Networks
Decision Support Systems
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.
Connectionist Modeling Some material taken from cspeech.ucd.ie/~connectionism and Rich & Knight, 1991.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
JYC: CSM17 Bioinformatics Week 9: Simulations #3: Neural Networks biological neurons natural neural networks artificial neural networks applications.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Radial Basis Function (RBF) Networks
1 Introduction to Artificial Neural Networks Andrew L. Nelson Visiting Research Faculty University of South Florida.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Artificial Neural Networks
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
NEURAL NETWORKS FOR DATA MINING
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
Classification / Regression Neural Networks 2
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Intelligence Techniques Multilayer Perceptrons.
CS-424 Gregory Dudek Today’s Lecture Neural networks –Training Backpropagation of error (backprop) –Example –Radial basis functions.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
Lecture 5 Neural Control
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Chapter 18 Connectionist Models
EEE502 Pattern Recognition
COSC 4426 AJ Boulay Julia Johnson Artificial Neural Networks: Introduction to Soft Computing (Textbook)
NEURAL NETWORKS LECTURE 1 dr Zoran Ševarac FON, 2015.
Perceptrons Michael J. Watts
Chapter 6 Neural Network.
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
Artificial Neural Networks By: Steve Kidos. Outline Artificial Neural Networks: An Introduction Frank Rosenblatt’s Perceptron Multi-layer Perceptron Dot.
Machine Learning Supervised Learning Classification and Regression
Big data classification using neural network
Data Mining, Neural Network and Genetic Programming
Joost N. Kok Universiteit Leiden
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Classification / Regression Neural Networks 2
CSC 578 Neural Networks and Deep Learning
Chapter 3. Artificial Neural Networks - Introduction -
Neuro-Computing Lecture 4 Radial Basis Function Network
CSE 573 Introduction to Artificial Intelligence Neural Networks
Artificial Neural Networks
Machine Learning: Lecture 4
Ch4: Backpropagation (BP)
Machine Learning: UNIT-2 CHAPTER-1
Neural networks (1) Traditional multi-layer perceptrons
The Network Approach: Mind as a Web
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Neural networks for data mining Eric Postma MICC-IKAT Universiteit Maastricht

Overview Introduction: The biology of neural networks the biological computer brain-inspired models basic notions Interactive neural-network demonstrations Perceptron Multilayer perceptron Kohonen’s self-organising feature map Examples of applications

A typical AI agent

Two types of learning Supervised learningSupervised learning curve fitting, surface fitting,...curve fitting, surface fitting,... Unsupervised learningUnsupervised learning clustering, visualisation...clustering, visualisation...

An input-output function

Fitting a surface to four points

Regression

Classification

The history of neural networks A powerful metaphor A powerful metaphor Several decades of theoretical analyses led to the formalisation in terms of statistics Several decades of theoretical analyses led to the formalisation in terms of statistics Bayesian framework Bayesian framework We discuss neural networks from the original metaphorical perspective We discuss neural networks from the original metaphorical perspective

(Artificial) neural networks The digital computer versus the neural computer

The Von Neumann architecture

The biological architecture

Digital versus biological computers 5 distinguishing properties speed speed robustness robustness flexibility flexibility adaptivity adaptivity context-sensitivity context-sensitivity

Speed: The “hundred time steps” argument The critical resource that is most obvious is time. Neurons whose basic computational speed is a few milliseconds must be made to account for complex behaviors which are carried out in a few hudred milliseconds (Posner, 1978). This means that entire complex behaviors are carried out in less than a hundred time steps. Feldman and Ballard (1982)

Graceful Degradation damage performance

Flexibility: the Necker cube

vision = constraint satisfaction

And sometimes plain search…

Adaptivitiy processing implies learning in biological computers versus processing does not imply learning in digital computers

Context-sensitivity: patterns emergent properties

Robustness and context-sensitivity coping with noise

The neural computer Is it possible to develop a model after the natural example?Is it possible to develop a model after the natural example? Brain-inspired models:Brain-inspired models: models based on a restricted set of structural en functional properties of the (human) brainmodels based on a restricted set of structural en functional properties of the (human) brain

The Neural Computer (structure)

Neurons, the building blocks of the brain

Neural activity in out

Synapses, the basis of learning and memory

Learning: Hebb’s rule neuron 1synapseneuron 2

Forgetting in neural networks

Towards neural networks

Connectivity An example: The visual system is a feedforward hierarchy of neural modules Every module is (to a certain extent) responsible for a certain function

(Artificial) Neural Networks NeuronsNeurons activityactivity nonlinear input-output functionnonlinear input-output function ConnectionsConnections weightweight LearningLearning supervisedsupervised unsupervisedunsupervised

Artificial Neurons input (vectors) input (vectors) summation (excitation) summation (excitation) output (activation) output (activation) i

Input-output function nonlinear function: nonlinear function: e f(e) f(x) = 1 + e -x/a 1 a  0 a  

Artificial Connections (Synapses) w AB w AB The weight of the connection from neuron A to neuron BThe weight of the connection from neuron A to neuron B AB w AB

The Perceptron

Learning in the Perceptron Delta learning rule Delta learning rule the difference between the desired output t and the actual output o, given input xthe difference between the desired output t and the actual output o, given input x Global error E Global error E is a function of the differences between the desired and actual outputsis a function of the differences between the desired and actual outputs

Gradient Descent

Linear decision boundaries

Minsky and Papert’s connectedness argument

The history of the Perceptron Rosenblatt (1959) Rosenblatt (1959) Minsky & Papert (1961) Minsky & Papert (1961) Rumelhart & McClelland (1986) Rumelhart & McClelland (1986)

The multilayer perceptron input one or more hidden layers output

Training the MLP supervised learning supervised learning each training pattern: input + desired outputeach training pattern: input + desired output in each epoch: present all patternsin each epoch: present all patterns at each presentation: adapt weightsat each presentation: adapt weights after many epochs convergence to a local minimumafter many epochs convergence to a local minimum

phoneme recognition with a MLP input: frequencies Output: pronunciation

Non-linear decision boundaries

Compression with an MLP the autoencoder

hidden representation

Restricted Boltzmann machines (RBMs)

Learning in the MLP

Preventing Overfitting GENERALISATION = performance on test set GENERALISATION = performance on test set Early stopping Early stopping Training, Test, and Validation set Training, Test, and Validation set k-fold cross validation k-fold cross validation leaving-one-out procedureleaving-one-out procedure

Image Recognition with the MLP

Hidden Representations

Other Applications PracticalPractical OCROCR financial time seriesfinancial time series fraud detectionfraud detection process controlprocess control marketingmarketing speech recognitionspeech recognition TheoreticalTheoretical cognitive modelingcognitive modeling biological modelingbiological modeling

Some mathematics…

Perceptron

Derivation of the delta learning rule Target output Actual output h = i

MLP

Sigmoid function May also be the tanh functionMay also be the tanh function ( instead of )( instead of ) Derivative f’(x) = f(x) [1 – f(x)]Derivative f’(x) = f(x) [1 – f(x)]

Derivation generalized delta rule

Error function (LMS)

Adaptation hidden-output weights

Adaptation input-hidden weights

Forward and Backward Propagation

Decision boundaries of Perceptrons Straight lines (surfaces), linear separable

Decision boundaries of MLPs Convex areas (open or closed)

Decision boundaries of MLPs Combinations of convex areas

Learning and representing similarity

Alternative conception of neurons Neurons do not take the weighted sum of their inputs (as in the perceptron), but measure the similarity of the weight vector to the input vector Neurons do not take the weighted sum of their inputs (as in the perceptron), but measure the similarity of the weight vector to the input vector The activation of the neuron is a measure of similarity. The more similar the weight is to the input, the higher the activation The activation of the neuron is a measure of similarity. The more similar the weight is to the input, the higher the activation Neurons represent “prototypes” Neurons represent “prototypes”

Course Coding

2nd order isomorphism

Prototypes for preprocessing

Kohonen’s SOFM (Self Organizing Feature Map) Unsupervised learning Unsupervised learning Competitive learning Competitive learning output input (n-dimensional) winner

Competitive learning Determine the winner (the neuron of which the weight vector has the smallest distance to the input vector) Determine the winner (the neuron of which the weight vector has the smallest distance to the input vector) Move the weight vector w of the winning neuron towards the input i Move the weight vector w of the winning neuron towards the input i Before learning i w After learning i w

Kohonen’s idea Impose a topological order onto the competitive neurons (e.g., rectangular map) Impose a topological order onto the competitive neurons (e.g., rectangular map) Let neighbours of the winner share the “prize” (The “postcode lottery” principle.) Let neighbours of the winner share the “prize” (The “postcode lottery” principle.) After learning, neurons with similar weights tend to cluster on the map After learning, neurons with similar weights tend to cluster on the map

Biological inspiration

Topological order neighbourhoods SquareSquare winner (red)winner (red) Nearest neighboursNearest neighbours HexagonalHexagonal Winner (red)Winner (red) Nearest neighboursNearest neighbours

inputs Outputs (map)

A simple example A topological map of 2 x 3 neurons and two inputs A topological map of 2 x 3 neurons and two inputs 2D input input weights visualisation

Weights before training

Input patterns (note the 2D distribution)

Weights after training

Another example Input: uniformly randomly distributed pointsInput: uniformly randomly distributed points Output: Map of 20 2 neuronsOutput: Map of 20 2 neurons TrainingTraining Starting with a large learning rate and neighbourhood size, both are gradually decreased to facilitate convergenceStarting with a large learning rate and neighbourhood size, both are gradually decreased to facilitate convergence

Weights visualisation

Dimension reduction 3D input 2D output

Adaptive resolution 2D input 2D output

Output map representation

Application of SOFM Examples (input)SOFM after training (output)

Visual features (biologically plausible)

Face Classification

Colour classification

Car classification

Principal Components Analysis (PCA) Principal Components Analysis (PCA) pca1 pca2 pca1 pca2 Projections of data Relation with statistical methods 1

Relation with statistical methods 2 Multi-Dimensional Scaling (MDS) Multi-Dimensional Scaling (MDS) Sammon Mapping Sammon Mapping Distances in high- dimensional space