Practical session on neural network modelling

Slides:



Advertisements
Similar presentations
Beyond Linear Separability
Advertisements

Slides from: Doug Gray, David Poole
NEURAL NETWORKS Backpropagation Algorithm
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Artificial Neural Networks 2 Morten Nielsen BioSys, DTU.
Artificial Neural Networks 2 Morten Nielsen Depertment of Systems Biology, DTU.
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
The back-propagation training algorithm
September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 
CS 484 – Artificial Intelligence
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
Artificial Neural Networks
Biointelligence Laboratory, Seoul National University
Classification Part 3: Artificial Neural Networks
Artificial Neural Networks
Integrating Neural Network and Genetic Algorithm to Solve Function Approximation Combined with Optimization Problem Term presentation for CSC7333 Machine.
Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.
Introduction to Artificial Neural Network Models Angshuman Saha Image Source: ww.physiol.ucl.ac.uk/fedwards/ ca1%20neuron.jpg.
Neural Networks - Berrin Yanıkoğlu1 Applications and Examples From Mitchell Chp. 4.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
CSE & CSE6002E - Soft Computing Winter Semester, 2011 Neural Networks Videos Brief Review The Next Generation Neural Networks - Geoff Hinton.
Neural Nets: Something you can use and something to think about Cris Koutsougeras What are Neural Nets What are they good for Pointers to some models and.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
Introduction to Neural Networks. Biological neural activity –Each neuron has a body, an axon, and many dendrites Can be in one of the two states: firing.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
Neural Networks - Berrin Yanıkoğlu1 Applications and Examples From Mitchell Chp. 4.
Robert J. Marks II CIA Lab Baylor University School of Engineering CiaLab.org Artificial Neural Networks: Supervised Models.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Image Source: ww.physiol.ucl.ac.uk/fedwards/ ca1%20neuron.jpg
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
BACKPROPAGATION (CONTINUED) Hidden unit transfer function usually sigmoid (s-shaped), a smooth curve. Limits the output (activation) unit between 0..1.
Kim HS Introduction considering that the amount of MRI data to analyze in present-day clinical trials is often on the order of hundreds or.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
語音訊號處理之初步實驗 NTU Speech Lab 指導教授: 李琳山 助教: 熊信寬
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
Neural Networks - Berrin Yanıkoğlu1 MLP & Backpropagation Issues.
Multinomial Regression and the Softmax Activation Function Gary Cottrell.
Back Propagation and Representation in PDP Networks
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Today’s Lecture Neural networks Training
Machine Learning Supervised Learning Classification and Regression
Back Propagation and Representation in PDP Networks
Fall 2004 Backpropagation CS478 - Machine Learning.
Deep Feedforward Networks
Artificial Neural Networks
Supervised Learning in ANNs
Computer Science and Engineering, Seoul National University
E. Barbuto, C. Bozza, A. Cioffi, M. Giorgini
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Neural Networks CS 446 Machine Learning.
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Machine Learning Today: Reading: Maria Florina Balcan
CSC 578 Neural Networks and Deep Learning
ECE 471/571 - Lecture 17 Back Propagation.
Lecture 11. MLP (III): Back-Propagation
Artificial Intelligence 13. Multi-Layer ANNs
Artificial Neural Networks
Neural Network - 2 Mayank Vatsa
Neural Networks Geoff Hulten.
Capabilities of Threshold Neurons
Machine Learning: Lecture 4
Machine Learning: UNIT-2 CHAPTER-1
Back Propagation and Representation in PDP Networks
Neural networks (1) Traditional multi-layer perceptrons
Artificial Intelligence 10. Neural Networks
COSC 4335: Part2: Other Classification Techniques
Computer Vision Lecture 19: Object Recognition III
Presentation transcript:

Practical session on neural network modelling Antonello Pasini, Rocco Langone CNR - Institute of Atmospheric Pollution Rome, Italy

The attribution investigations aim at identifying the influence of some forcings on a particular meteo-climatic variable, explaining the different weights that several causes have on a single effect. In this filed, neural networks have already been used successfully in global studies, as we have shown in the morning lesson. Today we’ll deal with data used in a master thesis, in which our target was twofold: i) to understand which global atmospheric circulation patterns are most influencing the temperatures observed in the SW Alpine region in the last 50 years; ii) to reconstruct the observed temperatures from these patterns Attribution studies

“logsig” transfer function The neural model The model generally used in “attribution” researches are layered neural networks, feed-forward, trained by the “error back propagation” method, using an optimization algorithm to minimize an Error Function: g(a) = [1 + exp(-2ka])-1 “logsig” transfer function Error Function

Be F a function to minimize: Gradient descent… Gradient descent with momentum… momentum learning rate… Be F a function to minimize: For example, in the case of our neural network, the equations of weights updating become (excluding the momentum term and the thresholds of the neurons):

The All Frame procedure Our model is endowed with a particular kind of training procedure called “all-frame” or “leave-one-out” cross-validation. It’s a cycle in which at every step all the data but one (the “hole”) constitute the training set and the remaining pattern is the validation/test set: Training stops in two ways: by fixing the number of epochs (i.e. the output cycles) by an early-stopping (if the error function on the training set is less than a threshold).

Ensemble runs The backpropagation learning method is local and the performance depends in part on the values of the initial random weights because the convergence to the absolute minimum of the error function is not assured. In this situation, we have to assess the variability of performance by choosing several random values for the weights. This is done automatically by our model at every run. This allows us to perform several runs (ensemble runs) in order to widely explore the landscape of the error function. This is particularly useful when the error function presents several local minima… in which we can be “trapped”.

Some words about NEWNET In the file “NEWNET1.exe” you can create and train a neural network in two ways: Some words about NEWNET step by step scrolling a menu… 1) All the parameters are set in a file (of the kind “.par”)

We will use, for convenience, the first way… The parameters file and the data file must have the same filename (with different extension, respectively “.par” and “.txt”) and they have to be in the same folder in which NEWNET is. We will use, for convenience, the first way… Once you have all the files in your folder, you’re ready to launch the executable file: set “l” (load files) write the filename without extension set “a” to enter the training menu once you are in the training menu, set “f” (all-frame method) set the number of the patterns of the whole dataset (in our case it’s 49) push a key to continue the file with the results will appear in your folder (with the same name of the other files and extension “.taf”). If the file already exist, you can overwrite the new file choosing “s” (yes) or not (setting “n”). In the last case you have to write the new filename without extension.

The parameter file *.par NL : number of layers LP : total number of data-patterns TP : number of test patterns, a subset of LP (when you’re not training with the “all-frame”) RR : variability range of the starting random weights T: initial threshold of all neurons ETA: learning rate MOM: momentum MSS: threshold for the early stopping (training stop when MSS of the training set become smaller than this threshold ) CY: maximum number of epochs (output cycles) to do UPL: number of units for layer AF: kind of activation (transfer) function (1 = linear, 2 = tansig, 3 = logsig) ST: stepness DMS: minimum variation of MSS between two consecutive epochs, below which the training stops, because of a probable “flat plateau”.

Task 1 We are given to you 3 data sets, built on 3 different mixing of 4 circulation patterns (indices). We are setting all the parameters (in the .par file) in a way allowing us a optimized reconstruction of temperatures (no overfitting, no too early stopping, good number of hidden neurons, etc.). Question: which choice of inputs determines the best results in terms of the Pearson coefficient R between observed and reconstructed temperatures? Please, for each data set consider 10 ensemble runs with random initial weights, just to explore the variability of the results themselves. Build an error bar for each file and display the results in a graph.

Input = 4 atmospheric teleconnection pattern indices (AO, EA, EBI, SCAN) Target = mean temperature of extended winter In this case you can get an interesting result by creating a 4-4-1 net trained with these parameters: Example This is the best result we obtained. Have you done something better? EXTENDED WINTER R AO, EAWR, ABI, ENSO AO, EA, EBI, SCAN NAO, EA, AO, ABI

The correlation coefficient between output and target, in the best single run, is about 0.75…

Task 2 Once determined the best combination of inputs for correctly reconstructing temperatures of extended winter, try to evaluate convergence or overfitting problems using this combination, by changing the number of epochs in the .par file. You can evaluate which choice is the optimal one and clearly see the non-convergence and overfitting problems. Please, make a graph of performance (in terms of R) for several values of epochs. Then, you can change the number of hidden neurons and see an overfitting example due to too many neurons.

Result 1

Result 2