Virgile TAVER 1, 2 Anne JOHANNET 2 Valérie BORRELL ESTUPINA 1

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Medium-range Ensemble Streamflow forecast over France F. Rousset-Regimbeau (1), J. Noilhan (2), G. Thirel (2), E. Martin (2) and F. Habets (3) 1 : Direction.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
Neural Networks  A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.
Kostas Kontogiannis E&CE
Forecasting impact of climate change on runoff coefficient in Limpopo basin using ANN MARCH , 2004AF_42 DAKAR WORKSHOP WELCOME TO THE SECOND AIACC.
Machine Learning Neural Networks
Artificial Intelligence (CS 461D)
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Optimizing number of hidden neurons in neural networks
Artificial Neural Network Techniques For Estimating Heravy Rainfall From Satellite Data Ming Zhang, Roderick A. Scofield NOAA/NESDIS/ORA 5200 Auth Road,
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
I welcome you all to this presentation On: Neural Network Applications Systems Engineering Dept. KFUPM Imran Nadeem & Naveed R. Butt &
Neural Networks Marco Loog.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Chapter 6: Multilayer Neural Networks
Hydrological Modeling FISH 513 April 10, Overview: What is wrong with simple statistical regressions of hydrologic response on impervious area?
Estimation of Oil Saturation Using Neural Network Hong Li Computer System Technology NYC College of Technology –CUNY Ali Setoodehnia, Kamal Shahrabi Department.
Soft Computing Colloquium 2 Selection of neural network, Hybrid neural networks.
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
Benefits and drawbacks of using data assimilation for hydrological modelling in karstic regions. Recent work on the Lez catchment in Southern France IAHS.
Classification Part 3: Artificial Neural Networks
HYPE model simulations for non- stationary conditions in European medium sized catchments Göran Lindström & Chantal Donnelly, SMHI, Sweden IAHS, ,
Artificial Neural Networks
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
Explorations in Neural Networks Tianhui Cai Period 3.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
Artificial Intelligence Techniques Multilayer Perceptrons.
Support Vector Machine With Adaptive Parameters in Financial Time Series Forecasting by L. J. Cao and Francis E. H. Tay IEEE Transactions On Neural Networks,
Gap filling of eddy fluxes with artificial neural networks
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 9: Ways of speeding up the learning and preventing overfitting Geoffrey Hinton.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
Multi-Layer Perceptron
Soft Computing Lecture 8 Using of perceptron for image recognition and forecasting.
Feature selection with Neural Networks Dmitrij Lagutin, T Variable Selection for Regression
Neural Networks - Berrin Yanıkoğlu1 Applications and Examples From Mitchell Chp. 4.
Neural Networks - lecture 51 Multi-layer neural networks  Motivation  Choosing the architecture  Functioning. FORWARD algorithm  Neural networks as.
Lecture 5 Neural Control
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Chapter 8: Adaptive Networks
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
Machine Learning 5. Parametric Methods.
Each neuron has a threshold value Each neuron has weighted inputs from other neurons The input signals form a weighted sum If the activation level exceeds.
Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
Dynamic Neural Network Control (DNNC): A Non-Conventional Neural Network Model Masoud Nikravesh EECS Department, CS Division BISC Program University of.
Artificial Neural Networks for Data Mining. Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 6-2 Learning Objectives Understand the.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Neural Network Architecture Session 2
Learning in Neural Networks
One-layer neural networks Approximation problems
Neural Networks A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.
Structure learning with deep autoencoders
Application of satellite-based rainfall and medium range meteorological forecast in real-time flood forecasting in the Upper Mahanadi River basin Trushnamayee.
of the Artificial Neural Networks.
Lecture Notes for Chapter 4 Artificial Neural Networks
Introduction to Radial Basis Function Networks
Computer Vision Lecture 19: Object Recognition III
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
Presentation transcript:

How to deal with non-stationary conditions in hydrology using neural networks Virgile TAVER 1, 2 Anne JOHANNET 2 Valérie BORRELL ESTUPINA 1 Séverin PISTRE 1 Made possible by a collaboration (1) HSM, HydroSciences Montpellier, UMR5569, Université de Montpellier 2, France (2) Ecole des Mines d’Alès, France Hello, my name is Valérie Borrell and I will be presenting research performed on the application of neural networks on the data base supplied in this session. This study is intitled “Non-Recurrent vs Recurrent Neural Network Models for non stationary modelling” or “How to deal with non-stationary conditions in hydrology using neural networks” ?? This work was made possible by a collaboration between Hydrosciences Montpellier and EMA The studied catchments are Durance and Fernow . Session : Testing simulation and forecasting models in non-stationary conditions IAHS General Assembly in Göteborg - July 2013 1

Neural Networks (NN) The Neural Networks are increasingly used in hydrology: For prediction For forecasting floods For modelling unknown relations The Neural Networks learn their behavior by training, nevertheless: They are sensible to overfitting (bias-variance dilemma) Model complexity must be chosen as simple as possible Regularization methods must be used The Neural networks are increasingly used in hydrology: For prediction of water resources, for forecasting floods For modeling unknown relations between rain and flow, when processes involved in this relation are not identified The Neural Networks learn their behavior by training (this step could more traditionally be called calibration), nevertheless: They are sensible to overfitting (it is what we call the bias-variance dilemma) The Model complexity must then be chosen as simplified as possible And some Regularization methods must be used to avoid this overfitting Neural Networks Methods Results Conclusion 2

Neural Networks Neuron definition: Neural Network architecture: Weighted sum Non-linear function (f) Neural Network architecture: Multilayer perceptron Universal approximation Parsimony (for statistical models) The principle is that One Neuron calculates a weighted sum of its input variables, then it applies to this sum a non linear function (f), which is often a sigmoïd, to produce an output signal. Once the input variables are identified, the Neural network architecture has to be specified. We chose a multilayer perceptron, which is an Universal approximation, defined as Parsimony (when considering statistical models). Between the input variables and the output signal, the output Neuron makes its calculation from the outputs of a hidden layer of neurons. The complixity of these hidden layers have to be defined. Neural Networks Methods Results Conclusion 3

Neural Network: design methodology Minimization of the quadratic error during training by Levenberg-Marquardt rule Data base utilization: One (sub)-set for training (Pi, i=1,5) One (sub)-set for stopping (early stopping with records ≠(Pi, i=1,5)) One (sub)-set for test (in level 3), different from training and stop sets Complexity selection Definition of architecture using cross-validation (included inside the training period) : Input variables (ui) Number of hidden layers and hidden neurons Selection using Nash criterion Once the neuron network architecture has been defined, the methodology of calibration is composed of 3 different steps : The methodology of minimizing the distance between observations and simulations : Here it is the minimization of the quadratic error during the training period with the Levenberg-Marquardt rule The methodology of using the database : One (sub)-set of the available period is used for training (Pi, i=1,5) One (sub)-set is used for stopping (early stopping with records ≠(Pi, i=1,5)) One (sub)-set is used for testing (in level 3 : Pi ?? ), this set has to be different from training and stop sets The methodology to select the complexity of the NN architecture : the cross-validation will allow the definition of the Input variables (ui) to use and of the Number of hidden layers and hidden neurons Selection using Nash criterion ?? Neural Networks Methods Results Conclusion 4

3 ways of modelling 3 models can be investigated regarding the postulated model For example let us consider an analogy : calculate the price of a “baguette”, 3 methods can used to estimate such a price : Take into account the price of primary ingredients (flour, water …), energy, and compute the price for a specific recipe If the state measurement is good: take into account the measured price yesterday, and anticipate a one-day evolution If the state measurement isn't good: take into account the estimated price yesterday, and anticipate evolution 3 models can be investigated regarding the postulated model. For example let us consider an analogy : calculate the price of a “baguette” (french bread), 3 methods can be used to estimate such a price : 1- Take into account the price of primary ingredients (flour, water …), energy, and compute the price for a specific recipe [récipi ] 2- The second postulated model in regards to the analogy : If the state measurement is good: take into account the measured price yesterday, and anticipate a one-day evolution (one step predictor) 3- If the state measurement isn't good: take into account the estimated price yesterday, and anticipate evolution (multistep predictor) Neural Networks Methods Results Conclusion 5

3 ways of modelling 3 models can be investigated regarding the postulated model: Computing discharge from rainfall and physics Computing discharge from the state measurement Computing discharge from the state estimation Static system modelling => Static NN model Dynamic system modelling => Directed NN model So the 3 models that can be investigated regarding the postulated model are : Computing discharge based on rainfall and physics – this is a static system modelling : the static NN model Computing the discharge based on the state measurement – this is a dynamic system modelling : the directed NN model Computing the discharge based on the state estimation – this is also a dynamic system modelling : the un-directed NN model The way to name the NN is associated to each postulated model : static, Directed (by the measurements of a state variable), Undirected (no measurement of a state variable) => Non-directed NN model Neural Networks Methods Results Conclusion 6 6

1 NN model for each postulated model yp: observed output of the physical process u (k): observed input of the physical process (rain) b (k): noise Postulated model 1 Postulated model 2: noise on the state Postulated model 3 : noise on the measurement φ u (k) yp(k+1) Static NN model φRN u (k) g(k+1) φ q-1 u (k) yp (k) b (k+1) yp(k+1) φRN u (k) yp (k) g(k+1) Directed NN model for each postulated model, we will build 1 NN model For the static NN model (the first postulated model), the physical postulated model is that the observed inputs (u(k)) will explain the observed output (y(k+1)) at the next time φ q-1 u (k) xp (k) b (k+1) yp(k+1) xp (k+1) Un-directed NN model φRN q-1 u (k) g (k) g (k+1) Neural Networks Methods Results Conclusion 7

1 NN model for each postulated model yp: observed output of the physical process u (k): observed input of the physical process (rain) b (k): noise Postulated model 1 Postulated model 2: noise on the state Postulated model 3 : noise on the measurement φ u (k) yp(k+1) Static NN model φRN u (k) g(k+1) φ q-1 u (k) yp (k) b (k+1) yp(k+1) φRN u (k) yp (k) g(k+1) Directed NN model For the directed NN model, we suppose that the observed output will explain a part of the simulated output at time k+1 (what is q-1? Something special to explain about the noise ?) φ q-1 u (k) xp (k) b (k+1) yp(k+1) xp (k+1) Un-directed NN model φRN q-1 u (k) g (k) g (k+1) Neural Networks Methods Results Conclusion 8 8

1 NN model for each postulated model yp: observed output of the physical process u (k): observed input of the physical process (rain) b (k): noise Postulated model 1 Postulated model 2: noise on the state Postulated model 3 : noise on the measurement φ u (k) yp(k+1) Static NN model φRN u (k) g(k+1) φ q-1 u (k) yp (k) b (k+1) yp(k+1) φRN u (k) yp (k) g(k+1) Directed NN model for the last postulated model , we suppose that the direct output is not the observed one because the system is disturbed by a noise . So for the un-directed NN model, we suppose that the simulated output at time k will explain a part of the simulated output at time k+1 φ q-1 u (k) xp (k) b (k+1) yp(k+1) xp (k+1) Non-directed NN model φRN q-1 u (k) g (k) g (k+1) Neural Networks Methods Results Conclusion 9 9

1 NN model for each postulated model yp: observed output of the physical process u (k): observed input of the physical process (rain) b (k): noise Postulated model 1 Postulated model 2: noise on the state Postulated model 3 : noise on the measurement φ u (k) yp(k+1) Static NN model φRN u (k) g(k+1) φ q-1 u (k) yp (k) b (k+1) yp(k+1) φRN u (k) yp (k) g(k+1) Directed NN model for the last postulated model , we suppose that the direct output is not the observed one because the system is disturbed by a noise . So for the un-directed NN model, we suppose that the simulated output at time k will explain a part of the simulated output at time k+1 φ q-1 u (k) xp (k) b (k+1) yp(k+1) xp (k+1) Non-directed NN model φRN q-1 u (k) g (k) g (k+1) Neural Networks Methods Results Conclusion 10 10

3 ways to deal with non stationary How to adapt the model to the changing environment and process? Changing process or environment The observed data are used to adapt parameter values at different time steps  Adaptativity The observed data are used as input data at different time step  Directed Model The observed data are used to modify inaccurate inputs at different time steps  Data Assimilation A variationnal approach is used in this work to modify rainfalls, temperature and snow at each time step Possible on the 3 models Only for Directed model Possible on the 3 models How to adapt the model to the changing environment and process? Changing process The observed data is used to adapt parameter values at each time step  Adaptativity Changing environment The observed data is used as an input data at each time step  Directed Model The observed data is used to modify inaccurate inputs at each time step  Data Assimilation A variationnal approach is used in this work to modify rainfalls, temperature and snow at each time step : Le modèle est entièrement dérivable à chaque pas de temps donc, on calcule le jacobien pour chaque dt et on peut remonter depuis l écart obs-sim en sortie jusqu à la varaible/param/ou input sur laquelle on souhaite agir par correction La seconde solution n est possibmle que sur le modele dirigé Les solutions 1 et 3 sont possibles qqsoit le modele retenu. C est donc sur ces solution (adaptativity or assimilation) que l on va travailler Je suis donc now génée par la classification changing process ou changing environment : why ? Si pour le last cas, on constate qu on a une non stationnarité sur la neige par ex, donc notre modele va mal se comporter dans le futur par rapport à cette entrée neige, donc on va la modifier pour mieux simuler ? On devrait modifier juste les param de ce process. Du coup c est le process utilisation neige qui a changé, pas que la neige ? Bref, je capte pas changing process ou environnement. Neural Networks Methods Results Conclusion 11 11

Application: - Fernow watershed, - Durance watershed Only models able to represent dynamic systems were developed : Directed (non-recurrent model) Non-Directed (recurrent model) φRN u (k) yp (k) g(k+1) Directed NN model Non-directed NN model φRN q-1 u (k) g (k) g (k+1) On sait que el modèle statique qui calcule le débit à partir de la seule pluie est peu efficace donc on ne l’essaie pas. Modele statique : Sortie = cte si entrée = cte Modele dynamique : Sortie = variable meme si entrée = cte Or ici, on observe un modele dynamique, donc normal que le modele statique ne fonctionne pas. 2 ways of dealing with non-stationary : No option Adaptativity Assimilation Neural Networks Methods Results Conclusion 12 12

Fernow watershed, USA (0,2 km2) Complete period: 01/01/1959 - 31/12/2009 Snowmelt and sampling too distant (day for a very small basin) Calibration periods: P1: 01/01/1959 - 31/12/1968: forest cut of the lower part of the basin (Mar - Oct 1964); forest cut of the upper part of the basin (Oct 1967 - Feb 1968) P2: 01/01/1969 - 31/12/1978: plantation of firtrees (Mar - Apr 1973) P3: 01/01/1979 - 31/12/1988 P4: 01/01/1989 - 31/12/1998 P5: 01/01/1999 - 31/12/2008 How to adapt the model to the changing environment and process? If we have a Changing process then we will Adapt parameters continuously  it is what we call “ adaptativity” : comment on les adapte dans les faits ? A quel moment ? If we are in a Changing environment then we have to Take measured data into account continuously Straightforwardly by Directed Model -> ok c est natif pour le mode dirigé ? By modifying inaccurate measured data (data assimilation) -> on modifie les data mais encore faut il assimiler qqchose, donc pourquoi est ce possible sur les 3 modeles ? So a Variationnal approach is used in this work to modify rainfalls, temperature and snow Besoin de détail sur l algo d assim ! Mon autre presentation c est de l assim pure et dure, donc je risque d avoir des questions de comparaison des algos used ! Neural Networks Methods Results Conclusion 13 13

Fernow model Neural Networks Methods Results Conclusion 14 NN Architecture : the rainfall during a calibrated time window are the inputs of 2 hidden layers The Temperature are the input of 1 hidden layer The 2 processes are linked on the last hidden layer of the rain If we are in a Directed model, then the observed flow on a calibrated time window are inputs If we are in a Non Directed model, then the simulated flow on a calibrated time window are inputs It was impossible to get satisfactory resultst with the static model. Neural Networks Methods Results Conclusion 14 14

Fernow model : illustration R pour rainfall et T pout Température, Qobs est le débit observé et Q le débit estimé. Selon que l’on met Qobs ou Q en entrée on a le modèle Dirigé ou non dirigé - Best results for the Directed Model (Nash oabout 0.7), whatever the way of modelling non stationary Nash about 0.4 for Non directed Adaptativty = best way to deal with non stationary for the directed model, also it is the assimilation the best way for the non directed - regarder ce qu etaient les test Pi !!! Why pas de test P3 parfois ? Why le test P5 est toujours à la traine ? Difference entre Test P3 et Train P3 ? Neural Networks Methods Results Conclusion 15 15

Durance watershed, France ( 2170 km2) Observed non-stationary: Temperature higher implying decrease of glaciers Discharge during spring due to snowmelt Complete period: 01/01/1904 - 30/12/2010 Calibration periods: P1: 01/01/1904 - 31/12/1924 P2: 01/01/1925 - 31/12/1945 P3: 01/01/1946 - 31/12/1966 P4: 01/01/1967 - 31/12/1987 P5: 01/01/1988 - 31/12/2008 Neural Networks Methods Results Conclusion 16 16

Durance model Architecture defined on P1 Rain 7j Temp 10j PET 4j Qcalc Hidden Layers 3 Perceptron multi couches classique Valeurs des w Debit a k-1 à k-o pour prévoir ceux à k Neural Networks Methods Results Conclusion 17 17

Durance model : illustration During the spring period, discharge of the Durance is due to snowmelt. To take into account this process, positive temperatures of winter and spring are preserved (from 1st of January to 30th of June). All the other temperature are set to zero. Neural Networks Methods Results Conclusion 18 18

Durance model : illustration Input Temperature Assimilation on Directed The supplied ones Rainfall Non-Directed Rainfall, Temperature, PET Non directed Snowmelt Neural Networks Methods Results Conclusion 19 19

Fernow Durance Directed, no option With the Directed model with Adaptativity or Assimilation on the Fernow catchment : Improvement of the Nash But decrease of the performance on the low flows on some periods Not a Gain, nor a deterrioration on the Durance catchment Neural Networks Methods Results Conclusion 20 20

Non-Directed, no option Fernow Durance Non-Directed, no option Best results on the Durance catchment Poor Nash on Fernow Very bad low flows simulations We can observed a deterioration of the results with the non-directed models compared with the directed ones Neural Networks Methods Results Conclusion 21 21

Fernow Durance Non-Directed, Adaptation Non-Directed, Assimilation Adaptation and Assimilation options can strongly improve the Nash criterion (in particular for the Durance catchment) But have no effect on low flows Neural Networks Methods Results Conclusion 22 22

Fernow Non-Directed, Assimilation The data assimilation : improves low flows while deterioring the Nash on the Ferrow catchment on some periods It was the oppositive result for the Durance catchment : improvement of the Nash while deterioring low flows on (previous slide) Neural Networks Methods Results Conclusion 23 23

Durance Non-Directed, no Option Non-Directed, no Option, T° Non-Directed, Assimilation, T° The treatment of temprature (Snowmelt) improves the Nash criterion Bad simulations on low flows Neural Networks Methods Results Conclusion 24 24

Conclusions The best way (reliable , simple, easy) to adapt the model to the changing environment consists in using the Directed Model (feedforward model) When using Directed Model, there has been no appreciable progress when using adaptativity or assimilation When using Non-Directed model, the improvement provided by adapatativity and data assimilation can be high Neural Network Modelling is more efficient for the largest studied catchment Work on progres : data assimilation must be studied more deeply (some parameters to adjust), the criteria used for otpimization have to be complixified (to avoid that the improvement on high flows appears when decreasing performance on low flows and vice versa) Assimilation : 3 paramètres numériques à fixer dans l algo actuel car tout n est pas encore automatique, c est en cours de developpement : Pas du gradient Fenetre d apprentissage Nbe de calcul d apprentissage Ici, def sur P3 car trainning, puis utilisé en l etat sur P1, P2 P4 P5 Ce qui explique peut être les bad results en assim (pas normal !) Si bon resultats en test, alors adap ou assim ne change pas grand chose Si bad results en test, alors adpat ou assim peuvent grandement améliorer Thank you for your time Neural Networks Methods Results Conclusion 25 25