Presentation is loading. Please wait.

Presentation is loading. Please wait.

Virgile TAVER 1, 2 Anne JOHANNET 2 Valérie BORRELL ESTUPINA 1

Similar presentations


Presentation on theme: "Virgile TAVER 1, 2 Anne JOHANNET 2 Valérie BORRELL ESTUPINA 1"— Presentation transcript:

1 How to deal with non-stationary conditions in hydrology using neural networks
Virgile TAVER 1, 2 Anne JOHANNET 2 Valérie BORRELL ESTUPINA 1 Séverin PISTRE 1 Made possible by a collaboration (1) HSM, HydroSciences Montpellier, UMR5569, Université de Montpellier 2, France (2) Ecole des Mines d’Alès, France Hello, my name is Valérie Borrell and I will be presenting research performed on the application of neural networks on the data base supplied in this session. This study is intitled “Non-Recurrent vs Recurrent Neural Network Models for non stationary modelling” or “How to deal with non-stationary conditions in hydrology using neural networks” ?? This work was made possible by a collaboration between Hydrosciences Montpellier and EMA The studied catchments are Durance and Fernow . Session : Testing simulation and forecasting models in non-stationary conditions IAHS General Assembly in Göteborg - July 2013 1

2 Neural Networks (NN) The Neural Networks are increasingly used in hydrology: For prediction For forecasting floods For modelling unknown relations The Neural Networks learn their behavior by training, nevertheless: They are sensible to overfitting (bias-variance dilemma) Model complexity must be chosen as simple as possible Regularization methods must be used The Neural networks are increasingly used in hydrology: For prediction of water resources, for forecasting floods For modeling unknown relations between rain and flow, when processes involved in this relation are not identified The Neural Networks learn their behavior by training (this step could more traditionally be called calibration), nevertheless: They are sensible to overfitting (it is what we call the bias-variance dilemma) The Model complexity must then be chosen as simplified as possible And some Regularization methods must be used to avoid this overfitting Neural Networks Methods Results Conclusion 2

3 Neural Networks Neuron definition: Neural Network architecture:
Weighted sum Non-linear function (f) Neural Network architecture: Multilayer perceptron Universal approximation Parsimony (for statistical models) The principle is that One Neuron calculates a weighted sum of its input variables, then it applies to this sum a non linear function (f), which is often a sigmoïd, to produce an output signal. Once the input variables are identified, the Neural network architecture has to be specified. We chose a multilayer perceptron, which is an Universal approximation, defined as Parsimony (when considering statistical models). Between the input variables and the output signal, the output Neuron makes its calculation from the outputs of a hidden layer of neurons. The complixity of these hidden layers have to be defined. Neural Networks Methods Results Conclusion 3

4 Neural Network: design methodology
Minimization of the quadratic error during training by Levenberg-Marquardt rule Data base utilization: One (sub)-set for training (Pi, i=1,5) One (sub)-set for stopping (early stopping with records ≠(Pi, i=1,5)) One (sub)-set for test (in level 3), different from training and stop sets Complexity selection Definition of architecture using cross-validation (included inside the training period) : Input variables (ui) Number of hidden layers and hidden neurons Selection using Nash criterion Once the neuron network architecture has been defined, the methodology of calibration is composed of 3 different steps : The methodology of minimizing the distance between observations and simulations : Here it is the minimization of the quadratic error during the training period with the Levenberg-Marquardt rule The methodology of using the database : One (sub)-set of the available period is used for training (Pi, i=1,5) One (sub)-set is used for stopping (early stopping with records ≠(Pi, i=1,5)) One (sub)-set is used for testing (in level 3 : Pi ?? ), this set has to be different from training and stop sets The methodology to select the complexity of the NN architecture : the cross-validation will allow the definition of the Input variables (ui) to use and of the Number of hidden layers and hidden neurons Selection using Nash criterion ?? Neural Networks Methods Results Conclusion 4

5 3 ways of modelling 3 models can be investigated regarding the postulated model For example let us consider an analogy : calculate the price of a “baguette”, 3 methods can used to estimate such a price : Take into account the price of primary ingredients (flour, water …), energy, and compute the price for a specific recipe If the state measurement is good: take into account the measured price yesterday, and anticipate a one-day evolution If the state measurement isn't good: take into account the estimated price yesterday, and anticipate evolution 3 models can be investigated regarding the postulated model. For example let us consider an analogy : calculate the price of a “baguette” (french bread), 3 methods can be used to estimate such a price : 1- Take into account the price of primary ingredients (flour, water …), energy, and compute the price for a specific recipe [récipi ] 2- The second postulated model in regards to the analogy : If the state measurement is good: take into account the measured price yesterday, and anticipate a one-day evolution (one step predictor) 3- If the state measurement isn't good: take into account the estimated price yesterday, and anticipate evolution (multistep predictor) Neural Networks Methods Results Conclusion 5

6 3 ways of modelling 3 models can be investigated regarding the postulated model: Computing discharge from rainfall and physics Computing discharge from the state measurement Computing discharge from the state estimation Static system modelling => Static NN model Dynamic system modelling => Directed NN model So the 3 models that can be investigated regarding the postulated model are : Computing discharge based on rainfall and physics – this is a static system modelling : the static NN model Computing the discharge based on the state measurement – this is a dynamic system modelling : the directed NN model Computing the discharge based on the state estimation – this is also a dynamic system modelling : the un-directed NN model The way to name the NN is associated to each postulated model : static, Directed (by the measurements of a state variable), Undirected (no measurement of a state variable) => Non-directed NN model Neural Networks Methods Results Conclusion 6 6

7 1 NN model for each postulated model
yp: observed output of the physical process u (k): observed input of the physical process (rain) b (k): noise Postulated model 1 Postulated model 2: noise on the state Postulated model 3 : noise on the measurement φ u (k) yp(k+1) Static NN model φRN u (k) g(k+1) φ q-1 u (k) yp (k) b (k+1) yp(k+1) φRN u (k) yp (k) g(k+1) Directed NN model for each postulated model, we will build 1 NN model For the static NN model (the first postulated model), the physical postulated model is that the observed inputs (u(k)) will explain the observed output (y(k+1)) at the next time φ q-1 u (k) xp (k) b (k+1) yp(k+1) xp (k+1) Un-directed NN model φRN q-1 u (k) g (k) g (k+1) Neural Networks Methods Results Conclusion 7

8 1 NN model for each postulated model
yp: observed output of the physical process u (k): observed input of the physical process (rain) b (k): noise Postulated model 1 Postulated model 2: noise on the state Postulated model 3 : noise on the measurement φ u (k) yp(k+1) Static NN model φRN u (k) g(k+1) φ q-1 u (k) yp (k) b (k+1) yp(k+1) φRN u (k) yp (k) g(k+1) Directed NN model For the directed NN model, we suppose that the observed output will explain a part of the simulated output at time k+1 (what is q-1? Something special to explain about the noise ?) φ q-1 u (k) xp (k) b (k+1) yp(k+1) xp (k+1) Un-directed NN model φRN q-1 u (k) g (k) g (k+1) Neural Networks Methods Results Conclusion 8 8

9 1 NN model for each postulated model
yp: observed output of the physical process u (k): observed input of the physical process (rain) b (k): noise Postulated model 1 Postulated model 2: noise on the state Postulated model 3 : noise on the measurement φ u (k) yp(k+1) Static NN model φRN u (k) g(k+1) φ q-1 u (k) yp (k) b (k+1) yp(k+1) φRN u (k) yp (k) g(k+1) Directed NN model for the last postulated model , we suppose that the direct output is not the observed one because the system is disturbed by a noise . So for the un-directed NN model, we suppose that the simulated output at time k will explain a part of the simulated output at time k+1 φ q-1 u (k) xp (k) b (k+1) yp(k+1) xp (k+1) Non-directed NN model φRN q-1 u (k) g (k) g (k+1) Neural Networks Methods Results Conclusion 9 9

10 1 NN model for each postulated model
yp: observed output of the physical process u (k): observed input of the physical process (rain) b (k): noise Postulated model 1 Postulated model 2: noise on the state Postulated model 3 : noise on the measurement φ u (k) yp(k+1) Static NN model φRN u (k) g(k+1) φ q-1 u (k) yp (k) b (k+1) yp(k+1) φRN u (k) yp (k) g(k+1) Directed NN model for the last postulated model , we suppose that the direct output is not the observed one because the system is disturbed by a noise . So for the un-directed NN model, we suppose that the simulated output at time k will explain a part of the simulated output at time k+1 φ q-1 u (k) xp (k) b (k+1) yp(k+1) xp (k+1) Non-directed NN model φRN q-1 u (k) g (k) g (k+1) Neural Networks Methods Results Conclusion 10 10

11 3 ways to deal with non stationary
How to adapt the model to the changing environment and process? Changing process or environment The observed data are used to adapt parameter values at different time steps  Adaptativity The observed data are used as input data at different time step  Directed Model The observed data are used to modify inaccurate inputs at different time steps  Data Assimilation A variationnal approach is used in this work to modify rainfalls, temperature and snow at each time step Possible on the 3 models Only for Directed model Possible on the 3 models How to adapt the model to the changing environment and process? Changing process The observed data is used to adapt parameter values at each time step  Adaptativity Changing environment The observed data is used as an input data at each time step  Directed Model The observed data is used to modify inaccurate inputs at each time step  Data Assimilation A variationnal approach is used in this work to modify rainfalls, temperature and snow at each time step : Le modèle est entièrement dérivable à chaque pas de temps donc, on calcule le jacobien pour chaque dt et on peut remonter depuis l écart obs-sim en sortie jusqu à la varaible/param/ou input sur laquelle on souhaite agir par correction La seconde solution n est possibmle que sur le modele dirigé Les solutions 1 et 3 sont possibles qqsoit le modele retenu. C est donc sur ces solution (adaptativity or assimilation) que l on va travailler Je suis donc now génée par la classification changing process ou changing environment : why ? Si pour le last cas, on constate qu on a une non stationnarité sur la neige par ex, donc notre modele va mal se comporter dans le futur par rapport à cette entrée neige, donc on va la modifier pour mieux simuler ? On devrait modifier juste les param de ce process. Du coup c est le process utilisation neige qui a changé, pas que la neige ? Bref, je capte pas changing process ou environnement. Neural Networks Methods Results Conclusion 11 11

12 Application: - Fernow watershed, - Durance watershed
Only models able to represent dynamic systems were developed : Directed (non-recurrent model) Non-Directed (recurrent model) φRN u (k) yp (k) g(k+1) Directed NN model Non-directed NN model φRN q-1 u (k) g (k) g (k+1) On sait que el modèle statique qui calcule le débit à partir de la seule pluie est peu efficace donc on ne l’essaie pas. Modele statique : Sortie = cte si entrée = cte Modele dynamique : Sortie = variable meme si entrée = cte Or ici, on observe un modele dynamique, donc normal que le modele statique ne fonctionne pas. 2 ways of dealing with non-stationary : No option Adaptativity Assimilation Neural Networks Methods Results Conclusion 12 12

13 Fernow watershed, USA (0,2 km2)
Complete period: 01/01/ /12/2009 Snowmelt and sampling too distant (day for a very small basin) Calibration periods: P1: 01/01/ /12/1968: forest cut of the lower part of the basin (Mar - Oct 1964); forest cut of the upper part of the basin (Oct Feb 1968) P2: 01/01/ /12/1978: plantation of firtrees (Mar - Apr 1973) P3: 01/01/ /12/1988 P4: 01/01/ /12/1998 P5: 01/01/ /12/2008 How to adapt the model to the changing environment and process? If we have a Changing process then we will Adapt parameters continuously  it is what we call “ adaptativity” : comment on les adapte dans les faits ? A quel moment ? If we are in a Changing environment then we have to Take measured data into account continuously Straightforwardly by Directed Model -> ok c est natif pour le mode dirigé ? By modifying inaccurate measured data (data assimilation) -> on modifie les data mais encore faut il assimiler qqchose, donc pourquoi est ce possible sur les 3 modeles ? So a Variationnal approach is used in this work to modify rainfalls, temperature and snow Besoin de détail sur l algo d assim ! Mon autre presentation c est de l assim pure et dure, donc je risque d avoir des questions de comparaison des algos used ! Neural Networks Methods Results Conclusion 13 13

14 Fernow model Neural Networks Methods Results Conclusion 14
NN Architecture : the rainfall during a calibrated time window are the inputs of 2 hidden layers The Temperature are the input of 1 hidden layer The 2 processes are linked on the last hidden layer of the rain If we are in a Directed model, then the observed flow on a calibrated time window are inputs If we are in a Non Directed model, then the simulated flow on a calibrated time window are inputs It was impossible to get satisfactory resultst with the static model. Neural Networks Methods Results Conclusion 14 14

15 Fernow model : illustration
R pour rainfall et T pout Température, Qobs est le débit observé et Q le débit estimé. Selon que l’on met Qobs ou Q en entrée on a le modèle Dirigé ou non dirigé - Best results for the Directed Model (Nash oabout 0.7), whatever the way of modelling non stationary Nash about 0.4 for Non directed Adaptativty = best way to deal with non stationary for the directed model, also it is the assimilation the best way for the non directed - regarder ce qu etaient les test Pi !!! Why pas de test P3 parfois ? Why le test P5 est toujours à la traine ? Difference entre Test P3 et Train P3 ? Neural Networks Methods Results Conclusion 15 15

16 Durance watershed, France ( 2170 km2)
Observed non-stationary: Temperature higher implying decrease of glaciers Discharge during spring due to snowmelt Complete period: 01/01/ /12/2010 Calibration periods: P1: 01/01/ /12/1924 P2: 01/01/ /12/1945 P3: 01/01/ /12/1966 P4: 01/01/ /12/1987 P5: 01/01/ /12/2008 Neural Networks Methods Results Conclusion 16 16

17 Durance model Architecture defined on P1 Rain 7j Temp 10j PET 4j Qcalc
Hidden Layers 3 Perceptron multi couches classique Valeurs des w Debit a k-1 à k-o pour prévoir ceux à k Neural Networks Methods Results Conclusion 17 17

18 Durance model : illustration
During the spring period, discharge of the Durance is due to snowmelt. To take into account this process, positive temperatures of winter and spring are preserved (from 1st of January to 30th of June). All the other temperature are set to zero. Neural Networks Methods Results Conclusion 18 18

19 Durance model : illustration
Input Temperature Assimilation on Directed The supplied ones Rainfall Non-Directed Rainfall, Temperature, PET Non directed Snowmelt Neural Networks Methods Results Conclusion 19 19

20 Fernow Durance Directed, no option
With the Directed model with Adaptativity or Assimilation on the Fernow catchment : Improvement of the Nash But decrease of the performance on the low flows on some periods Not a Gain, nor a deterrioration on the Durance catchment Neural Networks Methods Results Conclusion 20 20

21 Non-Directed, no option
Fernow Durance Non-Directed, no option Best results on the Durance catchment Poor Nash on Fernow Very bad low flows simulations We can observed a deterioration of the results with the non-directed models compared with the directed ones Neural Networks Methods Results Conclusion 21 21

22 Fernow Durance Non-Directed, Adaptation Non-Directed, Assimilation
Adaptation and Assimilation options can strongly improve the Nash criterion (in particular for the Durance catchment) But have no effect on low flows Neural Networks Methods Results Conclusion 22 22

23 Fernow Non-Directed, Assimilation The data assimilation :
improves low flows while deterioring the Nash on the Ferrow catchment on some periods It was the oppositive result for the Durance catchment : improvement of the Nash while deterioring low flows on (previous slide) Neural Networks Methods Results Conclusion 23 23

24 Durance Non-Directed, no Option Non-Directed, no Option, T°
Non-Directed, Assimilation, T° The treatment of temprature (Snowmelt) improves the Nash criterion Bad simulations on low flows Neural Networks Methods Results Conclusion 24 24

25 Conclusions The best way (reliable , simple, easy) to adapt the model to the changing environment consists in using the Directed Model (feedforward model) When using Directed Model, there has been no appreciable progress when using adaptativity or assimilation When using Non-Directed model, the improvement provided by adapatativity and data assimilation can be high Neural Network Modelling is more efficient for the largest studied catchment Work on progres : data assimilation must be studied more deeply (some parameters to adjust), the criteria used for otpimization have to be complixified (to avoid that the improvement on high flows appears when decreasing performance on low flows and vice versa) Assimilation : 3 paramètres numériques à fixer dans l algo actuel car tout n est pas encore automatique, c est en cours de developpement : Pas du gradient Fenetre d apprentissage Nbe de calcul d apprentissage Ici, def sur P3 car trainning, puis utilisé en l etat sur P1, P2 P4 P5 Ce qui explique peut être les bad results en assim (pas normal !) Si bon resultats en test, alors adap ou assim ne change pas grand chose Si bad results en test, alors adpat ou assim peuvent grandement améliorer Thank you for your time Neural Networks Methods Results Conclusion 25 25


Download ppt "Virgile TAVER 1, 2 Anne JOHANNET 2 Valérie BORRELL ESTUPINA 1"

Similar presentations


Ads by Google