Presentation is loading. Please wait.

Presentation is loading. Please wait.

CEE 6410 Water Resources Systems Analysis

Similar presentations


Presentation on theme: "CEE 6410 Water Resources Systems Analysis"— Presentation transcript:

1 CEE 6410 Water Resources Systems Analysis
Data-Driven Modeling and Machine Learning Regression Approach in Water Resource Systems CEE 6410 Water Resources Systems Analysis

2 Data-driven Models

3 Data-driven Models Find relationships between the system state variables without explicit knowledge of the physical behavior of the system. Examples: The unit hydrograph method, statistical models (ARMA, ARIMA) and machine learning (ML) models.

4 Why data-driven modeling and
machine learning in water resource systems?

5 Why data-driven modeling and machine learning in water resource systems?
Some highly complex processes in water resources system are difficult to understand and simulate using physically based approach. The Lower Sevier River Basin System Utah

6 Why data-driven modeling and machine learning in water resource systems?
Physically based modeling is limited by the lack of required data and the expense of data acquisition. Data-driven models (Machine Learning) as an alternative. Machine Learning models replicate the expected response of a system.

7 Example of ML Uses

8 Supervised vs. Unsupervised learning
Supervised learning: relate attributes with a target by discovering patterns in the data. These patterns are used to predict values of the target in future data. Unsupervised learning: The data have no target attribute. Explore the data to find some intrinsic structures in them.

9 Supervised vs. Unsupervised learning

10 Procedure Objective Data Retrieval & Analysis Input – Output Selection
Learning Machine Calibration Comparison & Robustness Analysis 12

11 Analysis – Supervised Learning
Machine Learning Approach: Input inclusion (Curse of Dimensionality) Generalization (Overfitting) Impact of unseen data (Robustness) Performance comparison (vs. another similar algorithm)

12 Analysis - Regression Nash coefficient of efficiency (η or E)
Similar to Coef of Determination(r2) Range –Inf to 1 Non dimensional units Root mean square error (RMSE): Same units as model response where: t : observed output t* : predicted output tav : observed average output N : number of observations.

13 Analysis - Classification
Confusion matrix Helps evaluate classifier performance on class by class basis Kappa Coefficient: robust measurement of classification accuracy n = number of classes xii =No. of observations on the diagonal of the confusion matrix corresponding to row i and column i, xi+ and x+i = Marginal totals of row i and col. i respectively N = No. of instances.

14 A Neural Network Model:
Bayesian Multilayer Perceptron for Regression & Classification 16

15 Bayesian Multilayer Perceptron (MLP)
ANN algorithm that uses the Bayesian Inference Method (b): Where: y1, y2, …, yn = simultaneous results from algorithm, WI, WII, bI, bII = model weights and biases, [x] = inputs. MLP: Used also with success in simulation and forecasting of soil moisture, reservoir management, groundwater conditions, etc. Probabilistic approach, noise effect minimization, error prediction bars, etc. (b) Implemented by Nabney (2005)

16 Bayesian Multilayer Perceptron (BMLP)
Using a dataset D = [x(n) , t(n)] with n =1…N, the training of the parameters [Wa, Wb, b(n), bh] is performed by minimizing the Overall Error Function E (Bishop, 2007): Where: ED: data error function, EW: penalization term, W= number of weights and biases in the BMLP, and α and β: Bayesian hyper-parameters.

17 Bayesian Multilayer Perceptron (BMLP)
For regression tasks, the Bayesian Inference allows the prediction y(n) and the variance of the predictions σy2, once the distribution of W has been estimated by maximizing the likelihood for α and β (Bishop, 2007). The output variance has two sources; the first source arises from the intrinsic noise in the output values ; and the second source comes from the posterior distribution of the BMLP weights. The output standard deviation vector σy can be interpreted as the error bar for confidence interval estimation (Bishop, 2007).

18 Bayesian Multilayer Perceptron (BMLP)
For classification tasks, the Bayesian Inference method allows for the estimation of the likelihood of a given class of the input variables using a logistic sigmoid function (Nabney, 2002).

19 BMLP classification - example

20 BMLP regression - example

21 Relevance Vector Machine for Regression
“Entities should not be multiplied unnecessarily” William of Ockham “Models should be no more complex than is sufficient to explain the data” Michael E. Tipping

22 Relevance Vector Machine for Regression
Developed by Tipping [2001] Given a training set of input-target pairs where N is the number of observations The target vector can be written as: w is a weight matrix Ф is a “design” matrix related with a kernel K(x1,xM) the error ε is assumed to be zero-mean Gaussian, with variance σ2

23 A likelihood distribution of the complete data set :
There is a danger that the maximum likelihood estimation of w and σ2 will suffer from severe over-fitting Imposition of an additional penalty term to the likelihood : This prior is ultimately responsible for the sparsity properties of the RVM

24 The posterior parameter distribution conditioned on the data :
The posterior probability assigned to values which are both probable under the prior and “which explain the data” (Tipping 2004)

25 The optimal values for many parameters are infinitive, therefore, the posterior probabilities of the associated weights are zero and the corresponding inputs are “irrelevant”. The non-zero elements are “The Relevance Vectors”

26 RVM approximations to "sinc" function

27 Generalization and Robustness Analysis
Overfitting Calibrate the ML models with one training data set and evaluated their performance with a different unseen test data set. ML applications : ill-posed problems t= f(x) small variation in x may cause large changes in t

28 Model Robustness Bootstrap Method
It is created by randomly sampling with replacement from the whole training data set A robust model is the one that shows a narrow confidence bounds in the bootstrap histogram: Model A Model B Model B is more robust


Download ppt "CEE 6410 Water Resources Systems Analysis"

Similar presentations


Ads by Google