Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses.

Uncertainty Analysis Using GEM-SA Tony O’Hagan

Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Setting up the project

Number of inputs  Select Project -> New, or click toolbar icon  Select number of inputs using  Project dialog appears

Our example  We’ll use the example “model1” in the GEM-SA DEMO DATA directory  This example is based on a vegetation model with 7 inputs – RESAEREO, DEFLECT, FACTOR, MO, COVER, TREEHT, LAI  The model has 16 outputs, but for the present we will consider output 4 – June monthly GPP

Define input names  Click on “Names …”  Enter parameter names  Click “OK”  The “Input parameter names” dialog opens

Files  Click on Files tab  The “Inputs” files contains one column for each parameter and one row for each model training run (the design)  The “Outputs” files contains the outputs of those runs (one column)  Using “Browse” buttons, select input and output files

Close project and save  We will leave all other settings at their default values for now  Click “OK”  Select Project -> Save – Or click toolbar icon  Choose a name and click “Save”

Running a simple analysis

Build the emulator  Click to build the emulator  A lot of things now start to happen! – The log window at the bottom starts to record various bits of information – A little window appears showing progress of minimisation of the roughness parameter estimation criterion – A new window “Main Effects Realisations” appears and several graphs appear  Progress bar at the bottom

Focus on the log window  Close the “Main Effects Realisations” window when it’s finished – We don’t need it in this session! – In the main window we now have a table – Which we will also ignore for now  Focus on the log window  This reports two key things – Diagnostics of the emulator build – The basic uncertainty analysis results

Emulation diagnostics  Note where the log window reports …  The first line says roughness parameters have been estimated by the simplest method  The values of these indicate how non-linear the effect of each input parameter is – Note the high value for input 4 (MO) Estimating emulator parameters by maximising probability distribution... maximised posterior for emulator parameters: precision = 12.1881, roughness = 0.227332 0.0256299 0.00388643 74.0941 0.963724 1.22783 2.42148

Uncertainty analysis – mean  Below this, the log reports  So the best estimate of the output (June GPP) is 24.3 (mol C/m 2 ) – This is averaged over the uncertainty in the 7 inputs  Better than just fixing inputs at best estimates – There is an emulation standard error of 0.065 in this figure Estimate of mean output is 24.3088, with variance 0.00422996

Uncertainty analysis – variance  The final line of the log is  This shows the uncertainty in the model output that is induced by input uncertainties – The variance is 72.9 – Equal to a standard deviation of 8.5 – So although the best estimate of the output is 24.3, the uncertainty in inputs means it could easily be as low as 16 or as high as 33 Estimate of total output variance = 72.9002

Exercise

A small change  Run the same model with Output 11 instead of Output 4  Calculate the coefficient of variation (CV) for this output – NB: the CV is defined as the standard deviation divided by the mean

More complex analyses

Input distributions  A normal (gaussian) distribution is generally a more realistic representation of uncertainty – Range unbounded – More probability in the middle  Default is to assume the uncertainty in each input is represented by a uniform distribution – Range determined by the range of values found in the input file

Changing input distributions  In Project dialog, Options tab, click the button for “All unknown, product normal”  Then OK  A new dialog opens to specify means and variances

Model 1 example  Uniform distributions from input ranges  Normal distributions to match – Range is 4 std devs  Except for MO – Narrower distribution UniformNormal ParameterLowerUpperMeanVariance RESAEREO80200140900 DEFLECT0.610.80.01 FACTOR0.10.50.30.01 MO3010060100 COVER0.60.990.80.01 TREEHT104025100 LAI3.7596.51

Effect on UA  After running the revised model, we see: – It runs faster, with no need to rebuild the emulator – The mean is changed a little and variance is halved The emulator fit is unchanged Estimate of mean output is 26.4649, with variance 0.0108452 Estimate of total output variance = 36.8522

Reducing the MO uncertainty further  If we reduce the variance of MO even more, to 49: – UA mean changes a little more and variance reduces again – Notice also how the emulation uncertainty has increased (0.004 for uniform) – This is because the design points cover the new ranges less thoroughly Estimate of mean output is 26.6068, with variance 0.014514 Estimate of total output variance = 26.4372

Another exercise  What happens if we reduce the uncertainty in MO to zero?  Two ways to do this – Literally set variance to zero – Select “Some known, rest product normal” on Project dialog, check the tick box for MO in the mean and variance dialog  What changes do you see in the UA?

Cross-validation  In the Project dialog, look at the bottom menu box, labelled “Cross-validation”  There are 3 options – None – Leave-one-out – Leave final 20% out  CV is a way of checking the emulator fit – Default is None because CV takes time

Cross Validation Root Mean-Squared Error = 0.844452 Cross Validation Root Mean-Squared Relative Error = 4.00836 percent Cross Validation Root Mean-Squared Standardised Error = 1.01297 Cross Validation variances range from 0.173433 to 2.89026 Written cross-validation means to file cvpredmeans.txt Written cross-validation variances to file cvpredvars.txt Leave-one-out CV  After estimating roughness and other parameters, GEM predicts each training run point using only the remaining n-1 points  Results appear in log window Close to 1 (Model 1, output 4, uniform inputs)

Leave out final 20% CV  This is an even better check, because it tests the emulator on data that have not been used in any way to predict it  Emulator is built on first 80% of data and used to predict last 20%  [Marc, zero standardised error??!!!] Cross Validation Root Mean-Squared Error = 0.959898 Cross Validation Root Mean-Squared Relative Error = 4.65714 percent Cross Validation Root Mean-Squared Standardised Error = 0 Cross Validation variances range from 0.182214 to 2.17168

Other options  There are various other options associated with the emulator building that we have not dealt with  But we’ve done the main things that should be considered in practice  And it’s enough to be going on with!

When it all goes wrong  How do we know when the emulator is not working? – Large roughness parameters  Especially ones hitting the limit of 99 – Large emulation variance on UA mean – Poor CV standardised prediction error  Especially when some are extremely large  In such cases, see if a larger training set helps – Other ideas like transforming output scale

Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses.

Similar presentations

Presentation on theme: "Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses.

Similar presentations

Presentation on theme: "Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses."— Presentation transcript:

Similar presentations

About project

Feedback