Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 1 of 35 Uncertainty Assessment l The quantified description of the uncertainty concerning situations and outcomes l Presentation Objectives: 1. Concepts –The problem and issues –Means of assessment 2. Analytic Procedures –Regression analysis –Bayes Theorem and Likelihood Ratios
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 2 of 35 Part 1 Concepts l The problem and issues l Means of assessment
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 3 of 35 The Problem l New Design Paradigm requires us to determine Distribution of Uncertainties l How can we do this? l Issues: 1. Procedure depends on data available –No one way 2. Systematic Biases exist throughout –Need to be aware and careful
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 4 of 35 Methods of Assessment -- Types l Logic –Example: Probability (Queen) in a deck of cards –Primary Content of Intro. Probability Subjects –Not really useful to assess System Uncertainty l Frequency l Statistical Methods l Judgment More on each coming up
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 5 of 35 Frequency Method l Analysis Approach –Assemble available data –Find frequency of occurrence of event of interest l Example 1: l Probability of Rainfall in a location –Use historical data from rain gauges l Example 2: P (failure of major dams) ~ /dam/year –Source: Baecher et al, data through 1960
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 6 of 35 Frequency Method -- Issues l Assumes process is “stationary”, that is –Future will be like past ; no change in mechanism –Is this always true? –No! Global warming may change rainfall pattern l Assumes data representative –That it represents item of interest –When might this not be true? –Technology of construction changes… Different types of dams, etc.
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 7 of 35 Statistical Models l Analysis Approach –Assemble Data, Choose a set to analyze –Posit a set of variables (X = price, etc) of interest –Posit form of variables (Price, ΔPrice, Relative P…) –Posit form of equations linking them, f(X) –Do statistical analysis (example: least squares) l Example: l Future Demand = f(price, income, etc.) + error – Obtain: D = a 0 + a 1 P b + a 2 P c +…
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 8 of 35 Statistical Models – General Issues l All the issues associated with Frequency Method –Statistical Analysis is a more sophisticated version of Frequency Method l Specifically, assumes that l Process is stationary l Data are representative
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 9 of 35 Statistical Models – Data Issues l Looks precise and technical, BUT l GIGO -- “Garbage in, Garbage out” –Results depend closely on data set chosen (see slides in regression analysis later on0 –Consider history of oil prices: which period should we choose? Everything available? Since OPEC 1, 1974? Last 20 years? Last 10 years?
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 10 of 35 History of Oil Prices Source:
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 11 of 35 Statistical Models – Data Issues l Conceptual “house of cards” l Consider Assumptions –Posit a set of variables –Which ones? Note: adding any factor increases statistical fit –Posit form of variables (Price, ΔPrice, etc.) –Which form? When you choose travel, what influences you: total, relative or differential cost? –Posit form of equations linking them, f(X) –Which form? Linear, exponential, log…
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 12 of 35 Statistical Models – Time Issues l Special issue when model seeks to determine “rate of change over time” l Y = a e rt r = rate per period, t l Data is a “time series”, as for oil price l Issue: Any variable that grows or decreases reasonably also fits the data well –Example: Los Angeles Air Travel correlates well with “prisoners in Oregon”, “divorces in France” –Another possible form of GIGO l Easy to get good correlation Causality
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 13 of 35 Expert Judgment Method Also known as “Subjective Probability” Analysis Method –Identify “experts” –Ask them questions such as “What will be performance of a new technology? l Several variations, concerning how group of experts interact, come to consensus or not… –“Delphi” method, named after mythical oracle
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 14 of 35 Expert Judgment -- Issues Who are the experts? –May be easy to identify who is not expert –BUT, knowledgeable insiders may have “trained incapacity” – be blind to new factors l Overconfidence –Distribution typically much broader than we imagine (remember class experience) l Insensitivity to New Information –Information typically should cause us to change opinions more than it does
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 15 of 35 Summary of Part 1 l Available methods each have difficulties. l These are conceptual – will not be eliminated by better math or technique! l Such issues contribute to the fact that “forecasts are ‘always’ wrong l We have to be properly modest about how far analysis will take us. l Do the analysis, but appreciate weaknesses
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 16 of 35 Part 2 Analytic Procedures l Regression analysis l Bayes’ Theorem l … and Likelihood Ratios
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 17 of 35 Regression Analysis -- Concept l Object: “best fit” of data to function Y = f(X) l Y often called the “dependent” variable l Xs then called the “independent” variables l Xs then “explain” the variation in Y l Think of Y as deformation of spring under load, X, … –Changes in X do not account for all deformation observed – other factors and errors
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 18 of 35 Regression Analysis – Set up l Object: “best fit” of data to function Y = f(X) l Assumptions about data –Each measurement of Independent variables (ex: Weights, X i ) are correct –Dependent variables (ex: Deflection, Y i ) have ‘errors’ l Model then is: Deflection = a (Weight) –And we recognize that measured values Ym i will differ from predicted values Y i will by an ‘error’ e i –Ym i = Y i + e i = a X i + e i
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 19 of 35 Example data
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 20 of 35 Regression Analysis – Math l Object: find model that gives “best fit” to data –That is, model that minimizes total “error” Total Error ≡ Sum of squared errors = ∑ (e i ) 2 l Why does this make sense? l Because this gives a Gaussian distribution, that is, bell-shaped or random l Solution Concept: –Optimization criteria: δ(error) / δ(parameter) = 0 –Solve for each parameter of model (ex: “a”)
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 21 of 35 Regression Analysis – Graph From Excel using LINEST Formula
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 22 of 35 Regression Analysis - Practice A pause for Demonstration of Excel analysis you can use
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 23 of 35 Note: Lab work versus Real World l Object: “best fit” of data to function Y = f(X) l If, as in lab, you control all factors and vary a few, then you can be clear on independent and dependent variables and show causality l Example: weights on spring cause deflection l Thus you can write equation: l Deflection = a(Weight) + measurement errors l Is this true in “real world”, outside lab
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 24 of 35 Real World l It is known that there is a close correlation, a good statistical fit, between number of firemen at a fire and amount of fire damage: l Fire damage = f (number of fireman) l What does it say about effect of firemen? l Nothing much! In this case, correlation is “spurious” : l Size of fire => Damage and Number of Firemen thus both also related
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 25 of 35 Consider Effect of Different Periods l The Data are all for the price of Google stock (GOOG) starting in August 2004 l First, the results for 3 years ending Sept 2007 l Next, 3 years ending Sept 2008 DIFFERENT! l How about 4 years YET ANOTHER! l Results depend on set of data !!!
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 26 of 35 Best Fit GOOG Original Data Best Fit r = 3.54 % /month
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 27 of 35 Best Fit GOOG Original Data Best Fit r = 1.43 % month
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 28 of 35 Best Fits GOOG Original Data Best Fit R = 2.79% / month Best Fit 04 – 07 r = 3.54% / month
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 29 of 35 Revision of Estimates – Bayes Theorem l Definitions –P(E)Prior Probability of Event E –P(E/O)Posterior P(E), after observation O is made. This is the goal of the analysis. –P(O/E)Conditional probability that O is associated with E –P(O)Probability of Event (Observation) O l Theorem: P(E/O) = P(E) {P(O/E) / P(O) } l Note: Importance of revision depends on: –rarity of observation O –extremes of P(O/E)
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 30 of 35 Application of Bayes Theorem l At a certain educational establishment: P(students) = 2/3 P(staff) = 1/3 P(fem/students) = 1/4 P(fem/staff) = 1/2 l What is the probability that a woman on campus is a student? {i.e., what is P(student/fem)?} P(student/fem) = P(student) P(fem/student) P(fem) l Thus: P(student/fem) = 2/3 {(1/4) / 1/3)} = 1/2
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 31 of 35 Likelihood Ratios, LR Definitions P(E )= P(E does not occur) = > P(E) + P(E ) = 1.0 _ LR= P(E)/P(E ); therefore P(E)= LR / (1 + LR) LR i = LR after i observations
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 32 of 35 Likelihood Ratios (2) l Formulas LR 1 = P(E) {P(Oj/E) / P(Oj)} P(E ) {P(Oj/E ) / P(Oj)} after a single observation Oj CLR i = P(Oj/E) / P(Oj/E ) the conditional likelihood ratio for Oj LR N = LR 0 j (CLR j ) N j N j = number of observations of type O j
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 33 of 35 Application of LR (1) l Bottle-making machines can be either OK or defective P(D) = 0.1 l The frequency of cracked bottles depends upon the state of the machine P(C/D) = 0.2 P(C/OK) = 0.05
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 34 of 35 Application of LR (2) Picking up 5 bottles at random from a machine, we find {2 cracked, 3 uncracked} What is the Prob (machine defective) LR O = P(D) / P(OK) = 0.1/0.9 = 1/9 CLR C = 0.2/0.05 = 4 CLR uc = 0.8/0.95 = 16/19 LR 5 = (1/9) (4) 2 (16/19) 3 = 1.06 P(D/{2C, 3UC}) = 0.52 = 1.06/( )
Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Uncertainty assessmentSlide 35 of 35 Take-aways from today l Many ways to try to estimate uncertainties l None are clear winners, each has its own use l Statistical analysis may be best, a good way to make sense out of available data l HOWEVER, each method has its difficulties l Statistical analysis concept “house of cards” l Above all: Correlation does not prove Cause l Bayes Theorem – More on this later