6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield.

6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield

6 July 2007I-Sim Workshop, Fontainebleau2 Outline Uncertainty Example – bovine tuberculosis Uncertainty analysis Elicitation Case study 1 – inhibiting platelet aggregation Propagating uncertainty Case study 2 – cost-effectiveness Conclusions

6 July 2007I-Sim Workshop, Fontainebleau3 Two kinds of uncertainty Aleatory (randomness) Number of heads in 10 tosses of a fair coin Mean of a sample of 25 from a N(0,1) distribution Epistemic (lack of knowledge) Atomic weight of Ruthenium Number of deaths at Agincourt Often, both arise together Number of patients who respond to a drug in a trial Mean height of a sample of 25 men in Fontainebleau

6 July 2007I-Sim Workshop, Fontainebleau4 Two kinds of probability Frequency probability Long run frequency in many repetitions Appropriate only for purely aleatory uncertainty Subjective (or personal) probability Degree of belief Appropriate for both aleatory and epistemic (and mixed) uncertainties Consider, for instance Probability that next president of USA is Republican

6 July 2007I-Sim Workshop, Fontainebleau5 Uncertainty and statistics Data are random Repeatable Parameters are uncertain but not random Unique Uncertainty in data is mixed But aleatory if we condition on (fix) the parameters E.g. likelihood function Uncertainty in parameters is epistemic If we condition on the data, nothing aleatory remains

6 July 2007I-Sim Workshop, Fontainebleau6 Two kinds of statistics Frequentist Based on frequency probability Confidence intervals, significance tests etc Inferences valid only in long run repetition Does not make probability statements about parameters Bayesian Based on personal probability Inferences conditional on the actual data obtained Makes probability statements about parameters

6 July 2007I-Sim Workshop, Fontainebleau7 Example: bovine tuberculosis Consider a model for the spread of tuberculosis (TB) in cows In the UK, TB is primarily spread by badgers Model in order to assess reduction of TB in cows if we introduce local culling (i.e. killing) of badgers

6 July 2007I-Sim Workshop, Fontainebleau8 How the model might look Simulation model components Location of badger setts, litter size and fecundity Spread of badgers Rates of transmission of disease Success rate of culling

6 July 2007I-Sim Workshop, Fontainebleau9 Uncertainty in the TB model Simulation Replicate runs give different outcomes (aleatory) Parameter uncertainty E.g. mean (and distribution of) litter size, dispersal range, transmission rates (epistemic) Structural uncertainty Alternative modelling assumptions (epistemic) Interest in properties of simulation distribution E.g. probability of reducing bovine TB incidence below threshold (with optimal culling) All are functions of parameters and model structure

6 July 2007I-Sim Workshop, Fontainebleau10 General structure Uncertain model parameters (structure) X With known distribution True value X T Object of interest Y T = Y(X T ) Possibly optimised over control parameters Model output Z(X), related to Y(X) E.g. Z(X) = Y(X) + error Can run model for any X Uncertainty about Y T due to two sources We don’t know X T (epistemic) Even if we knew X T,can only observe Z(X T ) (aleatory)

6 July 2007I-Sim Workshop, Fontainebleau11 Uncertainty analysis Find the distribution of Y T Challenges: Specifying distribution of X Computing Z(X) Identifying distribution of Z(X) given Y(X) Propagating uncertainty in X

6 July 2007I-Sim Workshop, Fontainebleau12 Parameter distributions Necessarily personal Even if we have data E.g. sample of badger litter sizes Expert judgement generally plays a part May be formal or informal Formal elicitation of expert knowledge A seriously non-trivial business Substantial body of literature, particularly in psychology

6 July 2007I-Sim Workshop, Fontainebleau13 Case study 1 A pharmaceutical company is developing a new drug to reduce platelet aggregation for patients with acute coronary syndrome (ACS) Primary comparator is clopidogrel Case study concerns elicitation of expert knowledge prior to reporting of Phase 2a trial Required in order to do Bayesian clinical trial simulation 5 elicitation sessions with several experts over a total of about 3 days Analysis revisited after Phase 2a and 2b trials

6 July 2007I-Sim Workshop, Fontainebleau14 Simulating SVEs Patient enters Randomise to new/clopidogrel Generate mean IPA for each drug Generate IPA- SVE relationship Generate patient IPA Generate whether patient has SVE Patient loop SVE = Secondary vascular event IPA = Inhibition of platelet aggregation

6 July 2007I-Sim Workshop, Fontainebleau15 Distributions elicited Many distributions were actually elicited 1. Mean IPA (efficacy on biomarker) for each drug and dose 2. Patient-level variation in IPA around mean 3. Relative risk of SVE conditional on individual patient IPA 4. Baseline SVE risk 5. Other things to do with side effects We will just look here at elicitation of the distribution of mean IPA for a high dose of the new drug Judgements made at the time Knowledge now is of course quite different! But decisions had to be made then about Phase 2b trial Whether to go ahead or drop the drug Size of sample, how many doses, etc

6 July 2007I-Sim Workshop, Fontainebleau16 Elicitation record Elicitation title Session Date Start time Attendance and roles ORIENTATION Purpose of elicitation This record This document will form an auditable record of the elicitation process. Participants’ expertise Nature of uncertainty Effective elicitation Strengths/weaknesses PRACTICE Objective Practice elicitation. Eliciting knowledge about the population of Portugal. Etc. IPA DISTRIBUTIONS Objective To elicit a joint probability distribution for individual patient IPA Etc. Definitions IPA is platelet aggregation inhibition, on a scale of 0 to 100 (i.e. 0% to 100%) Etc. Evidence Healthy volunteer data, about 150 volunteers in total. Etc. Structuring Session ended

6 July 2007I-Sim Workshop, Fontainebleau17 Eliciting one distribution Mean IPA (%) for high dose Range: 80 to 100 Median: 92 Probabilities: P(over 95) = 0.4, P(under 85) = 0.2 Chosen distribution: Beta(11.5, 1.2) Median 93 P(over 95) = 0.36, P(under 85) = 0.20, P(under 80) = 0.11

6 July 2007I-Sim Workshop, Fontainebleau18 Propagating uncertainty Usual approach is by Monte Carlo Randomly draw parameter sets X i, i = 1, 2, …, N from distribution of X Run model for each parameter set to get outputs Y i = Y(X i ), i = 1, 2, …, N Assume for now that we can do big enough runs to ignore the difference between Z(X) and Y(X) These are a sample from distribution of Y T Use sample to make inferences about this distribution Generally frequentist but fundamentally epistemic Impractical if computing each Y i is computationally intensive

6 July 2007I-Sim Workshop, Fontainebleau19 Optimal balance of resources Consider the situation where each Z(X i ) is an average over n individuals And Y(X i ) could be got by using very large n Then total computing effort is Nn individuals Simulation within simulation Suppose The variance between individuals is v The variance of Y(X) is w We are interested in E(Y(X)) and w Then optimally n = 1 + v/w (approx) Of order 36 times more efficient than large n

6 July 2007I-Sim Workshop, Fontainebleau20 Emulation When even this efficiency gain is not enough Or when we the conditions don’t hold We may be able to propagate uncertainty through emulation An emulator is a statistical model/approximation for the function Y(X) Trained on a set of model runs Y i = Y(X i ) or Z i = Z(X i ) But X i s not chosen randomly (inference is now Bayesian) Runs much faster than the original simulator Think neural net or response surface, but better!

6 July 2007I-Sim Workshop, Fontainebleau21 Gaussian process The emulator represents Y(.) as a Gaussian process Prior distribution embodies only a belief that Y(X) is a smooth, continuous function of X Condition on training set to get posterior GP Posterior mean function is a fast approximation to Y(.) Posterior variance expresses additional uncertainty Unlike neural net or response surface, the GP emulator correctly encodes the training data

6 July 2007I-Sim Workshop, Fontainebleau22 2 code runs Consider one input and one output Emulator estimate interpolates data Emulator uncertainty grows between data points

6 July 2007I-Sim Workshop, Fontainebleau23 3 code runs Adding another point changes estimate and reduces uncertainty

6 July 2007I-Sim Workshop, Fontainebleau24 5 code runs And so on

6 July 2007I-Sim Workshop, Fontainebleau25 Then what? Given enough training data points we can emulate any model accurately So that posterior variance is small “everywhere” Typically, this can be done with orders of magnitude fewer model runs than traditional methods Use the emulator to make inference about other things of interest E.g. uncertainty analysis, calibration, optimisation Conceptually very straightforward in the Bayesian framework But of course can be computationally hard

6 July 2007I-Sim Workshop, Fontainebleau26 Case study 2 Clinical trial simulation coupled to economic model Simulation within simulation Outer simulation of clinical trials, producing trial outcome results In the form of posterior distributions for drug efficacy Incorporating parameter uncertainty Inner simulation of cost-effectiveness (NICE decision) For each trial outcome simulate patient outcomes with those efficacy distributions (and many other uncertain parameters) Like the “optimal balance of resources” slide But complex clinical trial simulation replaces simply drawing from distribution of X

6 July 2007I-Sim Workshop, Fontainebleau27 Emulator solution 5 emulators built Means and variances of (population mean) incremental costs and QALYs, and their covariance Together these characterised the Cost Effectiveness Acceptability Curve Which was basically our Y(X) For any given trial design and drug development protocols, we could assess the uncertainty (due to all causes) regarding whether the final Phase 3 trial would produce good enough results for the drug to be 1. Licensed for use 2. Adopted as cost-effective by the UK National Health Service

6 July 2007I-Sim Workshop, Fontainebleau28 Conclusions The distinction between epistemic and aleatory uncertainty is useful Recognising that uncertainty about parameters of a model (and structural assumptions) is epistemic is useful Expert judgement is an integral part of specifying distributions Uncertainty analysis of a stochastic simulation model is conceptually a nested simulation Optimal balance of sample sizes More efficient computation using emulators

6 July 2007I-Sim Workshop, Fontainebleau29 References On elicitation O’Hagan, A. et al (2006). Uncertain Judgements: Eliciting Expert Probabilities. Wiley www.shef.ac.uk/beep On optimal resource allocation O’Hagan, A., Stevenson, M.D. and Madan, J. (2007). Monte Carlo probabilistic sensitivity analysis for patient level simulation models: Efficient estimation of mean and variance using ANOVA. Health Economics (in press) Download from tonyohagan.co.uk/academictonyohagan.co.uk/academic On emulators O'Hagan, A. (2006). Bayesian analysis of computer code outputs: a tutorial. Reliability Engineering and System Safety 91, 1290-1300. mucm.group.shef.ac.uk

6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield.

Similar presentations

Presentation on theme: "6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield.

Similar presentations

Presentation on theme: "6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield."— Presentation transcript:

Similar presentations

About project

Feedback