Uncertain Judgements: Eliciting experts’ probabilities Anthony O’Hagan et al 2006 Review by Samu Mäntyniemi.

Slides:



Advertisements
Similar presentations
Standardized Scales.
Advertisements

Evaluation of standard ICES stock assessment and Bayesian stock assessment in the light of uncertainty: North Sea herring as an example Samu MäntyniemiFEM,
Design of Experiments Lecture I
Bayesian Health Technology Assessment: An Industry Statistician's Perspective John Stevens AstraZeneca R&D Charnwood Bayesian Statistics Focus Team Leader.
Uncertainty in Engineering The presence of uncertainty in engineering is unavoidable. Incomplete or insufficient data Design must rely on predictions or.
Statistics: Purpose, Approach, Method. The Basic Approach The basic principle behind the use of statistical tests of significance can be stated as: Compare.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
1 Chapter 8 Subjective Probability. 2 Chapter 8, Subjective Probability Learning Objectives: Uncertainty and public policy Subjective probability-assessment.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 4: Modeling Decision Processes Decision Support Systems in the.
Structural uncertainty from an economists’ perspective
Evaluating Hypotheses
1 Stochastic Dominance Scott Matthews Courses: /
1 Validation and Verification of Simulation Models.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
The discipline of statistics: Provides methods for organizing and summarizing data and for drawing conclusions based on information contained in data.
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
Section 2: Science as a Process
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Calibration Guidelines 1. Start simple, add complexity carefully 2. Use a broad range of information 3. Be well-posed & be comprehensive 4. Include diverse.
1 Performance Evaluation of Computer Networks: Part II Objectives r Simulation Modeling r Classification of Simulation Modeling r Discrete-Event Simulation.
Technical Adequacy Session One Part Three.
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 8: Quantitative.
Estimations In Project Management Intaver Institute Inc. 303, 6707, Elbow Drive S.W, Calgary, AB, Canada Tel: +1(403) Fax: +1(403)
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Theory of Probability Statistics for Business and Economics.
1 Science as a Process Chapter 1 Section 2. 2 Objectives  Explain how science is different from other forms of human endeavor.  Identify the steps that.
Kampala, Uganda, 23 June 2014 Applicability of the ITU-T E.803 Quality of service parameters for supporting service aspects Kwame Baah-Acheamfuor Chairman,
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Introduction to Earth Science Section 2 Section 2: Science as a Process Preview Key Ideas Behavior of Natural Systems Scientific Methods Scientific Measurements.
Uncertainty Management in Rule-based Expert Systems
Copyright 2010, The World Bank Group. All Rights Reserved. Testing and Documentation Part II.
1 / 12 Michael Beer, Vladik Kreinovich COMPARING INTERVALS AND MOMENTS FOR THE QUANTIFICATION OF COARSE INFORMATION M. Beer University of Liverpool V.
Sampling and estimation Petter Mostad
Do I need statistical methods? Samu Mäntyniemi. Learning from experience Which way a bottle cap is going to land? Think, and then write down your opinion.
1 Chapter 10 Methods for Eliciting Probabilities.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Stats 242.3(02) Statistical Theory and Methodology.
Lecture 5.  It is done to ensure the questions asked would generate the data that would answer the research questions n research objectives  The respondents.
Data Analysis and Statistical Reasoning. Geometric and Spatial Reasoning: Review & Sharing What is Statistical Reasoning? Statistical Reasoning Assessment.
Theoretical distributions: the Normal distribution.
REASONING UNDER UNCERTAINTY: CERTAINTY THEORY
Dealing with Uncertainty: A Survey of Theories and Practice Yiping Li, Jianwen Chen and Ling Feng IEEE Transactions on Knowledge and Data Engineering,
Prior beliefs Prior belief is knowledge that one has about a parameter of interest before any events have been observed – For example, you may have an.
Kimberley Hacquoil, Statistics, Programming and Data Strategy GSK
Step 1: Specify a null hypothesis
Statistics in Clinical Trials: Key Concepts
Section 2: Science as a Process
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Statistical Data Analysis
Session II: Reserve Ranges Who Does What
Monte Carlo Schedule Analysis
Quantitative Project Risk Analysis
An introduction to Bayesian reasoning Learning from experience:
Scientific Inquiry Unit 0.3.
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Statistical Data Analysis
CS 594: Empirical Methods in HCC Introduction to Bayesian Analysis
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
28th September 2005 Dr Bogdan L. Vrusias
Psych 231: Research Methods in Psychology
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
Mathematical Foundations of BME Reza Shadmehr
MGS 3100 Business Analysis Regression Feb 18, 2016
Optimization under Uncertainty
Presentation transcript:

Uncertain Judgements: Eliciting experts’ probabilities Anthony O’Hagan et al 2006 Review by Samu Mäntyniemi

Contents Fundamentals of probability and judgement The elicitation context The psychology of judgement under uncertainty The elicitation of probabilities Eliciting distributions – General Eliciting and fitting a parametric distribution Eliciting distributions – uncertainty and imprecision Evaluating elicitation Multiple experts Published examples Guidance on best practice

Fundamentals of probability and judgement Aleatory and epistemic uncertainty: Aleatory: randomness of a system; coin tossing, “natural” variability of population size Epistemic: uncertainty about a fixed quantity; distance between cities, amount of aleatory uncertainty Probability does not exist No universally “correct” probability Subjective probabilities are not typically pre-formed, they are made in response to elicitation Compared to observed events, experts may predict poorly or very well But this is not the point: the point is to formalise what experts think

The elicitation context Best practice: face-to-face 1. Background and preparation 1. Identify the model -> identify variables to elicit 2. Identify and recruit experts 1. Evidence of expertise, willingness, lack of stake in findings 3. Motivating and training experts 1. What will be done with the results? Basic knowledge on probability 4. Structuring and decomposition 1. Structure of the model, review the evidence experts will use 5. The elicitation 1. Make questions to have summaries of a distribution, fit a distribution, check the results with the expert

The elicitation context: different roles 1. Decision maker (person or group) 1. The user of the results – the client 2. The substantive expert 1. Their knowledge is to be expressed 3. The statistician 1. Expert in the methodology 4. The facilitator 1. Expert in the process of elicitation, manages the dialogue with the expert

The psychology of judgement under uncertainty Humans do not usually act as rational agents “Rational” according to probability and decision theory Use of easily adopted strategies (heuristics) Substantive expertise does not guarantee expertise in probability assessment Several kinds of biases may arise from the way how questions are made Bias? At least the probability assessment becomes different For example, probabilities for cause of death: -“Heart attack” or “Something else” -“Heart attack” or “Accident, cancer or something else” “Bias” can potentially be reduced, but care needs to be taken

The elicitation of probabilities What is mathematically equal may not be that psychologically Experts can be taught to make probability assessments Interpretation of verbal expressions is highly variable and context specific Analogy to frequency may often be useful, but care is needed Some methods Direct estimation: ask the probability Response scales: e.g. a line with “impossible” and “certain” in both ends. Probability wheels or pie charts Bets: Would you rather bet on the event A or on having “heads” when tossing a fair coin Distribute a pile of objects to bins

Eliciting distributions – General In practice: elicit a small number of summaries and fit a distribution There is no true distribution, the fitted distribution tries to be “good” representation of the expert opinion Feedback is important The fitted distribution implies things the expert did not say Revise the distribution until it is accepted by the expert Univariate distributions Summaries based on probabilities, quantiles, credible intervals Intervals with high probabilities (>0.9) are typically poorly assessed (Based on calibration studies)

Eliciting and fitting a parametric distribution I Choice of distribution is to some extent a question of mathematical convenience The more complex the model, the more important this is “Overfitting” is recommended: ask more summaries than is needed to identify a distribution Bisection method: 1. Median? 2. Assume that new information says that the true value is below median -> what is the median of the new distribution? ->25% quantile 3. Assume that new information says that the true value is above median -> what is the median of the new distribution? ->75% quantile Could be continued further

Eliciting and fitting a parametric distribution II Interactive computing is almost essential New questions based on earlier answers and immediate feedback Identify apparent inconsistencies and correct them Interactive graphics Many published elicitation methods are devised theoretically and have never been used in practice

Eliciting distributions – uncertainty and imprecision Experts’ probabilities are proxies of the actual belief Fitted distribution is a proxy for the probabilities What to do with this uncertainty? Could use upper and lower bounds in sensitivity analysis Probabilistic analysis of probability….? Conceptual swamp?

Evaluating elicitation Was the elicitation successful? Should describe what an expert thinks, this is difficult to assess In some cases the true value can be revealed, and probabilities compared e.g. weather forecasting There are many methods to use in the comparison Poor calibration with the true values can mean Poor expert knowledge Poor elicitation of expert knowledge Or both

Multiple experts Averaging of experts is seen as the most simple and robust method of combining the expert knowledge Cooke’s method: Average experts Higher weight for better experts Test questions from the same subject area, and determination of weights based on success on those WGBAST method seems to be an extension of Cooke’s method “Problem”: similar experts might result in too much weight on the view that they share Group elicitation might have even greater potential

Summary Good introduction to state of the art in elicitation Elicitation of model structure? Lot of space devoted for calibration even though the sensibility of calibration is questioned? State of the art is quite messy field, new methods and theories are likely to evolve FEM could take a leading role in actively utilising and developing the elicitation procedures