Decision-Principles to Justify Carnap's Updating Method and to Suggest Corrections of Probability Judgments Peter P. Wakker Economics Dept. Maastricht.

Slides:

Advertisements

Similar presentations

Economics of Information (ECON3016)

Advertisements

Tests of Hypotheses Based on a Single Sample

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.

Bayesian Network and Influence Diagram A Guide to Construction And Analysis.

Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.

Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.

PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :

1 Slides for the book: Probabilistic Robotics Authors: Sebastian Thrun Wolfram Burgard Dieter Fox Publisher: MIT Press, Web site for the book & more.

AI – CS364 Uncertainty Management 26 th September 2006 Dr Bogdan L. Vrusias

Introduction Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University.

Visual Recognition Tutorial

Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.

Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.

Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.

LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.

Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 4: Modeling Decision Processes Decision Support Systems in the.

1 Empirical Similarity and Objective Probabilities Joint works of subsets of A. Billot, G. Gayer, I. Gilboa, O. Lieberman, A. Postlewaite, D. Samet, D.

Evaluating Hypotheses

Fall 2006 – Fundamentals of Business Statistics 1 Chapter 8 Introduction to Hypothesis Testing.

Uncertainty RUD, June 28, 2006, Peter P. Wakker Paper: Chapter for New Palgrave. Its didactical beginning is not optimal for this specialized audience.

Visual Recognition Tutorial

Games in the normal form- An application: “An Economic Theory of Democracy” Carl Henrik Knutsen 5/

Continuous Random Variables and Probability Distributions

Personality, 9e Jerry M. Burger

Thanks to Nir Friedman, HU

Reconciling Introspective Utility with Revealed Preference: Experimental Arguments Based on Prospect Theory Peter P. Wakker ( & Abdellaoui & Barrios; Ecole.

Engineering Systems Analysis for Design Richard de Neufville © Massachusetts Institute of Technology Flaw of Averages Slide 1 of 29 Richard de Neufville.

Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.

COMP14112: Artificial Intelligence Fundamentals L ecture 3 - Foundations of Probabilistic Reasoning Lecturer: Xiao-Jun Zeng

Combining Bayesian Beliefs and Willingness to Bet to Analyze Attitudes towards Uncertainty by Peter P. Wakker, Econ. Dept., Erasmus Univ. Rotterdam (joint.

Statistical Hypothesis Testing. Suppose you have a random variable X ( number of vehicle accidents in a year, stock market returns, time between el nino.

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.

Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.

The Development of Decision Analysis Jason R. W. Merrick Based on Smith and von Winterfeldt (2004). Decision Analysis in Management Science. Management.

The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.

Evaluation of Medicine Two types: –Societal level Economic evaluation –Individual level medical decision making.

Statistical Decision Theory

Chapter 7 Hypothesis testing. §7.1 The basic concepts of hypothesis testing  1 An example Example 7.1 We selected 20 newborns randomly from a region.

Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.

Theory of Probability Statistics for Business and Economics.

1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.

Introduction Osborn. Daubert is a benchmark!!!: Daubert (1993)- Judges are the “gatekeepers” of scientific evidence. Must determine if the science is.

Making Simple Decisions

BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.

Axioms Let W be statements known to be true in a domain An axiom is a rule presumed to be true An axiomatic set is a collection of axioms Given an axiomatic.

SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.

Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.

Chapter 4 Probability ©. Sample Space sample space.S The possible outcomes of a random experiment are called the basic outcomes, and the set of all basic.

Week 11 What is Probability? Quantification of uncertainty. Mathematical model for things that occur randomly. Random – not haphazard, don’t know what.

Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.

Uncertainty Management in Rule-based Expert Systems

Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.

Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.

Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.

Decision theory under uncertainty

The Scientific Method: Terminology Operational definitions are used to clarify precisely what is meant by each variable Participants or subjects are the.

Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.

Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.

Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.

Warsaw Summer School 2015, OSU Study Abroad Program Normal Distribution.

CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.

Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia

Chapter 16 March 25, Probability Theory: What an agent should believe based on the evidence Utility Theory: What the agent wants Decision Theory:

Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

Uncertain Judgements: Eliciting experts’ probabilities Anthony O’Hagan et al 2006 Review by Samu Mäntyniemi.

The Beta Reputation System

Reasoning Under Uncertainty in Expert System

Presentation transcript:

Decision-Principles to Justify Carnap's Updating Method and to Suggest Corrections of Probability Judgments Peter P. Wakker Economics Dept. Maastricht University Nir Friedman (opening)

2  + dimension map density labels player ancestral generative dynamics bound filtering iteration ancestral graph Good words        agent Bayesian network learning elicitation diagram causality utility reasoning Bad words

3 “Decision theory = probability theory + utility theory.” Bayesian networkers care about prob. th. However, why care about utility theory? (1) Important for decisions. (2) Helps in studying probabilities: If you are interested in the processing of probabilities, then still the tools of utility theory can be useful.

4 1. Decision Theory: Empirical Work (on Ut ty ); 2. A New Foundation of (Static) Bayesianism; 3. Carnap’s Updating Method; 4. Corrections of Probability Judgments Based on Empirical Findings. Outline

5 (Hypothetical) measurement of popularity of internet sites. For simplicity, Assumption. We compare internet sites that differ only regarding (randomness in) waiting time. Question: How does random waiting time affect popularity of internet sites? Through average? 1. Decision Theory; Empirical Work

6 More refined procedure: Not average of waiting time, but average of how people feel about waiting time, (subjectively perceived) cost of waiting time. Problem: Users’ subjectively perceived cost of waiting time may be nonlinear.

Subj. perc. of costs waiting time (seconds) /6 5/6 14 4/6 9 3/6 7 2/6 5 7 Graph

8 For simplicity, Assumption. Internet can be in two states only: fast or slow. P(fast) = 2/3; P(slow) = 1/3. How measure subjectively perceived cost of waiting time?

 C(25) +  C(t 1 ) =  C(35) +  C(t 0 ) _ ( C(35)  C(25) ) Tradeoff (TO) method t2t t 1 ~     t6t t 5   ~    slow fast  (= t 0 ) EC = C(t 2 )  C(t 1 ) = = = C(t 6 )  C(t 5 ) = C(t 1 )  C(t 0 ) =   _ ( C(35)  C(25) )   _ ( C(35)  C(25) )    slow fast  t  t ´  t 1 ~

1 0 Subj. cost waiting time Normalize: C(t 0 ) = 0; C(t 6 ) = 1. 0=t00=t0 t1t1 t6t6 1/6 5/6 t5t5 4/6 t4t4 3/6 t3t3 2/6 t2t2 Consequently: C(t j ) = j/6. 10

_ ( C(35)  C(25) ) t2t t 1 ~     t6t t 5   ~   ~ 25 t 1   35 0   (= t 0 ) = C(t 2 )  C(t 1 ) = = = C(t 6 )  C(t 5 ) = C(t 1 )  C(t 0 ) =   _ ( C(35)  C(25) )   _ ( C(35)  C(25) )   Tradeoff (TO) method revisited 11 misperceived prob s 11 22 11 22 11 22 ? ? ? ! ! ! EC unknown prob s

12 Measure subjective/unknown prob s from elicited choices: then p ( C(35) – C(25) ) = (1  p) ( C(t 1 ) – C(t 0 ) ), so p = C(35) – C(25) + C(t 1 ) – C(t 0 ) C(t 1 ) – C(t 0 ) ~ 25 t p slow fast 1-p (= t 0 ) p slow fast 1-p If P(slow) = Abdellaoui (2000), Bleichrodt & Pinto (2000), Management Science.

13 Say, some observations show: C(t 2 )  C(t 1 ) = C(t 1 )  C(t 0 ). Other observations show: C(t 2 ’)  C(t 1 ) = C(t 1 )  C(t 0 ), for t 2 ’ > t 2. Then you have empirically falsified EC model! Definition. Tradeoff consistency holds if this never happens. What if inconsistent data?

Theorem. EC model holds  14 Descriptive application: EC model falsified iff tradeoff consistency violated. tradeoff consistency holds.

15 Normative application: Can convince client to use EC iff can convince client that tradeoff consistency is reasonable. 2. A New Foundation of (Static) Bayesianism

16 We examine: Rudolf Carnap’s (1952, 1980) ideas about the Dirichlet family of prob ty distributions. 3. Carnap’s Updating Method

17 Example. Doctor, say YOU, has to choose the treatment of a patient standing before you. Patient has exactly one (“true”) disease from set D = {d 1,...,d s } of possible diseases. You are uncertain about which the true disease is.

For simplicity: Assumption. Results of treatment can be expressed in monetary terms. 18 Definition. Treatment (d i :1) : if true disease is d i, it saves $1, compared to common treatment; otherwise, it is equally expensive.

19 treatment (d i :1) d 1... d i... d s Uncertain which disease d j is true  uncertain what the outcome (money saved) of the treatment will be.

20 When deciding on your patient, you have observed t similar patients in the past, and found out their true disease. Notation. E = (E 1,...,E t ), E i describes disease of i th patient. Assumption.

21 You are Bayesian. So, expected uility. Assumption.

22 Given info E, prob s are to be taken as follows: Imagine someone, say me, gives you advice:

23 p E i = i p 0 + nini t t + t (as are the ‘s) i p 0 Appealing! Natural way to integrate - subject-matter info i p 0 ( ) - statistical information nini t ( ) : obs vd relative frequency of d i in E 1,…,E t nini t > 0 : subjective parameter Subjective parameters disappear as t  . Alternative interpretation: combining evidence.

24 Why not weight t 2 iso t? Why not take geometric mean? Why not have depend on t and n i, and on other n j ’s? Decision theory can make things less ad hoc. An aside. The main mathematical problem: to formulate everything in terms of the “naïve space,” as Grünwald & Halpern (2002) call it. Appealing advice, but, a hoc!

25 Let us change subject. Forget about advice, for the time being.

E 26 Positive relatedness of the observations. (d i :1) ~ E $x  (1) Wouldn’t you want to satisfy: (d i :1) $x.  (,d i )

27 Past-exchangeability: (d i :1) ~ E $x  (d i :1) ~ E' $x whenever: E = (E 1,...,E m  1,d j,d k,E m+2,...,E t ) and E' = (E 1,...,E m  1,,,E m+2,...,E t ) (2) Wouldn’t you want to satisfy: dkdk djdj for some m < t, j,k.

28 EjEj... EtEt ¬ni¬ni d i at time t+1 E1E1 nini nsns... n1n1 past- exchange- bility disjoint causality next, 29 31

29 Future-exchangeability Assume $x ~ E (d j :y) and $y ~ (E,d j ) (d k :z). Interpretation: $x ~ E (d j and then d k : z). Assume $x‘~ E (d k :y’) and $y' ~ (E,d k ) (d j :z’). Interpretation: $x’ ~ E (d k and then d j : z’). Now: x = x‘  z = z’. Interpretation: [d j then d k ] is as likely as [d k then d j ], given E. (3) Wouldn’t you want to satisfy:

(d i :1) $x (,d j ) 30 Disjoint causality: for all E & distinct i,j,k, (4) Wouldn’t you want to satisfy: E ~  E (d i :1) $x ~ (,d k ) Bad nutrition Other cause d2d2 d1d1 d3d3 A violation: Fig, 28

31 Theorem. Assume s  3. Equivalent are: (i) (a) Tradeoff consistency; Decision-theoretic surprise: p E i = i p 0 + nini t t + t (b) Positive relatedness of obs ns ; (c) Exchangeability (past and future); (d) Disjoint causality. (ii) EU holds for each  E with fixed U, and Carnap’s inductive method:

32 Abdellaoui (2000), Bleichrodt & Pinto (2000) (and many others): Subj.Probs nonadditive. Assume simple model: (A:x)  W(A)U(x) U(0) = 0; W nonadditive; may be Dempster-Shafer belief function. Only nonnegative outcomes. 4. Corrections of Probability Judgments Based on Empirical Findings

33 two-stage model, W = w   ;  : direct psychological judgment of probability w: turns judgments of probability into decision weights. w can be measured from case where obj. probs are known. Tversky & Fox (1995):

34 W(A  B)  W(A) + W(B) if disjoint (superadditivity). (e.g., Dempster-Shafer belief functions). Economists/AI: w is convex. Enhances:

p w Psychologists:

36 p, q moderate: w(p + q)  w(p) + w(q) (subadditivity). The w component of W enhances subadditivity of W, W(A  B)  W(A) + W(B) for disjoint events A,B, contrary to the common assumptions about belief functions as above.

37  = w inv W: behavioral derivation of judgment of expert. Tversky & Fox 1995: more nonlinearity in  than in w  's and W's deviations from linearity are of the same nature as Figure 3. Tversky & Wakker (1995): formal definitions

38 Non-Bayesians: Alternatives to the Dempster-Shafer belief functions. No degeneracy after multiple updating. Figure 3 for  and W: lack of sensitivity towards varying degrees of uncertainty Fig. 3 better reflects absence of information than convexity

39 Fig. 3: from data Suggests new concepts. e.g., info-sensitivity iso conservativeness/pessimism. Bayesians: Fig. 3 suggests how to correct expert judgments.

40 Support theory (Tversky & Koehler 1994). Typical finding: For disjoint A j,  (A 1 )  (A n ) –  (A 1 ...  A n ) increases as n increases.