Decision-Principles to Justify Carnap's Updating Method and to Suggest Corrections of Probability Judgments Peter P. Wakker Economics Dept. Maastricht University Nir Friedman (opening)
2 + dimension map density labels player ancestral generative dynamics bound filtering iteration ancestral graph Good words agent Bayesian network learning elicitation diagram causality utility reasoning Bad words
3 “Decision theory = probability theory + utility theory.” Bayesian networkers care about prob. th. However, why care about utility theory? (1) Important for decisions. (2) Helps in studying probabilities: If you are interested in the processing of probabilities, then still the tools of utility theory can be useful.
4 1. Decision Theory: Empirical Work (on Ut ty ); 2. A New Foundation of (Static) Bayesianism; 3. Carnap’s Updating Method; 4. Corrections of Probability Judgments Based on Empirical Findings. Outline
5 (Hypothetical) measurement of popularity of internet sites. For simplicity, Assumption. We compare internet sites that differ only regarding (randomness in) waiting time. Question: How does random waiting time affect popularity of internet sites? Through average? 1. Decision Theory; Empirical Work
6 More refined procedure: Not average of waiting time, but average of how people feel about waiting time, (subjectively perceived) cost of waiting time. Problem: Users’ subjectively perceived cost of waiting time may be nonlinear.
Subj. perc. of costs waiting time (seconds) /6 5/6 14 4/6 9 3/6 7 2/6 5 7 Graph
8 For simplicity, Assumption. Internet can be in two states only: fast or slow. P(fast) = 2/3; P(slow) = 1/3. How measure subjectively perceived cost of waiting time?
C(25) + C(t 1 ) = C(35) + C(t 0 ) _ ( C(35) C(25) ) Tradeoff (TO) method t2t t 1 ~ t6t t 5 ~ slow fast (= t 0 ) EC = C(t 2 ) C(t 1 ) = = = C(t 6 ) C(t 5 ) = C(t 1 ) C(t 0 ) = _ ( C(35) C(25) ) _ ( C(35) C(25) ) slow fast t t ´ t 1 ~
1 0 Subj. cost waiting time Normalize: C(t 0 ) = 0; C(t 6 ) = 1. 0=t00=t0 t1t1 t6t6 1/6 5/6 t5t5 4/6 t4t4 3/6 t3t3 2/6 t2t2 Consequently: C(t j ) = j/6. 10
_ ( C(35) C(25) ) t2t t 1 ~ t6t t 5 ~ ~ 25 t 1 35 0 (= t 0 ) = C(t 2 ) C(t 1 ) = = = C(t 6 ) C(t 5 ) = C(t 1 ) C(t 0 ) = _ ( C(35) C(25) ) _ ( C(35) C(25) ) Tradeoff (TO) method revisited 11 misperceived prob s 11 22 11 22 11 22 ? ? ? ! ! ! EC unknown prob s
12 Measure subjective/unknown prob s from elicited choices: then p ( C(35) – C(25) ) = (1 p) ( C(t 1 ) – C(t 0 ) ), so p = C(35) – C(25) + C(t 1 ) – C(t 0 ) C(t 1 ) – C(t 0 ) ~ 25 t p slow fast 1-p (= t 0 ) p slow fast 1-p If P(slow) = Abdellaoui (2000), Bleichrodt & Pinto (2000), Management Science.
13 Say, some observations show: C(t 2 ) C(t 1 ) = C(t 1 ) C(t 0 ). Other observations show: C(t 2 ’) C(t 1 ) = C(t 1 ) C(t 0 ), for t 2 ’ > t 2. Then you have empirically falsified EC model! Definition. Tradeoff consistency holds if this never happens. What if inconsistent data?
Theorem. EC model holds 14 Descriptive application: EC model falsified iff tradeoff consistency violated. tradeoff consistency holds.
15 Normative application: Can convince client to use EC iff can convince client that tradeoff consistency is reasonable. 2. A New Foundation of (Static) Bayesianism
16 We examine: Rudolf Carnap’s (1952, 1980) ideas about the Dirichlet family of prob ty distributions. 3. Carnap’s Updating Method
17 Example. Doctor, say YOU, has to choose the treatment of a patient standing before you. Patient has exactly one (“true”) disease from set D = {d 1,...,d s } of possible diseases. You are uncertain about which the true disease is.
For simplicity: Assumption. Results of treatment can be expressed in monetary terms. 18 Definition. Treatment (d i :1) : if true disease is d i, it saves $1, compared to common treatment; otherwise, it is equally expensive.
19 treatment (d i :1) d 1... d i... d s Uncertain which disease d j is true uncertain what the outcome (money saved) of the treatment will be.
20 When deciding on your patient, you have observed t similar patients in the past, and found out their true disease. Notation. E = (E 1,...,E t ), E i describes disease of i th patient. Assumption.
21 You are Bayesian. So, expected uility. Assumption.
22 Given info E, prob s are to be taken as follows: Imagine someone, say me, gives you advice:
23 p E i = i p 0 + nini t t + t (as are the ‘s) i p 0 Appealing! Natural way to integrate - subject-matter info i p 0 ( ) - statistical information nini t ( ) : obs vd relative frequency of d i in E 1,…,E t nini t > 0 : subjective parameter Subjective parameters disappear as t . Alternative interpretation: combining evidence.
24 Why not weight t 2 iso t? Why not take geometric mean? Why not have depend on t and n i, and on other n j ’s? Decision theory can make things less ad hoc. An aside. The main mathematical problem: to formulate everything in terms of the “naïve space,” as Grünwald & Halpern (2002) call it. Appealing advice, but, a hoc!
25 Let us change subject. Forget about advice, for the time being.
E 26 Positive relatedness of the observations. (d i :1) ~ E $x (1) Wouldn’t you want to satisfy: (d i :1) $x. (,d i )
27 Past-exchangeability: (d i :1) ~ E $x (d i :1) ~ E' $x whenever: E = (E 1,...,E m 1,d j,d k,E m+2,...,E t ) and E' = (E 1,...,E m 1,,,E m+2,...,E t ) (2) Wouldn’t you want to satisfy: dkdk djdj for some m < t, j,k.
28 EjEj... EtEt ¬ni¬ni d i at time t+1 E1E1 nini nsns... n1n1 past- exchange- bility disjoint causality next, 29 31
29 Future-exchangeability Assume $x ~ E (d j :y) and $y ~ (E,d j ) (d k :z). Interpretation: $x ~ E (d j and then d k : z). Assume $x‘~ E (d k :y’) and $y' ~ (E,d k ) (d j :z’). Interpretation: $x’ ~ E (d k and then d j : z’). Now: x = x‘ z = z’. Interpretation: [d j then d k ] is as likely as [d k then d j ], given E. (3) Wouldn’t you want to satisfy:
(d i :1) $x (,d j ) 30 Disjoint causality: for all E & distinct i,j,k, (4) Wouldn’t you want to satisfy: E ~ E (d i :1) $x ~ (,d k ) Bad nutrition Other cause d2d2 d1d1 d3d3 A violation: Fig, 28
31 Theorem. Assume s 3. Equivalent are: (i) (a) Tradeoff consistency; Decision-theoretic surprise: p E i = i p 0 + nini t t + t (b) Positive relatedness of obs ns ; (c) Exchangeability (past and future); (d) Disjoint causality. (ii) EU holds for each E with fixed U, and Carnap’s inductive method:
32 Abdellaoui (2000), Bleichrodt & Pinto (2000) (and many others): Subj.Probs nonadditive. Assume simple model: (A:x) W(A)U(x) U(0) = 0; W nonadditive; may be Dempster-Shafer belief function. Only nonnegative outcomes. 4. Corrections of Probability Judgments Based on Empirical Findings
33 two-stage model, W = w ; : direct psychological judgment of probability w: turns judgments of probability into decision weights. w can be measured from case where obj. probs are known. Tversky & Fox (1995):
34 W(A B) W(A) + W(B) if disjoint (superadditivity). (e.g., Dempster-Shafer belief functions). Economists/AI: w is convex. Enhances:
p w Psychologists:
36 p, q moderate: w(p + q) w(p) + w(q) (subadditivity). The w component of W enhances subadditivity of W, W(A B) W(A) + W(B) for disjoint events A,B, contrary to the common assumptions about belief functions as above.
37 = w inv W: behavioral derivation of judgment of expert. Tversky & Fox 1995: more nonlinearity in than in w 's and W's deviations from linearity are of the same nature as Figure 3. Tversky & Wakker (1995): formal definitions
38 Non-Bayesians: Alternatives to the Dempster-Shafer belief functions. No degeneracy after multiple updating. Figure 3 for and W: lack of sensitivity towards varying degrees of uncertainty Fig. 3 better reflects absence of information than convexity
39 Fig. 3: from data Suggests new concepts. e.g., info-sensitivity iso conservativeness/pessimism. Bayesians: Fig. 3 suggests how to correct expert judgments.
40 Support theory (Tversky & Koehler 1994). Typical finding: For disjoint A j, (A 1 ) (A n ) – (A 1 ... A n ) increases as n increases.