The influence of hierarchy on probability judgment David A. Lagnado David R. Shanks University College London
Level of hierarchy can modulate judgment Consider two statements about the next World Cup Consider two statements about the next World Cup It is most likely that Brazil will win It is most likely that Brazil will win It is most likely that a European team will win It is most likely that a European team will win These appear to support opposing predictions, but both may be true These appear to support opposing predictions, but both may be true Shows the importance of the level at which probabilistic information is represented Shows the importance of the level at which probabilistic information is represented
Hierarchical structure Pervasive feature of how we represent the world Pervasive feature of how we represent the world Reflects pre-existing physical and social hierarchies Reflects pre-existing physical and social hierarchies Readily generated through conceptual combination Readily generated through conceptual combination Category hierarchies serve both to organize our knowledge, and to structure our inferences Category hierarchies serve both to organize our knowledge, and to structure our inferences
Inference using a hierarchy One powerful feature of a category hierarchy is that given information about categories at one level, you can make inferences about categories at another level. One powerful feature of a category hierarchy is that given information about categories at one level, you can make inferences about categories at another level. This allows you to exclude alternatives, or reduce the number you need to consider This allows you to exclude alternatives, or reduce the number you need to consider TabloidBroadsheet TimesGuardian Mirror Sun
Probabilistic Inference using a hierarchy In many real-world situations we must base our initial category judgments on imperfect cues, degraded stimuli, or statistical data. In many real-world situations we must base our initial category judgments on imperfect cues, degraded stimuli, or statistical data. What effect do such probabilistic contexts have on the hierarchical inferences that we are licensed to make? What effect do such probabilistic contexts have on the hierarchical inferences that we are licensed to make? TabloidBroadsheet TimesGuardian Mirror Sun
Commitment heuristic Commitment heuristic - When people select the most probable category at the superordinate level, they assume that it contains the most probable subordinate category. Commitment heuristic - When people select the most probable category at the superordinate level, they assume that it contains the most probable subordinate category. This leads to the neglect of subordinates from the less probable superordinate. This leads to the neglect of subordinates from the less probable superordinate. TabloidBroadsheet TimesGuardian Mirror Sun
How adaptive is this heuristic? The efficacy of such a heuristic depends on the precise structure of the environment. The efficacy of such a heuristic depends on the precise structure of the environment. In certain environments it confers considerable advantages In certain environments it confers considerable advantages increases inferential power by focus on appropriate subcategories increases inferential power by focus on appropriate subcategories reduces computational demands by avoiding complex Bayesian calculations. reduces computational demands by avoiding complex Bayesian calculations. But in some environments it can lead to anomalous judgments and inferences. But in some environments it can lead to anomalous judgments and inferences.
Non-aligned hierarchy In the above sample the most frequently read type of paper is a Tabloid, but the most frequently read paper is a Broadsheet (the Guardian). In the above sample the most frequently read type of paper is a Tabloid, but the most frequently read paper is a Broadsheet (the Guardian). Non-aligned hierarchy: the most probable superordinate category does not contain the most probable subordinate category. Non-aligned hierarchy: the most probable superordinate category does not contain the most probable subordinate category. Tabloid 60Broadsheet 40 Times 5 Guardian 35 Mirror 30 Sun 30
Real world examples Word frequencies: the superordinate BE- is more frequent than BU-, but the subordinate BUT is more frequent than any of the other subordinates (BET, BED…etc.) Word frequencies: the superordinate BE- is more frequent than BU-, but the subordinate BUT is more frequent than any of the other subordinates (BET, BED…etc.) NHS statistics on survival rate for operations for different areas & sub-areas NHS statistics on survival rate for operations for different areas & sub-areas You are more likely to survive a hip operation in Surrey rather than Essex, but the best sub-area for survival is Colchester (in Essex). You are more likely to survive a hip operation in Surrey rather than Essex, but the best sub-area for survival is Colchester (in Essex).
Experiments 1 and 2 Learning phase - participants exposed to a non-aligned hierarchical environment in which they learn to predict voting behavior from newspaper readership. Learning phase - participants exposed to a non-aligned hierarchical environment in which they learn to predict voting behavior from newspaper readership. 100 trials ‘reading/voting profiles’ 100 trials ‘reading/voting profiles’
Screen during learning phase Broadsheet Chronicle Tabloid HeraldReporter Globe ○ Liberal ○ Progressive
Screen during learning phase Broadsheet Chronicle Tabloid HeraldReporter Globe ○ Liberal ○ Progressive Reading profile for J. K.
Screen during learning phase Broadsheet Chronicle Tabloid HeraldReporter Globe ○ Liberal ○ Progressive Reading profile for J. K. Outcome feedback
Structure of environment Tabloid60Broadsheet 40 Times 5Guardian 35 Mirror 30Sun 30 Party A Party B 50
Judgment phase Which paper is X most likely to read? X is selected at random What is the probability that X votes for one party rather than the other? Which type of paper is X most likely to read? Baseline Type Paper
Results of Experiment 1 Probability ratings for Party B rather than Party A with judgments divided into those based on aligned and non- aligned choices Probability ratings for Party B rather than Party A with judgments divided into those based on aligned and non- aligned choices
Experiment 2 Replication of Experiment 1, with frequency as well as probability response formats Replication of Experiment 1, with frequency as well as probability response formats Frequentist hypothesis that probability biases reduced with frequency format Frequentist hypothesis that probability biases reduced with frequency format
Results of Experiment 2 Mean ratings for Party B rather than Party A collapsed across probability and frequency ratings Mean ratings for Party B rather than Party A collapsed across probability and frequency ratings
Summary of Results Participants allow their initial probability judgment about category membership (newspaper readership) to shift their rating of the probability of a related outcome (voting preference), even though all judgments are made on the basis of the same statistical data. Participants allow their initial probability judgment about category membership (newspaper readership) to shift their rating of the probability of a related outcome (voting preference), even though all judgments are made on the basis of the same statistical data. When their prior choices were non-aligned this led to a switch in predictions about the outcome category When their prior choices were non-aligned this led to a switch in predictions about the outcome category
Conclusions These biases are explicable by the Commitment heuristic: These biases are explicable by the Commitment heuristic: The priming question commits people to just one inferential path, leading them to compute an erroneous estimate for the final probability. The priming question commits people to just one inferential path, leading them to compute an erroneous estimate for the final probability. This is understandable given the complexity of the normative Bayesian computation. This is understandable given the complexity of the normative Bayesian computation.
Comparison of Bayesian and commitment heuristic computations (just type level inference) P(A) = ( ) + ( ) = = 0.5 P(A) = 0.77 Bayesian computation Simplified heuristic computation Type of paper? Type of paper? TabloidBroadsheet Tabloid Party A Party B Party A
Conclusions Simplifying heuristic that assumes that environment is aligned Simplifying heuristic that assumes that environment is aligned Empowers inference when hierarchical structure is aligned, otherwise can lead to error Empowers inference when hierarchical structure is aligned, otherwise can lead to error Suggests tendency to reason as if a probable conclusion is true Suggests tendency to reason as if a probable conclusion is true
Process level accounts Associative model Associative model People learn predictive relations between category options (at both levels of hierarchy) and outcome. At test responses to category questions prime the appropriate associations and lead to a biased rating of the outcome. People learn predictive relations between category options (at both levels of hierarchy) and outcome. At test responses to category questions prime the appropriate associations and lead to a biased rating of the outcome. Frequency-based model Frequency-based model People encode event frequencies in the learning phase. At test responses to the category question serves as the reference class for subsequent conditional probability judgments about voting preferences. People encode event frequencies in the learning phase. At test responses to the category question serves as the reference class for subsequent conditional probability judgments about voting preferences.
Implications Importance of the level at which probabilistic data is represented to (or by) a decision maker Importance of the level at which probabilistic data is represented to (or by) a decision maker E.g., using NHS statistics to decide on hospital E.g., using NHS statistics to decide on hospital How do people search through hierarchical statistical data? How do people search through hierarchical statistical data? People’s judgments can be manipulated by the level at which statistical information is represented People’s judgments can be manipulated by the level at which statistical information is represented More generally, in multi-step inferences people are susceptible to biased probability judgments More generally, in multi-step inferences people are susceptible to biased probability judgments