Download presentation
Presentation is loading. Please wait.
1
Wisdom of Crowds in Human Memory: Reconstructing Events by Aggregating Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California, Irvine Joint work with: Brent Miller, Pernille Hemmer, Mike Yi Michael Lee, Bill Batchelder, Paolo Napoletano
2
Wisdom of crowds phenomenon Group estimate often performs as well as or better than best individual in the group 2
3
Examples of wisdom of crowds phenomenon 3 Who wants to be a millionaire? Galton’s Ox (1907): Median of individual estimates comes close to true answer
4
Tasks studied in our research Ordering/ranking problems declarative memory: order of US presidents, ranking cities by size episodic memory: order of events (i.e., serial recall) predictive rankings: fantasy football Matching problems assign N items to N responses e.g., match paintings to artists, or flags to countries Traveling Salesman problems find shortest route between cities problems involving permutations 4
5
Ulysses S. Grant James Garfield Rutherford B. Hayes Abraham Lincoln Andrew Johnson James Garfield Ulysses S. Grant Rutherford B. Hayes Andrew Johnson Abraham Lincoln Recollecting order from Declarative Memory time Place these presidents in the correct order
6
Recollecting order from episodic memory 6 http://www.youtube.com/watch?v=a6tSyDHXViM&feature=related
7
Place scenes in correct order (serial recall) 7 time A B C D
8
Goal: aggregating responses 8 D A B C A B D C B A D CA C B D A D B C Aggregation Algorithm A B C D ground truth = ? group answer
9
Bayesian Approach 9 D A B C A B D C B A D CA C B D A D B C Generative Model A B C D group answer = latent random variable
10
Task constraints No communication between individuals There is always a true answer (ground truth) Aggregation algorithm never has access to ground truth unsupervised methods ground truth only used for evaluation 10
11
Research Goals Aggregation of permutation data going beyond numerical estimates or multiple choice questions combinatorially complex Incorporate individual differences going beyond models that treat every vote equally assume some individuals might be “experts” Take cognitive processes into account going beyond mere statistical aggregation Hierarchical Bayesian models 11
12
Part I Ordering Problems 12
13
Experiment 1 Task: order all 44 US presidents Methods 26 participants (college undergraduates) Names of presidents written on cards Cards could be shuffled on large table 13
14
= 1 = 1+1 Measuring performance Kendall’s Tau: The number of adjacent pair-wise swaps Ordering by Individual ABECD True Order ABCDE C D E ABAB AB ECD ABCDEABCDE = 2
15
Empirical Results 15 (random guessing)
16
Probabilistic models Thurstone (1927), Mallows (1957), Plackett-Luce (1975) Lebanon-Mao (2008) Spectral methods Diaconis (1989) Heuristic methods from voting theory Borda count … however, many of these approaches were developed for preference rankings Many methods for analyzing rank data… 16
17
Bayesian models constrained by human cognition Extension of Thurstone’s (1927) model Extension of Estes (1972) perturbation model 17
18
Bayesian Thurstonian Approach 18 Each item has a true coordinate on some dimension A B C
19
Bayesian Thurstonian Approach 19 A B C … but there is noise because of encoding and/or retrieval error Person 1
20
Bayesian Thurstonian Approach 20 Each person’s mental representation is based on (latent) samples of these distributions B C A B C Person 1 A
21
Bayesian Thurstonian Approach 21 B C A B C The observed ordering is based on the ordering of the samples A < B < C Observed Ordering: Person 1 A
22
Bayesian Thurstonian Approach 22 People draw from distributions with common means but different variances Person 1 B C A B C A < B < C Observed Ordering: Person 2 A B C B C Observed Ordering: A < C < B A A
23
Graphical Model Notation 23 j=1..3 shaded = observed not shaded = latent
24
Graphical Model of Bayesian Thurstonian Model 24 j individuals Latent ground truth Individual noise level Mental representation Observed ordering
25
Inference Need the posterior distribution Markov Chain Monte Carlo Gibbs sampling on Metropolis-hastings on and Draw 400 samples group ordering based on average of across samples 25
26
(weak) wisdom of crowds effect 26 model’s ordering is as good as best individual (but not better)
27
Inferred Distributions for 44 US Presidents 27 George Washington (1) John Adams (2) Thomas Jefferson (3) James Madison (4) James Monroe (6) John Quincy Adams (5) Andrew Jackson (7) Martin Van Buren (8) William Henry Harrison (21) John Tyler (10) James Knox Polk (18) Zachary Taylor (16) Millard Fillmore (11) Franklin Pierce (19) James Buchanan (13) Abraham Lincoln (9) Andrew Johnson (12) Ulysses S. Grant (17) Rutherford B. Hayes (20) James Garfield (22) Chester Arthur (15) Grover Cleveland 1 (23) Benjamin Harrison (14) Grover Cleveland 2 (25) William McKinley (24) Theodore Roosevelt (29) William Howard Taft (27) Woodrow Wilson (30) Warren Harding (26) Calvin Coolidge (28) Herbert Hoover (31) Franklin D. Roosevelt (32) Harry S. Truman (33) Dwight Eisenhower (34) John F. Kennedy (37) Lyndon B. Johnson (36) Richard Nixon (39) Gerald Ford (35) James Carter (38) Ronald Reagan (40) George H.W. Bush (41) William Clinton (42) George W. Bush (43) Barack Obama (44) median and minimum sigma
28
Model can predict individual performance 28 inferred noise level for each individual distance to ground truth individual
29
Extension of Estes (1972) Perturbation Model Main idea: item order is perturbed locally Our extension: perturbation noise varies between individuals and items 29 A True order BCDE Recalled order DB C E A
30
Modified Perturbation Model 30
31
Strong wisdom of crowds effect 31 Perturbation model’s ordering is better than best individual Perturbation
32
Inferred Perturbation Matrix and Item Accuracy 32 Abraham Lincoln Richard Nixon James Carter
33
Alternative Heuristic Models Many heuristic methods from voting theory E.g., Borda count method Suppose we have 10 items assign a count of 10 to first item, 9 for second item, etc add counts over individuals order items by the Borda count i.e., rank by average rank across people 33
34
Model Comparison 34 Borda
35
Experiment 2 78 participants 17 problems each with 10 items Chronological Events Physical Measures Purely ordinal problems, e.g. Ten Amendments Ten commandments 35
36
Example results 36 1. Oregon (1) 2. Utah (2) 3. Nebraska (3) 4. Iowa (4) 5. Alabama (6) 6. Ohio (5) 7. Virginia (7) 8. Delaware (8) 9. Connecticut (9) 10. Maine (10) 1. Freedom of speech & relig... (1) 2. Right to bear arms (2) 3. No quartering of soldiers... (3) 4. No unreasonable searches (4) 5. Due process (5) 6. Trial by Jury (6) 7. Civil Trial by Jury (7) 8. No cruel punishment (8) 9. Right to non-specified ri... (10) 10. Power for the States & Pe... (9) Perturbation ModelThurstonian Model
37
Average results over 17 Problems 37 Individuals Mean Strong wisdom of crowds effect across problems
38
Predicting problem difficulty 38 std dispersion of noise levels across individual distance of group answer to ground truth ordering states geographically city size rankings
39
Effect of Group Composition How many individuals do we need to average over? 39
40
Effect of Group Size: random groups 40
41
Experts vs. Crowds Can we find experts in the crowd? Can we form small groups of experts? Approach Form a group for some particular task Select individuals with the smallest sigma (“experts”) based on previous tasks Vary the number of previous tasks 41
42
Group Composition based on prior performance 42 T = 0 # previous tasks T = 2 T = 8 Group size (best individuals first)
43
Methods for Selecting Experts 43 Endogenous: no feedback required Exogenous: selecting people based on actual performance
44
Aggregating Episodic Memories 44 Study this sequence of images
45
Place the images in correct sequence (serial recall) 45 A B C D E F G H I J
46
Average results across 6 problems 46 Mean
47
Example calibration result for individuals 47 inferred noise level distance to ground truth individual (pizza sequence; perturbation model)
48
Predictive Rankings: fantasy football 48 South Australian Football League (32 people rank 9 teams) Australian Football League (29 people rank 16 teams)
49
Part II Matching Problems 49
50
Study these combinations 50
51
23451 BCDE A Find all matching pairs 51
52
Experiment 15 subjects 8 problems 4 problems with 5 items 4 problems with 10 items 52
53
Mean accuracy across 8 problems 53
54
Bayesian Matching Model Proposed process: match “known” items guess between remaining ones Individual differences some items easier to know some participants know more 54
55
Graphical Model 55 i items Latent ground truth Observed matching Knowledge State Prob. of knowing j individuals person ability item easiness
56
Modeling results across 8 problems 56
57
Calibration at level of items and people 57 ITEMS INDIVIDUALS (for weapons and faces 10 items problem)
58
Varying number of individuals 58
59
How predictive are subject provided confidence ratings? 59 # guesses estimated by individual Accuracy # guesses estimated by model (based on variable A) r=-.50 r=-.81
60
Another matching problem 60 Dutch Danish Yiddish Thai Vietnamese Chinese Georgian Russian Japanese A B C D E F G H I godt nytår gelukkig nieuwjaar a gut yohr С Новым Годом สวัสดีปีใหม่ Chúc Mừng Nǎm Mới გილოცავთ ახალ წელს
61
Experiment 17 Participants 8 matching problems, e.g. car logo’s and brand names first and last names philosophers flags and countries greek symbols and letter names Number of items varied between 10 and 24 with 24 items, we have 24! possibilities 61
62
Modeling Results – Declarative Tasks 62
63
Calibration at level of items and people (for paintings problem) 63 ITEMS INDIVIDUALS
64
How predictive are subject provided confidence ratings? 64 # guesses estimated by individual Accuracy # guesses estimated by model (based on variable A) r=-.42 r=-.77
65
Part III Traveling Salesman Problems 65
66
Find the shortest route between cities 66 B30-21 Individual 5Individual 83 Individual 60 Optimal
67
Dataset Vickers, Bovet, Lee, & Hughes (2003) 83 participants 7 problems of 30 cities
68
TSP Aggregation Problem Propose a good solution based on all individual solutions Task constraints Data consists of city order only No access to city locations 68
69
Approach Find tours with edges for which many individuals agree Calculate agreement matrix A A = n × n matrix, where n is the number of cities a ij indicates the number of participants that connect cities i and j. Find tour that maximizes 69 (this itself is a non-Euclidian TSP problem)
70
Line thickness = agreement 70
71
Blue = Aggregate Tour 71
72
Results averaged across 7 problems aggregate
73
Results Weight: c = 2.0 path length # subj better # subj same # subj worse +0.491%11864 +1.424%16265 +0.159%1181 +0.193%2279 +0.162%1577 +0.042%0380 +4.965%46037 +1.064%0083 Problemsubj Min.subj Mean A30.14 +0.000%+3.246% A30.24 +0.000%+4.791% A30.48 +0.078%+5.936% B30.04 +0.121%+5.502% B30.11 +0.000%+4.992% B30.21 +0.042%+5.325% B30.27 +1.229%+5.497% All +1.718%+5.041% Individuals Model best individual performance across 7 problems model performance across 7 problems outperforms best individual
74
Part IV Summary & Conclusions 74
75
When do we get wisdom of crowds effect? Independent errors different people knowing different things Some minimal number of individuals 10-20 individuals often sufficient 75
76
What are methods for finding experts? 1) Self-reported expertise: unreliable has led to claims of “myth of expertise” 2) Based on explicit scores by comparing to ground truth but ground truth might not be immediately available 3) Endogenously discover experts Use the crowd to discover experts Small groups of experts can be effective 76
77
What to do about systematic biases? In some tasks, individuals systematically distort the ground truth spatial and temporal distortions memory distortions (e.g. false memory) decision-making distortions Does this diminish the wisdom of crowds effect? maybe… but a model that predicts these systematic distortions might be able to “undo” them 77
78
Conclusion Effective aggregation of human judgments requires cognitive models Psychology and cognitive science can inform aggregation models 78
79
That’s all 79 Do the experiments yourself: http://psiexp.ss.uci.edu/
80
Online Experiments Experiment 1 (Prior knowledge) http://madlab.ss.uci.edu/dem2/examples/ Experiment 2a (Serial Recall) study sequence of still images http://madlab.ss.uci.edu/memslides/ Experiment 2b (Serial Recall) study video http://madlab.ss.uci.edu/dem/ 80
81
Graphical Model 81 i items Latent ground truth Observed matching Knowledge State Prob. of knowing j individuals item and person parameters
82
MDS solution of pairwise tau distances 82 distance to truth
83
MDS solution of pairwise tau distances 83
84
Hierarchical Bayesian Models Generative models ordering information cognitively plausible individual differences Group response = probability distribution over all permutations of N items With N=44 items, we have 44! > 10 53 combinations Approximate inference methods: MCMC 84
85
Model incorporating overall person ability 85 j individuals Overall ability Task specific ability m tasks j individuals
86
Average results over 17 Problems 86 Mean new model
87
Thurstonian Model – stereotyped event sequences 87
88
Thurstonian Model – “random” videos 88
89
Heuristic Aggregation Approach Combinatorial optimization problem maximizes agreement in assigning N items to N responses Hungarian algorithm construct a count matrix M M ij = number of people that paired item i with response j find row and column permutations to maximize diagonal sum O( n 3 ) 89
90
Hungarian Algorithm Example 90 = correct= incorrect
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.