Download presentation
Presentation is loading. Please wait.
1
Wisdom of Crowds and Rank Aggregation Mark Steyvers Department of Cognitive Sciences University of California, Irvine Joint work with: Brent Miller, Pernille Hemmer, Mike Yi, Michael Lee
2
Wisdom of crowds phenomenon Aggregating over individuals in a group often leads to an estimate that is better than any of the individual estimates 2
3
Examples of wisdom of crowds phenomenon 3 Galton’s Ox (1907): Median of individual weight estimates came close to true answer Prediction markets
4
Ulysses S. Grant James Garfield Rutherford B. Hayes Abraham Lincoln Andrew Johnson James Garfield Ulysses S. Grant Rutherford B. Hayes Andrew Johnson Abraham Lincoln Our research: ranking problems time What is the correct chronological order?
5
Aggregating ranking data 5 D A B C A B D C B A D CA C B D A D B C Aggregation Algorithm A B C D ground truth = ? group answer
6
Task constraints No communication between individuals There is always a true answer (ground truth) Unsupervised algorithms no feedback is available ground truth only used for evaluation 6
7
Classic models: Thurstone (1927) Mallows (1957); Fligner and Verducci, 1986 Diaconis (1989) Voting methods: e.g. Borda count (1770) Machine learning applications Information retrieval and meta-search e.g. Klementiev, Roth et al. (2008; 2009), Lebanon & Mao (2008); Dwork et al. (2001) multi-object tracking e.g. Huan, Guestrin, Guibas (2009); Kondor, Howard, Jebara (2007) Unsupervised models for ranking data 7 Many models were developed for preference rankings and voting situations no known ground truth
8
Unsupervised Approach 8 D A B C A B D C B A D CA C B D A D B C Generative Model ? ? latent ground truth Incorporate individual differences
9
Overview of talk Reconstruct the order of US presidents Effect of group size and expertise Reconstruct the order of events Traveling Salesman Problem 9
10
Experiment: 26 individuals order all 44 US presidents 10 George WashingtonJohn AdamsThomas JeffersonJames Madison James MonroeJohn Quincy AdamsAndrew JacksonMartin Van Buren William Henry HarrisonJohn TylerJames Knox PolkZachary Taylor Millard FillmoreFranklin PierceJames BuchananAbraham Lincoln Andrew JohnsonUlysses S. GrantRutherford B. HayesJames Garfield Chester ArthurGrover Cleveland 1Benjamin HarrisonGrover Cleveland 2 William McKinleyTheodore RooseveltWilliam Howard TaftWoodrow Wilson Warren HardingCalvin CoolidgeHerbert HooverFranklin D. Roosevelt Harry S. TrumanDwight EisenhowerJohn F. KennedyLyndon B. Johnson Richard NixonGerald FordJames CarterRonald Reagan George H.W. BushWilliam ClintonGeorge W. BushBarack Obama
11
= 1 = 1+1 Measuring performance Kendall’s Tau: The number of adjacent pair-wise swaps Ordering by Individual ABECD True Order ABCDE C D E ABAB AB ECD ABCDEABCDE = 2
12
Empirical Results 12 (random guessing)
13
Thurstonian Model 13 A. George Washington B. James Madison C. Andrew Jackson Each item has a true coordinate on some dimension
14
Thurstonian Model 14 … but there is noise because of encoding errors A. George Washington B. James Madison C. Andrew Jackson
15
Thurstonian Model 15 A. George Washington B. James Madison C. Andrew Jackson Each person’s mental encoding is based on a single sample from each distribution A B C
16
Thurstonian Model 16 A. George Washington B. James Madison C. Andrew Jackson A B C A < C < B The observed ordering is based on the ordering of the samples
17
Thurstonian Model 17 A. George Washington B. James Madison C. Andrew Jackson A B C A < B < C The observed ordering is based on the ordering of the samples
18
Thurstonian Model 18 A. George Washington B. James Madison C. Andrew Jackson Important assumption: across individuals, standard deviation can vary but not the means
19
Graphical Model of Extended Thurstonian Model 19 j individuals Latent group means Individual noise level Mental representation Observed ordering
20
Inferred Distributions for 44 US Presidents 20 George Washington (1) John Adams (2) Thomas Jefferson (3) James Madison (4) James Monroe (6) John Quincy Adams (5) Andrew Jackson (7) Martin Van Buren (8) William Henry Harrison (21) John Tyler (10) James Knox Polk (18) Zachary Taylor (16) Millard Fillmore (11) Franklin Pierce (19) James Buchanan (13) Abraham Lincoln (9) Andrew Johnson (12) Ulysses S. Grant (17) Rutherford B. Hayes (20) James Garfield (22) Chester Arthur (15) Grover Cleveland 1 (23) Benjamin Harrison (14) Grover Cleveland 2 (25) William McKinley (24) Theodore Roosevelt (29) William Howard Taft (27) Woodrow Wilson (30) Warren Harding (26) Calvin Coolidge (28) Herbert Hoover (31) Franklin D. Roosevelt (32) Harry S. Truman (33) Dwight Eisenhower (34) John F. Kennedy (37) Lyndon B. Johnson (36) Richard Nixon (39) Gerald Ford (35) James Carter (38) Ronald Reagan (40) George H.W. Bush (41) William Clinton (42) George W. Bush (43) Barack Obama (44) error bars = median and minimum sigma
21
Calibration of individuals 21 inferred noise level for each individual distance to ground truth individual
22
Wisdom of crowds effect 22
23
Alternative Heuristic Models Many heuristic methods from voting theory E.g., Borda count method Suppose we have 10 items assign a count of 10 to first item, 9 for second item, etc add counts over individuals order items by the Borda count i.e., rank by average rank across people 23
24
Model Comparison 24 Borda
25
Overview of talk Reconstruct the order of US presidents Effect of group size and expertise Reconstruct the order of events Traveling Salesman Problem 25
26
Experiment 78 participants 17 ordering problems each with 10 items Chronological Events Physical Measures Purely ordinal problems, e.g. Ten Amendments Ten commandments 26
27
Ordering states west-east 27 Oregon (1) Utah (2) Nebraska (3) Iowa (4) Alabama (6) Ohio (5) Virginia (7) Delaware (8) Connecticut (9) Maine (10)
28
Ordering Ten Amendments 28 Freedom of speech & religion (1) Right to bear arms (2) No quartering of soldiers (4) No unreasonable searches (3) Due process (5) Trial by Jury (6) Civil Trial by Jury (7) No cruel punishment (8) Right to non-specified rights (10) Power for the States & People (9)
29
Ordering Ten Commandments 29
30
Effect of Group Size: random subgroups 30
31
How effective are small groups of experts? Want to find experts endogenously – without feedback Approach: select individuals with the smallest estimated noise levels based on previous tasks We are identifying general expertise (“Pearson’s g”) 31
32
Group Composition based on prior performance 32 T = 0 # previous tasks T = 2 T = 8 Group size (best individuals first)
33
33 Endogenous no feedback required Exogenous selecting people based on actual performance
34
Overview of talk Reconstruct the order of US presidents Effect of group size and expertise Reconstruct the order of events Traveling Salesman Problem 34
35
Recollecting Order from Episodic Memory 35 Study this sequence of images
36
Place the images in correct sequence (serial recall) 36 A B C D E F G H I J
37
Average results across 6 problems 37 Mean
38
Calibration of individuals 38 inferred noise level distance to ground truth individual (pizza sequence; perturbation model)
39
Overview of talk Reconstruct the order of US presidents Effect of group size and expertise Reconstruct the order of events Traveling Salesman Problem 39
40
Find the shortest route between cities 40 B30-21 Individual 5Individual 83 Individual 60 Optimal
41
Dataset Vickers, Bovet, Lee, & Hughes (2003) 83 participants 7 problems of 30 cities
42
TSP Aggregation Problem Data consists of city order only No access to city locations 42
43
Heuristic Approach Idea: find tours with edges for which many individuals agree Calculate agreement matrix A A = n × n matrix, where n is the number of cities a ij indicates the number of participants that connect cities i and j. Find tour that maximizes 43 (this itself is a non-Euclidian TSP problem)
44
Line thickness = agreement 44
45
Blue = Aggregate Tour 45
46
Results averaged across 7 problems aggregate
47
Summary Combine ordering / ranking data going beyond numerical estimates or multiple choice questions Incorporate individual differences assume some individuals might be “experts” going beyond models that treat every vote equally Applications combine multiple eyewitness accounts combine solutions in complex problem-solving situations fantasy football 47
48
That’s all 48 Do the experiments yourself: http://psiexp.ss.uci.edu/
49
Predictive Rankings: fantasy football 49 South Australian Football League (32 people rank 9 teams) Australian Football League (29 people rank 16 teams)
50
Predicting problem difficulty 50 std dispersion of noise levels across individual distance of group answer to ground truth ordering states geographically city size rankings
51
Related Concepts in Supervised Learning Boosting combining multiple classifiers Bagging (Bootstrap Aggregating) 51
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.