Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California,

Similar presentations


Presentation on theme: "Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California,"— Presentation transcript:

1 Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California, Irvine Joint work with: Michael Lee Brent Miller Pernille Hemmer Bill Batchelder Paolo Napoletano

2 Thomas Jefferson Andrew Jackson James Monroe George Washington John Adams Andrew Jackson Thomas Jefferson James Monroe John Adams George Washington Ordering problem: time what is the correct order of these Presidents?

3 Goal: aggregating responses 3 D A B C A B D C B A D CA C B D A D B C Aggregation Algorithm A B C D ground truth = ? group answer

4 Bayesian Approach 4 D A B C A B D C B A D CA C B D A D B C Generative Model A B C D ground truth = latent common cause

5 Important notes: No communication between individuals There is always a true answer (ground truth) Aggregation algorithm never has access to ground truth ground truth only used for evaluation 5

6 Matching problem: 6 RembrandtVan GoghMonetRenoir A B C D

7 Wisdom of crowds phenomenon Crowd estimate is often better than any individual in the crowd (Think of independent noise influencing each individual) 7

8 Examples of wisdom of crowds phenomenon 8 Who wants to be a millionaire? Galton’s Ox (1907): Median of individual estimates comes close to true answer

9 Limitations of Current “Wisdom of Crowds” Research Studies restricted to numeric or categorical judgments simple averaging schemes: Mode Median Mean No treatment of individual differences every “vote” is treated equally downplayed role of expertise 9

10 Cultural Consensus Theory (CCT) E.g. Romney, Batchelder, and Weller (1987) Finds the “answer key” to multiple choice questions when ground truth is lost takes person and item differences into account Informal version of CCT also developed for ranking data 10

11 Research Goals Generalize “wisdom of crowds” effect to more complex data Aggregation of permutations Ranking data Matching (assignment) data 11

12 Hierarchical Bayesian Models Probability distributions over all permutations of items with N items, there are N! combinations e.g., when N=44, we have 44! > 10^53 combinations Approximate inference methods: MCMC Cognitively plausible generative processes Treatment of individual differences 12

13 Part I Ordering Problems 13

14 Experiment 1 Task: order all 44 US presidents Methods 26 participants (college undergraduates) Names of presidents written on cards Cards could be shuffled on large table 14

15 = 1= 1+1 Measuring performance Kendall’s Tau: The number of adjacent pair-wise swaps Participant Ordering 1 25 34 Ground Truth 1 23 45 3451 2 1 25 34 1 23 45 = 2

16 Empirical Results 16  (random guessing)

17 Probabilistic models Thurstone (1927) Mallows (1957) Plackett-Luce (1975) Lebanon-Mao (2008) Spectral methods Diaconis (1989) Heuristic methods from voting theory Borda count … however, many of these approached developed for preference rankings Many approaches for analyzing rank data… 17

18 Bayesian Thurstonian Approach 18 Each item has a true coordinate on some dimension A B C

19 Bayesian Thurstonian Approach 19 A B C … but there is noise because of encoding and/or retrieval error Person 1

20 Bayesian Thurstonian Approach 20 Each person’s mental representation is based on (latent) samples of these distributions B C A B C Person 1 A

21 Bayesian Thurstonian Approach 21 B C A B C The observed ordering is based on the ordering of the samples A < B < C Observed Ordering: Person 1 A

22 Bayesian Thurstonian Approach 22 People draw from distributions with common mean but different variances Person 1 B C A B C A < B < C Observed Ordering: Person 2 A B C B C Observed Ordering: A < C < B A A

23 Graphical Model Notation 23 j=1..3 shaded = observed not shaded = latent

24 Graphical Model of Bayesian Thurstonian Model 24 j individuals Latent ground truth Individual ability Mental representation Observed ordering

25 Inference Need the posterior distribution Markov Chain Monte Carlo Gibbs sampling on Metropolis-hastings on and Draw 400 samples group ordering based on average of across samples 25

26 Wisdom of Crowds effect 26  model’s ordering is as good as best individual

27 Inferred Distributions for 44 US Presidents 27 George Washington (1) John Adams (2) Thomas Jefferson (3) James Madison (4) James Monroe (6) John Quincy Adams (5) Andrew Jackson (7) Martin Van Buren (8) William Henry Harrison (21) John Tyler (10) James Knox Polk (18) Zachary Taylor (16) Millard Fillmore (11) Franklin Pierce (19) James Buchanan (13) Abraham Lincoln (9) Andrew Johnson (12) Ulysses S. Grant (17) Rutherford B. Hayes (20) James Garfield (22) Chester Arthur (15) Grover Cleveland 1 (23) Benjamin Harrison (14) Grover Cleveland 2 (25) William McKinley (24) Theodore Roosevelt (29) William Howard Taft (27) Woodrow Wilson (30) Warren Harding (26) Calvin Coolidge (28) Herbert Hoover (31) Franklin D. Roosevelt (32) Harry S. Truman (33) Dwight Eisenhower (34) John F. Kennedy (37) Lyndon B. Johnson (36) Richard Nixon (39) Gerald Ford (35) James Carter (38) Ronald Reagan (40) George H.W. Bush (41) William Clinton (42) George W. Bush (43) Barack Obama (44) median and minimum sigma

28 Model is calibrated 28   Individuals with large sigma are far from the truth

29 Alternative Models Many heuristic methods from voting theory E.g., Borda count method Suppose we have 10 items assign a count of 10 to first item, 9 for second item, etc add counts over individuals order items by the Borda count i.e., rank by average rank across people 29

30 Model Comparison 30 

31 Experiment 2 78 participants 17 problems each with 10 items Chronological Events Physical Measures Purely ordinal problems, e.g. Ten Amendments Ten commandments 31

32 Ordering states west-east 32 Oregon (1) Utah (2) Nebraska (3) Iowa (4) Alabama (6) Ohio (5) Virginia (7) Delaware (8) Connecticut (9) Maine (10)

33 Ordering Ten Amendments 33

34 Ordering Ten Commandments 34 Worship any other God (1) Make a graven image (7) Take the Lord's name in vain (2) Break the Sabbath (3) Dishonor your parents (4) Murder (6) Commit adultery (8) Steal (5) Bear false witness (9) Covet (10)

35 Average results over 17 Problems 35 Individuals Mean  Thurstonian Model Borda count Mode Individuals

36 Effect of Group Composition How many individuals do we need to average over? 36

37 Effect of Group Size: random groups 37 

38 Experts vs. Crowds Can we find experts in the crowd? Can we form small groups of experts? Approach Form a group for some particular task Select individuals with the smallest sigma (“experts”) based on previous tasks Vary the number of previous tasks 38

39 Group Composition based on prior performance 39  T = 0 # previous tasks T = 2 T = 8 Group size (best individuals first)

40 Methods for Selecting Experts 40 Endogenous: no feedback required Exogenous: selecting people based on actual performance  

41 Model incorporating overall person ability 41 j individuals Overall ability Task specific ability m tasks j individuals

42 Average results over 17 Problems 42 Mean  new model

43 Part II Ordering Problems in Episodic Memory 43

44 Another ordering problem: 44 http://www.youtube.com/watch?v=29VGZtnCD30&feature=related A B C D time

45 Experiment 3 26 participants 6 videos 3 videos with stereotyped event sequences (e.g. wedding) 3 videos “unpredictable” videos (e.g., example video) extracted 10 stills for testing Method study video followed by immediate ordering test of 10 items 45

46 Bayesian Thurstonian Model 46  = 3

47 Two other examples 47  = 1  = 0

48 Overall Results 48 Mean 

49 Part III Matching Problems 49

50 Example Matching Problem (one-to-one) 50 Dutch Danish Yiddish Thai Vietnamese Chinese Georgian Russian Japanese A B C D E F G H I godt nytår gelukkig nieuwjaar a gut yohr С Новым Годом สวัสดีปีใหม่ Chúc Mừng Nǎm Mới გილოცავთ ახალ წელს

51 Experiment 17 Participants 8 matching problems, e.g. car logo’s and brand names first and last names philosophers flags and countries greek symbols and letter names Number of items varied between 10 and 24 with 24 items, we have 24! possibilities 51

52 Overall Results 52

53 Heuristic Aggregation Approach Combinatorial optimization problem maximizes agreement in assigning N items to N responses Hungarian algorithm construct a count matrix M M ij = number of people that paired item i with response j find row and column permutations to maximize diagonal sum O( n 3 ) 53

54 Hungarian Algorithm Example 54 = correct= incorrect

55 Hungarian Algorithm Results (2) 55

56 Bayesian Matching Model 56 Proposed process: - match “known” items - guess between remaining ones Individual differences: -some items easier to know -some participants know more Dutch Danish Yiddish Russian godt nytår gelukkig nieuwjaar a gut yohr С Новым Годом

57 Graphical Model 57 i items Latent ground truth Observed matching Knowledge State Prob. of knowing j individuals person ability item easiness

58 Overall Modeling Results 58

59 Calibration at level of items and people (for paintings problem) 59 ITEMS INDIVIDUALS

60 How predictive are subject provided confidence ratings? 60 # guesses estimated by individual Accuracy # guesses estimated by model (based on variable A) r=-.42 r=-.77

61 Part IV Open Issues 61

62 When do we get wisdom of crowds effect? Independent errors different people knowing different things Population response centered around ground truth Some minimal number of individuals 10-20 individuals often sufficient 62

63 What are methods for finding experts? 1) Self-reported expertise: unreliable  has led to claims of “myth of expertise” 2) Based on explicit scores by comparing to ground truth but ground truth might not be immediately available 3) Endogenously discover experts Use the crowd to discover experts Small groups of experts can be effective 63

64 What to do about systematic biases? In some tasks, individuals systematically distort the ground truth spatial and temporal distortions memory distortions (e.g. false memory) decision-making distortions Does this diminish the wisdom of crowds effect? maybe… but a model that predicts these systematic distortions might be able to “undo” them 64

65 Can we build domain specific models? Thurstonian model applied to wide variety of problems How about domain specific models? e.g., apply serial recall models to serial recall better specify sources of noise model systematic biases 65

66 That’s all 66 Do the experiments yourself: http://psiexp.ss.uci.edu/

67 Other slides 67

68 Results separated by problem 68

69 Notes Noise in Thurstonian models acquisition / encoding noise retrieval noise Link to crowd within (Ed Vul) are our results due to wisdom of crowds or individuals? Probably a bit of both and we cannot tell with our experiments However, there is probably a fair amount of encoding noise that would not benefit from repeated measurements within individuals Different individuals probably do know different things 69

70 To Do Compare explicitly estimated number of guesses with latent confidence Identifiability issue fix mean A? Hierarchical model test on small numbers of subjects Model comparisons on small sets of subjects 70 TO DO: look at kurtosis of sigma distributions

71 Modeling Group Serial Recall Goal: infer distribution over orderings of events given verbal reports i.e., P( original order | verbal report ) Many models for serial recall, e.g. Estes Perturbation model (1972) Shiffrin & Cook (1978) SOB (2002) Simple (2007) but many of these models do not have a likelihood function p( item 1, item 2, …, item N | memory contents ) 71

72 Bayesian Algorithm: not every person has equal weight 72 = correct= incorrect

73 Summary of Findings Extended wisdom of crowds to combinatorial problems approximate inference (MCMC) to infer probability distributions over permutations Bayesian methods that are calibrated we can tell who is likely to be accurate without having ground truth available 73

74 Graphical Model 74 i items Latent ground truth Observed matching Knowledge State Prob. of knowing j individuals item and person parameters

75 When do we get Wisdom of Crowds effect? Analyze model performance in a variety of tasks 75

76 MDS solution of pairwise tau distances 76 distance to truth

77 MDS solution of pairwise tau distances 77

78 Modeling Performance Across Task Current model is applied independently across tasks Extend hierarchical model with random effects approach to tasks Each person has a an overall ability (Pearson’s “g” ) Ability in a specific task is varies around overall ability 78


Download ppt "Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California,"

Similar presentations


Ads by Google