Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California,

Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California, Irvine Joint work with: Michael Lee Brent Miller Pernille Hemmer Bill Batchelder Paolo Napoletano

Thomas Jefferson Andrew Jackson James Monroe George Washington John Adams Andrew Jackson Thomas Jefferson James Monroe John Adams George Washington Ordering problem: time what is the correct order of these Presidents?

Goal: aggregating responses 3 D A B C A B D C B A D CA C B D A D B C Aggregation Algorithm A B C D ground truth = ? group answer

Bayesian Approach 4 D A B C A B D C B A D CA C B D A D B C Generative Model A B C D ground truth = latent common cause

Important notes: No communication between individuals There is always a true answer (ground truth) Aggregation algorithm never has access to ground truth ground truth only used for evaluation 5

Matching problem: 6 RembrandtVan GoghMonetRenoir A B C D

Wisdom of crowds phenomenon Crowd estimate is often better than any individual in the crowd (Think of independent noise influencing each individual) 7

Examples of wisdom of crowds phenomenon 8 Who wants to be a millionaire? Galton’s Ox (1907): Median of individual estimates comes close to true answer

Limitations of Current “Wisdom of Crowds” Research Studies restricted to numeric or categorical judgments simple averaging schemes: Mode Median Mean No treatment of individual differences every “vote” is treated equally downplayed role of expertise 9

Cultural Consensus Theory (CCT) E.g. Romney, Batchelder, and Weller (1987) Finds the “answer key” to multiple choice questions when ground truth is lost takes person and item differences into account Informal version of CCT also developed for ranking data 10

Research Goals Generalize “wisdom of crowds” effect to more complex data Aggregation of permutations Ranking data Matching (assignment) data 11

Hierarchical Bayesian Models Probability distributions over all permutations of items with N items, there are N! combinations e.g., when N=44, we have 44! > 10^53 combinations Approximate inference methods: MCMC Cognitively plausible generative processes Treatment of individual differences 12

Part I Ordering Problems 13

Experiment 1 Task: order all 44 US presidents Methods 26 participants (college undergraduates) Names of presidents written on cards Cards could be shuffled on large table 14

= 1= 1+1 Measuring performance Kendall’s Tau: The number of adjacent pair-wise swaps Participant Ordering 1 25 34 Ground Truth 1 23 45 3451 2 1 25 34 1 23 45 = 2

Empirical Results 16  (random guessing)

Probabilistic models Thurstone (1927) Mallows (1957) Plackett-Luce (1975) Lebanon-Mao (2008) Spectral methods Diaconis (1989) Heuristic methods from voting theory Borda count … however, many of these approached developed for preference rankings Many approaches for analyzing rank data… 17

Bayesian Thurstonian Approach 18 Each item has a true coordinate on some dimension A B C

Bayesian Thurstonian Approach 19 A B C … but there is noise because of encoding and/or retrieval error Person 1

Bayesian Thurstonian Approach 20 Each person’s mental representation is based on (latent) samples of these distributions B C A B C Person 1 A

Bayesian Thurstonian Approach 21 B C A B C The observed ordering is based on the ordering of the samples A < B < C Observed Ordering: Person 1 A

Bayesian Thurstonian Approach 22 People draw from distributions with common mean but different variances Person 1 B C A B C A < B < C Observed Ordering: Person 2 A B C B C Observed Ordering: A < C < B A A

Graphical Model Notation 23 j=1..3 shaded = observed not shaded = latent

Graphical Model of Bayesian Thurstonian Model 24 j individuals Latent ground truth Individual ability Mental representation Observed ordering

Inference Need the posterior distribution Markov Chain Monte Carlo Gibbs sampling on Metropolis-hastings on and Draw 400 samples group ordering based on average of across samples 25

Wisdom of Crowds effect 26  model’s ordering is as good as best individual

Inferred Distributions for 44 US Presidents 27 George Washington (1) John Adams (2) Thomas Jefferson (3) James Madison (4) James Monroe (6) John Quincy Adams (5) Andrew Jackson (7) Martin Van Buren (8) William Henry Harrison (21) John Tyler (10) James Knox Polk (18) Zachary Taylor (16) Millard Fillmore (11) Franklin Pierce (19) James Buchanan (13) Abraham Lincoln (9) Andrew Johnson (12) Ulysses S. Grant (17) Rutherford B. Hayes (20) James Garfield (22) Chester Arthur (15) Grover Cleveland 1 (23) Benjamin Harrison (14) Grover Cleveland 2 (25) William McKinley (24) Theodore Roosevelt (29) William Howard Taft (27) Woodrow Wilson (30) Warren Harding (26) Calvin Coolidge (28) Herbert Hoover (31) Franklin D. Roosevelt (32) Harry S. Truman (33) Dwight Eisenhower (34) John F. Kennedy (37) Lyndon B. Johnson (36) Richard Nixon (39) Gerald Ford (35) James Carter (38) Ronald Reagan (40) George H.W. Bush (41) William Clinton (42) George W. Bush (43) Barack Obama (44) median and minimum sigma

Model is calibrated 28   Individuals with large sigma are far from the truth

Alternative Models Many heuristic methods from voting theory E.g., Borda count method Suppose we have 10 items assign a count of 10 to first item, 9 for second item, etc add counts over individuals order items by the Borda count i.e., rank by average rank across people 29

Model Comparison 30 

Experiment 2 78 participants 17 problems each with 10 items Chronological Events Physical Measures Purely ordinal problems, e.g. Ten Amendments Ten commandments 31

Ordering states west-east 32 Oregon (1) Utah (2) Nebraska (3) Iowa (4) Alabama (6) Ohio (5) Virginia (7) Delaware (8) Connecticut (9) Maine (10)

Ordering Ten Amendments 33

Ordering Ten Commandments 34 Worship any other God (1) Make a graven image (7) Take the Lord's name in vain (2) Break the Sabbath (3) Dishonor your parents (4) Murder (6) Commit adultery (8) Steal (5) Bear false witness (9) Covet (10)

Average results over 17 Problems 35 Individuals Mean  Thurstonian Model Borda count Mode Individuals

Effect of Group Composition How many individuals do we need to average over? 36

Effect of Group Size: random groups 37 

Experts vs. Crowds Can we find experts in the crowd? Can we form small groups of experts? Approach Form a group for some particular task Select individuals with the smallest sigma (“experts”) based on previous tasks Vary the number of previous tasks 38

Group Composition based on prior performance 39  T = 0 # previous tasks T = 2 T = 8 Group size (best individuals first)

Methods for Selecting Experts 40 Endogenous: no feedback required Exogenous: selecting people based on actual performance  

Model incorporating overall person ability 41 j individuals Overall ability Task specific ability m tasks j individuals

Average results over 17 Problems 42 Mean  new model

Part II Ordering Problems in Episodic Memory 43

Another ordering problem: 44 http://www.youtube.com/watch?v=29VGZtnCD30&feature=related A B C D time

Experiment 3 26 participants 6 videos 3 videos with stereotyped event sequences (e.g. wedding) 3 videos “unpredictable” videos (e.g., example video) extracted 10 stills for testing Method study video followed by immediate ordering test of 10 items 45

Bayesian Thurstonian Model 46  = 3

Two other examples 47  = 1  = 0

Overall Results 48 Mean 

Part III Matching Problems 49

Example Matching Problem (one-to-one) 50 Dutch Danish Yiddish Thai Vietnamese Chinese Georgian Russian Japanese A B C D E F G H I godt nytår gelukkig nieuwjaar a gut yohr С Новым Годом สวัสดีปีใหม่ Chúc Mừng Nǎm Mới გილოცავთ ახალ წელს

Experiment 17 Participants 8 matching problems, e.g. car logo’s and brand names first and last names philosophers flags and countries greek symbols and letter names Number of items varied between 10 and 24 with 24 items, we have 24! possibilities 51

Overall Results 52

Heuristic Aggregation Approach Combinatorial optimization problem maximizes agreement in assigning N items to N responses Hungarian algorithm construct a count matrix M M ij = number of people that paired item i with response j find row and column permutations to maximize diagonal sum O( n 3 ) 53

Hungarian Algorithm Example 54 = correct= incorrect

Hungarian Algorithm Results (2) 55

Bayesian Matching Model 56 Proposed process: - match “known” items - guess between remaining ones Individual differences: -some items easier to know -some participants know more Dutch Danish Yiddish Russian godt nytår gelukkig nieuwjaar a gut yohr С Новым Годом

Graphical Model 57 i items Latent ground truth Observed matching Knowledge State Prob. of knowing j individuals person ability item easiness

Overall Modeling Results 58

Calibration at level of items and people (for paintings problem) 59 ITEMS INDIVIDUALS

How predictive are subject provided confidence ratings? 60 # guesses estimated by individual Accuracy # guesses estimated by model (based on variable A) r=-.42 r=-.77

Part IV Open Issues 61

When do we get wisdom of crowds effect? Independent errors different people knowing different things Population response centered around ground truth Some minimal number of individuals 10-20 individuals often sufficient 62

What are methods for finding experts? 1) Self-reported expertise: unreliable  has led to claims of “myth of expertise” 2) Based on explicit scores by comparing to ground truth but ground truth might not be immediately available 3) Endogenously discover experts Use the crowd to discover experts Small groups of experts can be effective 63

What to do about systematic biases? In some tasks, individuals systematically distort the ground truth spatial and temporal distortions memory distortions (e.g. false memory) decision-making distortions Does this diminish the wisdom of crowds effect? maybe… but a model that predicts these systematic distortions might be able to “undo” them 64

Can we build domain specific models? Thurstonian model applied to wide variety of problems How about domain specific models? e.g., apply serial recall models to serial recall better specify sources of noise model systematic biases 65

That’s all 66 Do the experiments yourself: http://psiexp.ss.uci.edu/

Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California,

Similar presentations

Presentation on theme: "Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California,

Similar presentations

Presentation on theme: "Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California,"— Presentation transcript:

Similar presentations

About project

Feedback