Rethinking the ESP Game Stephen Robertson, Milan Vojnovic, Ingmar Weber* Microsoft Research & Yahoo! Research *This work was done while I was a visiting.

Rethinking the ESP Game Stephen Robertson, Milan Vojnovic, Ingmar Weber* Microsoft Research & Yahoo! Research *This work was done while I was a visiting researcher at MSRC.

- 2 - The ESP Game – Live Demo Show it live. (2min)live AlternativeAlternative version.

- 3 - The ESP Game - Summary Two players try to agree on a label to be added to an image No way to communicate Entered labels only revealed at end Known labels are “off-limits” ESP refers to “ Extrasensory perception” Read the other person’s mind

- 4 - The ESP Game - History Developed by Luis von Ahn and Laura Dabbish at CMU in 2004 Goal: Improve image search Licensed by Google in 2006 A prime example of harvesting human intelligence for difficult tasks Many variants (music, shapes, …)

- 5 - The ESP Game – Strengths and Weaknesses Strengths –Creative approach to a hard problem –Fun to play –Vast majority of labels are appropriate –Difficult to spam –Powerful idea: Reaching consensus with little or no communication

- 6 - The ESP Game – Strengths and Weaknesses Weaknesses –The ultimate object is ill-defined –Finds mostly general labels –Already millions of images for these –“Lowest common denominator” problem –Human time is used sub-optimally

- 7 - A “Robot” Playing the ESP Game VideoVideo of recorded play.

- 8 - The ESP Game – Labels are Predictable Synonyms are redundant –“guy” => “man” for 81% of images Co-occurrence reduces “new” information –“clouds” => “sky” for 68% of images Colors are easy to agree on –“black” is 3.3% of all occurrences

- 9 - How to Predict the Next Label T = {“beach”, “water”}, next label t = ??

- 10 - How to Predict the Next Label Want to know: P(“blue” next label | {“beach”, “water”}) P(“car” next label | {“beach”, “water”}) P(“sky” next label | {“beach”, “water”}) P(“bcn” next label | {“beach”, “water”}) Problem of data sparsity!

- 11 - How to Predict the Next Label Want to know: P(“t” next label | T) = P(T | “t” next label) ¢ P(“t”) / P(T) Use conditional independence … Give a random topic to two people. Ask them to each think of 3 related terms. P(A,B) = P(A|B) ¢ P(B) = P(B|A) ¢ P(A) Bayes’ Theorem

- 13 - How to Predict the Next Label P({s 1, s 2 } | “t”) ¢ P(“t”) / P(T) = P(s 1 | “t”) ¢ P(s 2 | “t”) ¢ P(“t”) / P(T) P(s | “t”) will still be zero very often ! smoothing P(s | “t”) = (1- ¸ ) P(s | “t”) + ¸ P(s) C.I. Assumption violated in practice, but “close enough”. Non-zero background probability

- 14 - How to Predict the Next Label P(“t” next label | T already present) =  s 2 T P(s | “t”) P(“t”) / C where C is a normalizing constant ¸ chosen using a “validation set”. ¸ = 0.85 in the experiments. Model trained on ~13,000 tag sets. Also see: Naïve Bayes classifier Cond. indep. assumptionBayes’ Theorem

- 15 - Experimental Results: Part 1 Number of -games played 205 -images encountered 1,335 -images w/ OLT 1,105 Percentage w/ match -all images 69% -only images with OLTs 81% -all entered tags 17% Av. number of labels entered -per image4.1 -per game26.7 Agreement index -mean 2.6 -median 2.0 The “robot” plays reasonably well. The “robot” plays human-like.

- 16 - Quantifying “Predictability” and “Information” So, labels are fairly predictable. But how can we quantify “predictability”?

- 17 - Quantifying “Predictability” and “Information” “sunny” vs. “cloudy” tomorrow in BCN The role of a cubic dice The next single letter in “barcelo*” The next single letter in “re*” Clicked search result for “yahoo research”

- 18 - Entropy and Information An event occurring with probability p corresponds to an information of -log 2 (p) bits... … number of bits required to encode in optimally compressed encoding Example: Compressed weather forecast: P(“sunny”) = 0.5 0(1 bit) P(“cloudy”) = 0.2510(2 bits) P(“rain”) = 0.125110(3 bits) P(“thunderstorm”) = 0.125111(3 bits)

- 19 - Entropy and Information p=1 ! 0 bits of information –Cubic dice showed a number in [1,6] p ¼ 0 ! many, many bits of information –The numbers for the lottery “information” = “amount of surprise”

- 20 - Entropy and Information Expected information for p 1, p 2, …, p n :  i -p i ¢ log(p i ) = (Shannon) entropy Might not know true p 1, p 2, …, p n, but think they are p 1, p 2, …, p n. Then, w.r.t. p you observe  i -p i ¢ log(p i ) minimized for p = p p given by earlier model. p is then observed.

- 21 - Experimental Results: Part 2 Av. information per position of label in tag set 12345 9.28.58.07.77.5 Later labels are more predictable. Equidistribution = 12.3 bits. “Static” distribution = 9.3 bits. Av. information per position of human suggestions 12345+ 8.79.410.010.611.7 Human thinks harder and harder.

- 22 - Improving the ESP Game Could score points according to –log 2 (p) - Number of bits of information added to the system Have an activation time limit for “obvious” labels - Remove the immediate satisfaction for simple matches Hide off-limits terms - Have to be more careful to avoid “obvious” labels Try to match “experts” - Use previous tags or meta information Educate players - Use previously labeled images to unlearn behavior Automatically expand the off-limits list - Easy, but 10+ terms not practical

- 23 - Questions Thank you! ingmar@yahoo-inc.com

Rethinking the ESP Game Stephen Robertson, Milan Vojnovic, Ingmar Weber* Microsoft Research & Yahoo! Research *This work was done while I was a visiting.

Similar presentations

Presentation on theme: "Rethinking the ESP Game Stephen Robertson, Milan Vojnovic, Ingmar Weber* Microsoft Research & Yahoo! Research *This work was done while I was a visiting."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Rethinking the ESP Game Stephen Robertson, Milan Vojnovic, Ingmar Weber* Microsoft Research & Yahoo! Research *This work was done while I was a visiting.

Similar presentations

Presentation on theme: "Rethinking the ESP Game Stephen Robertson, Milan Vojnovic, Ingmar Weber* Microsoft Research & Yahoo! Research *This work was done while I was a visiting."— Presentation transcript:

Similar presentations

About project

Feedback