CS 2750: Machine Learning Active Learning and Crowdsourcing Prof. Adriana Kovashka University of Pittsburgh April 18, 2016
Collecting data on Amazon Mechanical Turk Workers Task Task: Dog? Broker Answer: Yes Pay: $0.01 Is this a dog? o Yes o No www.mturk.com $0.01 Alex Sorokin
Annotation protocols Type keywords Select relevant images Click on landmarks Outline something ……….. anything else ……… Alex Sorokin
Type keywords $0.01 Alex Sorokin
Select examples $0.02 Alex Sorokin
Outline something $0.01 http://visionpc.cs.uiuc.edu/~largescale/results/production-3-2/results_page_013.html Alex Sorokin
Motivation X 100 000 = $5000 Custom annotations Large scale Low price Alex Sorokin
Issues Quality? Price? How good is it? How to be sure? How to price it? Alex Sorokin
Ensuring Annotation Quality Consensus / multiple annotation / “wisdom of the crowd” Qualification exam Gold standard questions Grading tasks A second tier of workers who grade others Adapted from Alex Sorokin
Pricing Trade off between throughput and cost Higher pay can actually attract scammers Some studies find that the most accurate results are achieved if Turkers do tasks for free Adapted from Alex Sorokin
Games with a purpose: Luis von Ahn Associate professor at CMU One of the “fathers” of crowdsourcing Created the ESP Game, Peekaboom, and several other “games with a purpose”
THE ESP GAME TWO-PLAYER ONLINE GAME PARTNERS DON’T KNOW EACH OTHER AND CAN’T COMMUNICATE OBJECT OF THE GAME: TYPE THE SAME WORD THE ONLY THING IN COMMON IS AN IMAGE Luis von Ahn and Laura Dabbish. “Labeling Images with a Computer Game.” CHI 2004.
THE ESP GAME PLAYER 1 PLAYER 2 GUESSING: CAR GUESSING: BOY GUESSING: KID GUESSING: HAT GUESSING: CAR SUCCESS! YOU AGREE ON CAR SUCCESS! YOU AGREE ON CAR Luis von Ahn
THE ESP GAME IS FUN 4.1 MILLION LABELS WITH 23,000 PLAYERS THERE ARE MANY PEOPLE THAT PLAY OVER 20 HOURS A WEEK Luis von Ahn
WHY DO PEOPLE LIKE THE ESP GAME? Luis von Ahn
THE ESP GAME GIVES ITS PLAYERS A WEIRD AND BEAUTIFUL SENSE OF ANONYMOUS INTIMACY. “ THE ESP GAME GIVES ITS PLAYERS A WEIRD AND BEAUTIFUL SENSE OF ANONYMOUS INTIMACY. ON THE ONE HAND, YOU HAVE NO IDEA WHO YOUR PARTNER IS. THE ESP GAME GIVES ITS PLAYERS A WEIRD AND BEAUTIFUL SENSE OF ANONYMOUS INTIMACY. ON THE ONE HAND, YOU HAVE NO IDEA WHO YOUR PARTNER IS. ON THE OTHER HAND, THE TWO OF YOU ARE BRINGING YOUR MINDS TOGETHER IN A WAY THAT LOVERS WOULD ENVY. ” Luis von Ahn
“ ” “ ” “ ” “ ” STRANGELY ADDICTIVE IT’S SO MUCH FUN TRYNG TO GUESS WHAT OTHERS THINK. YOU HAVE TO STEP OUTSIDE OF YOURSELF TO MATCH ” IT’S FAST-PACED “ ” HELPS ME LEARN ENGLISH “ ” Luis von Ahn
LOCATING OBJECTS IN IMAGES THE ESP GAME TELLS US IF AN IMAGE CONTAINS A SPECIFIC OBJECT, BUT DOESN’T SAY WHERE IN THE IMAGE THE OBJECT IS SUCH INFORMATION WOULD BE EXTREMELY USEFUL FOR COMPUTER VISION RESEARCH Luis von Ahn
PLAYERS SHOOT AT OBJECTS ON THE IMAGE PAINTBALL GAME PLAYERS SHOOT AT OBJECTS ON THE IMAGE SHOOT THE: CAR WE GIVE POINTS AND CHECK ACCURACY BY GIVING PLAYERS IMAGES FOR WHICH WE ALREADY KNOW WHERE THE OBJECT IS Luis von Ahn
REVEALING IMAGES GUESSER REVEALER CAR BRUSH CAR BRUSH CAR GUESS PARTNER’S GUESS Luis von Ahn
Summary: Collecting annotations from humans Crowdsourcing allows very cheap data collection Getting high-quality annotations can be tricky, but there are many ways to ensure quality One way to obtain high-quality data fast is by phrasing your data collection as a game What to do when data is expensive to obtain?
Crowdsourcing Active Learning Unlabeled data Show data, collect and filter labels Find data near decision boundary Training labels Training data Features Classifier training Trained classifier Training James Hays
Active Learning Traditional active learning reduces supervision by obtaining labels for the most informative or uncertain examples first. [Mackay 1992, Freund et al. 1997, Tong & Koller 2001, Lindenbaum et al. 2004, Kapoor et al. 2007 ...] Sudheendra Vijayanarasimhan
Visual Recognition With Humans in the Loop ECCV 2010, Crete, Greece Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona
What type of bird is this? Field guides difficult for average users Computer vision doesn’t work perfectly (yet) Research mostly on basic-level categories Suppose you are taking a hike and come across this bird. You would like to know what kind of bird it is. What would you do? You pull out your birding field guide, you fumble through it forever, and never figure it out (show Sibley) You plug it in to your leading computer vision algorithm (show bird, if you’re lucky, it might also say it’s a chair) S You learn to things 1) field guides don’t work 2) Computer vision just does basic categories and doesn’t perform that well What type of bird is this? Steve Branson
Visual Recognition With Humans in the Loop What kind of bird is this? Parakeet Auklet Steve Branson
Motivation Supplement visual recognition with the human capacity for visual feature extraction to tackle difficult (fine-grained) recognition problems Typical progress is viewed as increasing data difficulty while maintaining full autonomy Here, the authors view progress as reduction in human effort on difficult data Brian O’Neill
Categories of Recognition Basic-Level Subordinate Parts & Attributes Airplane? Chair? Bottle? … American Goldfinch? Indigo Bunting?… Yellow Belly? Blue Belly?… Easy for Humans Hard for Humans Easy for Humans Hard for computers Hard for computers Hard for computers Steve Branson
American Goldfinch? yes Visual 20 Questions Game Blue Belly? no Cone-shaped Beak? yes Striped Wing? yes It doesn’t mean they necessarily answer them correctly American Goldfinch? yes Hard classification problems can be turned into a sequence of easy ones Steve Branson
Recognition With Humans in the Loop Computer Vision Cone-shaped Beak? yes Computer Vision American Goldfinch? yes Computers: reduce number of required questions Humans: drive up accuracy of vision algorithms Steve Branson
Example Questions Steve Branson
Example Questions Steve Branson
Example Questions Steve Branson
… Basic Algorithm Input Image ( ) Computer Vision Max Expected Information Gain Input Image ( ) A: NO Question 1: Is the belly black? Max Expected Information Gain A: YES Question 2: Is the bill hooked? … Steve Branson
Some definitions: Set of possible questions Possible answers to question i Possible confidence in answer i (Guessing, Probably, Definitely) User response History of user responses at time t Brian O’Neill
Question selection Seek the question (e.g. “What color is the belly of the bird?”) that gives the maximum information gain (entropy reduction) given the image and the set of previous user responses Probability of obtaining response ui to evaluated question given image and response history Entropy when response is added to history Entropy at this iteration (before response to evaluated question is added to history) where Brian O’Neill
Results Users drive performance: 19% 68% Fewer questions asked if CV used Just Computer Vision 19% Adapted from Steve Branson
Summary: Human-in-the-loop learning To make intelligent use of the human labeling effort during training, have the computer vision algorithm learn actively by selecting those questions that are most informative To combine strengths of human and imperfect vision algorithms, use a human-in-the-loop at recognition time