Download presentation
Published byRuth Arnold Modified over 8 years ago
1
CS 2750: Machine Learning Active Learning and Crowdsourcing
Prof. Adriana Kovashka University of Pittsburgh April 18, 2016
2
Collecting data on Amazon Mechanical Turk
Workers Task Task: Dog? Broker Answer: Yes Pay: $0.01 Is this a dog? o Yes o No $0.01 Alex Sorokin
3
Annotation protocols Type keywords Select relevant images
Click on landmarks Outline something ……….. anything else ……… Alex Sorokin
4
Type keywords $0.01 Alex Sorokin
5
Select examples $0.02 Alex Sorokin
6
Outline something $0.01 Alex Sorokin
7
Motivation X 100 000 = $5000 Custom annotations Large scale Low price
Alex Sorokin
9
Issues Quality? Price? How good is it? How to be sure?
How to price it? Alex Sorokin
10
Ensuring Annotation Quality
Consensus / multiple annotation / “wisdom of the crowd” Qualification exam Gold standard questions Grading tasks A second tier of workers who grade others Adapted from Alex Sorokin
11
Pricing Trade off between throughput and cost
Higher pay can actually attract scammers Some studies find that the most accurate results are achieved if Turkers do tasks for free Adapted from Alex Sorokin
12
Games with a purpose: Luis von Ahn
Associate professor at CMU One of the “fathers” of crowdsourcing Created the ESP Game, Peekaboom, and several other “games with a purpose”
13
THE ESP GAME TWO-PLAYER ONLINE GAME
PARTNERS DON’T KNOW EACH OTHER AND CAN’T COMMUNICATE OBJECT OF THE GAME: TYPE THE SAME WORD THE ONLY THING IN COMMON IS AN IMAGE Luis von Ahn and Laura Dabbish. “Labeling Images with a Computer Game.” CHI 2004.
14
THE ESP GAME PLAYER 1 PLAYER 2 GUESSING: CAR GUESSING: BOY
GUESSING: KID GUESSING: HAT GUESSING: CAR SUCCESS! YOU AGREE ON CAR SUCCESS! YOU AGREE ON CAR Luis von Ahn
15
THE ESP GAME IS FUN 4.1 MILLION LABELS WITH 23,000 PLAYERS
THERE ARE MANY PEOPLE THAT PLAY OVER 20 HOURS A WEEK Luis von Ahn
16
WHY DO PEOPLE LIKE THE ESP GAME?
Luis von Ahn
17
THE ESP GAME GIVES ITS PLAYERS A WEIRD AND BEAUTIFUL SENSE OF ANONYMOUS INTIMACY.
“ THE ESP GAME GIVES ITS PLAYERS A WEIRD AND BEAUTIFUL SENSE OF ANONYMOUS INTIMACY. ON THE ONE HAND, YOU HAVE NO IDEA WHO YOUR PARTNER IS. THE ESP GAME GIVES ITS PLAYERS A WEIRD AND BEAUTIFUL SENSE OF ANONYMOUS INTIMACY. ON THE ONE HAND, YOU HAVE NO IDEA WHO YOUR PARTNER IS. ON THE OTHER HAND, THE TWO OF YOU ARE BRINGING YOUR MINDS TOGETHER IN A WAY THAT LOVERS WOULD ENVY. ” Luis von Ahn
18
“ ” “ ” “ ” “ ” STRANGELY ADDICTIVE
IT’S SO MUCH FUN TRYNG TO GUESS WHAT OTHERS THINK. YOU HAVE TO STEP OUTSIDE OF YOURSELF TO MATCH ” IT’S FAST-PACED “ ” HELPS ME LEARN ENGLISH “ ” Luis von Ahn
19
LOCATING OBJECTS IN IMAGES
THE ESP GAME TELLS US IF AN IMAGE CONTAINS A SPECIFIC OBJECT, BUT DOESN’T SAY WHERE IN THE IMAGE THE OBJECT IS SUCH INFORMATION WOULD BE EXTREMELY USEFUL FOR COMPUTER VISION RESEARCH Luis von Ahn
20
PLAYERS SHOOT AT OBJECTS ON THE IMAGE
PAINTBALL GAME PLAYERS SHOOT AT OBJECTS ON THE IMAGE SHOOT THE: CAR WE GIVE POINTS AND CHECK ACCURACY BY GIVING PLAYERS IMAGES FOR WHICH WE ALREADY KNOW WHERE THE OBJECT IS Luis von Ahn
21
REVEALING IMAGES GUESSER REVEALER CAR BRUSH CAR BRUSH CAR GUESS
PARTNER’S GUESS Luis von Ahn
22
Summary: Collecting annotations from humans
Crowdsourcing allows very cheap data collection Getting high-quality annotations can be tricky, but there are many ways to ensure quality One way to obtain high-quality data fast is by phrasing your data collection as a game What to do when data is expensive to obtain?
23
Crowdsourcing Active Learning
Unlabeled data Show data, collect and filter labels Find data near decision boundary Training labels Training data Features Classifier training Trained classifier Training James Hays
24
Active Learning Traditional active learning reduces supervision by obtaining labels for the most informative or uncertain examples first. [Mackay 1992, Freund et al , Tong & Koller 2001, Lindenbaum et al , Kapoor et al ] Sudheendra Vijayanarasimhan
25
Visual Recognition With Humans in the Loop
ECCV 2010, Crete, Greece Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona
26
What type of bird is this?
Field guides difficult for average users Computer vision doesn’t work perfectly (yet) Research mostly on basic-level categories Suppose you are taking a hike and come across this bird. You would like to know what kind of bird it is. What would you do? You pull out your birding field guide, you fumble through it forever, and never figure it out (show Sibley) You plug it in to your leading computer vision algorithm (show bird, if you’re lucky, it might also say it’s a chair) S You learn to things 1) field guides don’t work 2) Computer vision just does basic categories and doesn’t perform that well What type of bird is this? Steve Branson
27
Visual Recognition With Humans in the Loop
What kind of bird is this? Parakeet Auklet Steve Branson
28
Motivation Supplement visual recognition with the human capacity for visual feature extraction to tackle difficult (fine-grained) recognition problems Typical progress is viewed as increasing data difficulty while maintaining full autonomy Here, the authors view progress as reduction in human effort on difficult data Brian O’Neill
29
Categories of Recognition
Basic-Level Subordinate Parts & Attributes Airplane? Chair? Bottle? … American Goldfinch? Indigo Bunting?… Yellow Belly? Blue Belly?… Easy for Humans Hard for Humans Easy for Humans Hard for computers Hard for computers Hard for computers Steve Branson
30
American Goldfinch? yes
Visual 20 Questions Game Blue Belly? no Cone-shaped Beak? yes Striped Wing? yes It doesn’t mean they necessarily answer them correctly American Goldfinch? yes Hard classification problems can be turned into a sequence of easy ones Steve Branson
31
Recognition With Humans in the Loop
Computer Vision Cone-shaped Beak? yes Computer Vision American Goldfinch? yes Computers: reduce number of required questions Humans: drive up accuracy of vision algorithms Steve Branson
32
Example Questions Steve Branson
33
Example Questions Steve Branson
34
Example Questions Steve Branson
35
… Basic Algorithm Input Image ( ) Computer Vision
Max Expected Information Gain Input Image ( ) A: NO Question 1: Is the belly black? Max Expected Information Gain A: YES Question 2: Is the bill hooked? … Steve Branson
36
Some definitions: Set of possible questions
Possible answers to question i Possible confidence in answer i (Guessing, Probably, Definitely) User response History of user responses at time t Brian O’Neill
37
Question selection Seek the question (e.g. “What color is the belly of the bird?”) that gives the maximum information gain (entropy reduction) given the image and the set of previous user responses Probability of obtaining response ui to evaluated question given image and response history Entropy when response is added to history Entropy at this iteration (before response to evaluated question is added to history) where Brian O’Neill
38
Results Users drive performance: 19% 68%
Fewer questions asked if CV used Just Computer Vision 19% Adapted from Steve Branson
39
Summary: Human-in-the-loop learning
To make intelligent use of the human labeling effort during training, have the computer vision algorithm learn actively by selecting those questions that are most informative To combine strengths of human and imperfect vision algorithms, use a human-in-the-loop at recognition time
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.