CS 2750: Machine Learning Active Learning and Crowdsourcing

Slides:



Advertisements
Similar presentations
Attributes for Classifier Feedback Amar Parkash and Devi Parikh.
Advertisements

Active Learning with Feedback on Both Features and Instances H. Raghavan, O. Madani and R. Jones Journal of Machine Learning Research 7 (2006) Presented.
LABELING IMAGES LUIS VON AHN CARNEGIE MELLON UNIVERSITY.
Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,
Incentivize Crowd Labeling under Budget Constraint
THE ESP GAME, & PEEKABOOM LUIS VON AHN CARNEGIE MELLON UNIVERSITY.
HUMAN COMPUTATION LUIS VON AHN CARNEGIE MELLON UNIVERSITY.
Data, Information and Coding In today’s lesson we will look at: The difference between data, information and knowledge How we can code data to make it.
Game Theoretic Aspect in Human Computation Presenter: Chien-Ju Ho
Capturing Human Insight for Visual Learning Kristen Grauman Department of Computer Science University of Texas at Austin Work with Sudheendra Vijayanarasimhan,
Utility data annotation via Amazon Mechanical Turk Alexander Sorokin David Forsyth University of Illinois at Urbana-Champaign
Human- Computer Interfaces HUMAN COMPUTATION.  Humans helping solve large problems  Using humans WITH computers to solve problems not solvable be either.
Presenter: Chien-Ju Ho  Introduction to Amazon Mechanical Turk  Applications  Demographics and statistics  The value of using MTurk Repeated.
1 Human Computation Play a Game to Develop an Ontology Peyman Nasirifard p+e+y+m+a+b-b+n dot deri.org.
Collaborative Human Computing Zack Zhu March 31, 2010 Seminar for Distributed Computing 1.
Foundations & Core in Computer Vision: A System Perspective Ce Liu Microsoft Research New England.
Game Design Serious Games Miikka Junnila.
Crowdsourcing 04/11/2013 Neelima Chavali ECE 6504.
Rethinking Grammatical Error Detection and Evaluation with the Amazon Mechanical Turk Joel Tetreault[Educational Testing Service] Elena Filatova[Fordham.
COVERT MULTI-PARTY COMPUTATION YINMENG ZHANG ALADDIN REU 2005 LUIS VON AHN MANUEL BLUM.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Taking Computer Vision Into The Wild Neeraj Kumar October 4, 2011 CSE 590V – Fall 2011 University of Washington.
Machine Learning Case study. What is ML ?  The goal of machine learning is to build computer systems that can adapt and learn from their experience.”
Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.
Lecture 26: Vision for the Internet CS6670: Computer Vision Noah Snavely.
Crowdsourcing research data UMBC ebiquity,
CAPTCHA, THE ESP GAME, AND OTHER STUFF LUIS VON AHN CARNEGIE MELLON UNIVERSITY.
Peekaboom: A Game for Locating Objects in Images
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
Presented by Zeehasham Rasheed
Rethinking the ESP Game Stephen Robertson, Milan Vojnovic, Ingmar Weber* Microsoft Research & Yahoo! Research *This work was done while I was a visiting.
Recap: HOGgles Data, Representation, and Learning matter. – This work looked just at representation By creating a human-understandable HoG visualization,
CAPTCHA & THE ESP GAME SHAH JAYESH CS575SPRING 2008.
Human Computation CSC4170 Web Intelligence and Social Computing Tutorial 7 Tutor: Tom Chao Zhou
Human Computation Steven Emory CS 575 Human Issues in Computing.
Beyond datasets: Learning in a fully-labeled real world Thesis proposal Alexander Sorokin.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers Victor Sheng, Foster Provost, Panos Ipeirotis KDD 2008 New York.
Exploration Seminar 3 Human Computation Roy McElmurry.
Human Computation & ESP Game 2008/12/19 Presenter: Lin, Sin-Yan 1.
Last part: datasets and object collections. CMU/MIT frontal facesvasc.ri.cmu.edu/idb/html/face/frontal_images cbcl.mit.edu/software-datasets/FaceData2.html.
Systematization of Crowdsoucing for Data Annotation Aobo, Feb
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Labeling Images for FUN!!! Yan Cao, Chris Hinrichs.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Crowdsourcing for Spoken Dialogue System Evaluation Ling 575 Spoken Dialog April 30, 2015.
Ideas Session Willer Travassos, Jan. 24th. GWAP Games with a purpose (GWAP) uses the computational power of humans to perform tasks that computers are.
©2001 Southern Illinois University, Edwardsville All rights reserved. Today Interview Techniques (Hand-in partner preferences) Thursday In-class Interviewing.
Six Hat Thinking Damian Groark CEO Challenging Learning
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
An Analytical Study of Puzzle Selection Strategies for the ESP Game Ling-Jyh Chen, Bo-Chun Wang, Kuan-Ta Chen Academia Sinica Irwin King, and Jimmy Lee.
Human Computation and Computer Vision CS143 Computer Vision James Hays, Brown University.
Playing GWAP with strategies - using ESP as an example Wen-Yuan Zhu CSIE, NTNU.
CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
CoCQA : Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation Baoli Li, Yandong Liu, and Eugene Agichtein.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
CS 1699: Intro to Computer Vision Active Learning Prof. Adriana Kovashka University of Pittsburgh November 24, 2015.
Human Computation (aka Crowdsourcing) LUIS VON AHN Slides taken from a talk by.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Data annotation with Amazon Mechanical Turk. Alexander Sorokin David Forsyth University of Illinois at Urbana-Champaign
Introduction to Machine Learning, its potential usage in network area,
THE ESP GAME, AND OTHER STUFF
Introduction to Classification & Clustering
What is a CAT? What is a CAT?.
Human-in-the-loop ECE6504 Xiao Lin.
1. Dashboard 1 2 The Dashboard provides a summary of all the relevant information of your Back Market account. 1 Check your average customer rating, a.
Evaluating Classifiers
THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU
Presentation transcript:

CS 2750: Machine Learning Active Learning and Crowdsourcing Prof. Adriana Kovashka University of Pittsburgh April 18, 2016

Collecting data on Amazon Mechanical Turk Workers Task Task: Dog? Broker Answer: Yes Pay: $0.01 Is this a dog? o Yes o No www.mturk.com $0.01 Alex Sorokin

Annotation protocols Type keywords Select relevant images Click on landmarks Outline something ……….. anything else ……… Alex Sorokin

Type keywords $0.01 Alex Sorokin

Select examples $0.02 Alex Sorokin

Outline something $0.01 http://visionpc.cs.uiuc.edu/~largescale/results/production-3-2/results_page_013.html Alex Sorokin

Motivation X 100 000 = $5000 Custom annotations Large scale Low price Alex Sorokin

Issues Quality? Price? How good is it? How to be sure? How to price it? Alex Sorokin

Ensuring Annotation Quality Consensus / multiple annotation / “wisdom of the crowd” Qualification exam Gold standard questions Grading tasks A second tier of workers who grade others Adapted from Alex Sorokin

Pricing Trade off between throughput and cost Higher pay can actually attract scammers Some studies find that the most accurate results are achieved if Turkers do tasks for free Adapted from Alex Sorokin

Games with a purpose: Luis von Ahn Associate professor at CMU One of the “fathers” of crowdsourcing Created the ESP Game, Peekaboom, and several other “games with a purpose”

THE ESP GAME TWO-PLAYER ONLINE GAME PARTNERS DON’T KNOW EACH OTHER AND CAN’T COMMUNICATE OBJECT OF THE GAME: TYPE THE SAME WORD THE ONLY THING IN COMMON IS AN IMAGE Luis von Ahn and Laura Dabbish. “Labeling Images with a Computer Game.” CHI 2004.

THE ESP GAME PLAYER 1 PLAYER 2 GUESSING: CAR GUESSING: BOY GUESSING: KID GUESSING: HAT GUESSING: CAR SUCCESS! YOU AGREE ON CAR SUCCESS! YOU AGREE ON CAR Luis von Ahn

THE ESP GAME IS FUN 4.1 MILLION LABELS WITH 23,000 PLAYERS THERE ARE MANY PEOPLE THAT PLAY OVER 20 HOURS A WEEK Luis von Ahn

WHY DO PEOPLE LIKE THE ESP GAME? Luis von Ahn

THE ESP GAME GIVES ITS PLAYERS A WEIRD AND BEAUTIFUL SENSE OF ANONYMOUS INTIMACY. “ THE ESP GAME GIVES ITS PLAYERS A WEIRD AND BEAUTIFUL SENSE OF ANONYMOUS INTIMACY. ON THE ONE HAND, YOU HAVE NO IDEA WHO YOUR PARTNER IS. THE ESP GAME GIVES ITS PLAYERS A WEIRD AND BEAUTIFUL SENSE OF ANONYMOUS INTIMACY. ON THE ONE HAND, YOU HAVE NO IDEA WHO YOUR PARTNER IS. ON THE OTHER HAND, THE TWO OF YOU ARE BRINGING YOUR MINDS TOGETHER IN A WAY THAT LOVERS WOULD ENVY. ” Luis von Ahn

“ ” “ ” “ ” “ ” STRANGELY ADDICTIVE IT’S SO MUCH FUN TRYNG TO GUESS WHAT OTHERS THINK. YOU HAVE TO STEP OUTSIDE OF YOURSELF TO MATCH ” IT’S FAST-PACED “ ” HELPS ME LEARN ENGLISH “ ” Luis von Ahn

LOCATING OBJECTS IN IMAGES THE ESP GAME TELLS US IF AN IMAGE CONTAINS A SPECIFIC OBJECT, BUT DOESN’T SAY WHERE IN THE IMAGE THE OBJECT IS SUCH INFORMATION WOULD BE EXTREMELY USEFUL FOR COMPUTER VISION RESEARCH Luis von Ahn

PLAYERS SHOOT AT OBJECTS ON THE IMAGE PAINTBALL GAME PLAYERS SHOOT AT OBJECTS ON THE IMAGE SHOOT THE: CAR WE GIVE POINTS AND CHECK ACCURACY BY GIVING PLAYERS IMAGES FOR WHICH WE ALREADY KNOW WHERE THE OBJECT IS Luis von Ahn

REVEALING IMAGES GUESSER REVEALER CAR BRUSH CAR BRUSH CAR GUESS PARTNER’S GUESS Luis von Ahn

Summary: Collecting annotations from humans Crowdsourcing allows very cheap data collection Getting high-quality annotations can be tricky, but there are many ways to ensure quality One way to obtain high-quality data fast is by phrasing your data collection as a game What to do when data is expensive to obtain?

Crowdsourcing  Active Learning Unlabeled data Show data, collect and filter labels Find data near decision boundary Training labels Training data Features Classifier training Trained classifier Training James Hays

Active Learning Traditional active learning reduces supervision by obtaining labels for the most informative or uncertain examples first. [Mackay 1992, Freund et al. 1997, Tong & Koller 2001, Lindenbaum et al. 2004, Kapoor et al. 2007 ...] Sudheendra Vijayanarasimhan

Visual Recognition With Humans in the Loop ECCV 2010, Crete, Greece Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona

What type of bird is this? Field guides difficult for average users Computer vision doesn’t work perfectly (yet) Research mostly on basic-level categories Suppose you are taking a hike and come across this bird. You would like to know what kind of bird it is. What would you do? You pull out your birding field guide, you fumble through it forever, and never figure it out (show Sibley) You plug it in to your leading computer vision algorithm (show bird, if you’re lucky, it might also say it’s a chair) S You learn to things 1) field guides don’t work 2) Computer vision just does basic categories and doesn’t perform that well What type of bird is this? Steve Branson

Visual Recognition With Humans in the Loop What kind of bird is this? Parakeet Auklet Steve Branson

Motivation Supplement visual recognition with the human capacity for visual feature extraction to tackle difficult (fine-grained) recognition problems Typical progress is viewed as increasing data difficulty while maintaining full autonomy Here, the authors view progress as reduction in human effort on difficult data Brian O’Neill

Categories of Recognition Basic-Level Subordinate Parts & Attributes Airplane? Chair? Bottle? … American Goldfinch? Indigo Bunting?… Yellow Belly? Blue Belly?… Easy for Humans Hard for Humans Easy for Humans Hard for computers Hard for computers Hard for computers Steve Branson

American Goldfinch? yes Visual 20 Questions Game Blue Belly? no Cone-shaped Beak? yes Striped Wing? yes It doesn’t mean they necessarily answer them correctly American Goldfinch? yes Hard classification problems can be turned into a sequence of easy ones Steve Branson

Recognition With Humans in the Loop Computer Vision Cone-shaped Beak? yes Computer Vision American Goldfinch? yes Computers: reduce number of required questions Humans: drive up accuracy of vision algorithms Steve Branson

Example Questions Steve Branson

Example Questions Steve Branson

Example Questions Steve Branson

… Basic Algorithm Input Image ( ) Computer Vision Max Expected Information Gain Input Image ( ) A: NO Question 1: Is the belly black? Max Expected Information Gain A: YES Question 2: Is the bill hooked? … Steve Branson

Some definitions: Set of possible questions Possible answers to question i Possible confidence in answer i (Guessing, Probably, Definitely) User response History of user responses at time t Brian O’Neill

Question selection Seek the question (e.g. “What color is the belly of the bird?”) that gives the maximum information gain (entropy reduction) given the image and the set of previous user responses Probability of obtaining response ui to evaluated question given image and response history Entropy when response is added to history Entropy at this iteration (before response to evaluated question is added to history) where Brian O’Neill

Results Users drive performance: 19%  68% Fewer questions asked if CV used Just Computer Vision 19% Adapted from Steve Branson

Summary: Human-in-the-loop learning To make intelligent use of the human labeling effort during training, have the computer vision algorithm learn actively by selecting those questions that are most informative To combine strengths of human and imperfect vision algorithms, use a human-in-the-loop at recognition time