Crowdsourcing 04/11/2013 Neelima Chavali ECE 6504.

Slides:



Advertisements
Similar presentations
LABELING IMAGES LUIS VON AHN CARNEGIE MELLON UNIVERSITY.
Advertisements

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Aggregating local image descriptors into compact codes
Interactively Co-segmentating Topically Related Images with Intelligent Scribble Guidance Dhruv Batra, Carnegie Mellon University Adarsh Kowdle, Cornell.
On-the-fly Specific Person Retrieval University of Oxford 24 th May 2012 Omkar M. Parkhi, Andrea Vedaldi and Andrew Zisserman.
THE ESP GAME, & PEEKABOOM LUIS VON AHN CARNEGIE MELLON UNIVERSITY.
HUMAN COMPUTATION LUIS VON AHN CARNEGIE MELLON UNIVERSITY.
VisualRank: Applying PageRank to Large-Scale Image Search Yushi Jing, Member, IEEE, and Shumeet Baluja, Member, IEEE.
Extracting Valuable Information Lazily Shiry Ginosar Advisor: Professor Manuel Blum Graduate Mentor: Luis von Ahn.
Human- Computer Interfaces HUMAN COMPUTATION.  Humans helping solve large problems  Using humans WITH computers to solve problems not solvable be either.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
Statistical Classification Rong Jin. Classification Problems X Input Y Output ? Given input X={x 1, x 2, …, x m } Predict the class label y  Y Y = {-1,1},
Computer vision: models, learning and inference
Matchin: Eliciting User Preferences with an Online Game Severin Hacker, and Luis von Ahn Carnegie Mellon University SIGCHI 2009.
AN IMPROVED AUDIO Jenn Tam Computer Science Dept. Carnegie Mellon University SOAPS 2008, Pittsburgh, PA.
Paper Discussion: “Simultaneous Localization and Environmental Mapping with a Sensor Network”, Marinakis et. al. ICRA 2011.
Lecture 26: Vision for the Internet CS6670: Computer Vision Noah Snavely.
Extracting Valuable Information Lazily Shiry Ginosar.
CAPTCHA, THE ESP GAME, AND OTHER STUFF LUIS VON AHN CARNEGIE MELLON UNIVERSITY.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Human Computation Steven Emory CS 575. Overview What is Human Computation? History of Human Computation Examples of Human Computation Bad Example Good.
Peekaboom: A Game for Locating Objects in Images
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
CAPTCHA & THE ESP GAME SHAH JAYESH CS575SPRING 2008.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Human Computation CSC4170 Web Intelligence and Social Computing Tutorial 7 Tutor: Tom Chao Zhou
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik, U.C. Berkeley We present a model of edge and region grouping.
Human Computation Steven Emory CS 575 Human Issues in Computing.
Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
Lecture #32 WWW Search. Review: Data Organization Kinds of things to organize –Menu items –Text –Images –Sound –Videos –Records (I.e. a person ’ s name,
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
DATA-CENTERED CROWDSOURCING WORKSHOP PROF. TOVA MILO SLAVA NOVGORODOV TEL AVIV UNIVERSITY 2014/2015.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Exploration Seminar 3 Human Computation Roy McElmurry.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
Labeling Images for FUN!!! Yan Cao, Chris Hinrichs.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
An Analytical Study of Puzzle Selection Strategies for the ESP Game Ling-Jyh Chen, Bo-Chun Wang, Kuan-Ta Chen Academia Sinica Irwin King, and Jimmy Lee.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Playing GWAP with strategies - using ESP as an example Wen-Yuan Zhu CSIE, NTNU.
UNBIASED LOOK AT DATASET BIAS Antonio Torralba Massachusetts Institute of Technology Alexei A. Efros Carnegie Mellon University CVPR 2011.
CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose.
C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Deep Questions without Deep Understanding
Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:
CS 1699: Intro to Computer Vision Active Learning Prof. Adriana Kovashka University of Pittsburgh November 24, 2015.
1 Introduction to Machine Learning Chapter 1. cont.
Human Computation (aka Crowdsourcing) LUIS VON AHN Slides taken from a talk by.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
6.S093 Visual Recognition through Machine Learning Competition Image by kirkh.deviantart.com Joseph Lim and Aditya Khosla Acknowledgment: Many slides from.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Predicting Consensus Ranking in Crowdsourced Setting Xi Chen Mentors: Paul Bennett and Eric Horvitz Collaborator: Kevyn Collins-Thompson Machine Learning.
CS 2750: Machine Learning Active Learning and Crowdsourcing
Odd Leaf Out Combining Human and Computer Vision Arijit Biswas, Computer Science and Darcy Lewis, iSchool Derek Hansen, Jenny Preece, Dana Rotman-University.
THE ESP GAME, AND OTHER STUFF
Data-Centered Crowdsourcing Workshop
The topic discovery models
The topic discovery models
Thesis Advisor : Prof C.V. Jawahar
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Overview of Machine Learning
The topic discovery models
Machine Learning Algorithms – An Overview
Presentation transcript:

Crowdsourcing 04/11/2013 Neelima Chavali ECE 6504

Roadmap Introduction Adaptively learning the Crowd Kernel The ESP Game CrowdClustering Experiment

Introducion “The practice of obtaining needed services, ideas, or content by soliciting contributions from a large group of people, especially an online community”-Wikipedia Combines the efforts of crowds of volunteers or part-time workers to give a significant result

Applications Testing & Refining a Product(Netflix) Market Research(Threadless) Knowledge Management(wikipedia) Customer Service(My Starbucks Ideas) R&D Computer Vision/Machine Learning And many more fields

ADAPTIVELY LEARNING THE CROWD KERNEL Paper:1

ML on New domain Describe the dataset as a d-dimensional representation of every object in the domain. Requires expertise Two representations: – Feature vector representation – Kernel representation Slide credit: O. Tamuz

1. INPUT Slide credit: O. Tamuz

1. INPUT + Slide credit: O. Tamuz

2. CROWD QUERIES Slide credit: O. Tamuz

3. OUTPUT Slide credit: O. Tamuz

ADAPTIVE ALGORITHM Turk random triples Turk “most informative triples” Maximum likelihood fit to logistic or relative model using gradient descent We use probabilistic model + information gain to decide how informative a triple is. Slide credit: O. Tamuz

LURE OF ADAPTIVITY Tie store Bow ties Neck ties Tie clipsScarves Slide credit: O. Tamuz

PERFORMANCE EVALUATION 20 Questions metric Random object is chosen secretly System asks 20 questions and then ranks objects in terms of likelihood Dataset: 75 ties+75 tiles+75 flags Slide credit: O. Tamuz

LABELING IMAGES WITH A COMPUTER GAME Paper 2

IMAGE SEARCH ON THE WEB USES FILENAMES AND HTML TEXT Slide Credit: Luis von Ahn

TWO-PLAYER ONLINE GAME PARTNERS DON’T KNOW EACH OTHER AND CAN’T COMMUNICATE OBJECT OF THE GAME: TYPE THE SAME WORD THE ONLY THING IN COMMON IS AN IMAGE THE ESP GAME Slide Credit: Luis von Ahn

PLAYER 1PLAYER 2 GUESSING: CARGUESSING: BOY GUESSING: CAR SUCCESS! YOU AGREE ON CAR SUCCESS! YOU AGREE ON CAR GUESSING: KID GUESSING: HAT THE ESP GAME Slide Credit: Luis von Ahn

© 2004 Carnegie Mellon University, all rights reserved. Patent Pending. Slide Credit: Luis von Ahn

WHAT ABOUT CHEATING? IF A PAIR PLAYS TOO FAST, WE DON’T RECORD THE WORDS THEY AGREE ON Slide Credit: Luis von Ahn

WE GIVE PLAYERS TEST IMAGES FOR WHICH WE KNOW ALL THE COMMON LABELS: WE ONLY STORE A PLAYER’S GUESSES IF THEY SUCCESSFULLY LABEL THE TEST IMAGES WHAT ABOUT CHEATING? Slide Credit: Luis von Ahn

MANY PEOPLE PLAY OVER 20 HOURS A WEEK 3.2 MILLION LABELS WITH 22,000 PLAYERS THE ESP GAME IS FUN Slide Credit: Luis von Ahn

LABELING THE ENTIRE WEB INDIVIDUAL GAMES IN YAHOO! AND MSN AVERAGE OVER 10,000 PLAYERS AT A TIME 5000 PEOPLE PLAYING SIMULTANEOUSLY CAN LABEL ALL IMAGES ON GOOGLE IN 30 DAYS! Slide Credit: Luis von Ahn

A FEW MILLION LABELS CAN IMPROVE IMAGE SEARCH CAN BE USED TO IMPROVE COMPUTER VISION CAN BE USED TO IMPROVE ACCESSIBILITY FOR VISUALLY IMPAIRED Slide Credit: Luis von Ahn

CROWDCLUSTERING Paper:3

What did they do? Use crowdsourcing to discover categories

How? Approach Each worker given M images to cluster. Images are represented in d-dimensional euclidean space(hidden variables) Atomic clusters: Dirichlet process mixture model Worker: pairwise binary classifier with a bias(hidden variables) A worker’s tendency to label pair of images is modelled as a pairwise logistic regression

How? Approach The number of atomic cluster centres and their means and covariances need to be evaluated.

EXPERIMENTS

Color?

Crowdsourcing on Mechanical Turk

Crowdsourcing on Mechanical Truk

Results Black Red

Results Lavender(male)

Results Purple(female)

Results Pink(female)

Results Violet(female)

Acknowledgements Dr. Parikh Pavan Ghatty