How Crowdsourcable is Your Task? Carsten Eickhoff Arjen P. de Vries WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011), Hong Kong,

Slides:

Advertisements

Similar presentations

The essentials managers need to know about Excel

Advertisements

Maxine Eskenazi Language Technologies Institute Carnegie Mellon University.

A quick review of z-scores and how to understand them August 26, 2011

CIM Power of Marketing 30 Sept 2008 Writing press releases that get NOTICED Presented by Ellen Carroll.

Overview of Objectives of Workshop Professor Ayo Ojuawo Department of Paediatrics University of Ilorin.

Applying Crowd Sourcing and Workflow in Social Conflict Detection By: Reshmi De, Bhargabi Chakrabarti 28/03/13.

Evaluation State-of the-art and future actions Bente Maegaard CST, University of Copenhagen

Practitioner Forum Innovation and Improvement for Safeguarding in Salford.

Turkish Economy in 2010 and Beyond Murat Ucer Global Source Partners All information copyright of GlobalSource and/or Murat Ucer. All rights reserved.

Jim Haywood - Product Manager for Statutory Returns School Census Summer 2015 Overview Keywords: CENSUS CENSUS15.

Multi video camera calibration and synchronization.

2008Anton McLachlan Workshop on Publishing Scientific Papers Constructing a Paper The final step in a research project. We all stand on the shoulders of.

Finding and evaluating Open Educational Resources SCORE Fellows Workshop 6 th December 2010 Creator: Non Scantlebury The Open University Library Services.

Peer-to-peer archival data trading Brian Cooper and Hector Garcia-Molina Stanford University.

Does School Uniform Change Different Aged Children’s Behavior? BySai.

 What I hate about you things people often do that hurt their Web site’s chances with search engines.

Boom and Bust Canada in the 1920s In the 1920s … Canada’s economy recovered quickly after WWI Canada’s economy recovered quickly after WWI By the mid.

Critical Thinking: Using Reflection Friday, 21 st November 2008.

How to conduct a network scale-up survey Christopher McCarty and H. Russell Bernard University of Florida February, 2009 © 2009 Christopher McCarty and.

SHS Team-teaching Workshop Facilitators: Yamaguchi Akiko (JTE) Anne Tan (ALT)

Copyright ©: SAMSUNG & Samsung Hope for Youth. All rights reserved Tutorials The internet: Finding information Suitable for: Improver Advanced.

Historical Investigation Suggested Formats. A “wicked” web site Pros It looks great Doesn’t have to be “linear” i.e. a set order to the info Allows you.

Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.

Christopher Harris Informatics Program The University of Iowa Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011) Hong Kong, Feb. 9, 2011.

How to write a successful EU funded project proposal? Fred de Vries Brussels 21 April 2004 Seminar Networking eLearning Practitioners.

WebInfoMall: the Chinese Web Archive how we got started and how it is now Huang Lianen and Li Xiaoming Peking University, China Digital Archive Workshop.

Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.

Department of Chemical Engineering Project IV Lecture 3: Literature Review.

Investigating Factors Influencing Crowdsourcing Tasks with High Imaginative Load Raynor Vliegendhart Martha Larson Christoph Kofler Carsten Eickhoff (speaker)

1 TURKOISE: a Mechanical Turk-based Tailor-made Metric for Spoken Language Translation Systems in the Medical Domain Workshop on Automatic and Manual Metrics.

Orna Farrell Presentation Skills Orna Farrell

Folksonomy Folktales Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services

 Research  Creating a Website  Online Assessment.

Revision Questions The Scientific Method. What is the Scientific Method? It is the only scientific way accepted to back up a theory or idea. It is the.

The Rise and Fall of Theme Parks in Hong Kong. From the surveys.

SOFTWARE METRICS Software Metrics :Roadmap Norman E Fenton and Martin Neil Presented by Santhosh Kumar Grandai.

Internet and its influence.. 1,2,3,4,5 Look – Think - Write.

Underwater Discovery Using Side Sonar Peter Gan, Tony Castagna, Peter Straub- Benthic SIRE 2009 Richard Stockton College What is side sonar? Side sonar.

?. 340 million tweets are sent each and every day.

The Scottish Magnet Lesson starter: Life in 19 th Century Ireland was of a poor quality for most. Do you agree? Provide reasons for your answer (4 marks)

E-Safety Challenge College. Learning Objectives Discuss the term plagiarism when using the Internet and it’s relevance to school work.

JOB ADVERT 9 th Meeting. HOW TO ANSWER ANY INTERVIEW QUESTION BY PERRI CAPELLPERRI CAPELL  He suggests when answering job-interview queries applying.

Disabled Children's Social Care Families & Carers Feedback Summary April – October 2015.

The Minto Pyramid Principle® Concept  The Minto Pyramid Principle concentrates on the thinking process that should precede writing. It explains how to.

Crowdsourcing Blog Track Top News Judgments at TREC Richard McCreadie, Craig Macdonald, Iadh Ounis {richardm, craigm, 1.

Survey Results Aaron Brown Billy Kakes Calvin Ling Professor David Patterson.

STANDARD GRADE CLOSE READING. In Your Own Words You have to show that you understand what you have read by explaining it. Do not copy or repeat any words.

Bellwork A – 8/21/12 B – 8/22/12 Recall how “Please remember how every animal cries” helps us remember Mr. B’s scientific method. Using the same idea come.

Big6 Research and Problem Solving Skills 6 th Grade Project Creating a Travel Brochure.

Year 6 SAT’s meeting 25/2/16. Year 6 SATs Welcome Breakfast club SATs week timetable Format of SATs papers Example questions How we are preparing your.

7 th Grade Big6 Project Assignment: Make a children’s informational book (It can be in graphic novel format or regular picture-book format)

Advanced Higher STATISTICS Spearman’s Rank (Spearman’s rank correlation coefficient) Lesson Objectives 1. Explain why it is used. 2. List the advantages.

Initial ideas: Using Serif to design hats A small sample of designs created using SerifDraw.

T.W.Scholten, C. de Persis, P. Tesi

Step 1 I found it, Now what?.

The American Dream LO: to explore ‘the American Dream’ and to see how this applies to George and Lennie.

Data Analysis of EnchantedLearning.com vs. Invent.org

Reach People when it matters with Location Extensions

Bountiful High School MAP Ethics Research Project

Affordable Moving Services - What to Look For

Inferential Statistics

Research Methods.

This presentation has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational purposes.

MEARNS CASTLE HIGH SCHOOL ENGLISH DEPARTMENT National 5 / Higher English Advice on SQA folio submission.

Summary/Product introduction/Report debriefing/Defense Papers

Curriculum for Excellence

How to add a website to your class Wiki

YOUR THESIS DEFENCE HOW TO DEMONSTRATE YOUR FULL POTENTIAL

Nature Directives Expert Group Meeting Brussels, 22 May 2019

Presentation transcript:

How Crowdsourcable is Your Task? Carsten Eickhoff Arjen P. de Vries WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011), Hong Kong, China, February 9–12, 2011.

2 The Crowdsourcing Boom Crowdsourcing, a Tale of Great Romance A Journey to the Dark Side of Crowdsourcing Is all Lost? Conclusions O Outline

3 Billions of judgements are being crowdsourced each year CrowdFlower – Judgement volume doubled ( ) Significant numbers of research publications rely on crowdsourcing to create scientific resources...but is it actually reliable? I The Crowdsourcing Boom

4 The Crowdsourcing Boom Crowdsourcing, a Tale of Great Romance A Journey to the Dark Side of Crowdsourcing Is all Lost? Conclusions O Outline

5 Summer 2008 How do I quickly get a large number of judgements? Task: Message grouping for discourse understanding Crowdsourcing produced very reliable results I Crowdsourcing – A Tale of Great Romance

6 Summer 2008 How do I quickly get a large number of judgements? Task: Message grouping for discourse understanding Crowdsourcing produced very reliable results I Crowdsourcing – A Tale of Great Romance

7 Fall 2008 Crowdsourcing has become a standard data source The excitement wears off I Crowdsourcing – A Tale of Great Romance

8 A dark and cold day in late autumn 2009 You need judgements for yet another experiment I Crowdsourcing – A Tale of Great Romance

9 A dark and cold day in late autumn 2009 You need judgements for yet another experiment You get cheated! I Crowdsourcing – A Tale of Great Romance

10 A dark and cold day in late autumn 2009 You need judgements for yet another experiment You get cheated! Again and again... I Crowdsourcing – A Tale of Great Romance

11 The Crowdsourcing Boom Crowdsourcing, a Tale of Great Romance A Journey to the Dark Side of Crowdsourcing Is all Lost? Conclusions O Outline

12 O A Journey to the Dark Side Task-based overview What is it that malicious workers do? Do we have remedies?

13 Task: Closed class questions Possible cheat: uniform answering (all yes/no) Possible cheat: arbitrary answers Remedy: Good gold standard data helps Pitfall: Cheaters who think about the task at hand can cause a lot of trouble (e.g. relevance judgements) I A Journey to the Dark Side

14 Task: Open class questions Possible cheat (1): Copy and paste standard text Possible cheat (2): Copy and paste domain-specific text Remedy: (1) is easy to detect. (2) is problematic I A Journey to the Dark Side

15 Task: Internal quality control Possible cheat: artificially boost your own confidence Possible cheat: even worse, do so in a network Remedy: We need a better confidence measure than prior acceptance rate Pitfall: Due to the large scale of HITs it is hard to find a reliable confidence measure I A Journey to the Dark Side

16 Task: External quality control Setup: redirect workers to your own site and let them do the HITs there Possible cheat: make up confirmation token Possible cheat: re-use genuine token Possible cheat: claim that you did not get a token Remedy: all of the above are easy to detect I A Journey to the Dark Side

17 The Crowdsourcing Boom Crowdsourcing, a Tale of Great Romance A Journey to the Dark Side of Crowdsourcing Is all Lost? Conclusions O Outline

18 E Is all Lost? Posterior detection and filtering of cheaters works reliably But we waste resources (money, time, nerves..) Can we discourage cheaters from doing our HIT in the first place?

19 E Is all Lost? Which HIT types do cheaters like? The Summer 2008 HIT hardly attracted any cheaters The one in Autumn was swamped by them The Summer task required a lot of creativity whereas the Autumn one was a straightforward relevance judgement

20 E Is all Lost? Hypothesis: “If the HIT conveys the impression of requiring creativity, cheaters are less likely to take it.” 2 HIT types – Suitability for children – Standard relevance judgements

21 E Task/Interface Design

22 C Crowd Filtering

23 F Conclusion The share of malicious workers can be significantly reduced by making your task: Innovative Creative Non-repetitive Crowd Filtering can help to reduce the share of malicious workers at the cost of higher completion time. Previous acceptance rate is not a robust predictor of worker reliability

24 V Thank You!

25 V Questions, Remarks, Concerns?