Presentation is loading. Please wait.

Presentation is loading. Please wait.

How Crowdsourcable is Your Task? Carsten Eickhoff Arjen P. de Vries WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011), Hong Kong,

Similar presentations


Presentation on theme: "How Crowdsourcable is Your Task? Carsten Eickhoff Arjen P. de Vries WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011), Hong Kong,"— Presentation transcript:

1 How Crowdsourcable is Your Task? Carsten Eickhoff Arjen P. de Vries WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011), Hong Kong, China, February 9–12, 2011.

2 2 The Crowdsourcing Boom Crowdsourcing, a Tale of Great Romance A Journey to the Dark Side of Crowdsourcing Is all Lost? Conclusions O Outline

3 3 Billions of judgements are being crowdsourced each year CrowdFlower – Judgement volume doubled (2009-2010) Significant numbers of research publications rely on crowdsourcing to create scientific resources...but is it actually reliable? I The Crowdsourcing Boom

4 4 The Crowdsourcing Boom Crowdsourcing, a Tale of Great Romance A Journey to the Dark Side of Crowdsourcing Is all Lost? Conclusions O Outline

5 5 Summer 2008 How do I quickly get a large number of judgements? Task: Message grouping for discourse understanding Crowdsourcing produced very reliable results I Crowdsourcing – A Tale of Great Romance

6 6 Summer 2008 How do I quickly get a large number of judgements? Task: Message grouping for discourse understanding Crowdsourcing produced very reliable results I Crowdsourcing – A Tale of Great Romance

7 7 Fall 2008 Crowdsourcing has become a standard data source The excitement wears off I Crowdsourcing – A Tale of Great Romance

8 8 A dark and cold day in late autumn 2009 You need judgements for yet another experiment I Crowdsourcing – A Tale of Great Romance

9 9 A dark and cold day in late autumn 2009 You need judgements for yet another experiment You get cheated! I Crowdsourcing – A Tale of Great Romance

10 10 A dark and cold day in late autumn 2009 You need judgements for yet another experiment You get cheated! Again and again... I Crowdsourcing – A Tale of Great Romance

11 11 The Crowdsourcing Boom Crowdsourcing, a Tale of Great Romance A Journey to the Dark Side of Crowdsourcing Is all Lost? Conclusions O Outline

12 12 O A Journey to the Dark Side Task-based overview What is it that malicious workers do? Do we have remedies?

13 13 Task: Closed class questions Possible cheat: uniform answering (all yes/no) Possible cheat: arbitrary answers Remedy: Good gold standard data helps Pitfall: Cheaters who think about the task at hand can cause a lot of trouble (e.g. relevance judgements) I A Journey to the Dark Side

14 14 Task: Open class questions Possible cheat (1): Copy and paste standard text Possible cheat (2): Copy and paste domain-specific text Remedy: (1) is easy to detect. (2) is problematic I A Journey to the Dark Side

15 15 Task: Internal quality control Possible cheat: artificially boost your own confidence Possible cheat: even worse, do so in a network Remedy: We need a better confidence measure than prior acceptance rate Pitfall: Due to the large scale of HITs it is hard to find a reliable confidence measure I A Journey to the Dark Side

16 16 Task: External quality control Setup: redirect workers to your own site and let them do the HITs there Possible cheat: make up confirmation token Possible cheat: re-use genuine token Possible cheat: claim that you did not get a token Remedy: all of the above are easy to detect I A Journey to the Dark Side

17 17 The Crowdsourcing Boom Crowdsourcing, a Tale of Great Romance A Journey to the Dark Side of Crowdsourcing Is all Lost? Conclusions O Outline

18 18 E Is all Lost? Posterior detection and filtering of cheaters works reliably But we waste resources (money, time, nerves..) Can we discourage cheaters from doing our HIT in the first place?

19 19 E Is all Lost? Which HIT types do cheaters like? The Summer 2008 HIT hardly attracted any cheaters The one in Autumn was swamped by them The Summer task required a lot of creativity whereas the Autumn one was a straightforward relevance judgement

20 20 E Is all Lost? Hypothesis: “If the HIT conveys the impression of requiring creativity, cheaters are less likely to take it.” 2 HIT types – Suitability for children – Standard relevance judgements

21 21 E Task/Interface Design

22 22 C Crowd Filtering

23 23 F Conclusion The share of malicious workers can be significantly reduced by making your task: Innovative Creative Non-repetitive Crowd Filtering can help to reduce the share of malicious workers at the cost of higher completion time. Previous acceptance rate is not a robust predictor of worker reliability

24 24 V Thank You!

25 25 V Questions, Remarks, Concerns? c.eickhoff@tudelft.nl


Download ppt "How Crowdsourcable is Your Task? Carsten Eickhoff Arjen P. de Vries WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011), Hong Kong,"

Similar presentations


Ads by Google