Download presentation
Presentation is loading. Please wait.
1
Crowdsourcing research data UMBC ebiquity, 2010-03-01
2
Overview How did we get into this Crowdsourcing defined Amazon Mechanical Turk CloudFlower Two examples – Annotating tweets for named entities – Evaluating word clouds Conclusions
3
Motivation Needed to train a named entity recognizer for Twitter statuses – Need human judgments on 1000s of tweets to identify NERs of type PER, ORG or LOC anand drove to boston to see the red sox play PER LOC ORG ORG NAACL 2010 Workshop: Creating Speech and Language Data With Amazon’s Mechanical Turk Shared task papers: what can you do with $100
4
Crowdsourcing Crowdsourcing = Crowd + Outsourcing Tasks normally performed by employees outsourced via an open call to a large community Some examples – Netflix prize – InnoCentive: solve R&D challenges – DARPA Network Challenge
5
Web Crowdsourcing Ideal fit for the Web Lots of custom examples – ESP Game, now Google Image Labeler – reCAPTCHA – Galaxy Zoo: armature astronomers classify galaxy images General crowd sourcing services – Amazon Mechanical Turk – CrowdFLower
9
Amazon Mechanical Turk Amazon service since 2005 Some tasks can’t be done well by computers and some require human judgements Amazon’s name: Human Intelligence Task (HIT) Requesters define tasks & upload data, workers (aka turkers) do tasks, get paid HITs generally low value, e.g., $0.02 each or $4-$5/hour, Amazon takes 10% Examples of HITs Add Keywords to images Crop Images Spam Identification – Generating a test set to train NN Subtitling, speech-to-text Adult content analysis Facial Recognition Proof Reading OCR Correction/Verification Annotate text
10
Original Mechanical Turk The Turk: 1 st chess playing automaton hoax Constructed 1770, toured US/EU for over 80 years Played Napoleon Bonaparte and Benjamin Franklin
15
Mturk quality control How do you ensure the work delivered by the Turk is of good quality? Define qualifications, give pre-test, mix in tasks with known answers Requesters can reject answers – Manually – Automatically? - when there are multiple assignments, won’t get paid unless 2 other people give the same result – No ‘recourse’ for Turkers but ratings of requesters
16
AMT Demo annotating NEs
17
CrowdFlower Commercial effort by Dolores Labs Sits on top of AMT Real time results Choose multiple worker channel like AMT, Samasource Quality control measures
19
What motivates Mechanical Turkers? Adopted from Dolores Lab Blog
20
CrowdFlower Markup Language (CML) Interactive form builder CML tags for radio, checkboxes, multiline text etc.
21
Analytics
22
Per worker stats
23
Gold Standards Ensure quality, prevents scammers from giving bad results Interface to monitor Gold stats If a worker makes mistake on a known result, he will be notified and shown his mistake. The error rates without the gold standard is more than twice as high as when we do use a gold standard. Helps in 2 ways - improves worker accuracy - allows CrowdFlower to determine who is giving accurate answers Adopted from http://crowdflower.com/docs
24
Conclusion Ask us after Spring break how it went You might find AMT useful to collect annotations or judgments for your research $25-$50 can go a long way
25
AMT Demo Better Word Cloud?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.