Crowdsourcing research data UMBC ebiquity, 2010-03-01.

Crowdsourcing research data UMBC ebiquity, 2010-03-01

Overview How did we get into this Crowdsourcing defined Amazon Mechanical Turk CloudFlower Two examples – Annotating tweets for named entities – Evaluating word clouds Conclusions

Motivation Needed to train a named entity recognizer for Twitter statuses – Need human judgments on 1000s of tweets to identify NERs of type PER, ORG or LOC anand drove to boston to see the red sox play PER LOC ORG ORG NAACL 2010 Workshop: Creating Speech and Language Data With Amazon’s Mechanical Turk Shared task papers: what can you do with $100

Crowdsourcing Crowdsourcing = Crowd + Outsourcing Tasks normally performed by employees outsourced via an open call to a large community Some examples – Netflix prize – InnoCentive: solve R&D challenges – DARPA Network Challenge

Web Crowdsourcing Ideal fit for the Web Lots of custom examples – ESP Game, now Google Image Labeler – reCAPTCHA – Galaxy Zoo: armature astronomers classify galaxy images General crowd sourcing services – Amazon Mechanical Turk – CrowdFLower

Amazon Mechanical Turk Amazon service since 2005 Some tasks can’t be done well by computers and some require human judgements Amazon’s name: Human Intelligence Task (HIT) Requesters define tasks & upload data, workers (aka turkers) do tasks, get paid HITs generally low value, e.g., $0.02 each or $4-$5/hour, Amazon takes 10% Examples of HITs Add Keywords to images Crop Images Spam Identification – Generating a test set to train NN Subtitling, speech-to-text Adult content analysis Facial Recognition Proof Reading OCR Correction/Verification Annotate text

Original Mechanical Turk The Turk: 1 st chess playing automaton hoax Constructed 1770, toured US/EU for over 80 years Played Napoleon Bonaparte and Benjamin Franklin

Mturk quality control How do you ensure the work delivered by the Turk is of good quality? Define qualifications, give pre-test, mix in tasks with known answers Requesters can reject answers – Manually – Automatically? - when there are multiple assignments, won’t get paid unless 2 other people give the same result – No ‘recourse’ for Turkers but ratings of requesters

AMT Demo annotating NEs

CrowdFlower Commercial effort by Dolores Labs Sits on top of AMT Real time results Choose multiple worker channel like AMT, Samasource Quality control measures

What motivates Mechanical Turkers? Adopted from Dolores Lab Blog

CrowdFlower Markup Language (CML) Interactive form builder CML tags for radio, checkboxes, multiline text etc.

Analytics

Per worker stats

Gold Standards Ensure quality, prevents scammers from giving bad results Interface to monitor Gold stats If a worker makes mistake on a known result, he will be notified and shown his mistake. The error rates without the gold standard is more than twice as high as when we do use a gold standard. Helps in 2 ways - improves worker accuracy - allows CrowdFlower to determine who is giving accurate answers Adopted from http://crowdflower.com/docs

Conclusion Ask us after Spring break how it went You might find AMT useful to collect annotations or judgments for your research $25-$50 can go a long way

AMT Demo Better Word Cloud?

Crowdsourcing research data UMBC ebiquity, 2010-03-01.

Similar presentations

Presentation on theme: "Crowdsourcing research data UMBC ebiquity, 2010-03-01."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Crowdsourcing research data UMBC ebiquity, 2010-03-01.

Similar presentations

Presentation on theme: "Crowdsourcing research data UMBC ebiquity, 2010-03-01."— Presentation transcript:

Similar presentations

About project

Feedback