Conducting Behavioral Research on Amazon‘s Mechanical Turk

Slides:



Advertisements
Similar presentations
How to motivate participation in HES. The purpose of recruitment The goal is to achieve as high participation as possible Ensures that the sample represents.
Advertisements

Applying Crowd Sourcing and Workflow in Social Conflict Detection By: Reshmi De, Bhargabi Chakrabarti 28/03/13.
Introduction to Mechanized Labor Marketplaces: Mechanical Turk Uichin Lee KAIST KSE.
Presenter: Chien-Ju Ho  Introduction to Amazon Mechanical Turk  Applications  Demographics and statistics  The value of using MTurk Repeated.
Bringing the crowdsourcing revolution to research in communication disorders Tara McAllister Byun, PhD, CCC-SLP Suzanne M. Adlof, PhD Michelle W. Moore,
Amazon Mechanical Turk (Mturk) What is MTurk? – Crowdsourcing Internet marketplace that utilizes human intelligence to perform tasks that computers are.
Chapter 13 Data Collection ♣ ♣ Introduction   Research Participants   Sample Size   Apparatus and/or Instruments   Instructions   Scheduling.
Crowdsourcing research data UMBC ebiquity,
Easy (and cheap) data Diverse Intuition pump (My) reasons for using crowdsourcing.
Human Computation and Crowdsourcing Uichin Lee May 8, 2011.
Overview of the research process. Purpose of research  Research with us since early days (why?)  Main reasons: Explain why things are the way they are.
Introduction to Mechanized Labor Marketplaces: Mechanical Turk Uichin Lee KAIST KSE.
Christopher Harris Informatics Program The University of Iowa Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011) Hong Kong, Feb. 9, 2011.
BackForward HPHConnect Making Employee Health Benefits Easier to Manage. Set up your HPHConnect account today by calling your broker or your Harvard Pilgrim.
A Guide for Teachers This presentation will move from slide to slide automatically and will last about four minutes.
Crowdsourcing: Ethics, Collaboration, Creativity KSE 801 Uichin Lee.
Crowdsourcing using Mechanical Turk Quality Management and Scalability Panos Ipeirotis – New York University.
Improving Search Results Quality by Customizing Summary Lengths Michael Kaisser ★, Marti Hearst  and John B. Lowe ★ University of Edinburgh,  UC Berkeley,
Crowdsourcing using Mechanical Turk Quality Management and Scalability Panos Ipeirotis – New York University.
Ethical Use of Amazon Mechanical Turk Kristy Milland Ryerson University
Dr. Salwa El-Magoli Chairperson of the National Quality Assurance and Accreditation Committee. Former Dean of the Faculty of Agricultural, Cairo university.
By Franklin Kramer.   Crowdsourcing web service  Have turkers complete HITs for small amounts of money (most being 1-25 cents)  Can filter workers.
© 2015 TriZetto Corporation 2 Managing Patient Debt: Minimizing the Cost of Collections Pete Bekas TriZetto Provider Solutions ®
Pivotal Merchant Insights Presented by: Teri Nock, Corporate Trainer, Pivotal Payments.
AFFILIATE TERMS OF SERVICE
Claire M. Renzetti, Ph.D. Judi Conway Patton Endowed Chair, CRVAW
Cedarville University
9 Procedure for Conducting an Experiment.
Evaluating your Fuel Card Options
Evaluating Web Resources
SESRI Workshop on Survey-based Experiments
6.1 Locate Information 6.2 Secondary Sources 6.3 Evaluate Information
Managing Marketing Information to Gain Customer Insights
Profitability – How to drive Shareholder Focus
A Comparison of Two Nonprobability Samples with Probability Samples
Principles of Business, Marketing, and Finance
Payment Instruments, Financial Privacy and Online Purchases
Rob Gleasure robgleasure.com
Developing a Methodology
RKO Warner Video Videotape rental firm in NYC Problems
Mortgage Broker - The Ultimate Home Buyer's Guide
Credit Cards: More Than Plastic
Conducting Behavioral Research on Amazon’s Mechanical Turk
Collecting from and Billing Patients
SESRI Workshop on Survey-based Experiments
Campbell R. Harvey Duke University and NBER
Assisting YouthBuild Programs to Promote Financial Saving:
Computer-Mediated Communication
SMALL BUSINESS MANAGEMENT
Innovation and the online payment realm A market study
Order-to-Cash (Project-Based Services) Scenario Overview
Marketing Communications
RESEARCH METHODS Lecture 35
Royal Mail Group: Publishing Volume Commitment Incentive.
oTree: An open-source platform for lab, web, and field experiments
Procedure for Conducting an Experiment
Royal Mail Group: Publishing Volume Commitment Incentive.
Kelly Fox Financial Planning Coordinator Central Penn College
COMPUTER ETHICS: Gender Effects and Employee Internet Misuse
Order-to-Cash (Project-Based Services) Scenario Overview
Compensation Programs
Smart Business for eGeneration Companies
Principle #1 – Appropriate Product Design and Delivery This presentation is made possible by the Smart Campaign   [Introductions of facilitator(s)
Section 4.1 Employability Skills
Human-Computer Interaction: Overview of User Studies
Royal Mail Group: Publishing Volume Commitment Incentive.
Marketplace FAQs Treasury 5/1/2019.
Marketing You.
It’s 2019: Do We Need “Super” Attention Check Items to Conduct Web-Based Survey Research? The Evolution of MTurk Survey Respondents Kateryna Sylaska,
Smart Business for eGeneration Companies
Presentation transcript:

Conducting Behavioral Research on Amazon‘s Mechanical Turk Nathalie Popovic 30.05.2015, Konstanz

Behavioral research methods

How-To Guide to MTurk What is Amazon’s Mechanical Turk? Why Mechanical Turk? Who are the Workers? How to Become a Requester? How to Create a Study? How to Ensure Quality? What about Ethics and Privacy? Turker Communities and Useful Websites

What is Amazon‘s Mechanical Turk? “Amazon’s Mechanical Turk is a crowdsourcing Internet marketplace that enables individuals and businesses (known as Requesters) to coordinate the use of human intelligence to perform tasks that computers are currently unable to do.” (Amazon Mechanical Turk. (2015, Juni 29). In Wikipedia, the free encyclopedia. Downloaded from https://en.wikipedia.org/w/index.php?title=Amazon_Mechanical_Turk&oldid=669241647) „a labor market for microtasks“ (Huang, Zhang, Parkes, Gajos, & Chen, 2010)

TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Why Mechanical Turk? Stable subject availability Large set of people willing to participate in experiments for relatively low payment Subject pool diversity (age, ethnicity, socioeconomic status, language, country of origin) Low cost and built-in payment mechanism Faster theory/ experiment cycle Validity of worker behavior ( e.g. Paolacci et al. 2010)

How does it work? Requester HITs Worker HIT = human intelligence task

Who are the Workers? Countries of origin: 100.000 workers from 100 different countries (Pontin, 2007) 50% USA, 40% India (Ipeirotis, 2010) Gender: Slightly more female (55%) Mean age: 32 years ( in my study: 37 years) Income: matches the income distribution in the general US population, but income level of US workers on Mechanical Turk is shifted towards lower income levels (Ipeirotis, 2010) Main Reasons for participation: (Ipeirotis, 2010) MONEY ! India: primary source of income USA: secondary source of income (entertainment and education) http://www.newyorker.com/culture/culture-desk/video-turking-for-respect Most of workers from US and India (because amazon allows cash payments only in dollar and Rupees) We find that approximately 50% of the workers come from the United States and 40% come from India. Country of origin tends to change the motivating reasons for workers to participate in the marketplace. Significantly more workers from India participate on Mechanical Turk because the online marketplace is a primary source of income, while in the US most workers consider Mechanical Turk a secondary source of income. While money is a primary motivating reason for workers to participate in the marketplace, workers also cite a variety of other motivating reasons, including entertainment and education.

How to become a requester? Create Requester Account and Amazon Payments Account What you need E-Mail adress (advise: use unique email adress for running studies) Credit Card U.S. Billing adress (create one with International Parcel Services) Small rate of failing seriousness chekc probably due to high relevance of topic of study

How to Create a Study? Create a HIT Internal HIT (using Amazon Templates) External HIT (Link to Study)

Internal HIT, survey template 19.05.2015 Seriousness Checks in Internet-Based Research

19.05.2015 Seriousness Checks in Internet-Based Research

TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Qualification of Workers Internal HIT (using Amazon Templates) External HIT (Link to Study)

TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

How to Create a Study? Payment: 10% service fee to amazon reservation wage: $ 1.38 per hour (Ipeirotis, 2010) average effective hourly wage of $ 4.80 for workers (Ipeirotis, 2010) Bonus Little to no effect of wage on quality of work (Marge et al., 2010; Mason & Watts, 2009) Start by paying less than expected reservation wage and then increasing the wage if the rate of completed work is too low Offer lottery to subjects Completion Time: Depends on payment, how long HIT takes, # of HITS posted, type of task, reputation of requester My Study: 100 subjects in 2 hours Reject or accept work

How to Create a Study? Synchronous Experiments Building a subject panel notifying panel about upcoming experiments providing a „waiting room“ If you need experiment with n subject, use panel with 3n subjects Building a subject panel (by running small preliminary experiments or running different study and ask participants if they would be liked to be notified about future studies)

How to Ensure Quality? Spammers Bots Solutions: Set qualification criteria (e.g. 95% approval rate) Verifiable questions (e.g. “What is 2 + 2 ?“ ), attention checks Seriousness checks same amount of effort as other questions but verifiable answer Advisable to make clear that not going to be paid if not correctly answered

What about Ethics and Privacy? Informed consent (purpose of the study, risks and benefits of the research, contact information of researcher) Debriefing (purpose of experiment, contact details of researcher) Panel without deception Compensation: hours and working conditions wholly determined by workers Confidentiality: with template HIT, Amazon has access to data 19.05.2015 Seriousness Checks in Internet-Based Research

Turker Community Off-site reputation systems Turkopticon Turker Nation Rate requesters based on communicativity, generosity, fairness and promptness TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Turkopticon TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Turker Nation TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Turker Community Off-site reputation systems Turkopticon Turker Nation Rate requesters based on communicativity, generosity, fairness and promptness Requesters should introduce themselves before posting hits Workers reactions to study can provide useful insights into method Keep professional rapport with workers as if they were employees TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Useful Websites A computer Scientist in a Business School www.behind-the-enemy-lines.com Panos Ipeirotis, Leonard N. Stern School of Business of New York University  Mturk tracker http://www.mturk-tracker.com/#/general Experimental Turk https://experimentalturk.wordpress.com/ Blog reporting evidence concerning the reliability of Amazon Mechanical Turk as an online subject pool for experiments Deneme http://groups.csail.mit.edu/uid/deneme/ a blog of experiments on Amazon Mechanical Turk MIT’s Center for People, Software and Information TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Bibliography Huang, E., Zhang, H., Parkes, D. C., Gajos, K. Z., & Chen, Y. (2010). Toward automatic task design: A progress report. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 77–85). New York: ACM. Ipeirotis, P. G. (2010b). Demographics of Mechanical Turk (Tech. Rep. No. CeDER-10-01). New York: New York University. Retrieved from http://hdl.handle.net/2451/29585. March. Marge, M. , Banerjee, S., & Rudnicky, A. I. (2010). Using the Amazon Mechanical Turk for transcription of spoken language. In J. Hansen (Ed.), Proceedings of the 2010 IEEE Conference on Acoustics, Speech and Signal Processing (pp. 5270–5273). IEEE. Mason, W. A., & Watts, D. J. (2009). Financial incentives and the performance of crowds. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 77–85). New York: ACM. Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44(1), 1–23. doi:10.3758/s13428-011- 0124-6 Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5, 411–419. Pontin, J. (2007). Artificial intelligence, with help from the humans. New York Times. March. 19.05.2015 Seriousness Checks in Internet-Based Research