Download presentation
Presentation is loading. Please wait.
Published byFriedrich Hofer Modified over 7 years ago
1
Conducting Behavioral Research on Amazon‘s Mechanical Turk
Nathalie Popovic , Konstanz
2
Behavioral research methods
3
How-To Guide to MTurk What is Amazon’s Mechanical Turk?
Why Mechanical Turk? Who are the Workers? How to Become a Requester? How to Create a Study? How to Ensure Quality? What about Ethics and Privacy? Turker Communities and Useful Websites
4
What is Amazon‘s Mechanical Turk?
“Amazon’s Mechanical Turk is a crowdsourcing Internet marketplace that enables individuals and businesses (known as Requesters) to coordinate the use of human intelligence to perform tasks that computers are currently unable to do.” (Amazon Mechanical Turk. (2015, Juni 29). In Wikipedia, the free encyclopedia. Downloaded from „a labor market for microtasks“ (Huang, Zhang, Parkes, Gajos, & Chen, 2010)
5
TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]
6
Why Mechanical Turk? Stable subject availability
Large set of people willing to participate in experiments for relatively low payment Subject pool diversity (age, ethnicity, socioeconomic status, language, country of origin) Low cost and built-in payment mechanism Faster theory/ experiment cycle Validity of worker behavior ( e.g. Paolacci et al. 2010)
7
How does it work? Requester HITs Worker HIT = human intelligence task
8
Who are the Workers? Countries of origin:
workers from 100 different countries (Pontin, 2007) 50% USA, 40% India (Ipeirotis, 2010) Gender: Slightly more female (55%) Mean age: 32 years ( in my study: 37 years) Income: matches the income distribution in the general US population, but income level of US workers on Mechanical Turk is shifted towards lower income levels (Ipeirotis, 2010) Main Reasons for participation: (Ipeirotis, 2010) MONEY ! India: primary source of income USA: secondary source of income (entertainment and education) Most of workers from US and India (because amazon allows cash payments only in dollar and Rupees) We find that approximately 50% of the workers come from the United States and 40% come from India. Country of origin tends to change the motivating reasons for workers to participate in the marketplace. Significantly more workers from India participate on Mechanical Turk because the online marketplace is a primary source of income, while in the US most workers consider Mechanical Turk a secondary source of income. While money is a primary motivating reason for workers to participate in the marketplace, workers also cite a variety of other motivating reasons, including entertainment and education.
9
How to become a requester?
Create Requester Account and Amazon Payments Account What you need adress (advise: use unique adress for running studies) Credit Card U.S. Billing adress (create one with International Parcel Services) Small rate of failing seriousness chekc probably due to high relevance of topic of study
10
How to Create a Study? Create a HIT
Internal HIT (using Amazon Templates) External HIT (Link to Study)
11
Internal HIT, survey template
Seriousness Checks in Internet-Based Research
12
Seriousness Checks in Internet-Based Research
13
TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]
14
TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]
15
Qualification of Workers
Internal HIT (using Amazon Templates) External HIT (Link to Study)
16
TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]
17
How to Create a Study? Payment: 10% service fee to amazon
reservation wage: $ 1.38 per hour (Ipeirotis, 2010) average effective hourly wage of $ 4.80 for workers (Ipeirotis, 2010) Bonus Little to no effect of wage on quality of work (Marge et al., 2010; Mason & Watts, 2009) Start by paying less than expected reservation wage and then increasing the wage if the rate of completed work is too low Offer lottery to subjects Completion Time: Depends on payment, how long HIT takes, # of HITS posted, type of task, reputation of requester My Study: 100 subjects in 2 hours Reject or accept work
18
How to Create a Study? Synchronous Experiments
Building a subject panel notifying panel about upcoming experiments providing a „waiting room“ If you need experiment with n subject, use panel with 3n subjects Building a subject panel (by running small preliminary experiments or running different study and ask participants if they would be liked to be notified about future studies)
19
How to Ensure Quality? Spammers Bots Solutions:
Set qualification criteria (e.g. 95% approval rate) Verifiable questions (e.g. “What is ?“ ), attention checks Seriousness checks same amount of effort as other questions but verifiable answer Advisable to make clear that not going to be paid if not correctly answered
20
What about Ethics and Privacy?
Informed consent (purpose of the study, risks and benefits of the research, contact information of researcher) Debriefing (purpose of experiment, contact details of researcher) Panel without deception Compensation: hours and working conditions wholly determined by workers Confidentiality: with template HIT, Amazon has access to data Seriousness Checks in Internet-Based Research
21
Turker Community Off-site reputation systems Turkopticon Turker Nation
Rate requesters based on communicativity, generosity, fairness and promptness TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]
22
Turkopticon TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]
23
Turker Nation TT.MM.JJJJ
Titel der Präsentation [Einfügen über Kopf- und Fußzeile]
24
Turker Community Off-site reputation systems Turkopticon Turker Nation
Rate requesters based on communicativity, generosity, fairness and promptness Requesters should introduce themselves before posting hits Workers reactions to study can provide useful insights into method Keep professional rapport with workers as if they were employees TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]
25
Useful Websites A computer Scientist in a Business School Panos Ipeirotis, Leonard N. Stern School of Business of New York University Mturk tracker Experimental Turk Blog reporting evidence concerning the reliability of Amazon Mechanical Turk as an online subject pool for experiments Deneme a blog of experiments on Amazon Mechanical Turk MIT’s Center for People, Software and Information TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]
26
Bibliography Huang, E., Zhang, H., Parkes, D. C., Gajos, K. Z., & Chen, Y. (2010). Toward automatic task design: A progress report. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 77–85). New York: ACM. Ipeirotis, P. G. (2010b). Demographics of Mechanical Turk (Tech. Rep. No. CeDER-10-01). New York: New York University. Retrieved from March. Marge, M. , Banerjee, S., & Rudnicky, A. I. (2010). Using the Amazon Mechanical Turk for transcription of spoken language. In J. Hansen (Ed.), Proceedings of the 2010 IEEE Conference on Acoustics, Speech and Signal Processing (pp. 5270–5273). IEEE. Mason, W. A., & Watts, D. J. (2009). Financial incentives and the performance of crowds. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 77–85). New York: ACM. Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44(1), 1–23. doi: /s Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5, 411–419. Pontin, J. (2007). Artificial intelligence, with help from the humans. New York Times. March. Seriousness Checks in Internet-Based Research
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.