Conducting Behavioral Research on Amazon‘s Mechanical Turk

Slides:

Advertisements

Similar presentations

How to motivate participation in HES. The purpose of recruitment The goal is to achieve as high participation as possible Ensures that the sample represents.

Advertisements

Applying Crowd Sourcing and Workflow in Social Conflict Detection By: Reshmi De, Bhargabi Chakrabarti 28/03/13.

Introduction to Mechanized Labor Marketplaces: Mechanical Turk Uichin Lee KAIST KSE.

Presenter: Chien-Ju Ho  Introduction to Amazon Mechanical Turk  Applications  Demographics and statistics  The value of using MTurk Repeated.

Bringing the crowdsourcing revolution to research in communication disorders Tara McAllister Byun, PhD, CCC-SLP Suzanne M. Adlof, PhD Michelle W. Moore,

Amazon Mechanical Turk (Mturk) What is MTurk? – Crowdsourcing Internet marketplace that utilizes human intelligence to perform tasks that computers are.

Chapter 13 Data Collection ♣ ♣ Introduction   Research Participants   Sample Size   Apparatus and/or Instruments   Instructions   Scheduling.

Crowdsourcing research data UMBC ebiquity,

Easy (and cheap) data Diverse Intuition pump (My) reasons for using crowdsourcing.

Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Overview of the research process. Purpose of research  Research with us since early days (why?)  Main reasons: Explain why things are the way they are.

Introduction to Mechanized Labor Marketplaces: Mechanical Turk Uichin Lee KAIST KSE.

Christopher Harris Informatics Program The University of Iowa Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011) Hong Kong, Feb. 9, 2011.

BackForward HPHConnect Making Employee Health Benefits Easier to Manage. Set up your HPHConnect account today by calling your broker or your Harvard Pilgrim.

A Guide for Teachers This presentation will move from slide to slide automatically and will last about four minutes.

Crowdsourcing: Ethics, Collaboration, Creativity KSE 801 Uichin Lee.

Crowdsourcing using Mechanical Turk Quality Management and Scalability Panos Ipeirotis – New York University.

Improving Search Results Quality by Customizing Summary Lengths Michael Kaisser ★, Marti Hearst  and John B. Lowe ★ University of Edinburgh,  UC Berkeley,

Crowdsourcing using Mechanical Turk Quality Management and Scalability Panos Ipeirotis – New York University.

Ethical Use of Amazon Mechanical Turk Kristy Milland Ryerson University

Dr. Salwa El-Magoli Chairperson of the National Quality Assurance and Accreditation Committee. Former Dean of the Faculty of Agricultural, Cairo university.

By Franklin Kramer.   Crowdsourcing web service  Have turkers complete HITs for small amounts of money (most being 1-25 cents)  Can filter workers.

© 2015 TriZetto Corporation 2 Managing Patient Debt: Minimizing the Cost of Collections Pete Bekas TriZetto Provider Solutions ®

Pivotal Merchant Insights Presented by: Teri Nock, Corporate Trainer, Pivotal Payments.

AFFILIATE TERMS OF SERVICE

Claire M. Renzetti, Ph.D. Judi Conway Patton Endowed Chair, CRVAW

Cedarville University

9 Procedure for Conducting an Experiment.

Evaluating your Fuel Card Options

Evaluating Web Resources

SESRI Workshop on Survey-based Experiments

6.1 Locate Information 6.2 Secondary Sources 6.3 Evaluate Information

Managing Marketing Information to Gain Customer Insights

Profitability – How to drive Shareholder Focus

A Comparison of Two Nonprobability Samples with Probability Samples

Principles of Business, Marketing, and Finance

Payment Instruments, Financial Privacy and Online Purchases

Rob Gleasure robgleasure.com

Developing a Methodology

RKO Warner Video Videotape rental firm in NYC Problems

Mortgage Broker - The Ultimate Home Buyer's Guide

Credit Cards: More Than Plastic

Conducting Behavioral Research on Amazon’s Mechanical Turk

Collecting from and Billing Patients

SESRI Workshop on Survey-based Experiments

Campbell R. Harvey Duke University and NBER

Assisting YouthBuild Programs to Promote Financial Saving:

Computer-Mediated Communication

SMALL BUSINESS MANAGEMENT

Innovation and the online payment realm A market study

Order-to-Cash (Project-Based Services) Scenario Overview

Marketing Communications

RESEARCH METHODS Lecture 35

Royal Mail Group: Publishing Volume Commitment Incentive.

oTree: An open-source platform for lab, web, and field experiments

Procedure for Conducting an Experiment

Royal Mail Group: Publishing Volume Commitment Incentive.

Kelly Fox Financial Planning Coordinator Central Penn College

COMPUTER ETHICS: Gender Effects and Employee Internet Misuse

Order-to-Cash (Project-Based Services) Scenario Overview

Compensation Programs

Smart Business for eGeneration Companies

Principle #1 – Appropriate Product Design and Delivery This presentation is made possible by the Smart Campaign [Introductions of facilitator(s)

Section 4.1 Employability Skills

Human-Computer Interaction: Overview of User Studies

Royal Mail Group: Publishing Volume Commitment Incentive.

Marketplace FAQs Treasury 5/1/2019.

It’s 2019: Do We Need “Super” Attention Check Items to Conduct Web-Based Survey Research? The Evolution of MTurk Survey Respondents Kateryna Sylaska,

Smart Business for eGeneration Companies

Presentation transcript:

Conducting Behavioral Research on Amazon‘s Mechanical Turk Nathalie Popovic 30.05.2015, Konstanz

Behavioral research methods

How-To Guide to MTurk What is Amazon’s Mechanical Turk? Why Mechanical Turk? Who are the Workers? How to Become a Requester? How to Create a Study? How to Ensure Quality? What about Ethics and Privacy? Turker Communities and Useful Websites

What is Amazon‘s Mechanical Turk? “Amazon’s Mechanical Turk is a crowdsourcing Internet marketplace that enables individuals and businesses (known as Requesters) to coordinate the use of human intelligence to perform tasks that computers are currently unable to do.” (Amazon Mechanical Turk. (2015, Juni 29). In Wikipedia, the free encyclopedia. Downloaded from https://en.wikipedia.org/w/index.php?title=Amazon_Mechanical_Turk&oldid=669241647) „a labor market for microtasks“ (Huang, Zhang, Parkes, Gajos, & Chen, 2010)

TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Why Mechanical Turk? Stable subject availability Large set of people willing to participate in experiments for relatively low payment Subject pool diversity (age, ethnicity, socioeconomic status, language, country of origin) Low cost and built-in payment mechanism Faster theory/ experiment cycle Validity of worker behavior ( e.g. Paolacci et al. 2010)

How does it work? Requester HITs Worker HIT = human intelligence task

Who are the Workers? Countries of origin: 100.000 workers from 100 different countries (Pontin, 2007) 50% USA, 40% India (Ipeirotis, 2010) Gender: Slightly more female (55%) Mean age: 32 years ( in my study: 37 years) Income: matches the income distribution in the general US population, but income level of US workers on Mechanical Turk is shifted towards lower income levels (Ipeirotis, 2010) Main Reasons for participation: (Ipeirotis, 2010) MONEY ! India: primary source of income USA: secondary source of income (entertainment and education) http://www.newyorker.com/culture/culture-desk/video-turking-for-respect Most of workers from US and India (because amazon allows cash payments only in dollar and Rupees) We find that approximately 50% of the workers come from the United States and 40% come from India. Country of origin tends to change the motivating reasons for workers to participate in the marketplace. Significantly more workers from India participate on Mechanical Turk because the online marketplace is a primary source of income, while in the US most workers consider Mechanical Turk a secondary source of income. While money is a primary motivating reason for workers to participate in the marketplace, workers also cite a variety of other motivating reasons, including entertainment and education.

How to become a requester? Create Requester Account and Amazon Payments Account What you need E-Mail adress (advise: use unique email adress for running studies) Credit Card U.S. Billing adress (create one with International Parcel Services) Small rate of failing seriousness chekc probably due to high relevance of topic of study

How to Create a Study? Create a HIT Internal HIT (using Amazon Templates) External HIT (Link to Study)

Internal HIT, survey template 19.05.2015 Seriousness Checks in Internet-Based Research

19.05.2015 Seriousness Checks in Internet-Based Research

TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Qualification of Workers Internal HIT (using Amazon Templates) External HIT (Link to Study)

TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

How to Create a Study? Payment: 10% service fee to amazon reservation wage: $ 1.38 per hour (Ipeirotis, 2010) average effective hourly wage of $ 4.80 for workers (Ipeirotis, 2010) Bonus Little to no effect of wage on quality of work (Marge et al., 2010; Mason & Watts, 2009) Start by paying less than expected reservation wage and then increasing the wage if the rate of completed work is too low Offer lottery to subjects Completion Time: Depends on payment, how long HIT takes, # of HITS posted, type of task, reputation of requester My Study: 100 subjects in 2 hours Reject or accept work

How to Create a Study? Synchronous Experiments Building a subject panel notifying panel about upcoming experiments providing a „waiting room“ If you need experiment with n subject, use panel with 3n subjects Building a subject panel (by running small preliminary experiments or running different study and ask participants if they would be liked to be notified about future studies)

How to Ensure Quality? Spammers Bots Solutions: Set qualification criteria (e.g. 95% approval rate) Verifiable questions (e.g. “What is 2 + 2 ?“ ), attention checks Seriousness checks same amount of effort as other questions but verifiable answer Advisable to make clear that not going to be paid if not correctly answered

What about Ethics and Privacy? Informed consent (purpose of the study, risks and benefits of the research, contact information of researcher) Debriefing (purpose of experiment, contact details of researcher) Panel without deception Compensation: hours and working conditions wholly determined by workers Confidentiality: with template HIT, Amazon has access to data 19.05.2015 Seriousness Checks in Internet-Based Research

Turker Community Off-site reputation systems Turkopticon Turker Nation Rate requesters based on communicativity, generosity, fairness and promptness TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Turkopticon TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Turker Nation TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Turker Community Off-site reputation systems Turkopticon Turker Nation Rate requesters based on communicativity, generosity, fairness and promptness Requesters should introduce themselves before posting hits Workers reactions to study can provide useful insights into method Keep professional rapport with workers as if they were employees TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Useful Websites A computer Scientist in a Business School www.behind-the-enemy-lines.com Panos Ipeirotis, Leonard N. Stern School of Business of New York University  Mturk tracker http://www.mturk-tracker.com/#/general Experimental Turk https://experimentalturk.wordpress.com/ Blog reporting evidence concerning the reliability of Amazon Mechanical Turk as an online subject pool for experiments Deneme http://groups.csail.mit.edu/uid/deneme/ a blog of experiments on Amazon Mechanical Turk MIT’s Center for People, Software and Information TT.MM.JJJJ Titel der Präsentation [Einfügen über Kopf- und Fußzeile]

Bibliography Huang, E., Zhang, H., Parkes, D. C., Gajos, K. Z., & Chen, Y. (2010). Toward automatic task design: A progress report. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 77–85). New York: ACM. Ipeirotis, P. G. (2010b). Demographics of Mechanical Turk (Tech. Rep. No. CeDER-10-01). New York: New York University. Retrieved from http://hdl.handle.net/2451/29585. March. Marge, M. , Banerjee, S., & Rudnicky, A. I. (2010). Using the Amazon Mechanical Turk for transcription of spoken language. In J. Hansen (Ed.), Proceedings of the 2010 IEEE Conference on Acoustics, Speech and Signal Processing (pp. 5270–5273). IEEE. Mason, W. A., & Watts, D. J. (2009). Financial incentives and the performance of crowds. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 77–85). New York: ACM. Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44(1), 1–23. doi:10.3758/s13428-011- 0124-6 Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5, 411–419. Pontin, J. (2007). Artificial intelligence, with help from the humans. New York Times. March. 19.05.2015 Seriousness Checks in Internet-Based Research