Estimating the Completion Time of Crowdsourced Tasks using Survival Analysis Jing Wang, New York University Siamak Faridani, University of California,

Slides:

Advertisements

Similar presentations

What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.

Advertisements

Controlling for Time Dependent Confounding Using Marginal Structural Models in the Case of a Continuous Treatment O Wang 1, T McMullan 2 1 Amgen, Thousand.

Topic : Process Management Lecture By: Rupinder Kaur Lecturer IT, SRS Govt. Polytechnic College for Girls,Ludhiana.

Unsupervised Modeling of Twitter Conversations

Statistical Topic Modeling part 1

Anindya Ghose Panos Ipeirotis Arun Sundararajan Stern School of Business New York University Opinion Mining using Econometrics A Case Study on Reputation.

Cox Model With Intermitten and Error-Prone Covariate Observation Yury Gubman PhD thesis in Statistics Supervisors: Prof. David Zucker, Prof. Orly Manor.

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.

Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.

Project 2: ATM’s & Queues

Volatility Chapter 9 Risk Management and Financial Institutions 2e, Chapter 9, Copyright © John C. Hull

Blogging in the Classroom Blogging Assignment and Expectations MSTI 131 Introduction to Educational Technology Fall 2010 Prof. Nichole Heinsler What is.

Survival Analysis for Risk-Ranking of ESP System Performance Teddy Petrou, Rice University August 17, 2005.

Macroeconomics Prof. Juan Gabriel Rodríguez

Cox Proportional Hazards Regression Model Mai Zhou Department of Statistics University of Kentucky.

Blogging Using Schoolwires Blogging Using Schoolwires Gail Dickinson

Connecting With Your Online Students NYSMATYC Math on the Web Themed Session Saturday, April 5, 2008 Mary Beth Orrange

Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Website Content, Forms and Dynamic Web Pages. Electronic Portfolios Portfolio: – A collection of work that clearly illustrates effort, progress, knowledge,

Designing Ranking Systems for Hotels on Travel Search Engines by Mining User-Generated and Crowd sourced Content Author - Anindya Ghose, Panagiotis G.

Add image. 3 “ Content is NOT king ” today 3 40 analog cable digital cable Internet 100 infinite broadcast Time Number of TV channels.

Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.

Towards Boosting Video Popularity via Tag Selection Elizeu Santos-Neto, Tatiana Pontes, Jussara Almeida, Matei Ripeanu University of British Columbia -

1. An Overview of the Data Analysis and Probability Standard for School Mathematics? 2.

Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.

Volunteer Angler Data Collection and Methods of Inference Kristen Olson University of Nebraska-Lincoln February 2,

Week 2 Assignment Part 2 Web Resources Katherine Driscoll CSC 1011/31/08.

Optimizing Plurality for Human Intelligence Tasks Luyi Mo University of Hong Kong Joint work with Reynold Cheng, Ben Kao, Xuan Yang, Chenghui Ren, Siyu.

An Analysis of Assessor Behavior in Crowdsourced Preference Judgments Dongqing Zhu and Ben Carterette University of Delaware.

Designing Ranking Systems for Consumer Reviews: The Economic Impact of Customer Sentiment in Electronic Markets Anindya Ghose Panagiotis Ipeirotis Stern.

Time-dependent covariates and further remarks on likelihood construction Presenter Li,Yin Nov. 24.

Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.

Finding and Sharing Information in a Site Module 5.

Visit The World’s First Guaranteed Results Marketing Firm Topic: Social Branding.

Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.

A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models Russell Almond Florida State University College of Education Educational Psychology.

Learning Agenda Emotions & Sales Article Sutton & Rafaeli Understanding the phenomenon Conducting an observational study –qualitative & quantitative info.

Academic procrastination of undergraduates: Low self-efficacy to self-regulate predicts higher levels of procrastination Source: Contemporary Educational.

Lecture 12: Cox Proportional Hazards Model

Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.

Improving Search Results Quality by Customizing Summary Lengths Michael Kaisser ★, Marti Hearst  and John B. Lowe ★ University of Edinburgh,  UC Berkeley,

Estimating Volatilities and Correlations Chapter 21.

LOGO Identifying Opinion Leaders in the Blogosphere Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng CIKM 2007 Advisor ： Dr. Koh Jia-Ling Speaker ： Tu.

01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.

01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.

Human and Optimal Exploration and Exploitation in Bandit Problems Department of Cognitive Sciences, University of California. A Bayesian analysis of human.

Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Estimating Volatilities and Correlations

CREATE, IMPLEMENT AND ENJOY! Blogs,Wikis & RSS Readers.

REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.

EPI 5344: Survival Analysis in Epidemiology Week 6 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa 03/2016.

Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.

1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 8.1: Cohort sampling for the Cox model.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.

Using Blog Properties to Improve Retrieval Gilad Mishne (ICWSM 2007)

Topic Modeling for Short Texts with Auxiliary Word Embeddings

The Life of a MongoDB GitHub Commit

Online Multiscale Dynamic Topic Models

The need for standardization

Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II

Michal Rosen-Zvi University of California, Irvine

Process Scheduling B.Ramamurthy 4/11/2019.

Process Scheduling B.Ramamurthy 4/7/2019.

INF 141: Information Retrieval

GhostLink: Latent Network Inference for Influence-aware Recommendation

Kaplan-Meier survival curves and the log rank test

Presentation transcript:

Estimating the Completion Time of Crowdsourced Tasks using Survival Analysis Jing Wang, New York University Siamak Faridani, University of California, Berkeley Panos Ipeirotis, New York Univesity 1

2 Crowdsourcing: Pricing and Time to completion?  Many firms use crowdsourcing for a variety of tasks  Still unclear how to price  Prior results indicate that price does not affect quality (Mason and Watts, 2009)  …but it does affect completion time  Unclear how long it will take for a task to finish

3 Data Set: Mechanical Turk Tracker (  Crawled Amazon Mechanical Turk hourly (now every min)  Captured full market state (content, position, and characteristics of all available HITs).  15 months of data (now >24 months)  165,368 HIT groups  6,701,406 HIT assignments from 9,436 requesters  Value of the HITs: $529,259 [guesstimate ~10% of actual value]  Missing very short tasks (posted and disappeared in <1hr)  Do not observe HIT redundancy

4 Completion Times: Power-laws HIT completion time: Time_last_seen – Time_first_posted

5 Completion Times: Power-laws and Censoring HIT completion time: Time_last_seen – Time_first_posted Censoring Effects Jumps/Outliers: Expiration Different slope: Requesters taking down HITs

6 Parameter estimation  Maximum Likelihood Estimation, controlling for censored data  Power-law parameter α~1.5  Power-laws with α<2 do not have well-defined mean value  Sample average increases as sample size increases

7 Why Power-laws?  Queuing theory model by (Cobham, 1954):  If workers pick tasks from two priority queues, completion time follows power-law with α=1.5  Chilton et al, HCOMP 2010: workers rank either by “most recently posted” or by “most HITs available”  Result: Inherent unpredictability of completion time  Real solution: Amazon should change the interface  But let’s see how other factors affect completion time

8 Survival Analysis  Examine and model the time it takes for events to occur  In our case: Event = HIT gets completed  Survival function S(t):  Probability that tasks will last longer than t  Used stratified Cox Proportional Hazards Model

9 Covariates Examined  HIT Characteristics  Monetary reward  Number of HITs  Length in characters  HIT topic (based on Latent Dirichlet Allocation analysis)  Market Characteristics  Day of the week (when HIT was first posted)  Time of the day (when HIT was first posted)  Requester Characteristics  Activities of requester until time of submission  Existing lifetime of requester

10 Effect of Price: Mostly monotonic  Half-life for $0.025 reward ~ 2 days  Half-life for $1 reward ~ 12 hours h(t) = 1.035^price 40% speedup for 10x price

11 Covariates Examined  HIT Characteristics  Monetary reward  Number of HITs  Length in characters  HIT topic (based on Latent Dirichlet Allocation analysis)  Market Characteristics  Day of the week (when HIT was first posted)  Time of the day (when HIT was first posted)  Requester Characteristics  Activities of requester until time of submission  Existing lifetime of requester

12 Effect of #HITs: Monotonic, but sublinear h(t) = 0.998^#HITs  10 HITs  2% slower than 1 HIT  100 HITs  19% slower than 1 HIT  1000 HITs  87% slower than 1 HIT or, 1 group of 1000  7 times faster than 1000 sequential groups of 1

13 Covariates Examined  HIT Characteristics  Monetary reward  Number of HITs  Length in characters (increases lifetime)  HIT topic (based on Latent Dirichlet Allocation analysis)  Market Characteristics  Day of the week (when HIT was first posted)  Time of the day (when HIT was first posted)  Requester Characteristics  Activities of requester until time of submission  Existing lifetime of requester

14 HIT Topics topic 1 : cw castingwords podcast transcribe english mp3 edit confirm snippet grade topic 2: data collection search image entry listings website review survey opinion topic 3: categorization product video page smartsheet web comment website opinion topic 4: easy quick survey money research fast simple form answers link topic 5: question answer nanonano dinkle article write writing review blog articles topic 6: writing answer article question opinion short advice editing rewriting paul topic 7: transcribe transcription improve retranscribe edit answerly voic answer

15 Effect of Topic: The CastingWords Effect topic 1 : cw castingwords podcast transcribe english mp3 edit confirm snippet grade topic 2: data collection search image entry listings website review survey opinion topic 3: categorization product video page smartsheet web comment website opinion topic 4: easy quick survey money research fast simple form answers link topic 5: question answer nanonano dinkle article write writing review blog articles topic 6: writing answer article question opinion short advice editing rewriting paul topic 7: transcribe transcription improve retranscribe edit answerly voic query question answer

16 Effect of Topic: Surveys=fast (even with redundancy!) topic 1 : cw castingwords podcast transcribe english mp3 edit confirm snippet grade topic 2: data collection search image entry listings website review survey opinion topic 3: categorization product video page smartsheet web comment website opinion topic 4: easy quick survey money research fast simple form answers link topic 5: question answer nanonano dinkle article write writing review blog articles topic 6: writing answer article question opinion short advice editing rewriting paul topic 7: transcribe transcription improve retranscribe edit answerly voic query question answer

17 Effect of Topic: Writing takes time topic 1 : cw castingwords podcast transcribe english mp3 edit confirm snippet grade topic 2: data collection search image entry listings website review survey opinion topic 3: categorization product video page smartsheet web comment website opinion topic 4: easy quick survey money research fast simple form answers link topic 5: question answer nanonano dinkle article write writing review blog articles topic 6: writing answer article question opinion short advice editing rewriting paul topic 7: transcribe transcription improve retranscribe edit answerly voic query question answer

18 Covariates Examined  HIT Characteristics  Monetary reward  Number of HITs  Length in characters (increases lifetime)  HIT topic (based on Latent Dirichlet Allocation analysis)  Market Characteristics: Not affecting  Day of the week (when HIT was first posted)  Time of the day (when HIT was first posted)  Requester Characteristics  Activities of requester until time of submission  Existing lifetime of requester (1yr ~ 50% speedup)

19 Covariates Examined  HIT Characteristics  Monetary reward  Number of HITs  Length in characters (increases lifetime)  HIT topic (based on Latent Dirichlet Allocation analysis)  Market Characteristics: Not affecting  Day of the week (when HIT was first posted)  Time of the day (when HIT was first posted)  Requester Characteristics  Activities of requester until time of submission  Existing lifetime of requester Why? We look at long-running HITs until completion…

20 Covariates Examined  HIT Characteristics  Monetary reward  Number of HITs  Length in characters (increases lifetime)  HIT topic (based on Latent Dirichlet Allocation analysis)  Market Characteristics: Not affecting  Day of the week (when HIT was first posted)  Time of the day (when HIT was first posted)  Requester Characteristics  Activities of requester until time of submission  Existing lifetime of requester (1yr ~ 50% speedup)

21 Conclusions  Completion times for tasks in Amazon Mechanical Turk follow a heavy tail distribution. (Paper studying MicroTasks.com has similar conclusions.)  Sample averages cannot be used to predict the expected completion time of a task.  By fitting a Cox proportional hazards regression model to the data collected from AMT, we showed the effect of various HIT parameters in the completion time of the task  “Base survival function” still a power-law  Still difficult to predict

22 Lessons Learned and Future Work  Current survival analysis too naive:  Ignores many interactions across variables  Need time-dependent covariates (market changes over time)  More frequent crawling does not change the results  Important: Analysis ignores “refilling” of HITs TODO:  Better to model directly the HIT assignment disappearance rate (how many #HITs done per minute)  Use queuing model theories  Use hierarchical version of LDA and dynamic models (#topics and shifts in topics over time)

Any Questions?