Optimizing Plurality for Human Intelligence Tasks Luyi Mo University of Hong Kong Joint work with Reynold Cheng, Ben Kao, Xuan Yang, Chenghui Ren, Siyu.

Slides:

Advertisements

Similar presentations

Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.

Advertisements

Reinforcement Learning

Problems and Their Classes

Incentivize Crowd Labeling under Budget Constraint

CrowdER - Crowdsourcing Entity Resolution

Probabilistic Skyline Operator over Sliding Windows Wenjie Zhang University of New South Wales & NICTA, Australia Joint work: Xuemin Lin, Ying Zhang, Wei.

Cleaning Uncertain Data with Quality Guarantees Reynold Cheng, Jinchuan Chen, Xike Xie 2008 VLDB Presented by SHAO Yufeng.

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna

Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.

Fast Algorithms For Hierarchical Range Histogram Constructions

TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.

The Greedy Method1. 2 Outline and Reading The Greedy Method Technique (§5.1) Fractional Knapsack Problem (§5.1.1) Task Scheduling (§5.1.2) Minimum Spanning.

Chapter 5 Fundamental Algorithm Design Techniques.

Iterative Optimization and Simplification of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial.

Estimating the Completion Time of Crowdsourced Tasks using Survival Analysis Jing Wang, New York University Siamak Faridani, University of California,

Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.

All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.

1 Efficient Subgraph Search over Large Uncertain Graphs Ye Yuan 1, Guoren Wang 1, Haixun Wang 2, Lei Chen 3 1. Northeastern University, China 2. Microsoft.

Static Optimization of Conjunctive Queries with Sliding Windows over Infinite Streams Presented by: Andy Mason and Sheng Zhong Ahmed M.Ayad and Jeffrey.

A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.

Stabbing the Sky: Efficient Skyline Computation over Sliding Windows COMP9314 Lecture Notes.

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.

Crowdsourcing research data UMBC ebiquity,

INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.

Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.

The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.

Mining Long Sequential Patterns in a Noisy Environment Jiong Yang, Wei Wang, Philip S. Yu, Jiawei Han SIGMOD 2002.

Evaluating Top-k Queries over Web-Accessible Databases Nicolas Bruno Luis Gravano Amélie Marian Columbia University.

Mariam Salloum (YP.com) Xin Luna Dong (Google) Divesh Srivastava (AT&T Research) Vassilis J. Tsotras (UC Riverside) 1 Online Ordering of Overlapping Data.

Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.

Towards Boosting Video Popularity via Tag Selection Elizeu Santos-Neto, Tatiana Pontes, Jussara Almeida, Matei Ripeanu University of British Columbia -

Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis.

 1  Outline  stages and topics in simulation  generation of random variates.

Reynold Cheng†, Eric Lo‡, Xuan S

Stochastic Algorithms Some of the fastest known algorithms for certain tasks rely on chance Stochastic/Randomized Algorithms Two common variations – Monte.

Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.

1 On Querying Historical Evolving Graph Sequences Chenghui Ren $, Eric Lo *, Ben Kao $, Xinjie Zhu $, Reynold Cheng $ $ The University of Hong Kong $ {chren,

Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach Wenjie Zhang, Xuemin Lin The University of New South Wales & NICTA Ming Hua,

Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.

« Pruning Policies for Two-Tiered Inverted Index with Correctness Guarantee » Proceedings of the 30th annual international ACM SIGIR, Amsterdam 2007) A.

ON INCENTIVE-BASED TAGGING Xuan S. Yang, Reynold Cheng, Luyi Mo, Ben Kao, David W. Cheung {xyang2, ckcheng, lymo, kao, The University.

Department of Computer Science City University of Hong Kong Department of Computer Science City University of Hong Kong 1 A Statistics-Based Sensor Selection.

Data-Centric Human Computation Jennifer Widom Stanford University.

Introduction to Algorithms By Mr. Venkatadri. M. Two Phases of Programming A typical programming task can be divided into two phases: Problem solving.

Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.

Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

Mining Document Collections to Facilitate Accurate Approximate Entity Matching Presented By Harshda Vabale.

Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.

DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.

Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree

Histograms for Selectivity Estimation, Part II Speaker: Ho Wai Shing Global Optimization of Histograms.

PERSONALIZED DIVERSIFICATION OF SEARCH RESULTS Date: 2013/04/15 Author: David Vallet, Pablo Castells Source: SIGIR’12 Advisor: Dr.Jia-ling, Koh Speaker:

 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:

Predicting Consensus Ranking in Crowdsourced Setting Xi Chen Mentors: Paul Bennett and Eric Horvitz Collaborator: Kevyn Collins-Thompson Machine Learning.

Spring 2008The Greedy Method1. Spring 2008The Greedy Method2 Outline and Reading The Greedy Method Technique (§5.1) Fractional Knapsack Problem (§5.1.1)

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.

Crowdsourcing High Quality Labels with a Tight Budget Qi Li 1, Fenglong Ma 1, Jing Gao 1, Lu Su 1, Christopher J. Quinn 2 1 SUNY Buffalo; 2 Purdue University.

Outline Introduction State-of-the-art solutions

Moran Feldman The Open University of Israel

Distributed Submodular Maximization in Massive Datasets

Data Integration with Dependent Sources

Objective of This Course

Adaptive entity resolution with human computation

Computational Advertising and

Presentation transcript:

Optimizing Plurality for Human Intelligence Tasks Luyi Mo University of Hong Kong Joint work with Reynold Cheng, Ben Kao, Xuan Yang, Chenghui Ren, Siyu Lei, David Cheung, and Eric Lo

Outline 2  Introduction  Problem Definition & Solution for Multiple Choice Questions  Experiments  Extension to Other HIT Types  Conclusions

Crowdsourcing Systems  Harness human effort to solve problems  Examples:  Amazon Mechanical Turk (AMT), CrowdFlower  HITs: Human Intelligence Tasks  Entity resolution, sort and join, filtering, tagging $$

Plurality of HITs  Imperfect answers from a single worker  make careless mistakes  misinterpret the HIT requirement  Specify sufficient plurality of a HIT (number of workers required to perform that HIT) Combined answer HIT result HIT

Plurality Assignment Problem (PAP)  Plurality has to be limited  A HIT is associated with a cost  Requester has limited budget  Requester requires time to verify HIT results $$ Budget PAP: wisely assign the right pluralities to various HITs to achieve overall high-quality results

Our Goal  Manually assigning pluralities is tedious if not infeasible  AMT on 28 th October, 2012  90,000 HITs submitted by Top-10 requesters  Algorithms for automating the process of plurality assignment are needed!

Related Work (1)  Designing HIT questions  Human-assisted graph search [1]  Entity resolution [2] [3]  Identifying entity with maximum value [4]  Data filtering [5] / data labeling [6] [1] Human-assisted graph search: it’s okay to ask questions. A. Parameswaran et al. VLDB’11 [2] CrowdER: crowdsourcing entity resolution. J. Wang et al. VLDB’12 [3] Question selection for crowd entity resolution. S. Whang et al. Stanford TechReport, 2012 [4] So who won? Dynamic max discovery with the crowd. S. guo et al. SIGMOD’12 [5] Crowdscreen: Algorithms for filtering data with humans. A. Parameswaran et al. SIGMOD’12 [6] Active learning for crowd-sourced databases. B. Mozafari et al. Technical Report, 2012 We consider how to obtain high-quality results for HITs

Related Work (2)  Determining Plurality  Minimum plurality of an multiple choice question (MCQ) to satisfy user-given threshold [7] [8] We study plurality assignment to a set of different HITs  Assigning specific workers to perform specific binary questions [9] [7] CDAS: A crowdsourcing data analytics system. X. Liu et al. VLDB’12 [8] Automan: A platform for integrating human-based and digital computation. D. Barowy et al. OOPSLA’12 [9] Whom to ask? Jury selection for decision making tasks on micro-blog services. C. Cao et al. VLDB’12 We study properties of HITs, and the relationship between HIT cost and quality.

Multiple Choice Questions (MCQs)  Most popular type  AMT on 28 th Oct, 2012  About three quarters of HITs are MCQs  Examples  Sentiment analysis, categorizing objects, assigning rating scores, etc.

Data Model  Set of HITs  For each HIT  contains a single MCQ  plurality (i.e., workers are needed)  cost (i.e., units of reward are given for completing )

Quality Model  Capture the goodness of HIT result’s for  MCQ quality  likelihood that the result is correct after it has been performed by k workers  Factors that affect MCQ quality  plurality: k  Worker’s accuracy (or accuracy): probability that a randomly-chosen worker provides a correct answer for estimated from similar HITs whose true answer is known

Problem Definition  Input  budget  Set of HITs  Output  pluralities for every HITs  Objective  maximize overall average quality

Solutions  Optimal Solution  Dynamic Programming  Not efficient for HIT sets that contain thousands of HITs 60,000 HITs extracted from AMT Execution time: 10 hours

quality increasing rate  Properties of MCQ quality function  Monotonicity  Diminishing return  PAP is approximable for HITs with these two properties  Greedy  Select the “best” HIT and increase its plurality until budget is exhausted  Selection criteria: the one with largest marginal gain  Theoretical approximation ratio = 2 Greedy: 2-approximate algorithm

Grouping techniques  Observations  Many HITs submitted by the same requesters are given the same cost and of very similar nature  Intuition  Group HITs of the same cost and quality function  More or less the same plurality for HITs in one group  Main idea  Select a “representative HIT” from each group  Evaluate its plurality by DP or Greedy  Deduce each HIT’s plurality from the representative HIT

Experiments  Synthetic  Generated based on the extraction of an AMT requester’s HITs information on Oct 28 th, 2012  Statistics 67,075 HITs 12 groups (same cost and accuracy) Costs vary from $0.08 to $0.24 Accuracy of each group is randomly selected from [0.5, 1]

Effectiveness  Competitors  Random: arbitrarily pick a HIT to increase its plurality until budget is exhausted  Even: divide the budget evenly across all HITs 5% 20% 3 per HIT Greedy is close-to- optimal in practice

Performance (1)  DP and Greedy are implemented using grouping techniques  Greedy is efficient! 1,000x 10,000x

Performance (2)  Grouping techniques  20 times faster than non-group solutions 12 groups vs. 60,000 HITs

 Examples  Enumeration Query  Tagging Query  Solution framework  Quality estimator Derive accuracy in MCQ  PA algorithm Greedy for HITs demonstrate monotonicity and diminishing return Other HIT Types

Enumeration Query  Objective  obtain a complete set of distinct elements for a set query  Quality function from [10]  Satisfy monotonicity and diminishing return  Greedy can be applied [10] Crowdsourced Enumeration Queries. MJ Franklin et al. ICDE’13

Tagging Query  Objective  obtain keywords (or tags) that best describe an object  Empirical results from [11]  Power-law relationship with number of workers’ answers [11] On Incentive-based Tagging. X. Yang et al. ICDE’13

Conclusions  Problem of setting plurality for HITs  Develop effective and efficient plurality algorithms for HITs whose quality demonstrates monotonicity and diminishing return  Future work  Study the extensions to support other kinds of HITs (i.e. Enumeration Query and Tagging Query)

Thank you! Contact Info: Luyi Mo University of Hong Kong 24

Results on Real Data  MCQs  Sentiment analysis (positive, neutral, negative) for 100 comments extracted from Youtube videos  25 answers collected per MCQ  Statistics  Similar trends to that of synthetic data in effectiveness and performance analysis

Results on Real Data (2)  Correlation between the MCQ quality and Real quality  Real quality: goodness of answers obtained from workers after the MCQ reached the assigned plurality  Correlation is over 99.8% MCQ quality is a good indicator