Quizz: Targeted Crowdsourcing with a Billion (Potential) Users Panos Ipeirotis, Stern School of Business, New York University Evgeniy Gabrilovich, Google.

Slides:



Advertisements
Similar presentations
Managing Your Staff.
Advertisements

Panos Ipeirotis Stern School of Business
What is economics?.
Topics we will discuss tonight: 1.Introduction to Google Adwords platform 2.Understanding how to text ads are used. Display advertising will not be discussed.
Rewarding Crowdsourced Workers Panos Ipeirotis New York University and Google Joint work with: Jing Wang, Foster Provost, Josh Attenberg, and Victor Sheng;
1 Developing a Predictive Model for Internet Video Quality-of-Experience Athula Balachandran, Vyas Sekar, Aditya Akella, Srinivasan Seshan, Ion Stoica,
Decentralisation without choice and competition: how we can make health systems work better without markets Ian Greener Durham University 1.
Supporting research to make patients, and the NHS, better Delivering commercial cancer trials through the NCRN.
INTRODUCTION TO AP BIOLOGY What is AP Biology  AP Biology is designed to be the equivalent of a University Introductory Biology Course  It.
Science as a Process Chapter 1 Section 2.
Amit Goyal Laks V. S. Lakshmanan RecMax: Exploiting Recommender Systems for Fun and Profit University of British Columbia
Basics of Statistical Estimation
Performance Evaluation Sponsored Search Markets Giovanni Neglia INRIA – EPI Maestro 4 February 2013.
Incentivize Crowd Labeling under Budget Constraint
 Make better decisions Usually business decisions  Build theory Understand the world better.
By Nicole Lopez & Travis Smith Motivating Adult Learners.
Digital Marketing Overview Tpugliese Adapted from Anton Koekemoer | April 2012.
All Things Search Attracting and understanding website visitors.
My Carbon Footprint: What am I going to do about it? Science Project Due Date: May 7 th ????
What is marketing? Marketing is the management process that identifies, anticipates and satisfies customer requirements profitably. Marketing focuses.
Intelligent Online Marketing Local Digital Advertising Specific To Your Business Presented by: Dexter Nelson Live Minder Trend Analysis & Marketing.
Over 60% of the U.S. population is online with over 170 million users in the United States! The Internet is viewed more than the newspaper industry and.
CS246 Search Engine Bias. Junghoo "John" Cho (UCLA Computer Science)2 Motivation “If you are not indexed by Google, you do not exist on the Web” --- news.com.
So what is the game about...  You and your team are going to be exclusively responsible for all the decisions and actions during the gaming period.
Application of CRM (Customer Relationship Management) in Libraries.
StaffCV Recruitment Management Solution A New Paradigm In Recruiting StaffCV Recruitment Management Solution A New Paradigm In Recruiting.
CLICK FRAUD Alexander Tuzhilin By Vinny Rey. Why was the study done? Google was getting sued by advertisers because of click fraud. Google agreed to have.
ENHANCING YOUR MARKETING STRATEGIES WITH ONLINE VIDEO How Video Marketing Works Presentation By:
IMA CIM Overview. IMA Mission “Provide a knowledge-sharing platform for business professionals where proven Internet.
Advantages and disadvantages for the affiliate marketing Done by : Abdallah Abu Afash Mahmoud basal For : Rasha Atallah.
1 Power, Power Curves and Sample Size. 2 Planning Reliable and Efficient Tests For any job, you need a tool that offers the right amount of power for.
Christopher Harris Informatics Program The University of Iowa Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011) Hong Kong, Feb. 9, 2011.
Providing Full Service Technology Support in Self-Service Times Objective: To discuss the challenges, pitfalls and opportunities of the current technology.
Keys to a Successful Landing Page Andrew Eisner Vice President, Client Services.
Anindya Ghose Sha Yang Stern School of Business New York University An Empirical Analysis of Sponsored Search Performance in Search Engine Advertising.
Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers Victor Sheng, Foster Provost, Panos Ipeirotis KDD 2008 New York.
1/20 Remco Chang (Computer Science) Paul Han (Tufts Medical / Maine Medical) Holly Taylor (Psychology) Improving Health Risk Communication: Designing Visualizations.
Utilities, Customers & SMS Rudi Leitner. Who in this room has a mobile phone? Who in this room has ever sent a text (SMS) message?
Optimizing Marketing Spend Through Multi-Source Conversion Attribution David Jenkins.
Search Engine Optimization (SEO) …a brief introduction Billy Howard,
You have a web site. How do you make money from it? Get lots of targeted traffic to it. How do you get this traffic? Marketing Social Media Search.
Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.
Pay Per Click (PPC) – Advertising Jump Start Your Internet Marketing! IMMEDIATE RESULTS WITH UNMATCHED COST – EFFICIENCY! Not only can you begin generating.
Promotion of e-Commerce sites. A business which uses e- commerce to trade online must also advertise. Several traditional methods can be used, such as.
Using Crowdsourcing to Support Pro-Environmental Community Activism Elaine Massung, David Coyle, Kirsten Cater, Marc Jay, Chris Preist CHI 2013 Session:
Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects Jakub Šimko, Marián Šimko, Mária Bieliková, Jakub Ševcech, Roman Burger
Is Your E-commerce Site Losing Customers? Sharon Taylor.
Motivating Adult Learners Why is understanding what motivates adult learners important? Adults comprise a large proportion of the workforce as well as,
Motivating adult learners can sometimes be a challenge. This module will provide you with information on how to design instructional content that will.
Motivating adult learners can sometimes be a challenge.
Motivating adult learners can sometimes be a challenge. This module will provide you with information on how to design instructional content that will.
© Nous Infosystems Pvt. Ltd. – Confidential Social Engagement for Banks and Financial Services Leveraging 19 years of expertise in global software services.
93% of online activities begin with search billion+ searches conducted worldwide each month. 5 92% of internet users search. 2.
Don’t Leave Your Money on the Table Your Guide to Paid Search Advertising Ron Adelman WSI Internet Marketing Consultant.
KiloBytes Technologies “New Face Of Technology” / Website: SEOwww.kilobytes.inSEO.
GOOGLE TAG MANAGER. INTRODUCTION Google Tag Manager (GTM) is a free solution, introduced in October Google Tag Manager (GTM) is a free solution,
New Hire Packet Automation Factors for Decision Making.
Adventures in Crowdsourcing Panos Ipeirotis Stern School of Business New York University Thanks to: Jing Wang, Marios Kokkodis, Foster Provost, Josh Attenberg,
Bayesian Optimization. Problem Formulation Goal  Discover the X that maximizes Y  Global optimization Active experimentation  We can choose which values.
Using to your Advantage
Comparing Bayesian and Frequentist Inference for Decision-Making
Skype 24/7 Support : BroadNet-SMS Lebanon : /1/2 Dubai : London :
Carla SEO/SEM.
SEO Breanna Sanders.
Google AdWords API. Want a more tailored AdWords Account
Digital Marketing Overview
Nicole Steen-Dutton, ClickDimensions
Training Deck – SEM & Facebook Ads
Digital Marketing Offerings
What is Public Relations? PR vs. Advertising
Presentation transcript:

Quizz: Targeted Crowdsourcing with a Billion (Potential) Users Panos Ipeirotis, Stern School of Business, New York University Evgeniy Gabrilovich, Google Work done while on sabbatical at Google

The sabbatical mission… We have a billion users… leverage their knowledge …

Knowledge Graph: Things not Strings

Still incomplete… Date of birth of Bayes (…uncertain…) Symptom of strep throat Side effects of treximet Who is Cristiano Ronaldo dating When is Jay Z playing in New York What is the customer service number for Google …

The sabbatical mission… Lets create a new crowdsourcing system… We have a billion users… leverage their knowledge …

Ideally…

But often…

The common solution…

Volunteers vs. hired workers Hired workers provide predictability …but have a monetary cost …are motivated by money (spam, misrepresent qualifications…) …extrinsic rewards crowd-out intrinsic motivation …do not always have the knowledge Volunteers cost less (…) …but difficult to predict success

Key Challenge Crowdsource in a predictable manner, with knowledgeable users, without introducing monetary rewards

Quizz

Calibration vs. Collection Calibration questions (known answer): Evaluating user competence on topic at hand Collection questions (unknown answer): Asking questions for things we do not know Trust more answers coming from competent users Tradeoff Learn more about user quality vs. getting answers (technical solution: use a Markov Decision Process)

Model: Markov Decision Process a correct b incorrect c collection V = c*IG[a,b] [a+1,b,c] V = c*IG[a+1,b] [a,b+1,c] V = c*IG[a,b+1] [a,b,c+1] V = (c+1)*IG[a,b] Exploit Explore User drops out User continues User correct User incorrect

Challenges Why would anyone come and play this game? Why would knowledgeable users come? Wouldnt it be simpler to just pay?

Attracting Visitors: Ad Campaigns

Running Ad Campaigns: Objectives We want to attract good users, not just clicks We do not want to think hard about keyword selection, appropriate ad text, etc. We want automation across thousands of topics (from treatment side effects to celebrity dating)

Solution: Treat Quizz as eCommerce Site

Feedback: Value of click

User Value: Information Gain Value of user: total information contributed Information gain is additive: #questions x infogain Information gain for question with n choices, user quality q Random user quality: q=1/n IG(q,n) = 0 Perfect user quality: q=1 IG(q,n) = log(n) Using a Bayesian version to accommodate for uncertainty about q

How to measure quality? Naïve (and unstable) approach: q = correct/total Bayesian approach: q is latent, with uniform prior Then q follows Beta(a,b) distr (a: correct, b:incorrect)

Expected Information Gain

Effect of Ad Targeting (Perhaps it is just more users?) Control: Ad campaign with no feedback, all keywords across quizzes [optimizes for clicks] Treatment: Ad campaign with feedback enabled [optimizes for conversions] Clicks/visitors: Same Conversion rate: 34% vs 13% (~3x more users participated) Number of answers: 2866 vs 279 (~10x more answers submitted) Total Information Gain: 7560 bits vs 610 bits (~11.5x more bits)

Effect of Optimizing for Conversion Value Control: Feedback on conversion event but no value Treatment: Feedback provides information gain per click Clicks/visitors: Same Conversion rate: 39% vs 30% (~30% more users participated) Number of answers: 1683 vs 1183 (~42% more answers submitted) Total Information Gain: 4690 bits vs 2870 bits (~63% more bits)

Example of Targeting: Medical Quizzes Our initial goal was to use medical topics as a evidence that some topics are not crowdsourcable Our hypothesis failed: They were the best performing quizzes… Users coming from sites such as Mayo Clinic, WebMD, … (i.e., pronsumers, not professionals)

Participation is important! Really useful users

Immediate feedback helps most – Knowing the correct answer 10x more important than knowing whether given answer was correct – Conjecture: Users also want to learn TreatmentEffect Show if user answer correct+2.4% Show the correct answer+20.4% Score: % of correct answers+2.3% Score: # of correct answers-2.2% Score: Information gain+4.0% Show statistics for performance of other users+9.8% Leaderboard based on percent correct-4.8% Leaderboard based on total correct answers-1.5%

Showing score is moderately helpful – Be careful what you incentivize though – Total Correct incentivizes quantity, not quality TreatmentEffect Show if user answer correct+2.4% Show the correct answer+20.4% Score: % of correct answers+2.3% Score: # of correct answers-2.2% Score: Information gain+4.0% Show statistics for performance of other users+9.8% Leaderboard based on percent correct-4.8% Leaderboard based on total correct answers-1.5%

Competitiveness (how other users performed) helps significantly TreatmentEffect Show if user answer correct+2.4% Show the correct answer+20.4% Score: % of correct answers+2.3% Score: # of correct answers-2.2% Score: Information gain+4.0% Show statistics for performance of other users+9.8% Leaderboard based on percent correct-4.8% Leaderboard based on total correct answers-1.5%

Leaderboards are tricky! – Initially, strong positive effect – Over time, effect became strongly negative – All-time leaderboards considered harmful TreatmentEffect Show if user answer correct+2.4% Show the correct answer+20.4% Score: % of correct answers+2.3% Score: # of correct answers-2.2% Score: Information gain+4.0% Show statistics for performance of other users+9.8% Leaderboard based on percent correct-4.8% Leaderboard based on total correct answers-1.5%

Cost/Benefit Analysis

Self-selection and participation Low performing users naturally drop out With paid users, monetary incentives keep them Submitted answers % correct

Comparison with paid crowdsourcing Best paid user – 68% quality, 40 answers (~1.5 minutes per question) – Quality-equivalency: 13 99% accuracy, 23 90% accuracy – 5 cents/question, or $3/hr to match advertising cost of unpaid users Knowledgeable users are much faster and more efficient Submitted answers % correct

Citizen Science Applications Google gives $10K/month to nonprofits in ad budget Climate CoLab experiment running – Doubled traffic with only $20/day – Targets political activist groups (not only climate) Additional experiments: Crowdcrafting, ebird, Weendy

Conclusions New way to run crowdsourcing, targeting with ads Engages unpaid users, avoids problems with extrinsic rewards Provides access to expert users, not available labor platforms Experts not always professionals (e.g., Mayo Clinic users) Nonprofits can use Google Ad Grants to attract (for free) participants to citizen science projects