Replicability and Reproducibility in Crowdsourcing & Social Media

Slides:



Advertisements
Similar presentations
Quizz: Targeted Crowdsourcing with a Billion (Potential) Users Panos Ipeirotis, Stern School of Business, New York University Evgeniy Gabrilovich, Google.
Advertisements

Chapter 14Design and Analysis of Experiments 8E 2012 Montgomery 1.
Knowledge is Power Marketing Information System (MIS) determines what information managers need and then gathers, sorts, analyzes, stores, and distributes.
Lesson 46: Using Information From the Web copy and paste information from a Web site print a Web page download information from a Web site customize Web.
Exercise 1.1 (Problem Statement) Problem Statement: A bed company is considering hiring an advertising firm to stimulate business. The management consults.
R. FRANK NIMS MIDDLE SCHOOL A BRIEF INTRODUCTION TO VIRUSES.
A Wanderer’s Guide to the Data Network Navigating Our Connected World Garrett Shields, GISP, CFM.
Text reference, Chapter 14, Pg. 525
Adware By: Kevin Garnett, Charlie wancy, Go Diego Go, Batman braggster.
PERSONALIZED SEARCH Ram Nithin Baalay. Personalized Search? Search Engine: A Vital Need Next level of Intelligent Information Retrieval. Retrieval of.
Page 1www.sitecore.net Behavioral Targeting – Live!  The importance of understanding and attaching engagement value  to each visitor interaction Presented.
Getting started First go to the forum and on the page which comes up, select “register”, double click and…(now press space bar for the next screen in this.
A blog is a web log, a frequently updated website. Authors: Usually only one person - each post is one author's voice. Others can only leave comments.
n Just as a human virus is passed from person from person, a computer virus is passed from computer to computer. n A virus can be attached to any file.
Privacy & Confidentiality in Internet Research Jeffrey M. Cohen, Ph.D. Associate Dean, Responsible Conduct of Research Weill Medical College of Cornell.
What’s the Big Deal About Internet Privacy?. Today’s Objective I can explain to Mr. Bates why companies collect information about visitors on their websites.
Virus Assignment JESS D. How viruses affect people and businesses  What is a virus? A computer virus is a code or a program that is loaded onto your.
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
OPeNDAP Development and Security Policies. Development Policies All of our software uses LGPL or GPL –LGPL is used by most of the code –We want it to.
Top Ten Ways to Protect Privacy Online -Abdul M. Look for privacy policies on Web Sites  Web sites can collect a lot of information about your visit.
Social Impacts of IT: P6 By André Sammut. Social Impacts IT impacts our life both in good ways and bad ways. Multiplayer Games Social Networks Anti-social.
CHAPTER 16 SEARCH ENGINE OPTIMIZATION. LEARNING OBJECTIVES How to monitor your site’s traffic What are the pros and cons of keyword advertising within.
AP CSP: Identifying People with Data and The Cost of Free
Tom’s website is a B2C business Type of E-commerce Business
ACO501 – Accommodation Sales & Marketing
Erick Engelke Engineering Computing June 2016
Compensation Plan ORIENTATION
Malware and Computer Maintenance
Recap: If, elif, else If <True condition>:
5 Major Benefits of Airplane Advertising
IoT Approach to Accommodation & Booking Related Web Services
Linzhang Wang Dept. of Computer Sci&Tech, Nanjing University
Top Tips to Monitor & Manage Your Online Reputation Leveraging Social Media
The need for standardization
E-commerce | WWW World Wide Web - Concepts
Arab Open University 2nd Semester, M301 Unit 5
Andy Wang CIS 5930 Computer Systems Performance Analysis
Social Unleashed: Unlocking the Transformative Power of Social Marketing Doug Laird CMO Wildfire, a division of Google.
E-commerce | WWW World Wide Web - Concepts
Outline Introduction Characteristics of intrusion detection systems
Topic 5: Online Communities Press F5 to view!
Topic 6: Issues Press F5 to view!
vbvbb
DigiDay 2016 Darren Trofimczuk
Why is it important? ❏ Most website owners do not think that if they paste address on page or post than they have too much risk of spam. ❏
UC Davis StudyPages – participant management, social media and website
Unit 27 Web Server Scripting Extended Diploma in ICT
Jim Fawcett CSE776 – Design Patterns Summer 2003
Spyware. By: Katheryn L. Gaston.
راهنماي استفاده دانشجويان از سامانه
Area What is the area of these shapes 8 x x2 x x 8x x x 8.
Computer communications
e-CODEX Requirements II Discovery
My Digital Footprint.
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall
Personalization & Privacy: Flow of Information
CSc 337 Lecture 27: Cookies.
LONG MULTIPLICATION is just multiplying two numbers.
CISC101 Reminders Assignment 3 due next Friday. Winter 2019
PolyAnalyst Web Report Training
Exercise 1.1 (Problem Statement)
Human-Computer Interaction: Overview of User Studies
My Digital Footprint….
What’s the Big Deal About Internet Privacy?
Data Collection: Designing an Observational System
Exercise 1.1 (Problem Statement)
Your Digital Footprint
CSc 337 Lecture 25: Cookies.
COSC-100 (Elements of Computer Science) Prof. Juola
Outline Announcements: Version control with CVS HW II due today!
Presentation transcript:

Replicability and Reproducibility in Crowdsourcing & Social Media Panos Ipeirotis Stern School of Business New York University

Two Main Types of Research Get data sets from a company Usually contain user behavior information NDA’s, IP agreements, and other strings attached Example: Travelocity browsing and purchase data Generate data sets through crowdsourcing Get humans by paying (e.g., Mechanical Turk) Get humans through social media, email, etc. Example: Hire humans to participate in quizzes

Replicability with Corporate Data: Hard Impossible to share corporate data publicly Even if possible, privacy snafus prevent most companies from even allowing public disclosure of their user’s data Negatives: Unclear how to deal with this issue If replicability impossible, is it even research? Positive: Should stop taking too seriously N=1 experiments Reproducibility only way forward

Replicability with Collected Behavioral Data: Easier When hiring/engaging humans, possible to share data Need to pass IRB (often easy, sometimes a pain) Anonymization not that hard Can simply post data on a web page But this allows only for statistical double-checking Can also share code for replicating experimental setting Costly to run again the experiment Thousands of dollars needed to pay participants, or to run advertising campaigns on the web In reality, we want to reproduce in similar, not identical settings If result not robust to small perturbations, not a result It is a resource waste to try to replicate