Human Computation and Crowdsourcing Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Xchange-Park: A Crowd-sourcing based parking reservation system Aakash Therani Ankit Jasuja Manish Shah
Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Human- Computer Interfaces HUMAN COMPUTATION.  Humans helping solve large problems  Using humans WITH computers to solve problems not solvable be either.
1.Data categorization 2.Information 3.Knowledge 4.Wisdom 5.Social understanding Which of the following requires a firm to expend resources to organize.
Collective Intelligence Dr. Frank McCown Intro to Web Science Harding University This work is licensed under a Creative Commons Attribution-NonCommercial-
Collaborative Human Computing Zack Zhu March 31, 2010 Seminar for Distributed Computing 1.
Social Media Intro to Business & Marketing. The most three most trusted forms of advertising are: Recommendations from people I know - 90% Consumer opinions.
Crowdsourcing Gaurang Jadia CS575 Human Issues in Computing.
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
Crowdsourcing research data UMBC ebiquity,
Oozing out knowledge in human brains to the Internet Lada Adamic School of Information University of Michigan
Human Computation CSC4170 Web Intelligence and Social Computing Tutorial 7 Tutor: Tom Chao Zhou
James A. Senn’s Information Technology, 3rd Edition
Level 2 IT Users Qualification – Unit 1 Improving Productivity Name.
Chapter 3 Computer Science and the Foundation of Knowledge Model
Presenters: Title:. CONTENTS What is Crowdsourcing? How Crowdsourcing works? Types of Crowdsourcing Applications of Crowdsourcing Benefits & Problems.
IT Job Roles Task 20. Software Engineer Job Description Software engineers are responsible for creating and maintaining software of various different.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
What kind of media institution might distribute your media product and why? By Jess Knight Question 3.
Welcome Thank you for joining us today. Please stand by while we wait for more attendees to join in. The webcast will begin momentarily.
Human Computation and Crowdsourcing Uichin Lee May 8, 2011.
Module 3: Business Information Systems Chapter 11: Knowledge Management.
Taylor Trayner. Definition  Set of business processes developed in an organization to create, store, transfer, and apply knowledge  Knowledge is a firm.
SOCIAL MEDIA FOR BUSINESS reqSmart. Some Facts about Social Media - I Years to reach 50 million users. Radio – 38 years Television – 13 years Internet.
Working © 2013 Cengage Learning. All rights reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or.
KSE801: Human Computation and Crowdsourcing Uichin Lee Sept. 5, 2011 Knowledge Service Engineering Department.
Introduction to Management Information Systems Lecture 1: Why IS Matter – Introductory thoughts and goals J.-S. Rayson Chou, P.E., Ph.D. Assistant Professor.
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 5)
Confidential, all rights reserved Marketing Demand-Generation Seminar, 11/22/02 Module 1: Brand Positioning Alan Gonsenhauser,
AVI/Psych 358/IE 340: Human Factors Web 2.0 November
How We’re Going to Solve the AI Problem Pedro Domingos Dept. Computer Science & Eng. University of Washington.
DATA-CENTERED CROWDSOURCING WORKSHOP PROF. TOVA MILO SLAVA NOVGORODOV TEL AVIV UNIVERSITY 2014/2015.
Keeping on Top of Technological Trends and Uses of Existing Technology Daniel L. Appelman Heller Ehrman LLP.
Exploration Seminar 3 Human Computation Roy McElmurry.
Artificial Intelligence
Sarah Fatima Varda Sarfraz.  What is Recommendation systems?  Three recommendation approaches  Content-based  Collaborative  Hybrid approach  Conclusions.
The Internet Industry Week Four. RISE OF THE INTERNET THE INTERNET – a global system of interconnected private, public, academic, business, and government.
Computing Fundamentals Module Lesson 19 — Using Technology to Solve Problems Computer Literacy BASICS.
Overview Crowd-Powered Systems Alexander J. Quinn January 15, 2015.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Crowdsourcing: Ethics, Collaboration, Creativity KSE 801 Uichin Lee.
Artificial Intelligence By Michelle Witcofsky And Evan Flanagan.
How Solvable Is Intelligence? A brief introduction to AI Dr. Richard Fox Department of Computer Science Northern Kentucky University.
Systems Analysis and Design in a Changing World, Fourth Edition
Hybrid economies Lessig, L Remix: making art and commerce thrive in the hybrid economy. Bloomsbury: London.
Computing Fundamentals Module Lesson 6 — Using Technology to Solve Problems Computer Literacy BASICS.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
How to Create a Successful Internet Video Campaign.
CS5038 The Electronic Society Lecture: Social Networking Lecture Outline Social Networking Service Social Networking Sites –Bebo –Friendster –MySpace Social.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Chapter 9: Tapping the Crowd for Fast Innovation ISTO SIPILÄ.
The Rise of Crowdsourcing in Management Research Organized by: Yuqing (Ching) Ren Natalia Levina August 9, 2010 Academy of Management Annual Meeting.
Social Information Processing March 26-28, 2008 AAAI Spring Symposium Stanford University
Chapter 8 The Social Enterprise: From Recruiting to Problem Solving and Collaboration.
Social Media & Social Networking 101 Canadian Society of Safety Engineering (CSSE)
Big Data: Every Word Managing Data Data Mining TerminologyData Collection CrowdsourcingSecurity & Validation Universal Translation Monolingual Dictionaries.
C ROWD SOURCING F ARHEEN M OSHARRAF. W HAT IS C ROWD SOURCING ? Crowd sourcing is the process of obtaining needed services, ideas, or content by soliciting.
Crowdsourcing: How to Benefit from (Too) Many Great Ideas (Blohm et al., 2013) Olga Jemeljanova Joona Kanerva Niko Kuki Mikko Nummela Group
Data-Centered Crowdsourcing Workshop
MIS2502: Data Analytics Advanced Analytics - Introduction
The Internet Industry Week Two.
Software What Is Software?
Geospatial Technology Evolution and Future Trends
MANAGING KNOWLEDGE FOR THE DIGITAL FIRM
Algorithmic Management and Fairness
New Mexico Broadband Program Internet Tools for Small Business
Software Agent.
Presentation transcript:

Human Computation and Crowdsourcing Uichin Lee KSE652 Social Computing Systems Design and Analysis

The Rise of Crowdsourcing By Jeff Howe (Wired Magazine, 2006) Remember outsourcing? Sending jobs to India and China is so The new pool of cheap labor: everyday people using their spare cycles to create content, solve problems, even do corporate R&D.

The Rise of Crowdsourcing The Professional The Packager The Tinkerer The Masses And the Age of the Crowd

The Professional A story of “Claudia Menashe” – A project director at the National Health Museum in Washington, DC – Putting a series of interactive kiosks devoted to potential pandemics like the avian flu – An exhibition designer created a plan for the kiosk; now she wants to have images to accompany the text.. Hire a photographer? Pre-existing images—stock photograph

The Professional She ran across a stock photo collection by Mark I’ll give you some discount: how about $100-$150 per photograph? That’s about half of what a cooperate client would pay! I’ll give you some discount: how about $100-$150 per photograph? That’s about half of what a cooperate client would pay! Great! I’ll buy 4 images!! We don’t have much money… Claudia Mark

The Professional One dollar!! That’s a steal!! iStockphoto: a marketplace for the work of amateur photographers (e.g., homemakers, students, engineers, dancers); over 20,000 contributors which charge about $1 to $5 per basic image

The Packager Viral videos; how to repurpose content to make compelling TV on a budget? Web Junk 20 at VH1: American television program in which VH1 and iFilm collaborate to highlight the twenty funniest and most interesting clips collected from the Internet that week Michael Hirschorn (creator of Web Junk 20) – “I knew we offered something YouTube couldn’t; television. Everyone wants to be on TV” Next generation TV: user generated content – As user generated TV matures, the users will become more proficient and the networks better at ferreting out the best of the best.. UGC everywhere; say in education, e.g., Khan Academy

The Tinkerer: The Future of Cooperate R&D InnoCentives – Launched in 2001 to connect with brainpower outside the company – Companies pay solvers anywhere from $10,000 to $100,000 per solution – Jill Panetta (CSO) says More than 30% of the problems are solved! The odds of a solver’s success increased in the fields in which they had no formal expertise “The strength of weak ties”– Mark Granovetter Similar Services: Ed Melcarek – On most Saturdays, Melcarek attacks problems that have stumped some of the best cooperate scientists at Fortune 100 companies – “not bad for a few weeks’ work” (e.g., Colgate problem: $25,000) P&G’s R&D: – “We have 9,000 people on our R&D staff and up to 1.5 million researchers working through our external networks”

The Masses The Turk: – The first machine capable of beating a human at chess, built around the late 1760s by a Hungarian nobleman named Wolfgan von Kempelen Amazon’s Mechanical Turk – Crowdsourcing for the masse (no specific talents required) – Web based marketplace that helps companies to find people to perform “human intelligence tasks” (HITs) computers are lousy at – Examples: identifying items in a photo, skimming real estate documents to find identifying information, writing short product description – HITs cost from a few cents to a few dollars or more “Human Intelligence inside” Our focus : The Masses – Labor Marketplaces, Games, Ubiquitous Sensing, Social Networking/Q&A

The Age of the Crowd Distributed computing projects: UC Berkeley’s – Tapping into the unused processing power of millions of individual computers “Distributed labor networks” – Using the Internet (and Web 2.0) to exploit the spare processing power of millions of human brains Successful examples? – Open source software: a network of passionate, geeky volunteers could write code just as well as highly paid developers at Microsoft or Sun Microsystems – Wikipedia: creating a sprawling and surprisingly comprehensive online encyclopedia – eBay, Facebook: can’t exist without the contributions of users

The Age of the Crowd The productive potential of millions of plugged-in enthusiasts is attracting the attention of old-line business too For the last decade or so, companies have been looking overseas for cheap labor But now it doesn’t matter where the laborers are, as long as they are connected to the Internet

The Age of the Crowd Technological advances in everything (from product design software to digital video cameras) are breaking down the cost barriers that once separated amateurs from professionals Crowds (e.g., hobbyists, part-timers, dabblers) now suddenly have a market for their efforts Smart companies in industries tap the latent talent of the crowd “The labor isn’t always free, but it costs a lot less than paying traditional employees. It’s not outsourcing: it’s crowdsourcing”

Human Computation: A Survey and Taxonomy of a Growing Field Alexander J. Quinn, Benjamin B. Bederson CHI 2011

Human Computation Computer scientists (in the artificial intelligence field) have been trying to emulate human like abilities, e.g., language, visual processing, reasoning using computers Alan Turing wrote in 1950: “The idea behind digital computers may be explained by saying that these machines are intended to carry out any operations which could be done by a human computer.” L. Von Ahn 2005 wrote a doctorial thesis about human computation The field is now thriving: business, art, R&D, HCI, databases, artificial intelligence, etc.

Definition of Human Computation Dates back 1938 in philosophy and psychology literature; 1960 in Computer Science literature (by Turing) Modern usage inspired by von Ahn’s 2005 dissertation titled by “Human Computation” – “…a paradigm for utilizing human processing power to solve problems that computers cannot yet solve.”

Definition of Human Computation “…the idea of using human effort to perform tasks that computers cannot yet perform, usually in an enjoyable manner.” (Law, von Ahn 2009) “…a new research area that studies the process of channeling the vast internet population to perform tasks or provide data towards solving difficult problems that no known efficient computer algorithms can yet solve” (Chandrasekar, et al., 2010) “…a technique that makes use of human abilities for computation to solve problems.” (Yuen, Chen, King, 2009) “…a technique to let humans solve tasks, which cannot be solved by computers.” (Schall, Truong, Dustdar, 2008) “A computational process that involves humans in certain steps…” (Yang, et al., 2008) “…systems of computers and large numbers of humans that work together in order to solve problems that could not be solved by either computers or humans alone” (Quinn, Bederson, 2009) “…a new area of research that studies how to build systems, such as simple casual games, to collect annotations from human users.” (Law, et al., 2009)

Related Ideas Crowdsourcing Social computing Data mining Collective intelligence

Crowdsourcing “Crowdsourcing is the act of taking a job traditionally performed by a designated agent (usually an employee) and outsourcing it to an undefined, generally large group of people in the form of an open call.” (Jeff Howe) Human computation replaces computers with humans, whereas crowdsourcing replaces traditional human workers with members of the public – HC: replacement of computers with humans – CS: replacement of insourced workers with crowdsourced workers Some crowdsourcing tasks can be considered as human computation tasks – Hiring crowdsourced workers for translation jobs : – Machine translation (fast, but low quality) vs. human translation (slow, high quality)

Social Computing Definition from Wikipedia: – “.. supporting any sort of social behavior in or through computational systems” (e.g., blogs, , IM, SNS, wikis, social bookmarking) – “.. Supporting computations that are carried out by groups of people” (e.g., collaborative filtering, online auctions, prediction markets, reputation systems) Some other definitions: – “… applications and services that facilitate collective action and social interaction online with rich exchange of multimedia information and evolution of aggregate knowledge…” (Parameswaran, Whinston, 2007) – “… the interplay between persons' social behaviors and their interactions with computing technologies” (Dryer, Eisbach, Ark, 1999)

Data Mining Data mining is defined broadly as the application of specific algorithms for extracting patterns from data.” (Fayyad, Piatetsky-Shapiro, Smyth, 1996) While data mining deals with human created data, it does not involve human computation – Google PageRank “only” uses human created data (links)

Collective Intelligence Overarching notion: large groups of loosely organized people can accomplish great things by working together – Traditional study focused on “decision making capabilities by a large group of people” Taxonomical “genome” of collective intelligence – “… groups of individuals doing things collectively that seem intelligent” (Malone, 2009) Collective intelligence generally encompasses human computation and social computing

Relationship Diagram Collective Intelligence Data Mining Crowdsourcing Social Computing Human Computation

Classifying Human Computation Motivation – What does motivate people to perform HC? Human skill – What kinds of human skills do HC tasks require? Aggregation – How to combine results of HC tasks? Quality control – How to control quality of the results of HC tasks? Processing order of different roles – Roles (requester, worker, computer) Task-request cardinality – Requester vs. Worker cardinality

Motivation Examples Pay (financial rewards) Mechanical Turk (online labor marketplace), ChaCha (mobile Q&A), LiveOps (a distributed call center) Altruism (just helping other people for good)helpfindjim.com (Jim Gray), Naver KiN, Yahoo! Answer Enjoyment (fun) Game With A Purpose (GWAP): - ESP Game, Tag a Tune, Reputation (recognition) Volunteer translators at childrenslibrary.org, Naver KiN, Yahoo! Answer Implicit workreCAPTCHA

Quality Control Examples Output agreement ESP Game (a game for labeling images) – answer is accepted if the pair agree on the same answer Input agreement Tag-a-tune: two humans are listening to different inputs (music). They are asked to describe the music and try to decide whether they are looking at the same music or different music Economic models When money is a motivating factor; some economic models can be used to elicit quality answers (e.g., game-theoretic model of the worker’s rating to reduce the incentive to cheat) Defensive task design Design tasks so that it’s difficult to cheat (e.g., comprehension questions) Redundancy Each task is given to multiple people to separate the wheat from the chaff Statistical filtering Filter or aggregate the data in some way that removes the effects of irrelevant work Multilevel review One set of workers does the work; the second set reviews the results and rates the quality (e.g., Soylent : find-fix-verity) Automatic check fold.it (protein folding game); easy to check using computer, but hard to find answers Reputation system Motivated to provide quality answers by a reputation scoring systems; Mechanical Turk, Naver KiN, etc. Expert check Trusted expert skims or cross-checks results for relevance and apparent accuracy

Aggregation Examples Collection (to build a knowledge base) Artificial intelligence research; to build large DB of common sense facts (e.g., people can’t brush their hairs with a table) Examples: ESP game, reCAPTCHA, FACTory, Verbosity, etc. Wisdom of crowds (statistical processing of data) Average guess of normal people can be very close to the actual outcome; e.g., Ask500people, News Futures, Iowa Electronic Markets Search Large number of volunteers to sift through photos or videos, searching for some desired scientific phenomenon, person, or object, e.g., helpfindjim.com, project Iterative improvementGiving answers of previous worker to elicit better answers, e.g., MonoTrans Active learning Classifier training; selects the samples that could potentially give best training benefits and select them for manual annotations for training Genetic algorithm (search/optimization) Free Knowledge Exchange, PicBreeder None (if independent task is performed) VizWiz (a mobile app that les a blind user take a photo and ask question)

Human Skills, Processing Order, Task-Request Cardinality Human SkillsExamples Visual recognitionESP Game Language understandingSoylent Basic human communicationChaCha Processing OrderExamples Computer  Worker (>> Requester)reCAPTCHA Worker (player)  Requester  Computer (aggregation)ESP Game (image labeling) Computer  Worker  Requester  Computer Cyc inferred large # of common senses  FACTory, a GWAP where worker (players solve problem), Cyc performs aggregation Requester  WorkerMechanical Turk Task-Request CardinalityExamples One-to-one (one worker to one task)ChaCha Many-to-many (many workers to many tasks)ESP Game Many-to-one (many workers to one task)helpfindjim.com (Jim Gary) Few-to-one (few workers to one task)VizWiz

Summary Definition of human computation and crowdsourcing Relationship with other related issues Classifying human computation and crowdsourcing systems – Motivation, human skill, aggregation, quality control, processing order, task-request cardinality – Nature of collaboration, architecture, recruitment, human skill