The Collaborative Organization of Knowledge D. Spinellis and P. Louridas Strong Regularities in Online Peer Production D. Wilkinson Ziyad Aljarboua Monday,

Slides:



Advertisements
Similar presentations
 A wiki is a collaborative Web site that combines the collective work of many authors.  A wiki allows anyone to edit, delete or modify content that.
Advertisements

The influence of search engines on preferential attachment Dan Li CS3150 Spring 2006.
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
UNDERSTANDING VISIBLE AND LATENT INTERACTIONS IN ONLINE SOCIAL NETWORK Presented by: Nisha Ranga Under guidance of : Prof. Augustin Chaintreau.
Blogs defined From Wikipedia A blog is a website in which journal entries are.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Wikipedia. The setting and the open questions We examine the organization in summer of 2006 –Jimbo Wales has been named one of the 100 most influential.
CS246 Search Engine Bias. Junghoo "John" Cho (UCLA Computer Science)2 Motivation “If you are not indexed by Google, you do not exist on the Web” --- news.com.
What is a blog? “Web log” In simple terms, a blog is a web page where what you write goes in chronological order on the front page Author can write, viewers.
Experiences Teaching Math Using Wikipedia Andrew Knyazev Twenty-Third Annual International Conference on Technology in Collegiate Mathematics Denver, Colorado.
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
Adriana Iordan Web Marketing Manager / Avangate Social Networking Media How the software authors should use it?
A Measurement-driven Analysis of Information Propagation in the Flickr Social Network WWW09 报告人: 徐波.
Books such as The Long Thaw explain issues like climate change in language that is easy for the general public to understand. Authors.
Trusting the user: Wikipedia as an example Daniel Mayer Wikimedia Foundation Free Culture and the Digital Library 14 October 2005.
1 Analyzing Patterns of User Content Generation in Online Social Networks Lei Guo, Yahoo! Enhua Tan, Ohio State University Songqing Chen, George Mason.
Getting sustainable and wider engagement in NHM science John Cummings, Wikimedian in Residence Wikimedia and open knowledge.
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
Wiki Culture & Collaboration Presented by: Faria Sami Quratulain Shattari Munim Ahmed Zaid Nizami.
Feasibility Study of a Wiki Collaboration Platform for Systematic Review Eileen Erinoff AHRQ Annual Meeting September 15, 2009.
BLOG. WHAT IS A BLOG ? We have a lot of definition of blog.. A blog is a personal diary. A daily pulpit. A collaborative space. A political soapbox. A.
Wikis Chanaka Wickramasinghe Library Assistant /NSLRC Web based information dissemination:
Wikis: A Comprehensive Overview By Brian McFarland, Alex Wolfe, Becker Jeung, Victoria Burges, Paul Weidinger, and Hang Zueng.
Cara Catalano Wikis Nassau Library System. Cara CatalanoCara Catalano Library Media Specialist, Turtle Hook Middle School Library Media Specialist, Turtle.
Blogs and Wikis Dr. Norm Friesen. Questions What is a blog? What is a Wiki? What is Wikipedia? What is RSS?
Introduction to Wikipedia & Wikipedia assignment.
An interactive website was established to improve communication and establish a place for section policy and educational materials. The site is a success.
Build it Tweak it Use it Know it Love it. A tool to collaborate on projects What does Collaborate mean? To work together.
Peter Laird. | 1 Building Dynamic Google Gadgets in Java Peter Laird Managing Architect WebLogic Portal BEA Systems.
By Josué A. Ruiz Rodriguez Wyatt Lugo Caballero.  What do you understand about Web tool?
WIKIPEDIA’S INVESTMENT PRESENTATION. Free encyclopedia Collects and summarizes information Into over 250 different languages Information is provided world-wide.
Making the most of Wikipedia F Crawford WHP Federation Libraries Jan 2010.
P2Pedia A Distributed Wiki Network Management and Artificial Intelligence Laboratory Carleton University Presented by: Alexander Craig May 9 th, 2011.
CURRIKI --An Overview Presented to the Bioscience Interest Group Christine Loew Program Manager
Tajik Wikipedia Free Encyclopedia Ibrahim Rustamov Note: To view pages on the Internet properly with all Tajik letters, please.
Blogs, Wikis and Podcasting  By Zach, Andrew and Sam.
Using Wikis. What is a wiki? Hawaiian Word – meaning ‘quick’ A website or a document Real strength lies in its collaborative nature  Multiple people.
Wikis Eugene Bin, Katherine Dickson, Dev Doshi, Nick Ferla, Alexandra Lecompte.
COORENOR COORENOR Web Portal COORENOR Agenda Where we are? (Summarize features of the COORENOR web portal.) Where are we going? (Show how to.
Matt Hampshire Gene Goering Avery Hines Josh Green.
Employing Wikis for online collaboration in the e-learning environment: Case study 1 Raitman, R., Augar, N. & Zhou, W. (2005). Employing Wikis for online.
UNIT 8 SEMINAR COLLABORATION IN THE WORKPLACE SUSAN HARRELL KAPLAN UNIVERSITY CM 415 Effective and Appropriate Communication in the Workplace.
Lena Arena ICT Consultant, Sydney Region Creating Collaborative Blogs and Wikis.
Brought to you by the Geendale ICT committee Slides can be found at sciencepw.wikispaces.com Originated from the Hawaiian language. The.
Collaborative Peer Production In a Health Context Jimmy Wales President, Wikimedia Foundation Wikipedia Founder.
Wikispaces Welcome Wikispaces in K–12 Education [date and time] Welcome Read-only Web v. Read/Write Web Wikis Getting Started with Wikispaces Wrap-up and.
Allison Payne, GT Facilitator Oakdale Middle School Web 2.0 How-to for Educators by Gwen Solomon and Lynne Schrum, 2010.
Challenge Problem: Link Mining Lise Getoor University of Maryland, College Park.
Kaitlyn Graber, Kenny Henault, Mike Hoelzel, Aaron Hall.
C HAPTER Introduction to Web 2.0 Wikis Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 4.
INTRODUCTION TO MAPNET WIKI Anar Khan on behalf of AgResearch IS Bioinformatics, Mathematics and Statistics 10/10/2006.
What is a Wiki? A wiki is an online database that can be edited by anyone with access to it. “ Wiki ” is Hawaiian meaning ‘ fast ’ or ‘ quick ’
Wikis. What are Wikis? Could this be a Wiki? MoT0Ehttp:// MoT0E.
Eszter Hargittai, Northwestern University Hargittai, Connell, Klawitter & Litt, Northwestern University eszter.com Eszter Hargittai Delaney Family Professor.
AMAZON WEB SERVICES User Agreement Summary. The Services Free Services: These are the services we intend on using. They are collectively called the Amazon.
By: Jamie Morgan  A wiki is a web page or collection of web pages which you and your students can access to contribute or modify content without having.
Lesson 1 What is Wiki?. Objectives ● To provide an overview of what wikis are ● To show some examples of their different uses ● To discuss the advantages.
Lesson 1 What is Wiki?. Objectives ● To provide an overview of what wikis are ● To show some examples of their different uses ● To discuss the advantages.
Wikipedia & the Wikimedia Foundation
Wikispaces in K–12 Education
Stochastic Models of User-Contributory Web Sites
Wikipedia and Open Source Design
The WikiWorld IMKE CSC 2006 Kaido Kikkas.
Wikispaces in K–12 Education
Wikipedia, the free encyclopedia
Wikipedia Network Analysis: Commonality detection among Wikipedia authors Deepthi Sajja.
ICT Word Processing Lesson 5: Revising and Collaborating on Documents
Ben Jones - S Rebecca Hunter - S
COLLABORATING VIA BLOGS AND WIKIS
What Are Wikis, and Why Should You Use Them?
Presentation transcript:

The Collaborative Organization of Knowledge D. Spinellis and P. Louridas Strong Regularities in Online Peer Production D. Wilkinson Ziyad Aljarboua Monday, November 10, Harvard University

Intro - Wikipedia Free multilingual encyclopedia launched in 2001 Operated by the non-profit Wikimedia Foundation Contains 2,610,291 articles in English and 10 million in total 236 active language editions Content written by volunteers 2

Intro - Wikipedia Developed by Jimmy Wales and Larry Sanger Time’s 2006 list of the world’s most influential people “Largest and most popular general reference work on the internet”. Wikipedia 3 Source: Wikipedia

Intro - Wikipedia No formal peer-review and changes take effect immediately New articles are created by registered users but can be edited by anyone Redistribution, creation of derivative works and commercial use of content is permitted 25,000 to 60,000 page request per second 50% of traffic to Wikipedia comes from Google 4

Intro - Wikipedia Wikipedia contributors by country Source: Wikipedia 5

Intro - Wikipedia Article Count from Jan, 2001 to Sep 2007 Source: Wikipedia 6

Michael Scott from the office:"Wikipedia is the best thing ever. Anyone in the world can write anything they want about any subject, so you know you are getting the best possible information". Quality of articles undermined Bias: Content reflects contributors’ interest Wikipedia - Concerns 7

Wikipedia - vandalism 8

The Collaborative Organization of Knowledge Attempts to study Wikipedia’s growth: how human knowledge is recorded and organized through an open collaborative process (in Wikipedia) Examines relationship between existing and referenced nonexistent articles How existing entries foster development of new entries? 9

The Collaborative Organization of Knowledge Examines the recorded evolutionary development of Wikipedia's structure through article revisions and contributions Motivation: Wikipedia’s coverage has not declined while its scope sharply increased. 10

Growth Technologies and open participation policy behind rapid growth – Edit with no prior authorization – Edit history for all pages – Watchlist for users to alerts them for changes in their selected pages – Ability to revert changes if page is vandalized – Ability to lock entries against revisions – Easiness to link to other articles – Categorizing articles using markup tags 11

The Study Study processed all material on Wikipedia as of February of 2006 (485GB worth of xml documents) examined all recorded changes (28.2 million revisions on 1.9 million pages) and how entries were created and linked 12

General findings Reverting is returning page to previous version most of the time to undo vandalism 4% of article revisions were reverts Average time to revert a vandalized page is 13 hours 11% of pages that were reverted at least once had been vandalized at least once Most reverted and revised: George W. Bush with 28,000 revisions (2*9,300 reverts and vandalism) 2,441 entries (0.13%) locked 20% of articles were stubs 13

Conclusion 1 Creation of new Wikipedia entries is not a random process but is related to the references to nonexistent articles “what drives Wikipedia growth is the inclusion of red links, ie references to articles that do not exist yet.” Wikipedia 14

Conclusion 1 15

Conclusion 1 Mena number of references to a nonexistent article raised exponentially until the article was created. Once article is created, mean rises linearly or levels. 16

Inflationary/deflationary hypothesis Inflationary hypothesis: number of links to nonexistent articles increase at a higher rate than that of the new article creation Wikipedia is located in a midpoint between the two scenarios (thin coverage vs. decline in growth rate) 17

Wikipedia growth *Incomplete include nonexistent articles and stubs 18

Wikipedia growth Between 2003 and 2006, number of entries increased from 140,000 to 1.4 million and ration of complete/incomplete remained roughly the same Growth of Wikipedia partly attributed to splitting of articles (depth in articles translate into breadth) Rate of article creation vs rate of knowledge expansion ? 19

Wikipedia content Process of adding new articles that depends on current nonexistent referenced articles leads to content balance Articles are more likely to be written because they are popular (have many references leading to them) that because contributor is interested Are not most references originating from an articles will link to an article similar in subject? (assumes knowledge is a fully connected graph) 20

Finding 1 Process of referencing an nonexistent article and subsequent definition of that article seemed to be a collaborative effort. The person who referenced a nonexistent article and the person who started the referenced article was the same in only 3% of the cases Wikipedia growth is limited by number of contributors not individual contributors! 21

Conclusion 2 Wikipedia is a scale-free network 22

Scale-Free Network Degree of a node = number of connections to other nodes Degree distribution: probability distribution of degrees over entire network For degree j: P(j) = # nodes with degree j / # nodes Fraction of nodes with degree j to all nodes 23

Scale-Free Network A network where degree distribution follows a power law i.e. degree distribution approaches 1/j^s as j increases Fraction of nodes with degree j decreases as j (number of connections) increases 24

Scale-Free network 25 Source: Wikipedia

Building the network Models explaining why Wikipedia is scale-free: – Power laws result of an optimization process – Power laws result of growth model (preferential attachment model) Simple network: Wikipedia: Expected #reference: 26

Building the network 27

Strong Regularities in Online Peer Production D. Wilkinson 28

Introduction Open source software development, blogs, wikis, social networks… Some of most visited website … and continue to grow Online peer production share common macroscopic properties? 29

Objective Describe strong macroscopic regularities in people’s contributions to PPS (distribution of user participation and activity per topic) Examine basic dynamical rules guiding evolution of PPS Why distribution of levels of user participation is power law? Not a psychological analysis of contributors 30

Methodology Examines 4 different PPS: Wikipedia, Bugzilla, Digg, Essembly Data analyzed are exhaustive; involves all users and contributions SystemTime spanUsersTopicscontributions Wikipedia6y, 10m5.07M1.5M50M Bugzilla6y, 7m111K357k3.08M Digg3y1.05M3.57M105M Essembly1y, 4m12.04K24.9K1.31M 31

PPSs Wikipedia Essembly: social network for individuals to discuss and vote on political matters and organize to take action Bugzilla: bug-tracking system where developers report and collaborate to fix bugs Digg: news aggregator 32

User Participation Power law distribution: few dedicated members account for most activity Focus on inactive users (generality) % of Inactive: – Wikipedia: 71% of editors – Bugzilla: 95% of commentors – Digg: 61% of voters ; 56% of submitters – Essembly: 83% of voters ; 53% submitters Inactive: – Digg & Essembly: 3 months – Wikipedia & bugzilla: 6 months 33

Essembly Votes Digg Votes Essembly Resolves Bugzilla comments Wikipedia edits Digg submissions User Participation 34

User Contributions Power law exponent is strongly related to the system’s barrier to contribution (cost of contributions) Both active and inactive users have distribution of contributions that follows a power law 35

Participation Momentum When people stop participating? Momentum associated with user’s participation Probability of stop is inversely proportional to # of contributions 36

Participation Momentum 37

Exponent Significance Probability to contribute proportional to contribution cost (exponent) Power law exponent reflects cost to make a contribution 38

Distribution of count of all users (active+inactive) also follows power law but with smaller exponent User Participation % Inactive users All usersInactive users 39

Activity per topic # contributions/topic. (#edits/article) Popular topics attract more users  more edits. Results: – Distribution of contributions/topic is lognormal – Lognormal mean and variance depend linearly on time for topics where novelty decay is not a factor – Contributions to a topic increases its visibility and popularity. 40

Activity per Topic Contributions  popularity  more contributions (multiplicative reinforcement mechanism) Wikipedia Essembly Digg 41

Activity per Topic Number of articles Log(number of edits) Number of resolves Log(number of votes) 42

Activity per Topic Variance and mean depend linearly on age (t) of topic 43

Popularity factor – interface design Digg vs. Essembly vs. Wikipedia Small number of topics attracts vast majority of contributions (long-tail log dist. plots) 44

Discussion How size of a group coactively working together affect results? 45

Sources Wikipedia D. Spinellis and P. Louridas, “The Collaborative Organization of Knowledge” D. Wilkinson,” Strong Regularities in Online Peer Production” 46