Wikipedia Network Analysis: Commonality detection among Wikipedia authors Deepthi Sajja
How many of you have ever edited wikipedia article/articles? How many of you have analysed or read about wikipedia network?
Introduction: Evolution of Wikipedia Wikipedia began as a complementary project for Nupedia in 2001. Articles were written by experts and reviewed under a formal process. Goal of making a publicly editable encyclopedia.
Content distribution of Wikipedia
Growth of Wikipedia As of October 25, 2016 there are 5,269,891 articles
Why Do People Write for Wikipedia? An interview was conducted with 22 volunteer encyclopedia writers in the fall of 2004 and spring of 2005 Volunteers include people who spent up to 30 hours a week
Motivation behind the contributors Like scientists,contributors to Wikipedia seek to collaboratively identify and publish true facts about the world.
Motivation behind the contributors credibility Wikipedia has indirect attribution of authorship. Most have been edited numerous times by numerous people and explicit attribution would seem to be impossible.
Inequality of Contributions great number of authors with few contributions Small group of authors contribute to large number of articles and the other group contributes to one or two articles and also mostly participates in editing the existing articles.
Slowing Growth of Wikipedia Till 2007, Wikipedia has characterized the growth in content and editors as being fundamentally exponential in nature.
Active editors analysis
Why network structure matters Disputed vs Undisputed articles We need to look at structural features of the network rather than just at their attribute measures. Varying edit histories and reverts due to variations in the bipolarity.
Objective: To try and analyze the future growth of Wikipedia network. Centered towards the individual articles vs contributors rather than whole network
Why previous works are not so reliable Most of the analysis was done between 2004-2008. Wikipedia network is different from typical social networks. Growth is purely dependent on contributors. Potential risk of core authors exhauting contributions.
Number of contributors editing a Wikipedia article Features considered Number of contributors editing a Wikipedia article Rather than number of edits of whole network and contributors active history,I consider contributors per article over one year span based on the latest data dumps and go back up to last five years for comparison purposes.
Features considered (contd) Key authors of the article Previous attempts were made to rank the authors based on their global contributions of the Wikipedia articles Identify the key authors of individual article based on the edit , revert history and information level presented. Find out how frequently the article creator becomes the key author
Features considered (contd) Frequency of set of authors having contributions to common Wikipedia articles.
Questions?