Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wikipedia Network Analysis: Commonality detection among Wikipedia authors Deepthi Sajja.

Similar presentations

Presentation on theme: "Wikipedia Network Analysis: Commonality detection among Wikipedia authors Deepthi Sajja."— Presentation transcript:

1 Wikipedia Network Analysis: Commonality detection among Wikipedia authors
Deepthi Sajja

2 Objective Number of contributors editing a Wikipedia article.
Key authors of the article

3 Objetive Key authors of the article
Compare growth rate with other language Wikipedia.

4 Data Extraction Wikimedia database dumps:All pages with complete edit history. (several terabytes) for english alone. Another way: Small set of pages using Wiki's Export page.

5 Computing Edit Network
WikiEvent tool WikiEvent is used to extract the revisions in chunks. WikiEvent input: history files of pages in xml format.

6 Edit types: add,delete,restore,undelete Target:page or user
Consider an example of three revisions on one page where (in Revision 1) user Alice adds some new text to the page; subsequently (in Revision 2), user Bob deletes this text; then (in Revision 3), user Charlie reverts Bob's edit - setting back the page text to the one submitted in Revision 1

7 WikiEvent Output PageTitle;RevisionID;Time(calendar);Time(milliseconds);InteractionType;WordCount;ActiveUser;Target "Social network analysis"; ; T21:08:52Z; ;added;196;" ";"Social network analysis" "Social network analysis"; ; T06:13:44Z; ;added;10;" ";"Social network analysis" "Social network analysis"; ; T06:13:44Z; ;deleted;192;" ";" " "Social network analysis"; ; T22:42:43Z; ;added;54;"Davodd";"Social network analysis" "Social network analysis"; ; T22:42:43Z; ;deleted;7;"Davodd";" " "Social network analysis"; ; T13:29:11Z; ;added;1;" ";"Social network analysis" "Social network analysis"; ; T13:29:11Z; ;deleted;1;" ";"Davodd

8 From this output we can calculate number of edits performed by contributors on a single article.
We can filter out data by choosing specific type we need in the csv file.

9 Importing network to Visone
Visone: Used for analyzing and visualization of complex networks. CSV file with the computed edit events can be imported in visone.

10 Data Filtering

11 Event iterator

12 Event network Specify the link attributes
How events of various types add to these attributes, and how they change over time. Required for the evolution of event network.


14 Attributes: Added,delted,restored,undeleted
Specify halftime - defining how fast attributes decay over time. Useful for network snapshots over the timeslot.

15 A halftime equal to zero or negative indicates that the respective attribute does not decay over time. Attributes:added, deleted has no decay. They just adds up the weight. Attribute: recently added may have decay.

16 We need to specify Identity of weights in weight function table.
To establish the identity: Attribute: deleted Event type: deleted Here weights of events of type deleted are added. For attribute : interacted Event type: added, on

17 Network Visualization
Bi-Partite: Nodes: pages Nodes:Users cotributed The link attributes encode (in our case) the number of words added.


19 Questions?

Download ppt "Wikipedia Network Analysis: Commonality detection among Wikipedia authors Deepthi Sajja."

Similar presentations

Ads by Google