Measuring Reliability in Wikipedia Wen-Yuan Zhu 2007.11.13.

Slides:



Advertisements
Similar presentations
Why should my organisation move to Internet Explorer 9? An upgrade guide for IT professionals.
Advertisements

A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Web Usage Mining Web Usage Mining (Clickstream Analysis) Mark Levene (Follow the links to learn more!)
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.
Application of Bayesian Network in Computer Networks Raza H. Abedi.
Measuring Author Contributions to the Wikipedia B. Thomas Adler, Luca de Alfaro, Ian Pye and Vishwanath Raman Computer Science Dept. UC Santa Cruz, CA,
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Identity Management Based on P3P Authors: Oliver Berthold and Marit Kohntopp P3P = Platform for Privacy Preferences Project.
Truth Discovery with Multiple Confliction Information Providers on the Web Xiaoxin Yin, Jiawei Han, Philip S.Yu Industrial and Government Track short paper.
To trust or not, is hardly the question! Sai Moturu.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
George H. Watson, University of Delaware Realizing the Promise of Problem-Based Learning in Higher Education Problem-Based Learning: A Process.
Dept. of Computer Science & Engineering, CUHK1 Trust- and Clustering-Based Authentication Services in Mobile Ad Hoc Networks Edith Ngai and Michael R.
A Content-Driven Reputation System for the Wikipedia Nan Li
An Authentication Service Against Dishonest Users in Mobile Ad Hoc Networks Edith Ngai, Michael R. Lyu, and Roland T. Chin IEEE Aerospace Conference, Big.
Chapter 5 Searching for Truth: Locating Information on the WWW.
Validating and Improving Test-Case Effectiveness Author: Yuri Chernak Presenter: Lam, Man Tat.
Using Wikispaces This work is licensed under a Creative Commons Attribution-Noncommercial- Share Alike 3.0 License. Skills: Wikispaces: editing and management.
“InPrivate” Jennifer Bui MIS 304 September 4, 2008 Professor Fang Jennifer Bui MIS 304 September 4, 2008 Professor Fang.
COMPUTER APPLICATIONS TO BUSINESS ||
SocialFilter: Introducing Social Trust to Collaborative Spam Mitigation Michael Sirivianos Telefonica Research Telefonica Research Joint work with Kyungbaek.
Trusting the user: Wikipedia as an example Daniel Mayer Wikimedia Foundation Free Culture and the Digital Library 14 October 2005.
Encyclopedias Sajjad ur Rehman. Purpose Ready reference source Secondary source Provide general overview of a topic and the background information Pointers.
A fast identification method for P2P flow based on nodes connection degree LING XING, WEI-WEI ZHENG, JIAN-GUO MA, WEI- DONG MA Apperceiving Computing and.
Chapter 5 Searching for Truth: Locating Information on the WWW.
Wiki Culture & Collaboration Presented by: Faria Sami Quratulain Shattari Munim Ahmed Zaid Nizami.
Network and Systems Security By, Vigya Sharma (2011MCS2564) FaisalAlam(2011MCS2608) DETECTING SPAMMERS ON SOCIAL NETWORKS.
Investigations into Trust for Collaborative Information Repositories: A Wikipedia Case Study Deborah L. McGuinnessDeborah L. McGuinness, Co-Director Knowledge.
Wikis Chanaka Wickramasinghe Library Assistant /NSLRC Web based information dissemination:
Detecting Promotional Content in Wikipedia Shruti Bhosale Heath Vinicombe Ray Mooney University of Texas at Austin 1.
LinkWare LinkWare is a web-enabled, open platform for generation and distribution of electronic technical documentation and e–catalogues. The LinkWare.
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Introduction to the 10th Harmonisation conference Helge Rørdam Olesen National Environmental Research Institute (NERI) Denmark Chairman of the initiative.
ICT development office ICT research, planning and training dept. Network development and administration dept. System development and operation dept. President.
WIKIPEDIA’S INVESTMENT PRESENTATION. Free encyclopedia Collects and summarizes information Into over 250 different languages Information is provided world-wide.
P2Pedia A Distributed Wiki Network Management and Artificial Intelligence Laboratory Carleton University Presented by: Alexander Craig May 9 th, 2011.
High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.
Literacy in Information: Evaluating Internet Resources Jennifer Fendrick & Nicole Christensen In order to properly evaluate a website, the.
Working With Wikis: Taking Student Organizations Online Frederic Murray, MLIS Instructional Services Librarian Al Harris Library
21 May 2007Council of Science Editors AuthorAID Knowledge Community 1. Global/Open: (Barbara Gastel) for all AuthorAIDs Push out info Exchange ideas 2.
Page 1 Inferring Relevant Social Networks from Interpersonal Communication Munmun De Choudhury, Winter Mason, Jake Hofman and Duncan Watts WWW ’10 Summarized.
Tajik Wikipedia Free Encyclopedia Ibrahim Rustamov Note: To view pages on the Internet properly with all Tajik letters, please.
Understanding User’s Query Intent with Wikipedia G 여 승 후.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
 Think about your favorite movie.  Think, specifically, about why you thought it was good. › What did you look for? › What made it your favorite? 
Activity 4 Protecting Ourselves. Keeping Safe There are lots of different ways we can be at risk on the Internet. How can we protect ourselves and keep.
Time-Space Trust in Networks Shunan Ma, Jingsha He and Yuqiang Zhang 1 College of Computer Science and Technology 2 School of Software Engineering.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
C HAPTER Introduction to Web 2.0 Wikis Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 4.
CiteData: A New Multi-Faceted Dataset for Evaluating Personalized Search Performance CIKM’10 Advisor : Jia-Ling, Koh Speaker : Po-Hsien, Shih.
KNOWLEDGE MODELING FOR READINESS SELF-ASSESSMENT Presented by: Fuhua Lin Aothors: Dunwei Wen, Ken Dickson, Fuhua Lin Athabasca University, Alberta, Canada.
Abstract  An abstract is a concise summary of a larger project (a thesis, research report, performance, service project, etc.) that concisely describes.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
IT.CAS.Web2.0 Kyle Erickson
FATMA ISMED K1-09 Websites in ELT.
by Dr. Nikolas Stylianides
Advanced Techniques for Automatic Web Filtering
Wikitology Wikipedia as an Ontology
Advanced Techniques for Automatic Web Filtering
Searching for Truth: Locating Information on the WWW
Using a wiki Skills: using a wiki
Searching for Truth: Locating Information on the WWW
Searching for Truth: Locating Information on the WWW
Leverage Consensus Partition for Domain-Specific Entity Coreference
Wikis Can only have 100 users, unlimited pages in the free version
Presentation transcript:

Measuring Reliability in Wikipedia Wen-Yuan Zhu

Outline Introduction Some term of Wikipedia Basic concept of measuring reliability A way to measure reliability Conclusion Reference

Introduction Wikipedia is the most popular online cooperation cyclopedia it has rich phenomenon which is difference to internet network and common webs

Some term of Wikipedia

Some term of Wikipedia(2) feature article – to be considered to be the best articles in Wikipedia – as determined by Wikipedian – at present, there are 1683 featured articles

Some term of Wikipedia(3) if an article is a feature article, it will show the icon at right corner

Some term of Wikipedia(4) articles are reviewed at Wikipedia:Featured article candidates according to Wikipedia:Featured article criteria

Some term of Wikipedia(5) make sure that it meets all of the featured article criteria consensus must be reached that it meets the criteria

Some term of Wikipedia(6) articles that no longer meet the criteria can be proposed for improvement or removal at Wikipedia:Featured article review

Some term of Wikipedia(7) clean-up article – cleanup issues that this project covers may include wikification, spelling, grammar, tone, and sourcing – anyone can require to cleanup some page in Wikipedia:Cleanup

Some term of Wikipedia(8)

Basic concept of measuring reliability if the article has the higher link ratio, the article has the higher reliability this part referred to [2]

Basic concept of measuring reliability(2) class of terms

Basic concept of measuring reliability(3) relation between full name and short

Basic concept of measuring reliability(4) Relation between PageRank and Link-ratio

Basic concept of measuring reliability(5) it is not enough to measuring reliability only rely on linking data there are too many factors to influence reliability of article in Wikipedia

A way to measure reliability to use Bayesian statistic to model reliability in Wikipedia to use revision history to assess the reliability of article in Wikipedia this part referred to [3]

A way to measure reliability(2)

A way to measure reliability(3) article trust – trustworthiness of a version of an article fragment trust – trustworthiness of a fragment in a version of an article author trust – trustworthiness of an author

A way to measure reliability(4) is the version of an article is the trust value of the author who revised is the trust value of is the inserted content in by is the deleted content in by is the size of

A way to measure reliability(5)

A way to measure reliability(6) Dynamic Bayesian networks – to be defined by a pair is the graph structure of the network is the set of the network’s conditional density distributions

A way to measure reliability(7) from to, the state at the revision is represented as a quad the states satisfies the Markov property – since,

A way to measure reliability(8)

A way to measure reliability(9) to determine the posterior density distribution of is fully characterized by and

A way to measure reliability(10) the Beta distribution where is the beta function with and

A way to measure reliability(11)

A way to measure reliability(12) to assume let is the mean of then or

A way to measure reliability(13)

A way to measure reliability(14) featured articles – considered highly trustworthy clean-up articles – considered untrustworthy Normal articles – remaining articles

A way to measure reliability(15) administrators – registered authors – anonymous authors – blocked users –

A way to measure reliability(16) a set of English articles from the Geography category in Wikipedia in January featured articles 50 clean-up articles 768 normal articles manually classify

A way to measure reliability(17) U.S. National Forest in Wikipedia created by an anonymous author

A way to measure reliability(18) is mean of the posterior density distribution

A way to measure reliability(19) to developed a classifier based on aforementioned 50 featured articles and 50 clean-up articles the training set contains 100 pairs, where is the trust value of an article and is its class

A way to measure reliability(20) the learned rule for feature article is the test size of 200 new articles(48805 revisions) was evaluated the accuracy of prediction is 82%

A way to measure reliability(21) to use trust track to predict events

A way to measure reliability(22) the method has some problems – the reliability of author is not a constant – the test set of classifier is too small – what is the predicting standards of predict events

Conclusion An overview of Wikipedia and measuring reliability in Wikipedia to introduce some ways to measuring reliability in Wikipedia to realize difficult problems of measuring reliability in Wikipedia

Reference [1] [2] D. McGuinness, H. Zeng, Pda Silva, LDing, DNarayanan, and MBhaowal. Investigation into trust for collaborative information repositories: A Wikipedia case study. In Proceedings of the Workshop on Models of Trust for the Web, [3] H. Zeng, M. Alhoussaini, L. Ding, R. Fikes, and D. McGuinness. Computing trust from revision history. In Intl. Conf. on Privacy, Security and Trust, 2006.