Investigations into Trust for Collaborative Information Repositories: A Wikipedia Case Study Deborah L. McGuinnessDeborah L. McGuinness, Co-Director Knowledge.

Slides:



Advertisements
Similar presentations
Open repositories: value added services The Socionet example Sergey Parinov, CEMI RAS and euroCRIS.
Advertisements

CONCEPTUAL WEB-BASED FRAMEWORK IN AN INTERACTIVE VIRTUAL ENVIRONMENT FOR DISTANCE LEARNING Amal Oraifige, Graham Oakes, Anthony Felton, David Heesom, Kevin.
Measuring Reliability in Wikipedia Wen-Yuan Zhu
The Web of data with meaning... By Michael Griffiths.
K S L W i n e A g e n t : Testbed Application for Semantic Web Technologies Deborah McGuinness Eric Hsu Jessica Jenkins Rob McCool Sheila McIlraith Paulo.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Explanation in GILA 2 Stanford -> RPI McGuinness, Ding January 15, 2008.
To trust or not, is hardly the question! Sai Moturu.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
GenSpace: Exploring Social Networking Metaphors for Knowledge Sharing and Scientific Collaborative Work Chris Murphy, Swapneel Sheth, Gail Kaiser, Lauren.
A Semantic Sommelier as an Ontology-powered Mobile Social Application and a Pedagogical Tool Deborah L. McGuinness and Evan W. Patton.
Statistical Relational Learning for Link Prediction Alexandrin Popescul and Lyle H. Unger Presented by Ron Bjarnason 11 November 2003.
Wikis This work is licensed under a Creative Commons Attribution-Noncommercial- Share Alike 3.0 License. Skills (application development): wiki editing.
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
The Semantic Web Week 1 Module Content + Assessment Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Russell Taylor Lecturer in Computing & Business Studies.
Tutorial 8 Sharing, Integrating and Analyzing Data
Knowledge Provenance in Semantic Wikis Li Ding, Jie Bao, and Deborah McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute Troy,
Explanation and Trust for Adaptive Systems Alyssa Glass (Stanford / SRI / Willow Garage) In collaboration with Deborah McGuinness (Stanford/RPI), Michael.
Computational Thinking Related Efforts. CS Principles – Big Ideas  Computing is a creative human activity that engenders innovation and promotes exploration.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
By: Shawn Li. OUTLINE XML Definition HTML vs. XML Advantage of XML Facts Utilization SAX Definition DOM Definition History Comparison between SAX and.
Web 2.0: Concepts and Applications 2 Publishing Online.
Web Explanations for Semantic Heterogeneity Discovery Pavel Shvaiko 2 nd European Semantic Web Conference (ESWC), 1 June 2005, Crete, Greece work in collaboration.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Tutorial 1 Getting Started with Adobe Dreamweaver CS3
WHATAREWIKIS? A wiki is a page or collection of web pages designed to enable anyone who accesses it to contribute or modify content, using a simplified.
SWE 316: Software Design and Architecture – Dr. Khalid Aljasser Objectives Lecture 11 : Frameworks SWE 316: Software Design and Architecture  To understand.
Explanation: The Next Phase in Question Answering Deborah L. McGuinness Knowledge Systems Laboratory Stanford University
Web 2.0 Tools Used in the Finance/Investment Management Industry.
Understanding PML Paulo Pinheiro da Silva. PML PML is a provenance language (a language used to encode provenance knowledge) that has been proudly derived.
Build it Tweak it Use it Know it Love it. A tool to collaborate on projects What does Collaborate mean? To work together.
P2Pedia A Distributed Wiki Network Management and Artificial Intelligence Laboratory Carleton University Presented by: Alexander Craig May 9 th, 2011.
Search Engine Architecture
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
Introduction to HTML. Today’s Discussion What is HTML ? What is HTML ? What is Web Page ? What is Web Page ? Web Server Web Server Web Browser Web Browser.
McGuinness Oct 17, Eurasian Pygmy Owl -Glaucidium passerinum – picture Romek Mikusek Explanation BOF Moderator: Deborah McGuinness.
1 Foundations VI: Provenance Deborah McGuinness and Peter Fox CSCI Week 12, November 30, 2009.
Enabling Explanations: The Inference Web and PML Approach Deborah McGuinness, Paulo Pinheiro da Silva, Li Ding Knowledge Systems Laboratory Stanford University.
OWL Representing Information Using the Web Ontology Language.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Information Dynamics & Interoperability Presented at: NIT 2001 Global Digital Library Development in the New Millennium Beijing, China, May 2001, and DELOS.
Faculty Faculty Richard Fikes Edward Feigenbaum (Director) (Emeritus) (Director) (Emeritus) Knowledge Systems Laboratory Stanford University “In the knowledge.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding,
Explainable Adaptive Assistants Deborah L. McGuinness, Tetherless World Constellation, RPI Alyssa Glass, Stanford University Michael Wolverton, SRI International.
UI's for inputting and presenting the metadata of hypermedia documents Kai Kuikkaniemi HUT T
Word 2007® Business and Personal Communication How can Microsoft Word 2007 help you work with others?
Explainable Adaptive Assistants Deborah L. McGuinness, Tetherless World Constellation, RPI Alyssa Glass, Stanford University Michael Wolverton, SRI International.
The Discourse District A Tool for Communal Organization of Knowledge and Community Organization by Knowledge A dynamic repository for community writings,
Explanation Infrastructure Supporting Transparency and Accountability Deborah L. McGuinness Co-Director and Senior Research Scientist Knowledge Systems,
Wikis: tools for collaboration Ace School Librarianship ICT Applications.
Social Information Processing March 26-28, 2008 AAAI Spring Symposium Stanford University
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Excel Services Displays all or parts of interactive Excel worksheets in the browser –Excel “publish” feature with optional parameters defined in worksheet.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Annotating and Embedding Provenance in Science Data Repositories to Enable Next Generation Science Applications Deborah L. McGuinness.
Microsoft Excel Illustrated Introductory Workbooks and Preparing them for the Web Managing.
By: Jamie Morgan  A wiki is a web page or collection of web pages which you and your students can access to contribute or modify content without having.
Searching for Scientific Research Using Environmental Index (EBSCO)
Overview Blogs and wikis are two Web 2.0 tools that allow users to publish content online Blogs function as online journals Wikis are collections of searchable,
Encoding Extraction as Inferences
Understanding PML A Proof Markup Language
Manuscript Transcription Assistant Initiative
ICT Word Processing Lesson 5: Revising and Collaborating on Documents
Foundations VI: Provenance
Tutorial 7 – Integrating Access With the Web and With Other Programs
Web Programming : Building Internet Applications Chris Bates CSE :
Presentation transcript:

Investigations into Trust for Collaborative Information Repositories: A Wikipedia Case Study Deborah L. McGuinnessDeborah L. McGuinness, Co-Director Knowledge Systems, Artificial Intelligence Lab, Stanford University Joint work with: Honglei Zeng, Paulo Pinheiro da Silva, Li Ding, Dhyanesh Narayanan, and Mayukh Bhaowal

May 22, 2006MTW - McGuinness Big Picture  Research theme Make question answering systems more operational to users (agents/humans) by providing explanations for answers…  In many settings, explanations require some notion of trust in information and/or sources

May 22, 2006MTW - McGuinness Trust is a Critical Emerging Component in Social Collaborative Information Spaces  Goal: Allow users to access, view, and analyze information informed by trust ratings. This enables users (and agents) to: Assess the trustworthiness of documents that are collaboratively created and updated Monitor the changes in trustworthiness of dynamic documents and provide timely notifications of possible malicious content modification Identify trustworthy information with visualization tools Access shareable trust information among heterogeneous systems Enable new design paradigms for Wikis with built-in trust components – e.g., target text analytic tools at more trustworthy documents or document fragments within a larger resource such as Wikipedia

May 22, 2006MTW - McGuinness Some Issues Relevant to Collaborative Information Repositories/Wikis and Trust  Revisions: a key characteristic of Wikis Some social collaborative spaces, such as Wikis allow (and sometimes promote) updates to posts from others. Note that this differs from traditional bulletin boards, archived mailing lists, etc. that only support revision by way of follow-up posts  Rating-based systems Some web systems support and encourage explicit ratings of contributors and contributions Wikis have no explicit trust encoding support Simple rating schemes may not work (e.g. an article rated trustworthy may not still be trustworthy if modified)  We are exploring computational approaches to trust exploiting prominent Wiki features including: Citation-based trust approach (Wiki articles are interlinked via citations/hyperlinks) Revision-history based trust approach

May 22, 2006MTW - McGuinness Terms  Concepts Article Version (of an article) Fragment Author  Relations An article may have multiple versions, each of which reflects the modification made by an author on a previous version A version can be split into multiple fragments, each of which is entirely contributed by a single author Article Version Fragment Author hasFragment:[1,p] hasVersion:[1,n] hasAuthor:[1,m] hasAuthor:[1,1]

May 22, 2006MTW - McGuinness Citation-based Trust  Derive trust based on the citation relationships among articles For example, a well-cited article may be more trustworthy than an article that has no citations  In the same family as the well known (Google) PageRank.

May 22, 2006MTW - McGuinness Link-ratio Algorithm  Link-ratio of an article (i.e., the page with title x): the ratio between the number of citation occurrences of the encyclopedia term x and the number of total occurrences of x (citations and non-citations). For example, “Seattle” appears 3855 times in Wikipedia, 1408 of which are citations (other mentions are not hot). The link-ratio value of “Seattle” is 1408/3855 =  Generally speaking*, the higher the link-ratio value of an article is, the more trustworthy an article is.  Issue: there may be no incentive to link to an encyclopedia entry (e.g. the “love” article vs. the “Gauss's law” article)

May 22, 2006MTW - McGuinness Revision History-based Trust (an example of the “natural number” article in Wikipedia)  When (an anonymous author) inserted new content into the “natural number” page, originated by Trovatore, there could be an assumption of implicit trust in the original document fragment(s). Trovatore isAuthorOf Content Insertion v0: Oct 7, 2005v1: Dec 1, 2005 Natural number can mean either a positive integer (1, 2, 3,...) or a non-negative integer (0, 1, 2, 3, …) The former definition is generally used in number theory, while the latter is preferred in set theory.

May 22, 2006MTW - McGuinness Deriving Trust from Revision History  Revision Operations (insertion, deletion, modification) implies trust. trustworthiness of the revised article depends on the trustworthiness of the previous version, the author of the last revision, and the amount of text involved in the last revision.  Revision history is widely available in cooperative information systems: Collaborative Software Development (CVS) Cooperative Document Authoring (Wikipedia)

May 22, 2006MTW - McGuinness A formulation of Revision Trust  (Assumption) The trustworthiness of a new article fragment is (only) dependent on its author.  (Assumption) the trustworthy content of a revised fragment f ’ is the trustworthy content of the previous fragment f minus the trustworthy content that the author a removed from f (e.g., a fragment f could be more trustworthy if the deletion made by a has removed inaccuracies in f) t f, t f ’, t a are trust values of f, f ’ and a respectively; |f|, |f ’| and |D| are the sizes of f, f ’ and D (D is the deleted text).

May 22, 2006MTW - McGuinness Inference Web and PML  Inference Web is an infrastructure for providing explanations of results from web applications. It provides tools such as browsers, abstractors, checkers, summarizers, combiners to manipulate and present justifications.  PML is the interlingua representation language for Inference Web. Proof markup language (PML) is a representation language designed to be able to encode information agents may need in order to evaluate results – including where information came from and how it was manipulated.  PML has an OWL encoding (and XML serialization)  PML can be (and has been used) to represent justification of information manipulation steps done by theorem provers (e.g., JTP, SNARK), text analytic tools (e.g., UIMA), task processors (e.g., SPARK), rule engines/systems (e.g., CWM, Cybercop), etc.  The main components concern inference representation and provenance issues such as author, source, etc.  Our current work expands PML to include representation primitives for trust.

May 22, 2006MTW - McGuinness fragment A Sample PML encoding fragment trust author trust

May 22, 2006MTW - McGuinness Proof Markup Language: Node Sets and Inference Steps iw:hasConclusion: Direct Assertion (DA) iw:NodeSet iw:isConsequenceOf iw:InferenceStep iw:hasLanguage:en iw:hasRule: iw:hasSourceUsage: Conclusion: In mathematics, a natural number is either a positive integer (1, 2, 3, 4,...) or a non-negative integer (0, 1, 2, 3, 4,...). Encoding this conclusion in PML: articleID, author, timestamp In mathematics, a natural number is either a positive integer (1, 2, 3, 4,...) or a non-negative integer (0, 1, 2, 3, 4,...).

May 22, 2006MTW - McGuinness Proof Markup Language: Aggregated Trust Relation Wikipedia iw: AggregatedTrustRelation iw:hasTrustedParty: iw:hasTrustingParty: iw:hasTrustValue: A trivial conclusion: In mathematics, a natural number is either a positive integer (1, 2, 3, 4,...) or a non-negative integer (0, 1, 2, 3, 4,...). Encoding trust conclusion in PML: Wikipedia author

May 22, 2006MTW - McGuinness Application: Trust View in Wikipedia Wikipedia Database article revision author Article D (version, author) + Fragmentation Service Wikipedia DB processor Article D (fragment, version)+ (fragment, author)+ Trust Valuation Service Trust Rendering Service PML for D Article D (fragment, trust)+ (version, trust)+ (author, trust)+ HTML for D User Click “trust” tab Wikipedia User Click “pml” tab Wikipedia view input output input output Article D (version, author)+ citations, …

May 22, 2006MTW - McGuinness Wikipedia Article without Trust View

May 22, 2006MTW - McGuinness Wikipedia Article with Citation Trust View Multiple Trust View Tab Fragments are colored per their trust values computed from Citation Trust (default mode).

May 22, 2006MTW - McGuinness Wikipedia Article with Revision Trust View Fragments are colored per their trust values computed from Revision Trust.

May 22, 2006MTW - McGuinness Conclusion  Inference Web and PML can be used to support encoding and presentation of trust related to information in social collaborative information repositories such as Wikis.  We have designed and implemented a simple trust representation that extends PML and included support for the extension in our IW tools.  More sophisticated trust modeling and trust processing is expected to be required.  We are investigating Models of trust Trust aggregation from multiple sources and multiple algorithms Refinements and usage of revision-based trust Additional trust approaches and their combination New applications utilizing (sharable) trust information More info: Inference Web: iw.stanford.edu Simple examples of PML markup with wiki demo: foto.stanford.edu/mediawiki /index.php/Main_Page

May 22, 2006MTW - McGuinness Extra

May 22, 2006MTW - McGuinness Abstract PML wiki:ArticleVersion iw:NodeSet In mathematics, a natural number is either a positive integer … iw:Person Oleg Alexandrov iw:AggregatedTrust fragment trust is iw:AggregatedTrust author trust is iw:NodeSet (fragment n) … wiki:hasFragmentList iw:Person (author m) iw:hasSource iw:Organization Wikipedia iw:hasTrustingParty iw:hasTrustedParty Note: Green nodes are in IW registry