UMBC an Honors University in Maryland The Semantic Web in use: Analyzing FOAF Documents Li Ding, Lina Zhou, Tim Finin and Anupam Joshi University of Maryland,

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

ROWLBAC – Representing Role Based Access Control in OWL
April 23, 2007McGuinness NIST Interoperability Week One Ontology Spectrum Perspective Deborah L. McGuinness Acting Director & Senior Research Scientist.
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Jim Hendler Chief Scientist - Information Systems Office DARPA.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Semantic Web Thanks to folks at LAIT lab Sources include :
CS570 Artificial Intelligence Semantic Web & Ontology 2
5/17/20151 FOAF. 5/17/20152 Introduction Metadata is data about data The terms refer to data used to identify, describe, or locate information resources.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Flink: Lessons of interoperability Peter Mika Dept. of Business Informatics Free University Amsterdam 1 st Intl. Workshop on.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
RDF Kitty Turner. Current Situation there is hardly any metadata on the Web search engine sites do the equivalent of going through a library, reading.
Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.
Semantic Web Presented by: Edward Cheng Wayne Choi Tony Deng Peter Kuc-Pittet Anita Yong.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
Semantic Web Series 1 Mohammad M. R. Cowdhury UniK, Kjeller.
Ehsan Zamiri Supervisor: Dr. Kahani Ferdowsi University of Mashad FOAF: Semantic Based NameSpace for Social Networking.
Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection Boanerges Aleman-Meza, Meenakshi Nagarajan,
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
UMBC an Honors University in Maryland 1 Knowledge Sharing on the Semantic Web Tim Finin University of Maryland, Baltimore County Department of Homeland.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Logics for Data and Knowledge Representation
@ Swoogle Tutorial (Part II: Swoogle Demo) A canned demo Use-case: UMBC tree survey Presented by eBiquity Lab, CSEE, UMBC.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
The INTERNET how it works. the internet: defined So, what is it?
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
@ Presented by eBiquity group, UMBC CIKM’04, Nov 12, 2004 SwoogleSwoogle SwoogleSwoogle search and metadata for the semantic web Partial research support.
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
Semantic Web - an introduction By Daniel Wu (danielwujr)
Ontologies Come of Age Deborah L. McGuinness Stanford University “The Semantic Web: Why, What, and How, MIT Press, 2001” Presented by Jungyeon, Yang.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
UMBC an Honors University in Maryland 1 Information Integration and the Semantic Web Finding knowledge, data and answers Tim Finin University of Maryland,
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
UMBC an Honors University in Maryland 1 Finding knowledge, data and answers on the Semantic Web Tim Finin University of Maryland, Baltimore County
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Dr. Lowell Vizenor Ontology and Semantic Technology Practice Lead Alion Science and Technology Semantic Technology: A Basic Introduction.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
Microsoft Research Faculty Summit Jennifer Golbeck Assistant Professor, College of Information Studies University of Maryland, College Park Social.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
UMBC an Honors University in Maryland 1 Finding and Ranking Knowledge on the Semantic Web Li Ding, Rong Pan, Tim Finin, Anupam Joshi, Yun Peng and Pranam.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Characterizing Knowledge on the Semantic Web with Watson Mathieu d’Aquin, Claudio Baldassarre, Laurian Gridinoc, Sofia Angeletou, Marta Sabou, Enrico Motta.
PREMIS Controlled vocabularies Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair Vienna,
@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle.
UMBC an Honors University in Maryland 1 Searching for Knowledge and Data on the Semantic Web Tim Finin University of Maryland, Baltimore County
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
1 Web Services for Semantic Interoperability and Integration Tim Finin University of Maryland, Baltimore County Dagstuhl, 20 September 2004
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
1 Intelligent Information System Lab., Department of Computer and Information Science, Korea University Semantic Social Network Analysis Kyunglag Kwon.
Event Linking With Meaning: Ontological Hypertext and the Semantic Web Hugh Davis Learning Societies Lab ECS The University of Southampton, UK All Notes.
@ How the Semantic Web is Being Used: An Analysis of FOAF Documents Li Ding, Lina Zhou, Tim Finin, Anupam Joshi eBiquity Lab, Department of CSEE University.
Swoogle: A Semantic Web Search and Metadata Engine Li Ding, Tim Finin, Anupam Joshi, Rong Pan, R. Scott Cost, Yun Peng Pavan Reddivari, Vishal Doshi, Joel.
Introduction to the Semantic Web. Questions What is the Semantic Web? Why do we want it? How will we do it? Who will do it? When will it be done?
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Building the Semantic Web
Finding knowledge, data and answers on the Semantic Web
ece 627 intelligent web: ontology and beyond
Knowledge Discovery in the Semantic Web
SWD = SWO + SWI SWD Rank SWD IR Engine
Presented by ebiqity UMBC Nov, 2004
Visit Swoogle web site at
PREMIS Tools and Services
Presentation transcript:

UMBC an Honors University in Maryland The Semantic Web in use: Analyzing FOAF Documents Li Ding, Lina Zhou, Tim Finin and Anupam Joshi University of Maryland, Baltimore County DARPA contract F and NSF awards ITR-IIS and ITR- IIS provided partial research support for this work

UMBC an Honors University in Maryland Outline Motivation Introduction  The six popular ontologies  FOAF vocabulary  Why FOAF Building FOAF Document collection  FOAF Document Identification  FOAF Document Discovery  Popular Properties of foaf:Person Applications  Personal Information Fusion  Social Network Analysis

UMBC an Honors University in Maryland The Semantic Web The semantic web vision is that information and services are described using shared ontologies in KR-like markup languages, making them accessible to machines (programs). How do we get there?  What kind of ontologies? IEEE SUO? Cyc?  What kind of languages? RDF? OWL? RuleML? It’s reasonable to start with the simple and move toward the complex  From Dublin Core to CYC  From RDF to OWL and beyond Significant semantic web content exists today  Using simple vocabularies (e.g., FOAF) and RDF/RDFS

UMBC an Honors University in Maryland The Semantic Web The more important word in “Semantic Web” is the latter The KR aspects of the SW were taken off the shelf, the result of 25 years of research done in the AI community Remember hypertext? It was a nice research backwater going back to the 50’s (recall Memex and Xanadu)  Hypertext was forever change by the Web  So maybe the web will forever change KR TBL: “The Semantic Web will globalize KR, just as the WWW globalize hypertext”

UMBC an Honors University in Maryland Web of what? What features does the web bring to the table? “Anyone can say anything about anything” The meaning of RDF terms will be (partly) determined socially It’s a web of documents, services, agents and people

UMBC an Honors University in Maryland What kind of Ontologies? Catalog/ID General Logical constraints Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal is-a Formal instance Value Restriction Disjointness, Inverse, part of… After Deborah L. McGuinness (Stanford) Taxonomies Expressive Ontologies Wordnet CYC RDFDAML OO DB SchemaRDFS IEEE SUOOWL UMLS VocabulariesSimple Ontologies

UMBC an Honors University in Maryland The Semantic Web Today There are several simple RDF vocabularies that are widely used today  Dublin Core  RSS  FOAF It’s instructive to study how these are being used today And to track how their usage changes

UMBC an Honors University in Maryland The Six Most Popular Ontologies RDF DC RSS FOAF RDFS MCVB The statistics is generated by

UMBC an Honors University in Maryland A usecase: FOAF FOAF (Friend of a Friend) is a simple ontology to describe people and their social networks.  See the foaf project page: We recently crawled the web and discovered over 1,500,000 valid RDF FOAF files.  Most of these are from seveal blogging system that encode basic user info in foaf  See Tim Finin 2410…37262c252e

FOAF vocabulary

UMBC an Honors University in Maryland FOAF: why RDF? Extensibility! FOAF vocabulary provides 50+ basic terms for making simple claims about people FOAF files can use other RDF terms too: RSS, MusicBrainz, Dublin Core, Wordnet, Creative Commons, blood types, starsigns, … RDF guarantees freedom of independent extension  OWL provides fancier data-merging facilities Result: Freedom to say what you like, using any RDF markup you want, and have RDF crawlers merge your FOAF documents with other’s and know when you’re talking about the same entities. After Dan Brickley,

UMBC an Honors University in Maryland No free lunch! Consequence: We must plan for lies, mischief, mistakes, stale data, slander Dataset is out of control, distributed, dynamic Importance of knowing who-said-what  Anyone can describe anyone  We must record data provenance  Modeling and reasoning about trust is critical Legal, privacy and etiquette issues emerge Welcome to the real world After Dan Brickley,

UMBC an Honors University in Maryland FOAF example using XML <rdf:RDF xmlns:rdf=" syntax-ns#" xmlns:foaf=" Tim Finin

UMBC an Honors University in Maryland FOAF example using XML Tim Finin Tim

UMBC an Honors University in Maryland FOAF example using XML Tim Finin Anupam Joshi

UMBC an Honors University in Maryland FOAF isn’t the only one Other ontologies are used to publish social information Swoogle finds >360 RDFs or OWL classes with the local name “person.”

UMBC an Honors University in Maryland Lots of FOAF tools

UMBC an Honors University in Maryland Why FOAF Information Creators  Community membership management  Unique Person Identification (privacy preserved)  Indicating Authorship Information Consumers  Provenance tracking  Social networking Expose community information to new comers Match interests  Trust building block

UMBC an Honors University in Maryland Studying how FOAF is being used What counts as a FOAF document? How can we find foaf documents?

UMBC an Honors University in Maryland 1.D is an RDF document. 2.D uses FOAF namespace 3.The RDF graph serialized by D contains the sub-graph below 4.D defines one and only one Person instance 1.D is an RDF document. 2.D uses FOAF namespace 3.The RDF graph serialized by D contains the sub-graph below 4.D defines one and only one Person instance Identify a FOAF document D is a generic FOAF document when 1,2,3 met D is a strict FOAF document when 1,2,3,4 met X foaf:Person Z foaf:Y rdf:type

UMBC an Honors University in Maryland Different FOAF collections DS-Swoogle  Foaf documents selected from Swoogle’s database of ~340K semantic web documents  Swoogle selects at most 1000 documents from any site DS-FOAF  Custom crawler found 1.5M foaf documents, most from a few large blog sites (e.g., livejournal) DS-FOAF-Small  Subset of ~7K non-blog foaf documents from ~1K sites defining ~37K people

UMBC an Honors University in Maryland FOAF document Discovery Bootstrap: using web search engine (Got 10,000 docs) Discovery: using rdfs:seeAlso semantics (Got 1.5M docs) Top 7 FOAF websites

UMBC an Honors University in Maryland From DS-Swoogle 17 SWDs add to the definition of foaf:Person  e.g., defining superclasses, disjointness, etc. 162 properties are defined for foaf:Person  e.g., properties whose domain is foaf:Person 74 properties defined as relations between people  e.g., properties with both domain and range of foaf:Person 582 properties used  e.g., used to assert something of a foaf:Person instance

UMBC an Honors University in Maryland Popular properties of foaf:Person non-blog (26,936) liveJournal.com (20,298,073) DS-FOAF-SMALL * (33,790) 1foaf:mbox_sha1sum (0.84)foaf:mbox_sha1sum (1.0)foaf:name(0.80) 2foaf:homepage (0.66 )dc:description(1.0)foaf:mbox_sha1sum(0.71) 3foaf:name (0.64)dc:title (1.0)foaf:nick (0.51) 4foaf:nick (0.61)foaf:nick (1.0)foaf:homepage (0.40) 5foaf:weblog (0.60)foaf:page (1.0)foaf:depiction (0.35) 6foaf:knows (0.44)foaf:weblog (0.99)foaf:weblog (0.30) 7foaf:mbox (0.38)rdfs:seeAlso (0.85)foaf:knows (0.28) 8foaf:img (0.38)foaf:knows (0.85)foaf:surname (0.27) 9bio:olb (0.35)foaf:dateOfBirth (0.71)foaf:firstName (0.26) 10rdfs:seeAlso (0.34) foaf:interest (0.67)rdfs:seeAlso (0.26) 11foaf:mbox (0.26) *DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents. Top 10 popular properties (per document)

UMBC an Honors University in Maryland Popular properties of foaf:Person non-blog (26,936) liveJournal.com (20,298,073) DS-FOAF-SMALL * (33,790) 1 foaf:name (0.84)dc:title (1.74)foaf:name(0.69) 2 foaf:knows (0.79)foaf:interest (1.68)foaf:mbox_sha1sum(0.65) 3 foaf:homepage (0.63)foaf:nick (1.04)rdfs:seeAlso (0.39) 4 foaf:mbox_sha1sum (0.51)foaf:weblog (1.00)foaf:nick (0.26) 5 rdfs:seeAlso (0.40)rdfs:seeAlso (0.99)foaf:homepage (0.18) 6 dc:title (0.31)foaf:knows (0.95)foaf:mbox (0.15) 7 foaf:nick (0.22)foaf:page (0.95)foaf:weblog (0.15) 8 foaf:weblog (0.18)dc:description (0.046)foaf:firstName (0.11) 9 foaf:mbox (0.15)foaf:mbox_sha1sum (0.046)foaf:surname (0.11) 10 daml:equivalentTo (0.13)foaf:dateOfBirth (0.046)foaf:depiction (0.10) 11 foaf:knows (0.07) Top 10 popular properties (per instance) *DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents.

UMBC an Honors University in Maryland Extracting social networks Three steps Discovering foaf instances Merging instances representing the same person Linking people via foaf:knows and other foaf based relations  e.g., quaffing:drankBeerWith Integrating other SNA data  e.g., from co-author relationships mined from citeseer

UMBC an Honors University in Maryland Merging instances Named instances Inverse functional properties Set of nearly inverse functional properties OWL constraints Rdf:seeAlso

UMBC an Honors University in Maryland Collecting Personal Information

UMBC an Honors University in Maryland Caution: Collision? Mistake! caution

UMBC an Honors University in Maryland SNA1: Instances of foaf:Person/doc Zipf’s distribution Sloppy tail: few foaf documents contain thousands of instances Cumulative distribution

UMBC an Honors University in Maryland SNA2: Instances of foaf:Person/group Zipf’s distribution Sloppy tail: some instances are wrongly fused due to incorrect FOAF documents Cumulative distribution A group refers to a fused person

UMBC an Honors University in Maryland Degree analysis For social networks, the in-degree and out- degree measure of a person is of interest Can be used to identify hubs and authorities or to compute other interesting properties or rankings Analyzing most large social networks reveals that in-degree and out-degree follows a power law or Zipf distribution We found that to be the case for social networks induced by foaf documents.

UMBC an Honors University in Maryland SNA3: In-degree of group Zipf’s Distribution Sharp tail: few FOAF documents have large in- degrees Cumulative distribution

UMBC an Honors University in Maryland SNA4: Out-degree of group Zipf’s distribution Sloppy tail: few person directory documents Cumulative distribution

UMBC an Honors University in Maryland SNA5: Patterns of FOAF Network Four types of group  Isolated  Only in only one inlink (97%)  Only out  Both (intermediate) Basic Patterns:  Singleton: (isolated)  Star: (only out) an active person publishes friends  Clique: a small group

UMBC an Honors University in Maryland SNA6: Size of components Zipf’s distribution Sloppy head: singleton Sloppy tail: blog websites (e.g. Cumulative distribution

UMBC an Honors University in Maryland SNA7: Growth of FOAF network The data suggests that there is a natural evolution for a social network (1) disjointed star-like, connected components (2) link together to form trees and forests, (3) eventually forming a scale-free network

UMBC an Honors University in Maryland SNA7: Growth of FOAF network 1 2 3

UMBC an Honors University in Maryland The Map of FOAF network Blog.livedoor.jp non-blog June 2004

UMBC an Honors University in Maryland Conclusions The semantic web is evolving There is a growing volume of RDF content FOAF is one of the one of the early successes. FOAF data is being used FOAF data is relatively easy to collect and analize FOAF data is a good source for social network information

UMBC an Honors University in Maryland Questions? Demo: Swoogle: ebiquity group: