@ How the Semantic Web is Being Used: An Analysis of FOAF Documents Li Ding, Lina Zhou, Tim Finin, Anupam Joshi eBiquity Lab, Department of CSEE University of Maryland Baltimore County
@ Outline Introduction The six popular ontologies FOAF vocabulary Why FOAF Building FOAF Document collection FOAF Document Identification FOAF Document Discovery Popular Properties of foaf:Person Applications Personal Information Fusion Social Network Analysis
@ The Six Most Popular Ontologies RDF DC RSS FOAF RDFS MCVB The statistics is generated by
FOAF vocabulary
@ Why FOAF Information Creators Community membership management Unique Person Identification (privacy preserved) Indicating Authorship Information Consumers Provenance tracking Social networking Expose community information to new comers Match interests Trust building block
@ 1.D is an RDF document. 2.D uses FOAF namespace 3.The RDF graph serialized by D contains the sub-graph below 4.D defined one and only one master Person 1.D is an RDF document. 2.D uses FOAF namespace 3.The RDF graph serialized by D contains the sub-graph below 4.D defined one and only one master Person Identify a FOAF document D is a generic FOAF document when 1,2,3 met D is a strict FOAF document when 1,2,3,4 met X foaf:Person Z foaf:Y rdf:type
@ FOAF document Discovery Bootstrap: using web search engine (Got 10,000 docs) Discovery: using rdfs:seeAlso semantics (Got 1.5M docs) Top 7 FOAF websites
@ Popular properties of foaf:Person (1/2) non-blog (26,936) liveJournal.com (20,298,073) DS-FOAF-SMALL * (33,790) 1foaf:mbox_sha1sum (0.84)foaf:mbox_sha1sum (1.0)foaf:name(0.80) 2foaf:homepage (0.66 )dc:description(1.0)foaf:mbox_sha1sum(0.71) 3foaf:name (0.64)dc:title (1.0)foaf:nick (0.51) 4foaf:nick (0.61)foaf:nick (1.0)foaf:homepage (0.40) 5foaf:weblog (0.60)foaf:page (1.0)foaf:depiction (0.35) 6foaf:knows (0.44)foaf:weblog (0.99)foaf:weblog (0.30) 7foaf:mbox (0.38)rdfs:seeAlso (0.85)foaf:knows (0.28) 8foaf:img (0.38)foaf:knows (0.85)foaf:surname (0.27) 9bio:olb (0.35)foaf:dateOfBirth (0.71)foaf:firstName (0.26) 10rdfs:seeAlso (0.34) foaf:interest (0.67)rdfs:seeAlso (0.26) 11foaf:mbox (0.26) *DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents. Top 10 popular properties (per document)
@ Popular properties of foaf:Person (2/2) non-blog (26,936) liveJournal.com (20,298,073) DS-FOAF-SMALL * (33,790) 1 foaf:name (0.84)dc:title (1.74)foaf:name(0.69) 2 foaf:knows (0.79)foaf:interest (1.68)foaf:mbox_sha1sum(0.65) 3 foaf:homepage (0.63)foaf:nick (1.04)rdfs:seeAlso (0.39) 4 foaf:mbox_sha1sum (0.51)foaf:weblog (1.00)foaf:nick (0.26) 5 rdfs:seeAlso (0.40)rdfs:seeAlso (0.99)foaf:homepage (0.18) 6 dc:title (0.31)foaf:knows (0.95)foaf:mbox (0.15) 7 foaf:nick (0.22)foaf:page (0.95)foaf:weblog (0.15) 8 foaf:weblog (0.18)dc:description (0.046)foaf:firstName (0.11) 9 foaf:mbox (0.15)foaf:mbox_sha1sum (0.046)foaf:surname (0.11) 10 daml:equivalentTo (0.13)foaf:dateOfBirth (0.046)foaf:depiction (0.10) 11 foaf:knows (0.07) Top 10 popular properties (per instance) *DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents.
@ Collecting Personal Information
@ Caution: Collision? Mistake! caution
@ SNA1: Instances of foaf:Person per doc Zipf’s distribution Sloppy tail: few person directory documents contains thousands of instances Cumulative distribution
@ SNA2: Instances of foaf:Person per group Zipf’s distribution Sloppy tail: some instances are wrongly fused due to incorrect FOAF documents Cumulative distribution A group refers to a fused person
@ SNA3: In-degree of group Zipf’s Distribution Sharp tail: few FOAF documents have large in- degrees Cumulative distribution
@ SNA4: Out-degree of group Zipf’s distribution Sloppy tail: few person directory documents Cumulative distribution
@ SNA5: Patterns of FOAF Network Four types of group Isolated Only in only one inlink (97%) Only out Both (intermediate) Basic Patterns: Singleton: (isolated) Star: (only out) an active person publishes friends Clique: a small group
@ SNA6: Size of components Zipf’s distribution Sloppy head: singleton Sloppy tail: blog websites (e.g. Cumulative distribution
@ SNA7: Growth of FOAF network 1 2 3
@ The Map of FOAF network (Jun,2004) Blog.livedoor.jp non-blog
@ Questions? Demo: Swoogle: eBiquity group: