Download presentation
Presentation is loading. Please wait.
Published byChristiana Webster Modified over 9 years ago
1
@ Presented by eBiquity group, UMBC CIKM’04, Nov 12, 2004 SwoogleSwoogle SwoogleSwoogle search and metadata for the semantic web Partial research support was provided by DARPA contract F30602-00-0591 and by NSF by awards NSF-ITR-IIS-0326460 and NSF-ITR-IDM-0219649.
2
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 2 Outline Motivation Concepts Demo Architecture document discovery metadata creation ontology rank Status Summary http://swoogle.umbc.eduhttp://swoogle.umbc.edu/
3
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 3 Motivation (Google + Web) has made us all smarter something similar is needed by people and software agents for information on the semantic web
4
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 4 Motivation – Common Questions Find an ontology What are the ontologies about “time” ? Shall I use an existing ontology or create one? Find instance data Show me the instances of a class “http://foo.com/Person”? Gather relevant information for my application. Characterize the Semantic Web How many RDF documents are online? What are the most popular ontologies ? What graph properties does the semantic web have? Does namespace URI link to the corresponding ontology?
5
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 5 The Role of Swoogle in Semantic Web Semantic Web Services Data Service Software Agents, Applications SW data service database (Web) document RDF document uses Directory/Digest Service Service Finder digests searches Data Finder SwoogleSwoogle
6
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 6 Related work Ontology based annotation & search Annotate web documents SHOE (UMCP, 1997) Ontobroker (AIFB, karlsruhe, 1998), WebKB (Martin & Eklund, 1999), QuizRDF (BT,2002) Annotate proper reference & relations CREAM (AIFB,2003) Ontology repositories Ontology level DAML Ontology Library Schema Web SemWebCentral Term level W3C’s Ontaria (2004) Ontology management systems Stanford’s Ontolingua IBM’s Snobase Based on both ontology and instance document Automated discovery Search and rank ontologies and terms Digest but not store Create metadata based on RDF and OWL semantics Provide services to both human and software agents Swoogle aims to be a Google-like online ontology repository
7
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 7 Concepts Document A Semantic Web Document (SWD) is an online document written in semantic web languages (i.e. RDF and OWL). An ontology document (SWO) is a SWD that contains mostly term definition (i.e. classes and properties). It corresponds to T-Box in Description Logic. An instance document (SWI or SWDB) is a SWD that contains mostly class individuals. It corresponds to A-Box in Description Logic. Term A term is a non-anonymous RDF resource which is the URI reference of either a class or a property. Individual An individual refers to a non-anonymous RDF resource which is the URI reference of a class member. In swoogle, a document D is a valid SWD iff. JENA* correctly parses D and produces at least one triple. *JENA is a Java framework for writing Semantic Web applications. http://www.hpl.hp.com/semweb/jena2.htmhttp://www.hpl.hp.com/semweb/jena2.htm rdf:type rdfs:Class foaf:Person rdf:type foaf:Person http://.../foaf.rdf#finin
8
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 8 Concepts Example wordNet:Agent rdf:type rdfs:Class rdfs:subClassOf foaf:Person http://xmlns.com/foaf/1.0/ foaf:mbox rdfs:domain rdf:type rdf:Property Property Class SWO http://foo.com/foaf.rdf#finin foaf:mbox rdf:type finin@umbc.edu foaf:Person http://foo.com/foaf.rdf#finin SWI Individual SWD Term NOTE: Qualified Names (QName) are used to shorten well-known namespaces as follows rdf: => http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdfs: => http://www.w3.org/2000/01/rdf-schema foaf: => http://xmlns.com/foaf/1.0/ wordNet: => http://xmlns.com/wordnet/1.6/
9
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 9 Demo Find “Time” Ontology (Swoogle Search) 1 2 3 4 Digest “Time” Ontology Document view Term view Find Term “Person” (Ontology Dictionary) Digest Term “Person” Class properties (Instance) properties 5 Swoogle Statistics
10
Find “Time” Ontology We can use a set of keywords to search ontology. For example, “time, before, after” are basic concepts for a “Time” ontology. Demo 1
11
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 11 Usage of Terms in SWD foaf:mbox rdf:type finin@umbc.edu foaf:Person http://www.cs.umbc.edu/~finin/foaf.rdf wordNet:Agent rdf:type rdfs:Class rdfs:subClassOf foaf:Person http://xmlns.com/foaf/1.0/ foaf:mbox rdfs:domain rdf:type rdf:Property populated Class defined Class populated Property defined Property http://foo.com/foaf.rdf#finin foaf:mbox rdf:type finin@umbc.edu foaf:Person http://foo.com/foaf.rdf defined Individual
12
Digest “Time” Ontology (term view) Demo 2(a) …………. TimeZone before intAfter
13
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 13 Document Metadata Web document metadata When/how discovered/fetched Suffix of URL Last modified time Document size SWD metadata Language features OWL species RDF encoding Statistical features Defined/used terms Declared/used namespaces Ontology Ratio Ontology Rank Ontology annotation Label Version Comment Related Relational Metadata Links to other SWDs Imported SWDs Referenced SWDs Extended SWDs Prior version Links to terms Classes/Properties defined/used
14
Digest “Time” Ontology (document view) Demo 2(b)
15
Find Term “Person” Demo 3 Not capitalized! URIref is case sensitive!
16
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 16 Term Metadata: An integrated definition Class Definition rdfs:subClassOf -- foaf:Agent rdfs:label – “Person” Properties (from SWI) foaf:name dc:title Properties (from SWO) foaf:mbox foaf:name foaf:mbox rdfs:domain Onto 1 owl:Class rdf:type “Person” rdfs:label foaf:Agent rdfs:subClassOf Onto 2 foaf:name rdf:type “Tim Finin” SWD3 foaf:Person
17
Digest Term “Person” Demo 4 167 different properties 562 different properties
18
Demo 5 Swoogle Statistics
19
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 19 Swoogle Architecture metadata creation data analysis interface SWD discovery SWD Metadata Web Service Web Server SWD Cache The Web Candidate URLs Web Crawler SWD Reader IR analyzerSWD analyzer Agent Service
20
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 20 1. SWD Discovery Swoogle uses three crawlers to discover likely SWD URLs A Google Crawler uses Google to find URLs using keywords: http://www.w3.org/2000/01/rdf-schema,... File type suffices:.rdf,.owl A Focused Crawler crawls through HTML files recursively within the given website. A SWD Crawler crawls through SWDs and discover URLs according to term semantics. To determine the likely SWD URLs: Non-swd extension filter:.jpg,.mp3, and etc. Protocol filter: file://, urn:, and etc. Namespace of RDF resources in SWD
21
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 21 2. Metadata Creation Document metadata General metadata SWD metadata Ontology metadata Term Metadata (definition) Class property (Instance) property: i.e. class-property bond Relational metadata TermDocument Term rdfs:subClassOf, rdfs:domain…rdfs:seeAlso, … Document Uses, Defines,…owl:imports,…
22
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 22 2.1 Ontology Ratio Why? The fuzzy distinction between ontology and instance document Given a SWD foo, and let C(foo): the set of classes defined in foo P(foo): the set of properties defined in foo I(foo): the set of instances defined in foo Ontology Ratio as a heuristic to do the classification 0: pure SWI 1: pure SWO > 0.8: foo is said to be an ontology.
23
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 23 2.2 Relational Metadata Inter-document relation rdfs:seeAlso IMport (IM) e.g. owl:import Similar/Equal SWD Inter-term relation EXtension (EX) e.g. rdfs:subClassOf use-TerM (TM)e.g. rdf:range use-INdividual (IN)e.g. owl:sameAs Prior Version (PV, IPV, CPV) Generalized inter-document relations Generalized from individual level relation Capture more relations while with less complexity Usage Link SWDs Ontology rank
24
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 24 SWOs SWIs HTML documents Images Audio files Video files 3. Data analysis: Ranking SWD Why? Ranking captures page importance and popularity Ranking has been proven useful in HTML search. SWD is different from HTML and has more semantics So, a new SWD ranking mechanism is needed ! Related ideas? Google ’ s PageRank Kleinberg ’ s HITS
25
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 25 3.1 Random surfer model (PageRank) How PageRank is computed? page A’s rank is Where {T i } are the pages that link to A C(X): # of page X’s out links d is a damping factor (e.g., 0.85) Compute by iterating until converge Uniform probability of following any link is convention in the Web but not in the SW Links have semantics that influence the probability of following them Rational users read an ontology and all ontologies it referenced. Jump to a random page Follow a random link bored? no yes read page
26
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 26 3.2 Rational Random Surfer Model Weighted random behavior Rational behavior Rank of a SWI Rank of a a SWO Jump to a random page Follow a random link bored? no yes read page Read referenced SWOs SWO? yes no where TC(A) is transitive closure of SWOs referencing A. 1 2 1 2
27
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 27 3.3 Ontology Rank Example foaf:mbox rdf:type finin@umbc.edu foaf:Person http://www.cs.umbc.edu/~finin/foaf.rdf wordNet:Person rdf:type rdfs:Class rdfs:subClassOf foaf:Person http://xmlns.com/foaf/1.0/ TM http://www.w3.org/2000/01/rdf-schema rdfs:subClassOf rdf:Property rdf:type http://xmlns.com/wordnet/1.6/ rdfs:Class rdf:type wordNet:Individual rdfs:subClassOf wordNet:Person EX
28
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 28 3.3 Ontology Rank Example (cont’d) http://www.cs.umbc.edu/~finin/foaf.rdf http://xmlns.com/wordnet/1.6/ http://xmlns.com/foaf/1.0/ EX TM http://www.w3.org/2000/01/rdf-schema rawPR =0.2 rawPR =100 rawPR =3 rawPR =300 PR =0.2 PR =100 PR =103 PR =403
29
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 29 Current Status Swoogle Watch reported (Nov 7, 2004) 40 M triples 270 K SWDs: 4k ontologies 144 K terms: 91K classes & 51K properties Ongoing work Ontology Dictionary Swoogle Statistics Web Service interface (see Swoogle website) IR with the Semantic Web ( Content search) Character N-Grams Bag of URIrefs Swangling
30
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 30 Summary Swoogle (Mar, 2004) Swoogle2 (Sep, 2004) Swoogle3 Automated SWD discovery SWD metadata creation and search Ontology rank (rational surfer model) Swoogle watch Web Interface Ontology dictionary Swoogle statistics Web service interface (WSDL) Bag of URIref IR search Better crawl & refresh strategies More metadata (ontology mapping) More IR features Better web service interfaces Capture and store all triples More reasoning 2005 2004
31
@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm'04 -- http://swoogle.umbc.edu/ 31 The End Website: http://swoogle.umbc.eduhttp://swoogle.umbc.edu Slides at: http://ebiquity.umbc.edu/v2.1/resource/html/id/66/http://ebiquity.umbc.edu/v2.1/resource/html/id/66/ Demo: http://ebiquity.umbc.edu/v2.1/resource/html/id/65/http://ebiquity.umbc.edu/v2.1/resource/html/id/65/
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.