@ Presented by eBiquity group, UMBC CIKM’04, Nov 12, 2004 SwoogleSwoogle SwoogleSwoogle search and metadata for the semantic web Partial research support.

Slides:



Advertisements
Similar presentations
1 Search and Navigate Web Ontologies Li Ding Tetherless World Constellation Rensselaer Polytechnic Institute Aug 22, 2008.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
DAML Ontology Library Mike Dean OntoLog Forum 28 February
1 Ontolog OOR Use Case Review Todd Schneider 1 April 2010 (v 1.2)
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
1 UIM with DAML-S Service Description Team Members: Jean-Yves Ouellet Kevin Lam Yun Xu.
CS570 Artificial Intelligence Semantic Web & Ontology 2
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
1 Semantic Web Technologies: The foundation for future enterprise systems Okech Odhiambo Knowledge Systems Research Group Strathmore University.
Ontology Notes are from:
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Context Dependent Reasoning.
Roi Adadi David Ben-David.  Semantic Web Document (SWD) ◦ A web page that serializes an RDF graph. ◦ Uses one of the recommended RDF syntax languages,
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Dr. Alexandra I. Cristea RDF.
Web Mining Research: A Survey
IST NeOn-project.org The Semantic Web is growing… #SW Pages Lee, J., Goodwin, R. (2004) The Semantic.
Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.
The Semantic Web Week 12 Term 1 Recap Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module Website:
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
Swoogle Swoogle Semantic Search Engine Web-enhanced Information Management Bin Wang.
CSE 428 Semantic Web Topics Introduction Jeff Heflin Lehigh University.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
UMBC an Honors University in Maryland 1 Knowledge Sharing on the Semantic Web Tim Finin University of Maryland, Baltimore County Department of Homeland.
Practical RDF Chapter 1. RDF: An Introduction
Okech Odhiambo Faculty of Information Technology Strathmore University
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
Logics for Data and Knowledge Representation
@ Swoogle Tutorial (Part II: Swoogle Demo) A canned demo Use-case: UMBC tree survey Presented by eBiquity Lab, CSEE, UMBC.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
U M B CU M B CU M B CU M B C AN HONORS UNIVERSITY IN MARYLAND SwoogleSwoogleSwoogleSwoogle An indexing and retrieval engine for the Semantic Web Tim Finin.
Resource Curation and Automated Resource Discovery.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
 Copyright 2007 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Scalable Authoritative OWL.
Semantic Web Ontology Design Pattern Li Ding Department of Computer Science Rensselaer Polytechnic Institute October 3, 2007 Class notes for CSCI-6962.
Semantic Web - an introduction By Daniel Wu (danielwujr)
Problems in Semantic Search Krishnamurthy Viswanathan and Varish Mulwad {krishna3, varish1} AT umbc DOT edu 1.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
UMBC an Honors University in Maryland 1 Information Integration and the Semantic Web Finding knowledge, data and answers Tim Finin University of Maryland,
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
UMBC an Honors University in Maryland 1 Finding knowledge, data and answers on the Semantic Web Tim Finin University of Maryland, Baltimore County
Semantic Technologies and Application to Climate Data M. Benno Blumenthal IRI/Columbia University CDW /04-01.
CSE 428 Semantic Web Topics Introduction Jeff Heflin Lehigh University.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Dr. Lowell Vizenor Ontology and Semantic Technology Practice Lead Alion Science and Technology Semantic Technology: A Basic Introduction.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
UMBC an Honors University in Maryland 1 Finding and Ranking Knowledge on the Semantic Web Li Ding, Rong Pan, Tim Finin, Anupam Joshi, Yun Peng and Pranam.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle.
UMBC an Honors University in Maryland 1 Searching for Knowledge and Data on the Semantic Web Tim Finin University of Maryland, Baltimore County
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
1 Web Services for Semantic Interoperability and Integration Tim Finin University of Maryland, Baltimore County Dagstuhl, 20 September 2004
Swoogle: A Semantic Web Search and Metadata Engine Li Ding, Tim Finin, Anupam Joshi, Rong Pan, R. Scott Cost, Yun Peng Pavan Reddivari, Vishal Doshi, Joel.
Charlie Abela Department of Intelligent Computer Systems
Finding knowledge, data and answers on the Semantic Web
Information Retrieval and the Semantic Web
SWD = SWO + SWI SWD Rank SWD IR Engine
Web Services for Semantic Interoperability and Integration
Presented by ebiqity UMBC Nov, 2004
Visit Swoogle web site at
OntoRank for RDF documents
Semantic-Web, Triple-Strores, and SPARQL
A framework for ontology Learning FROM Big Data
Presentation transcript:

@ Presented by eBiquity group, UMBC CIKM’04, Nov 12, 2004 SwoogleSwoogle SwoogleSwoogle search and metadata for the semantic web Partial research support was provided by DARPA contract F and by NSF by awards NSF-ITR-IIS and NSF-ITR-IDM

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Outline Motivation Concepts Demo Architecture  document discovery  metadata creation  ontology rank Status Summary

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Motivation (Google + Web) has made us all smarter something similar is needed by people and software agents for information on the semantic web

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Motivation – Common Questions Find an ontology  What are the ontologies about “time” ?  Shall I use an existing ontology or create one? Find instance data  Show me the instances of a class “  Gather relevant information for my application. Characterize the Semantic Web  How many RDF documents are online?  What are the most popular ontologies ?  What graph properties does the semantic web have?  Does namespace URI link to the corresponding ontology?

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' The Role of Swoogle in Semantic Web Semantic Web Services Data Service Software Agents, Applications SW data service database (Web) document RDF document uses Directory/Digest Service Service Finder digests searches Data Finder SwoogleSwoogle

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Related work Ontology based annotation & search  Annotate web documents SHOE (UMCP, 1997) Ontobroker (AIFB, karlsruhe, 1998), WebKB (Martin & Eklund, 1999), QuizRDF (BT,2002)  Annotate proper reference & relations CREAM (AIFB,2003) Ontology repositories  Ontology level DAML Ontology Library Schema Web SemWebCentral  Term level W3C’s Ontaria (2004) Ontology management systems  Stanford’s Ontolingua  IBM’s Snobase  Based on both ontology and instance document  Automated discovery  Search and rank ontologies and terms  Digest but not store  Create metadata based on RDF and OWL semantics  Provide services to both human and software agents Swoogle aims to be a Google-like online ontology repository

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Concepts Document  A Semantic Web Document (SWD) is an online document written in semantic web languages (i.e. RDF and OWL).  An ontology document (SWO) is a SWD that contains mostly term definition (i.e. classes and properties). It corresponds to T-Box in Description Logic.  An instance document (SWI or SWDB) is a SWD that contains mostly class individuals. It corresponds to A-Box in Description Logic. Term  A term is a non-anonymous RDF resource which is the URI reference of either a class or a property. Individual  An individual refers to a non-anonymous RDF resource which is the URI reference of a class member. In swoogle, a document D is a valid SWD iff. JENA* correctly parses D and produces at least one triple. *JENA is a Java framework for writing Semantic Web applications. rdf:type rdfs:Class foaf:Person rdf:type foaf:Person

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Concepts Example wordNet:Agent rdf:type rdfs:Class rdfs:subClassOf foaf:Person foaf:mbox rdfs:domain rdf:type rdf:Property Property Class SWO foaf:mbox rdf:type foaf:Person SWI Individual SWD Term NOTE: Qualified Names (QName) are used to shorten well-known namespaces as follows rdf: => rdfs: => foaf: => wordNet: =>

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Demo Find “Time” Ontology (Swoogle Search) Digest “Time” Ontology Document view Term view Find Term “Person” (Ontology Dictionary) Digest Term “Person” Class properties (Instance) properties 5 Swoogle Statistics

Find “Time” Ontology We can use a set of keywords to search ontology. For example, “time, before, after” are basic concepts for a “Time” ontology. Demo 1

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Usage of Terms in SWD foaf:mbox rdf:type foaf:Person wordNet:Agent rdf:type rdfs:Class rdfs:subClassOf foaf:Person foaf:mbox rdfs:domain rdf:type rdf:Property populated Class defined Class populated Property defined Property foaf:mbox rdf:type foaf:Person defined Individual

Digest “Time” Ontology (term view) Demo 2(a) …………. TimeZone before intAfter

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Document Metadata Web document metadata  When/how discovered/fetched  Suffix of URL  Last modified time  Document size SWD metadata  Language features OWL species RDF encoding  Statistical features Defined/used terms Declared/used namespaces Ontology Ratio  Ontology Rank Ontology annotation  Label  Version  Comment Related Relational Metadata  Links to other SWDs Imported SWDs Referenced SWDs Extended SWDs Prior version  Links to terms Classes/Properties defined/used

Digest “Time” Ontology (document view) Demo 2(b)

Find Term “Person” Demo 3 Not capitalized! URIref is case sensitive!

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Term Metadata: An integrated definition Class Definition rdfs:subClassOf -- foaf:Agent rdfs:label – “Person” Properties (from SWI) foaf:name dc:title Properties (from SWO) foaf:mbox foaf:name foaf:mbox rdfs:domain Onto 1 owl:Class rdf:type “Person” rdfs:label foaf:Agent rdfs:subClassOf Onto 2 foaf:name rdf:type “Tim Finin” SWD3 foaf:Person

Digest Term “Person” Demo different properties 562 different properties

Demo 5 Swoogle Statistics

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Swoogle Architecture metadata creation data analysis interface SWD discovery SWD Metadata Web Service Web Server SWD Cache The Web Candidate URLs Web Crawler SWD Reader IR analyzerSWD analyzer Agent Service

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' SWD Discovery Swoogle uses three crawlers to discover likely SWD URLs  A Google Crawler uses Google to find URLs using keywords: File type suffices:.rdf,.owl  A Focused Crawler crawls through HTML files recursively within the given website.  A SWD Crawler crawls through SWDs and discover URLs according to term semantics. To determine the likely SWD URLs:  Non-swd extension filter:.jpg,.mp3, and etc.  Protocol filter: file://, urn:, and etc.  Namespace of RDF resources in SWD

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Metadata Creation Document metadata  General metadata  SWD metadata  Ontology metadata Term Metadata (definition)  Class property  (Instance) property: i.e. class-property bond Relational metadata TermDocument Term rdfs:subClassOf, rdfs:domain…rdfs:seeAlso, … Document Uses, Defines,…owl:imports,…

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Ontology Ratio Why?  The fuzzy distinction between ontology and instance document Given a SWD foo, and let  C(foo): the set of classes defined in foo  P(foo): the set of properties defined in foo  I(foo): the set of instances defined in foo Ontology Ratio as a heuristic to do the classification  0: pure SWI  1: pure SWO  > 0.8: foo is said to be an ontology.

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Relational Metadata Inter-document relation  rdfs:seeAlso  IMport (IM) e.g. owl:import  Similar/Equal SWD Inter-term relation  EXtension (EX) e.g. rdfs:subClassOf  use-TerM (TM)e.g. rdf:range  use-INdividual (IN)e.g. owl:sameAs  Prior Version (PV, IPV, CPV) Generalized inter-document relations  Generalized from individual level relation  Capture more relations while with less complexity Usage  Link SWDs  Ontology rank

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' SWOs SWIs HTML documents Images Audio files Video files 3. Data analysis: Ranking SWD Why?  Ranking captures page importance and popularity  Ranking has been proven useful in HTML search.  SWD is different from HTML and has more semantics  So, a new SWD ranking mechanism is needed ! Related ideas?  Google ’ s PageRank  Kleinberg ’ s HITS

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Random surfer model (PageRank) How PageRank is computed?  page A’s rank is  Where {T i } are the pages that link to A C(X): # of page X’s out links d is a damping factor (e.g., 0.85)  Compute by iterating until converge Uniform probability of following any link is convention in the Web but not in the SW  Links have semantics that influence the probability of following them  Rational users read an ontology and all ontologies it referenced. Jump to a random page Follow a random link bored? no yes read page

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Rational Random Surfer Model Weighted random behavior Rational behavior  Rank of a SWI  Rank of a a SWO Jump to a random page Follow a random link bored? no yes read page Read referenced SWOs SWO? yes no where TC(A) is transitive closure of SWOs referencing A

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Ontology Rank Example foaf:mbox rdf:type foaf:Person wordNet:Person rdf:type rdfs:Class rdfs:subClassOf foaf:Person TM rdfs:subClassOf rdf:Property rdf:type rdfs:Class rdf:type wordNet:Individual rdfs:subClassOf wordNet:Person EX

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Ontology Rank Example (cont’d) EX TM rawPR =0.2 rawPR =100 rawPR =3 rawPR =300 PR =0.2 PR =100 PR =103 PR =403

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Current Status Swoogle Watch reported (Nov 7, 2004)  40 M triples  270 K SWDs: 4k ontologies  144 K terms: 91K classes & 51K properties Ongoing work  Ontology Dictionary  Swoogle Statistics  Web Service interface (see Swoogle website)  IR with the Semantic Web ( Content search) Character N-Grams Bag of URIrefs Swangling

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' Summary Swoogle (Mar, 2004) Swoogle2 (Sep, 2004) Swoogle3  Automated SWD discovery  SWD metadata creation and search  Ontology rank (rational surfer model)  Swoogle watch  Web Interface  Ontology dictionary  Swoogle statistics  Web service interface (WSDL)  Bag of URIref IR search  Better crawl & refresh strategies  More metadata (ontology mapping)  More IR features  Better web service interfaces  Capture and store all triples  More reasoning

@ SwoogleSwoogle SwoogleSwoogle MotivationConceptsDemoArchitectureStatusSummary Swoogle, cikm' The End Website: Slides at: Demo: