UMBC an Honors University in Maryland 1 Finding and Ranking Knowledge on the Semantic Web Li Ding, Rong Pan, Tim Finin, Anupam Joshi, Yun Peng and Pranam.

Slides:



Advertisements
Similar presentations
1 Search and Navigate Web Ontologies Li Ding Tetherless World Constellation Rensselaer Polytechnic Institute Aug 22, 2008.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
1 The PageRank Citation Ranking: Bring Order to the web Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd Presented by Fei Li.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
UMBC AN HONORS UNIVERSITY IN MARYLAND Future Research Challenges and Needed Resources for The Web, Semantics and Data Mining Tim Finin UMBC, Baltimore.
Roi Adadi David Ben-David.  Semantic Web Document (SWD) ◦ A web page that serializes an RDF graph. ◦ Uses one of the recommended RDF syntax languages,
Information Retrieval in Practice
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Semantic Search Spring 2007 Computer Engineering Department Sharif University of Technology.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
The PageRank Citation Ranking “Bringing Order to the Web”
Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Swoogle Swoogle Semantic Search Engine Web-enhanced Information Management Bin Wang.
Overview of Search Engines
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection Boanerges Aleman-Meza, Meenakshi Nagarajan,
UMBC an Honors University in Maryland The Semantic Web in use: Analyzing FOAF Documents Li Ding, Lina Zhou, Tim Finin and Anupam Joshi University of Maryland,
UMBC an Honors University in Maryland 1 Knowledge Sharing on the Semantic Web Tim Finin University of Maryland, Baltimore County Department of Homeland.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Finding knowledge, data and answers on the Semantic Web
Chapter 6 Understanding Each Other CSE 431 – Intelligent Agents.
Tables to Linked Data Zareen Syed, Tim Finin, Varish Mulwad and Anupam Joshi University of Maryland, Baltimore County
Logics for Data and Knowledge Representation
@ Swoogle Tutorial (Part II: Swoogle Demo) A canned demo Use-case: UMBC tree survey Presented by eBiquity Lab, CSEE, UMBC.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
UMBC an Honors University in Maryland 1 Adding Semantics to Social Websites for Citizen Science Pranam Kolari University of Maryland, Baltimore County.
SemSearch: A Search Engine for the Semantic Web Yuangui Lei, Victoria Uren, Enrico Motta Knowledge Media Institute The Open University EKAW 2006 Presented.
@ Presented by eBiquity group, UMBC CIKM’04, Nov 12, 2004 SwoogleSwoogle SwoogleSwoogle search and metadata for the semantic web Partial research support.
Semantic Web Ontology Design Pattern Li Ding Department of Computer Science Rensselaer Polytechnic Institute October 3, 2007 Class notes for CSCI-6962.
Semantic Web - an introduction By Daniel Wu (danielwujr)
Problems in Semantic Search Krishnamurthy Viswanathan and Varish Mulwad {krishna3, varish1} AT umbc DOT edu 1.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
UMBC an Honors University in Maryland 1 Information Integration and the Semantic Web Finding knowledge, data and answers Tim Finin University of Maryland,
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
UMBC an Honors University in Maryland 1 Finding knowledge, data and answers on the Semantic Web Tim Finin University of Maryland, Baltimore County
Algorithmic Detection of Semantic Similarity WWW 2005.
Of 35 lecture 5: rdf schema. of 35 RDF and RDF Schema basic ideas ece 627, winter ‘132 RDF is about graphs – it creates a graph structure to represent.
UMBC an Honors University in Maryland 1 Information Integration and the Semantic Web Finding knowledge, data and answers Tim Finin 1, Anupam Joshi 1, Li.
Using linked data to interpret tables Varish Mulwad September 14,
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
Text Based Similarity Metrics and Delta for Semantic Web Graphs Krishnamurthy Koduvayur Viswanathan Monday, June 28,
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
An Ontological Approach to Financial Analysis and Monitoring.
@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle.
UMBC an Honors University in Maryland 1 Searching for Knowledge and Data on the Semantic Web Tim Finin University of Maryland, Baltimore County
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
1 Web Services for Semantic Interoperability and Integration Tim Finin University of Maryland, Baltimore County Dagstuhl, 20 September 2004
Chapter 8: Web Analytics, Web Mining, and Social Analytics
@ How the Semantic Web is Being Used: An Analysis of FOAF Documents Li Ding, Lina Zhou, Tim Finin, Anupam Joshi eBiquity Lab, Department of CSEE University.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
Swoogle: A Semantic Web Search and Metadata Engine Li Ding, Tim Finin, Anupam Joshi, Rong Pan, R. Scott Cost, Yun Peng Pavan Reddivari, Vishal Doshi, Joel.
Data mining in web applications
Information Retrieval in Practice
Search Engine Architecture
Finding knowledge, data and answers on the Semantic Web
Information Retrieval and the Semantic Web
SWD = SWO + SWI SWD Rank SWD IR Engine
Web Services for Semantic Interoperability and Integration
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Presented by ebiqity UMBC Nov, 2004
Visit Swoogle web site at
OntoRank for RDF documents
Semantic-Web, Triple-Strores, and SPARQL
Presentation transcript:

UMBC an Honors University in Maryland 1 Finding and Ranking Knowledge on the Semantic Web Li Ding, Rong Pan, Tim Finin, Anupam Joshi, Yun Peng and Pranam Kolari University of Maryland, Baltimore County  This work was partially supported by DARPA contract F , NSF grants CCR and IIS and grants from IBM, Fujitsu and HP.

UMBC an Honors University in Maryland 2 This talk Motivation Swoogle overview Bots navigate the Semantic Web Ranking Semantic Web content Use cases and applications Conclusions

UMBC an Honors University in Maryland 3 Google has made us smarter

UMBC an Honors University in Maryland 4 But what about our agents? tell register A Google for knowledge on the Semantic Web is needed by people and software agents

UMBC an Honors University in Maryland 5 This talk Motivation Swoogle overview Bots navigate the Semantic Web Ranking Semantic Web content Use cases and applications Conclusions

UMBC an Honors University in Maryland 6 title text

UMBC an Honors University in Maryland 7 Swoogle Architecture metadata creation data analysis interface SWD discovery SWD Metadata Web Service Web Server SWD Cache The Web Candidate URLs Web Crawler SWD Reader IR analyzerSWD analyzer Agent Service Swoogle 2: 340K SWDs, 48M triples, 5K SWOs, 97K classes, 55K properties, 7M individuals (4/05) Swoogle 3: 700K SWDs, 135M triples, 7.7K SWOs, (11/05)

Find “Time” Ontology We can use a set of keywords to search ontology. For example, “time, before, after” are basic concepts for a “Time” ontology. Demo 1

Digest “Time” Ontology (document view) Demo 2(a)

Digest “Time” Ontology (term view) Demo 2(b) …………. TimeZone before intAfter

Find Term “Person” Demo 3 Not capitalized! URIref is case sensitive!

Digest Term “Person” Demo different properties 562 different properties

Demo 5(a) Swoogle Today

UMBC an Honors University in Maryland 14 Demo 5(b) Swoogle Statistics FOAF Trustix W3C Stanford

UMBC an Honors University in Maryland 15 Swoogle’s Triple Store lets you shop And check out your triples into any of several reasoners

UMBC an Honors University in Maryland 16 Summary Swoogle (Mar, 2004) Swoogle2 (Sep, 2004) Swoogle3 (July 2005)  Automated SWD discovery  SWD metadata creation and search  Ontology rank (rational surfer model)  Swoogle watch  Web Interface  Ontology dictionary  Swoogle statistics  Web service interface (WSDL)  Bag of URIref IR search  Triple shopping cart  Better (re-)crawling strategies  Better navigation models  Index instance data  More metadata (ontology mapping and OWL-S services )  Better web service interfaces  IR component for string literals

UMBC an Honors University in Maryland 17 This talk Motivation Swoogle overview Bots navigate the Semantic Web Ranking Semantic Web content Use cases and applications Conclusions

UMBC an Honors University in Maryland 18 The Semantic Web Onion Universal RDF Graph RDF Document Class-instance Molecule Triple Physically hosting knowledge (About 100 triples per SWD in average) The “Semantic Web” (About 10M documents) Finest lossless set of triples triples modifying the same subject Atomic knowledge block Resource Literal Swoogle maintains metadata about objects in different layers of the Semantic Web Onion.

UMBC an Honors University in Maryland 19 RDF graph Resource Web SWT SWD uses populates defines officialOnto isDefinedBy owl:imports … rdfs:seeAlso rdfs:isDefinedBy SWO isUsedBy isPopulatedBy rdfs:subClassOf sameNamespace, sameLocalname Extends class-property bond Term Search Document Search literal Semantic Web Navigation Model Navigating the HTML web is simple; there’s just one kind of link. The SW has more kinds of links and hence more navigation paths.

UMBC an Honors University in Maryland 20 RDF graph Resource Web SWT SWD uses populates defines officialOnto isDefinedBy owl:imports … rdfs:seeAlso rdfs:isDefinedBy SWO isUsedBy isPopulatedBy rdfs:subClassOf sameNamespace, sameLocalname Extends class-property bond Term Search Document Search literal Semantic Web Navigation Model Relations in 1 and 3 and parts of 4 require a global view to discover

UMBC an Honors University in Maryland 21 foaf:Personfoaf:Agent rdfs:subClassOf foaf:mbox foaf:Person rdf:type foaf:mbox rdfs:domain owl:InverseFunctionalProperty owl:Class rdfs:range owl:Thing rdf:type foaf:Person rdf:type rdfs:seeAlso owl:imports An Example We navigate the Semantic Web via links in the physical layer of RDF documents and also via links in the “logical” layer defined by the semantics of RDF and OWL.

UMBC an Honors University in Maryland 22 This talk Motivation Swoogle overview Bots navigate the Semantic Web Ranking Semantic Web content Use cases and applications Conclusions

UMBC an Honors University in Maryland 23 Rank has its privilege Google introduced a new approach to ranking query results using a simple “popularity” metric. –It was a big improvement! Swoogle ranks its query results also –When searching for an ontology, class or property, wouldn’t one want to see the most used ones first? Ranking SW content requires different algorithms for different kinds of SW objects –For SWDs, SWTs, individuals, “assertions”, molecules, etc…

UMBC an Honors University in Maryland 24 Google ’ s PageRank A page’s rank is a function of how many links point to it and the rank of the pages hosting those links. The “random surfer” model provides the intuition: (1)Jump to a random page (2)Select and follow a random link on the page and repeat until ‘bored’ (3)If bored, go to (1) Ranked pages by the relative frequency with which they are visited. Jump to a random page Follow a random link bored? no yes

UMBC an Honors University in Maryland 25 Ranking Semantic Web Documents Target: a pure SW dataset –Nodes: a collection of online SWDs (330K SWDs, 1.5% are labeled as ontologies) –Links: in addition to hyperlinks, term level relations are generalized into TM, EX, IM. Rational surfer model (extension of weighted PageRank) –Semantic content (term level relations) encoded into links –rank of node iteratively spread via links –weight/capacity of link vary according to link semantics –propagate weight to imported ontologies Evaluation –Method: Compare OntoRank with PageRank for promoting ontologies even using the same Pure SW Dataset

UMBC an Honors University in Maryland 26 An Example EX TM wPR =0.2 wPR =100 wPR =3 wPR =300 OntoRank =0.2 OntoRank =100 OntoRank =103 OntoRank =403

UMBC an Honors University in Maryland 27 Ontology Dictionary Motivation –One ontology does not always provide all needed vocabulary –There could be many scenario that requires assembling terms from multiple ontologies DIY ontology engineering 1.Search an appropriate class C 2.Search for popular properties used for modifying C’s class instance 3.Go back to step 1 if more classes are needed

UMBC an Honors University in Maryland 28 Ranking Semantic Web Terms Pr(Term|Doc) can be measured by the normalized value of the product of the term’s –Popularity: how many SWDs is using the term. –Frequency: how many times the term is used in the SWD SWDs are accessed non-uniformly by OntoRank TermRank estimates a term’s importance as ∑ Pr(Term|Doc) * OntoRank(Doc) Evaluation –Compare TermRank with Term’s popularity for the top 10 highest rated terms and compose analytical evaluation.

UMBC an Honors University in Maryland 29 Class-Property Bonds Class Definition rdfs:subClassOf -- foaf:Agent rdfs:label – “Person” Class-Property Bond (introduced by instances) foaf:name dc:title Class-Property Bond (introduced by ontology) foaf:mbox foaf:name rdf:type owl:Class rdf:type “a human being” rdfs:comment foaf:name “Tim Finin” “Tim’s FOAF File” dc:title foaf:mbox rdfs:domain foaf:Agent rdfs:subClassOf rdfs:domain SWD1 SWD3 SWD2 foaf:Person

UMBC an Honors University in Maryland 30 This talk Motivation Swoogle overview Bots navigate the Semantic Web Ranking Semantic Web content Use cases and applications Conclusions

UMBC an Honors University in Maryland 31 Supporting Semantic Web Developers Finding SW content –Ontologies, classes, properties, molecules, triples, partial ontology mappings, authoritative copies –Ad hoc data collection Exploring how the SW is being used, e.g. –Computing basic statistics –Ranking properties used with foaf:person And misused –Finding common typos

UMBC an Honors University in Maryland 32 Applications and use cases Supporting Semantic Web developers, e.g., –Ontology designers –Vocabulary discovery –Who’s using my ontologies or data? –Etc. Searching specialized collections, e.g., –Proofs in Inference Web –Text Meaning Representations of news stories in SemNews Supporting SW tools, e.g., –Discovering mappings between ontologies

UMBC an Honors University in Maryland 33

UMBC an Honors University in Maryland 34

UMBC an Honors University in Maryland 35

UMBC an Honors University in Maryland 36 This talk Motivation Swoogle overview Bots navigate the Semantic Web Ranking Semantic Web content Use cases and applications Conclusions

UMBC an Honors University in Maryland 37 Will it Scale? How? Here’s a rough estimate of the data in RDF documents on the semantic web based on Swoogle’s crawling System/dateTermsDocumentsIndividualsTriplesBytes Swoogle21.5x x10 5 7x10 6 5x10 7 7x10 9 Swoogle32x10 5 7x x x10 7 1x x10 5 5x10 6 5x10 7 5x10 8 5x x10 5 5x10 7 5x10 8 5x10 9 5x10 11 We think Swoogle’s centralized approach can be made to work for the next few years if not longer.

UMBC an Honors University in Maryland 38 How much reasoning? SwoogleN (N<=3) does limited reasoning –It’s expensive –It’s not clear how much should be done More reasoning would benefit many use cases –e.g., type hierarchy Recognizing specialized metadata –E.g., that ontology A some maps terms from B to C

UMBC an Honors University in Maryland 39 Conclusion The web will contain the world’s knowledge in forms accessible to people and computers –We need better ways to discover, index, search and reason over SW knowledge SW search engines address different tasks than html search engines –So they require different techniques and APIs Swoogle like systems can help create consensus ontologies and foster best practices

UMBC an Honors University in Maryland 40 Annotated in OWL For more information