Swoogle Swoogle Semantic Search Engine Web-enhanced Information Management Bin Wang.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
CS570 Artificial Intelligence Semantic Web & Ontology 2
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Roi Adadi David Ben-David.  Semantic Web Document (SWD) ◦ A web page that serializes an RDF graph. ◦ Uses one of the recommended RDF syntax languages,
Information Retrieval in Practice
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Semantic Web Tools for Authoring and Using Analysis Results Richard Fikes Robert McCool Deborah McGuinness Sheila McIlraith Jessica Jenkins Knowledge Systems.
BTW (“By The Way…”) Information Annotation By Rudd Stevens, Jason Endo University of San Francisco.
Web Mining Research: A Survey
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
IST NeOn-project.org The Semantic Web is growing… #SW Pages Lee, J., Goodwin, R. (2004) The Semantic.
Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.
Web Mining Research: A Survey
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
Computer communication B Introduction to the Semantic Web.
Overview of Search Engines
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
Search Engines and Information Retrieval Chapter 1.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
An Introduction to the Resource Description Framework Eric Miller Online Computer Library Center, Inc. Office of Research Dublin, Ohio 元智資工所 系統實驗室 楊錫謦.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
Web Categorization Crawler Mohammed Agabaria Adam Shobash Supervisor: Victor Kulikov Winter 2009/10 Design & Architecture Dec
SWETO: Large-Scale Semantic Web Test-bed Ontology In Action Workshop (Banff Alberta, Canada June 21 st 2004) Boanerges Aleman-MezaBoanerges Aleman-Meza,
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
@ Presented by eBiquity group, UMBC CIKM’04, Nov 12, 2004 SwoogleSwoogle SwoogleSwoogle search and metadata for the semantic web Partial research support.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Problems in Semantic Search Krishnamurthy Viswanathan and Varish Mulwad {krishna3, varish1} AT umbc DOT edu 1.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
UMBC an Honors University in Maryland 1 Information Integration and the Semantic Web Finding knowledge, data and answers Tim Finin University of Maryland,
UMBC an Honors University in Maryland 1 Finding knowledge, data and answers on the Semantic Web Tim Finin University of Maryland, Baltimore County
Algorithmic Detection of Semantic Similarity WWW 2005.
Search Engines By: Faruq Hasan.
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
- University of North Texas - DSCI 5240 Fall Graduate Presentation - Option A Slides Modified From 2008 Jones and Bartlett Publishers, Inc. Version.
Text Based Similarity Metrics and Delta for Semantic Web Graphs Krishnamurthy Koduvayur Viswanathan Monday, June 28,
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
UMBC an Honors University in Maryland 1 Finding and Ranking Knowledge on the Semantic Web Li Ding, Rong Pan, Tim Finin, Anupam Joshi, Yun Peng and Pranam.
Semantic Web COMS 6135 Class Presentation Jian Pan Department of Computer Science Columbia University Web Enhanced Information Management.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
An Ontological Approach to Financial Analysis and Monitoring.
@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
UMBC an Honors University in Maryland 1 Searching for Knowledge and Data on the Semantic Web Tim Finin University of Maryland, Baltimore County
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Swoogle: A Semantic Web Search and Metadata Engine Li Ding, Tim Finin, Anupam Joshi, Rong Pan, R. Scott Cost, Yun Peng Pavan Reddivari, Vishal Doshi, Joel.
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
Data mining in web applications
Finding knowledge, data and answers on the Semantic Web
Information Retrieval and the Semantic Web
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
SWD = SWO + SWI SWD Rank SWD IR Engine
Presented by ebiqity UMBC Nov, 2004
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
Visit Swoogle web site at
OntoRank for RDF documents
Presentation transcript:

Swoogle Swoogle Semantic Search Engine Web-enhanced Information Management Bin Wang

Outline  Background Introduction  Semantic Web  Semantic Search  Swoogle – Semantic Search Engine  Swoogle Architecture  Semantic Web documents  Finding SWDs  Ranking SWDs  Swoogle Indexing and Retrieval  Conclusion

Background Introduction  What is Semantic Web?  An evolving development of WWW.  The semantics of information and services in the web is well-defined.  It makes it possible for web to understand and satisfy the requests of people and machines to use the web content.

Background Introduction  What is Semantic Web? The Semantic Web Layers

Background Introduction  What is Semantic Search?  A set of techniques on the management of documents, especially semantically supported document retrieval.  Two forms of Search: Navigational Search, Research Search; Semantic Search belongs to the second category.  It attempts to augment and improve traditional search results by using data from Semantic Web.

Swoogle – Semantic Search Engine  Swoogle – A crawler-based indexing and retrieval system for semantic web – RDF and OWL documents encoded in XML and N3  It automatically discovers SWDs, indexes the metadata and answers queries about it.  SWDs are characterized by semantic annotation and meaningful references to other SWDs; conventional search engines do not take advantage of these features.

Swoogle Search Interface Swoogle Search Interface Developed by UMD

Activities that Swoogle can do  Finding appropriate ontologies It allows users to query for ontologies that contain specified terms anywhere in the document. The ontologies returned are ranked by Ontology Rank algorithm.  Finding instance data It helps users to integrate data distributed on the web.  Characterizing the Semantic Web It reveals interesting structural properties about the semantic web by extracting metadata and especially inter- document relations.

Swoogle Architecture  Four main components: SWD discovery, metadata creation, data analysis and interface

Swoogle Architeture  SWD discovery component:  discovers potential SWDs throughout the web  keeps up-to-date information about SWDs.  Metadata creation component:  generates objective metadata about SWDs at both the syntax level and the semantic level.  Data analysis component:  derives analytical reports, such as classification of SWOs and SWDBs, rank of SWDs and IR index of SWDs  Interface component:

Semantic Web Documents(SWDs)  A SWD is a document in a semantic web language(based on RDF, e.g. RDFS, DAML+OIL, and OWL) that is online and accessible to web users and software agents.  There are two kinds of documents in SWDs:  SWOs (Semantic Web Ontology)  SWDBs (Semantic Web Databases)

Semantic Web Documents(SWDs)  SWOs(Semantic Web Ontology) A SWD with a significant proportion of the statements it makes define new terms or extend the definitions of terms defined in other SWDs.  SWDBs(Semantic Web Databases) A SWD without defining or extending a significant number of terms.

Finding SWDs  Develop a Google Crawler to search URLs using the Google Web Service.  starts with type extensions(e.g..rdf,.owl,.daml, and.n3, good SWD indicators )  Develop a Focused Crawler to crawl documents within a given website.  only crawls URLs relative to the given base URL  invites SW community to submit the URLs

Finding SWDs  Develop the JENA2 based Swoogle Crawler.  It verifies if a document is a SWD or not  It revisits discovered URLs to check updates  Some heuristics are used to discover new SWDs through semantic relations. --A URIref is highly likely to be the URL of a SWD. --OWL: imports links to an external ontology, which is a SWD. --etc..

SWD Metadata  It is collected to make SWD search more efficient and effective.  It is derived from the content of SWDs as well as the relations among SWDs.  Swoogle identifies three categories of metadata:  Basic metadata;  Relations;  Analytical results such as SWO/SWDB classification, and SWD ranking.

SWD Metadata  1. Basic metadata It considers the syntactic and semantic features of a SWD.  Language feature It refers to the properties describing the syntactic or semantic features of a SWD.  RDF statistics It refers to the properties summarizing node distribution of the RDF graph of a SWD.  Ontology annotation It refers to the properties that describe a SWD as an ontology.

SWD Metadata  2.Relations among SWDs Swoogle focuses on SWD level relations which generalize RDF node level relations. The following relations are captured: TM/IN: captures term reference relations between two SWDs; IM: shows that an ontology imports another ontology; EX: shows that an ontology extends another ontology; PV: shows that an ontology is a prior version of another; CPV: shows that an ontology is a prior version of and is compatible with another; IPV: shows that an ontology is a prior version of and is incompatible with another.

Ranking SWDs  Rational Random Surfer  A user will arrive at a given page ->by directly addressing it ->by following one of the links pointing to it;  Different links may stand for different relations, thus have different weights. Jump to a random page Follow a random link bored? SWO? no yes no yes Explore all linked SWOs

Ranking SWDs  Rational Random Surfer - raw rank T(x): the set of SWDs that x links to; L(a): the set of SWDs that links to a; d: a damping factor, typically set to 0.85.

Ranking SWDs  Rational Random Surfer – final rank TC(A) is the transitive closure of SWOs imported by a.  Swoogle computes the rank for SWDBs using the first one, and computes the rank for SWOs using the sec one.

Swoogle Indexing and Retrival  Swoogle adapts the Sire, a custom indexing and retrieval engine:  It employs a TF/IDF model with a standard cosine similarity metric.  It indexes discovered documents by using either character N-Gram or URIrefs as keywords to find relevant documents and to compute the similarity among a set of documents.

Conclusion  Introduces a prototype crawler-based indexing and retrieval system for Semantic Web documents.  One of the interesting properties computed for each SWD is its ontology rank. Here it uses the rational surfing model, different from what is used in conventional search engine.

References Li Ding, Tim Finin, Anupam Joshi, Rong Pan, R. Scott Cost, Yun Peng, Pavan Reddivari, Vishal Doshi, Joel Sachs, Swoogle: a search and metadata engine for the semantic web, Proceedings of the thirteenth ACM international conference on Information and knowledge management, November 08-13, 2004, Washington, D.C., USA R. Guha, Rob McCool, Eric Miller, Semantic search, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary Berners-Lee, Tim; James Hendler and Ora Lassila (May 17, 2001). "The Semantic Web". Scientific American Magazine.

Thank You!