Problems in Semantic Search Krishnamurthy Viswanathan and Varish Mulwad {krishna3, varish1} AT umbc DOT edu 1.

Slides:



Advertisements
Similar presentations
1 Search and Navigate Web Ontologies Li Ding Tetherless World Constellation Rensselaer Polytechnic Institute Aug 22, 2008.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
DAML Ontology Library Mike Dean OntoLog Forum 28 February
…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Using Watson for Building Intelligent Applications in E-learning Mathieu d’Aquin The Knowledge Media Institute, The Open University
Using the Semantic Web Mathieu d’Aquin Knowledge Media Institute, the Open University
Information Retrieval in Practice
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
Exploiting the Semantic Web: Next Generation Semantic Web Applications in KMi Watson, PowerMagpie, PowerAqua, … Mathieu d’Aquin Laurian Gridinoc Vanessa.
Watson Supporting Next Generation Semantic Web Applications Mathieu d’Aquin, Claudio Baldassarre, Laurian Gridinoc, Marta Sabou, Sofia Angeletou, Enrico.
LINKED DATA COMS E6125 Prof. Gail Kaiser Presented By : Mandar Mohe ( msm2181 )
IST NeOn-project.org The Semantic Web is growing… #SW Pages Lee, J., Goodwin, R. (2004) The Semantic.
Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Parallel and Distributed IR
Swoogle Swoogle Semantic Search Engine Web-enhanced Information Management Bin Wang.
Overview of Search Engines
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
© Copyright 2008 STI INNSBRUCK Rhizomer “The Rhizomer Semantic Content Management System” Roberto Garcia, Juan.
Databases & Data Warehouses Chapter 3 Database Processing.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
Managing & Integrating Enterprise Data with Semantic Technologies Susie Stephens Principal Product Manager, Oracle
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
The Semantic Web Web Science Systems Development Spring 2015.
Digital Enterprise Research Institute HADA – An Access Controlled Application for Publishing and Discovering Linked Government Data Owen Sacco.
RDF and OWL Developing Semantic Web Services by H. Peter Alesso and Craig F. Smith CMPT 455/826 - Week 6, Day Sept-Dec 2009 – w6d21.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
Towards an ecosystem of data and ontologies Mathieu d’Aquin and Enrico Motta Knowledge Media Institute The Open University.
updated CmpE 583 Fall 2008 Ontology Integration- 1 CmpE 583- Web Semantics: Theory and Practice ONTOLOGY INTEGRATION Atilla ELÇİ Computer.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Semantic Web - an introduction By Daniel Wu (danielwujr)
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
You sexy beast. Ok, inappropriate. How about: Web of links to Web of Meaning Hello Semantic Web!
SPINNING THE SEMANTIC WEB APPLICATIONS FOR THE MODERN ERA LIBRARIES
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 A Sitemap extension to enable efficient interaction with large.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Semantic Web Project Pancreatic Cancer Search Facilitator.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
And the Watson Plugin for the NeOn Toolkit. IST NeOn-project.org The Semantic Web is growing… #SW Pages.
HEMANTH GOKAVARAPU SANTHOSH KUMAR SAMINATHAN Frequent Word Combinations Mining and Indexing on HBase.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Characterizing Knowledge on the Semantic Web with Watson Mathieu d’Aquin, Claudio Baldassarre, Laurian Gridinoc, Sofia Angeletou, Marta Sabou, Enrico Motta.
NeOn Components for Ontology Sharing and Reuse Mathieu d’Aquin (and the NeOn Consortium) KMi, the Open Univeristy, UK
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Search Engine and Optimization 1. Introduction to Web Search Engines 2.
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Swoogle: A Semantic Web Search and Metadata Engine Li Ding, Tim Finin, Anupam Joshi, Rong Pan, R. Scott Cost, Yun Peng Pavan Reddivari, Vishal Doshi, Joel.
Automated Question Answering Suggestion Using User Expert and Semantic Information การแนะนำการตอบคำถามอัตโนมัติ โดยใช้ข้อมูลผู้เชี่ยวชาญ และข้อมูลเชิง.
Information Retrieval in Practice
The Semantic Web By: Maulik Parikh.
Search Engine Architecture
Cloud based linked data platform for Structural Engineering Experiment
وب معنایی در تجارت الکترونیک
SWD = SWO + SWI SWD Rank SWD IR Engine
Analyzing and Securing Social Networks
Presented by ebiqity UMBC Nov, 2004
NJVR: The NanJing Vocabulary Repository
Property consolidation for entity browsing
LOD reference architecture
Web archives as a research subject
Information Retrieval and Web Design
Presentation transcript:

Problems in Semantic Search Krishnamurthy Viswanathan and Varish Mulwad {krishna3, varish1} AT umbc DOT edu 1

Agenda Introduction Swoogle Cool things others do Swoogle facts/figures Our ideas References 2

Why is Semantic Search significant? 3

Swoogle Swoogle is a search engine for Semantic Web (SW) documents It offers the following services: – Search SW ontologies and documents – Search SW terms, i.e. URIs that have been defined as classes and properties – Provide metadata of SW documents and support browsing the Semantic Web 4

Swoogle Swoogle supports two relevant query types: – Ontology: Searches a small collection that consists only of Semantic Web Ontologies – Document: Searches all SW documents. This search space is much larger Swoogle indexes only the document’s URL, the terms being defined in the document, explicit descriptions about the document, and the namespaces used by the document 5

Swoogle capabilities Web search: – Basic metadata: e.g. url, desc, ns etc. – Document metadata: hasEncoding, hasLength etc. – RDF metadata: hasGrammar, hasCntTriple etc. Advanced search using Lucene features REST based services: Compose an HTTP GET query and retrieve the results in the form of RDF/XML 6

Examples of REST queries A query is represented as a URL: – REST_QUERY ::= SERVICE_URI ? PARAMS Example: search SW documents which are classified as ontologies (ontoRatio > 0) – queryType: e.g. search_swd_ontology – searchString: user constructed (see manual) – Key queryType=search_swd_ontology&search String=person&key=demo 7

Cool things other semantic search engines do … 8

Sindice Sindice is a Semantic Web search engine created at Digitial Enterprise Research Institute (DERI) Interesting things to note about Sindice – – Architecture – Indexing 9

Sindice Sindice uses the paradigms of cloud computing for their architecture Sindice uses Hadoop / Nutch to distribute crawling across multiple machines Collected data is stored in a HBase – a distributed column store 10

Sindice Sindice indexes based on – – Inverse Functional Properties (IFP) – URI’s – Literals (Keywords) IFP – An OWL cardinality restriction Benefits – Faster Retrieval 11

Watson – A gateway to the Semantic Web From the Knowledge Management Institute at the Open University in UK Interesting things to note about Watson – – Consider implicit semantic relationships – Quality of Semantic documents – “Rich access” to semantic data 12

Watson Implicit relationships between semantic web documents – Equivalence (Duplicate detection) Quality of Semantic Documents “Richer” access to Semantic Data – Web Interface for Humans – SparQL end point – Java/SOAP and REST APIs 13

Others Semantic Web Search Engine (SWSE) – Pipelined architecture for crawling and indexing – Improved index and storage structure Falcons – Class subsumption reasoning – Includes a Triple Store 14

Power Aqua Multi-ontology based QA system powered by PowerMap and Watson Takes inputs in the form of NL queries Factual queries that can be expressed as one or more linguistic triples Common wh-questions 15

Power Aqua Key challenges in order to be able to answer NL- questions: – Locating the ontologies relevant to a particular query – Identifying semantically sound relationships – Combining information from multiple queries 16

Swoogle facts/figures The search engine components currently run on 4 machines These machines host the crawler, the Lucene index, the MySQL database etc. and access the NFS Approximately 20,000 pages are accessed by Swoogle everyday (which get queued) About 1,731,371 pure SW documents have been discovered 17

Swoogle facts/figures Swoogle crawler has a large queue of documents to be crawled and indexed Swoogle accesses metadata and index files over the NFS that makes information retrieval slower 18

Our Ideas: Research and Engineering Acquire new hardware Parallelize Swoogle Focus on a particular domain Project Swoogle as a search engines for agents 19

Our Ideas: Research and Engineering Improve Swoogle’s indexing scheme Analyze Swoogle’s ranking scheme Use of Swoogle Metadata Improve the usability of the website Google like Services 20

References Li Ding et al., "Swoogle: A Search and Metadata Engine for the Semantic Web", Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management, November P. Mika, G. Tummarello “Web Semantics in the Clouds”, IEEE Intelligent Systems, Volume 23, Issue 5 (September 2008) E. Oren, R.Delbru, M. Catasta, R. Cyganiak, H. Stenzhorn, G. Tummarello “Sindice.com: A document-oriented lookup index for open linked data.” In International Journal of Metadata, Semantics and Ontologies, 3(1), Mathieu d’Aquin et al., “Watson: A Gateway for the Semantic Web”,Poster session of the European Semantic Web Conference, ESWC 2007 Gong Cheng, Weiyi Ge, Honghan Wu, Yuzhong Qu, “Searching Semantic Web Objects Based on Class Hierarchies” In WWW 2008 Workshop on Linked Data on the Web,

Questions ? 22