Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.

Similar presentations


Presentation on theme: "Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1."— Presentation transcript:

1 Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1

2 Agenda Introduction Swoogle Swoogle’s Competition – Sindice Semantic Web Search Engine (SWSE) Watson Falcon Research Problems and Issues with Swoogle References ____________________________ 2

3 Introduction ____________________________ 3 Web Dr.Finin’s FOAF Profile Your Agent Possible because: Data is in machine understandable form like – RDF, OWL But how will agent find all this data ? Search Engines ?

4 Introduction 4 ____________________________ Traditional Search Engine ResultsSemantic Web Search Engine Results

5 Swoogle Swoogle is a crawler based indexing and retrieval system for Semantic Web Swoogle crawls and discovers documents written in RDF,OWL Swoogle classifies a Semantic Web Document(SWD) as – Semantic Web Ontology (SWO) – Defines new terms Semantic Web Databases (SWDB) – Makes assertions about individuals ____________________________ 5

6 Swoogle SWOOGLE DEMO ____________________________ 6

7 Swoogle Architecture ____________________________ 7

8 Swoogle Architecture SWD Discovery Component Google crawler using the Google web service Filetypes with extensions “.rdf”, ”.owl”, “.n3” Google limits only 1000 results per query A focussed crawler Crawls documents within a given website Extension and Focus constraints A Swoogle crawler Jena based crawler Explores Semantic Links between SWDs ____________________________ 8

9 Swoogle Architecture Metadata Creation Basic Metadata Encoding – “RDF/XML”, “N-Triple”, “N3” Language – RDF, RDFS, OWL, DAML + OIL OWL Species – OWL-LITE, OWL-DL, OWL-FULL Relations among SWDs Reference relationship among SWDs Inter ontology relationships ____________________________ 9

10 Swoogle Architecture Data analysis component Classification of SWD as SWO or SWDB Compute rank of SWD Web based interface Human User Interface – http://swoogle.umbc.edu Web Services using REST interface Agent Service ____________________________ 10

11 Sindice Created at Digital Enterprise Research Institute (DERI) Key features of Sindice include – Sindice collects SWDs and indexes them on resource URIs, Inverse Functional Properties(IFPs) and keywords Sindice uses the Hadoop parallel architecture ____________________________ 11

12 Sindice Inverse Functional Property (IFP) – An OWL cardinality restriction Sincdice uses three indexes – URI index IFP index Keyword index Benefits - Faster retrieval of data ____________________________ 12

13 Sindice Hadoop architecture is used in the following manner – Sindice employs Hadoop/Nutch to distribute crawling job across multiple machines Collected data is stored in the Hbase distributed column – based store Efficient handling of large datasets across the cluster using a MapReduce implementation ____________________________ 13

14 Sindice SINDICE DEMO ____________________________ 14

15 SWSE Semantic Web Search Engine (SWSE) is also a Semantic Web Search Engine created at Digital Enterprise Research Institute (DERI) SWSE uses a “Multicrawler” – a pipelined architecture for crawling ____________________________ 15

16 Watson Created at Knowledge Management Institute at the UK Open University Major Design Principles – Considers explicit and implicit relations between Ontologies Ranking of Ontologies with focus on quality over popularity ____________________________ 16

17 Watson WATSON DEMO ____________________________ 17

18 Falcon Falcon is a Semantic Web Search engine created at the Institute of Web Science in China Falcon allows keyword based queries on : Objects Concepts Documents Falcon performs class subsumption reasoning ____________________________ 18

19 Falcon FALCON DEMO ____________________________ 19

20 Summary Swoogle Keyword based search Searches Ontologies and Instance Data Others Sindice Indexes on URI, IFP, keywords Use of Hadoop Architecture SWSE Pipelined Architecture for Crawling Watson Implicit relations between SWDs Falcon Class Subsumption Reasoning 20 ____________________________

21 Issues Crawling Swoogle’s crawler is running as a single thread on one machine Limits the number of SWDs dicovered and revisted Possible Solutions Use of Hadoop Architecture Use of Grub ____________________________ 21

22 Other Issues Crawling large structured Datasets like DBPedia More reasoning More services ____________________________ 22

23 References Li Ding et al., "Swoogle: A Search and Metadata Engine for the Semantic Web", Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management, November 2004. P. Mika, G. Tummarello “Web Semantics in the Clouds”, IEEE Intelligent Systems, Volume 23, Issue 5 (September 2008) E. Oren, R.Delbru, M. Catasta, R. Cyganiak, H. Stenzhorn, G. Tummarello “Sindice.com: A document-oriented lookup index for open linked data.” In International Journal of Metadata, Semantics and Ontologies, 3(1), 2008. Mathieu d’Aquin et al., “Watson: A Gateway for the Semantic Web”,Poster session of the European Semantic Web Conference, ESWC 2007 Gong Cheng, Weiyi Ge, Honghan Wu, Yuzhong Qu, “Searching Semantic Web Objects Based on Class Hierarchies” In WWW 2008 Workshop on Linked Data on the Web, 2008 ____________________________ 23

24 Questions? ____________________________ 24


Download ppt "Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1."

Similar presentations


Ads by Google