© Paul Buitelaar – November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas.

Slides:



Advertisements
Similar presentations
Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute 1 From OntoSelect to OntoSelect-SWSE.
Advertisements

DAML Ontology Library Mike Dean OntoLog Forum 28 February
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
Features and Uses of a Multilingual Full-Text Electronic Theses and Dissertations (ETDs) System Yin Zhang Kent State University Kyiho Lee, Bumjong You.
ISWC ASWC th International Semantic Web Conference Busan, South Korea, Nov , 2007
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
OntoBlog: Linking Ontology and Blogs Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of Informatics, Japan 2 Asian.
Information Retrieval in Practice
Search Engines and Information Retrieval
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Searching the Web II. The Web Why is it important: –“Free” ubiquitous information resource –Broad coverage of topics and perspectives –Becoming dominant.
Information Retrieval in Practice
IST NeOn-project.org The Semantic Web is growing… #SW Pages Lee, J., Goodwin, R. (2004) The Semantic.
Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.
University of Kansas Data Discovery on the Information Highway Susan Gauch University of Kansas.
Overview of Search Engines
ACCESS TO QUALITY RESOURCES ON RUSSIA Tanja Pursiainen, University of Helsinki, Aleksanteri institute. EVA 2004 Moscow, 29 November 2004.
Some studies on Vietnamese multi-document summarization and semantic relation extraction Laboratory of Data Mining & Knowledge Science 9/4/20151 Laboratory.
4th project meeting 27-29/05/2013, Budapest, Hungary FP 7-INFRASTRUCTURES programme agINFRA agINFRA A data infrastructure for agriculture.
Search Engines and Information Retrieval Chapter 1.
Improving the Catalogue Interface using Endeca Tito Sierra NCSU Libraries.
Language Technology for the Semantic Web OntoWeb5,Florida,October 17 th,2003 WP12: Language Technology Overview SIG5 Paul Buitelaar.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
Redeeming Relevance for Subject Search in Citation Indexes Shannon Bradshaw The University of Iowa
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
Towards an ecosystem of data and ontologies Mathieu d’Aquin and Enrico Motta Knowledge Media Institute The Open University.
A Panoramic Approach to Integrated Evaluation of Ontologies in the Semantic Web S. Dasgupta, D. Dinakarpandian, Y. Lee School of Computing and Engineering.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Learning Patterns on the World Wide Web Andrew Hogue Advisor: David Karger October 17, 2003.
Search Engine Architecture
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
You sexy beast. Ok, inappropriate. How about: Web of links to Web of Meaning Hello Semantic Web!
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
A System for Automatic Personalized Tracking of Scientific Literature on the Web Tzachi Perlstein Yael Nir.
Virtual Information and Knowledge Environments Workshop on Knowledge Technologies within the 6th Framework Programme -- Luxembourg, May 2002 Dr.-Ing.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
CS798: Information Retrieval Charlie Clarke Information retrieval is concerned with representing, searching, and manipulating.
Characterizing Knowledge on the Semantic Web with Watson Mathieu d’Aquin, Claudio Baldassarre, Laurian Gridinoc, Sofia Angeletou, Marta Sabou, Enrico Motta.
Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
July 2002, DI Colloquium Semantic Annotation for Semantic Indexing Paul Buitelaar, Martin VolkMuchMore DFKI Language Technology Saarbrücken, Germany Eurospider.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Swoogle: A Semantic Web Search and Metadata Engine Li Ding, Tim Finin, Anupam Joshi, Rong Pan, R. Scott Cost, Yun Peng Pavan Reddivari, Vishal Doshi, Joel.
Information Retrieval in Practice
Full Text Finder Publication Finder Overview
Search Engine Architecture
An Empirical Study of Learning to Rank for Entity Search
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Search Engine Architecture
Built by Schools for Schools
Presented by ebiqity UMBC Nov, 2004
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
Thanks to Bill Arms, Marti Hearst
Exploring Scholarly Data with Rexplore
IL Step 3: Using Bibliographic Databases
Introduction of KNS55 Platform
International Marketing and Output Database Conference 2005
Introduction to Information Retrieval
Content Augmentation for Mixed-Mode News Broadcasts Mike Dowman
Presentation transcript:

© Paul Buitelaar – November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas Eigner Competence Center Semantic Web & Language Technology Lab DFKI GmbH Saarbrücken, Germany

© Paul Buitelaar – November 2007, Busan, South-Korea Overview Ontology Search  Knowledge reuse (integration with Ontology Learning) OntoSelect  Browse (ontologies, labels, classes, properties)  Search by topic Evaluating Ontology Search  Benchmark (evaluation) data set  Experiment (compare SWOOGLE, OntoSelect) Conclusions

© Paul Buitelaar – November 2007, Busan, South-Korea Ontology Search There are more and more ontologies published on the (Semantic) Web  Available as RDFS or OWL files (also still DAML) Opens up possibilities for reuse of knowledge  Access through ontology search engines and/or (manual/automatic) organization in ontology libraries But: increasingly harder to find the right one for your application  Increasing research in ontology search/selection (Alani et al., Buitelaar et al., Ding et al., Sabou et al.) – SWOOGLE, OntoSelect, Watson

© Paul Buitelaar – November 2007, Busan, South-Korea OntoSelect Ontology Library and Search Engine  Monitors the web for ontologies with automatic harvesting and indexing  Browse and search On ontologies, classes, properties and (multilingual) labels Ontology search integrates relevance feedback over Wikipedia for search term  Ontology publishing Submit ontologies - will be automatically integrated  Statistics On formats, languages, labels used, ontology publishing Paul Buitelaar, Thomas Eigner, Thierry Declerck OntoSelect: A Dynamic Ontology Library with Support for Ontology Selection In: Proc. of the Demo Session at the International Semantic Web Conference, Hiroshima, Japan, Nov

© Paul Buitelaar – November 2007, Busan, South-Korea OntoSelect – Browse

© Paul Buitelaar – November 2007, Busan, South-Korea Ontology Search

© Paul Buitelaar – November 2007, Busan, South-Korea Keyword as Wikipedia Topic

© Paul Buitelaar – November 2007, Busan, South-Korea Keyword Expansion (Extraction) Relevance Feedback from Wikipedia

© Paul Buitelaar – November 2007, Busan, South-Korea Ranked Results (Browsable)

© Paul Buitelaar – November 2007, Busan, South-Korea Search Criteria Relevance criteria address ontology content, structure, status:  Coverage - Term Matching How many of the terms in a text collection are covered by labels for classes and properties?  Structure - Properties Relative to Classes How detailed is the knowledge structure that the ontology represents?  Connectedness - Number of Included Ontologies Is the ontology connected to other ontologies and how well established are these?

© Paul Buitelaar – November 2007, Busan, South-Korea Evaluation – Benchmark Benchmark: 15 Wikipedia topics and 57 manually assigned ontologies out of 1056 cached through OntoSelect 15 Wikipedia topics were selected out of the set of all (37284) class/property labels in OntoSelect, by:  Filtering out labels that did not correspond to a Wikipedia page > 5658 labels / topics  5658 labels were used as search terms in SWOOGLE to filter out labels that returned less than 10 ontologies (out of the 1056 in OntoSelect) > 3084 labels / topics  Out of 3084 labels we manually selected useful topics, e.g. we left out very short labels (‘v’) and very abstract ones (‘thing’) > 50 topics  We randomly selected 15 for which we manually checked the ontologies retrieved from OntoSelect and SWOOGLE > 15 topics with 57 assigned ontologies

© Paul Buitelaar – November 2007, Busan, South-Korea Evaluation – Benchmark by Topic 15 (Wikipedia) topics with number of assigned ontologies:  Atmosphere (2)  Biology (11)  City (3) katyn/CMSC828y/location.daml  Communication (10)  Economy (1)  Infrastructure (2)  Institution (1)  Math (3)  Military (5)  Newspaper (2)  Oil (0)  Production (1)  Publication (6)  Railroad (1)  Tourism (9)

© Paul Buitelaar – November 2007, Busan, South-Korea Evaluation – Experiment Comparison of (average) results between SWOOGLE and OntoSelect Use OntoSelect benchmark  15 topics (queries)  57 assigned ontologies (relevance assessments)  1056 ontologies (data set) Use different configurations for OntoSelect  With/without keyword expansion/extraction  With/without class names (in addition to labels)  With/without property labels  Weighting of relevance criteria  …

© Paul Buitelaar – November 2007, Busan, South-Korea Evaluation – Results

© Paul Buitelaar – November 2007, Busan, South-Korea Evaluation – Weighting of ‚title‘

© Paul Buitelaar – November 2007, Busan, South-Korea Conclusions Conclusions on evaluation are too early  Many more configurations (weights) to compare  Extend the benchmark  Comparison with other ontology search engines Main contribution of the presented work  First comprehensive benchmark for topic-driven evaluation of ontology search  (Extended) Benchmark will be made publicly available