Ontology Evaluation and Ranking using OntoQA Samir Tartir and I. Budak Arpinar Large-Scale Distributed Information Systems Lab University of Georgia The.

Slides:



Advertisements
Similar presentations
Conceptual Clustering
Advertisements

CS 484 – Artificial Intelligence
Learning to Advertise. Introduction Advertising on the Internet = $$$ –Especially search advertising and web page advertising Problem: –Selecting ads.
More about Correlations. Spearman Rank order correlation Does the same type of analysis as a Pearson r but with data that only represents order. –Ordinal.
Homeland Security Exercise and Evaluation Program HSEEP.
Discriminant Analysis Testing latent variables as predictors of groups.
Mobile App Monetization: Understanding the Advertising Ecosystem Vaibhav Rastogi.
Predicting Missing Provenance Using Semantic Associations in Reservoir Engineering Jing Zhao University of Southern California Sep 19 th,
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
Enhancing Internet Search Engines to Achieve Concept- based Retrieval F. Lu, T. Johnsten, V. Raghavan, and D. Traylor.
Authors: Xu Cheng, Haitao Li, Jiangchuan Liu School of Computing Science, Simon Fraser University, British Columbia, Canada. Speaker : 童耀民 MA1G0222.
Aardvark Anatomy of a Large-Scale Social Search Engine.
Which of the two appears simple to you? 1 2.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
© Paul Buitelaar – November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas.
WEB SEARCH PERSONALIZATION WITH ONTOLOGICAL USER PROFILES Data Mining Lab XUAN MAN.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
GEORGIOS FAKAS Department of Computing and Mathematics, Manchester Metropolitan University Manchester, UK. Automated Generation of Object.
O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Measures of Central Tendency And Spread Understand the terms mean, median, mode, range, standard deviation.
Numerical Measures of Variability
OntoQA: Metric-Based Ontology Quality Analysis Samir Tartir, I. Budak Arpinar, Michael Moore, Amit P. Sheth, Boanerges Aleman-Meza IEEE Workshop on Knowledge.
 Two basic types Descriptive  Describes the nature and properties of the data  Helps to organize and summarize information Inferential  Used in testing.
Measuring Association Rules Shan “Maggie” Duanmu Project for CSCI 765 Dec 9 th 2002.
Guidelines For successful Portfolio Implementation by Melissa Wood.
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
2 Kinds of Statistics: 1.Descriptive: listing and summarizing data in a practical and efficient way 2.Inferential: methods used to determine whether data.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Ontology Quality by Detection of Conflicts in Metadata Budak I. Arpinar Karthikeyan Giriloganathan Boanerges Aleman-Meza LSDIS lab Computer Science University.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
1 SEMEF : A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks Delroy Cameron Masters Thesis Computer Science, University of Georgia.
Post-Ranking query suggestion by diversifying search Chao Wang.
Using Transportation Distances for Measuring Melodic Similarity Pichaya Tappayuthpijarn Qiang Wang.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Knowledge Representation Fall 2013 COMP3710 Artificial Intelligence Computing Science Thompson Rivers University.
Data Mining with Big Data IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014 Xiangyu Cai ( )
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
An Ontological Approach to Financial Analysis and Monitoring.
Customer Satisfaction Index July 2008 RESULTS. Introduction This report presents the results for the Customer Satisfaction Index survey undertaken in.
New Hanover County Schools Board of Education Presentation November 19, 2013.
e-marking in large-scale, high stakes assessments conference themes :  role of technology in assessments and teacher education  use of assessments for.
Are the Standard Documentations really Quality Reports? European Conference on Quality in Official Statistics Helsinki, 3-6 May 2010 © STATISTIK AUSTRIA.
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Discovering and Ranking Semantic Associations over a Large RDF Metabase Chris Halaschek, Boanerges Aleman- Meza, I. Budak Arpinar, Amit P. Sheth 30th International.
Ontology Evaluation Outline Motivation Evaluation Criteria Evaluation Measures Evaluation Approaches.
Statistical analysis.
Queensland University of Technology
Knowledge Representation
Statistical analysis.
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Enhancing Internet Search Engines to Achieve Concept-based Retrieval
Knowledge Discovery in the Semantic Web
Significance analysis of microarrays (SAM)
Ontology-Based Information Integration Using INDUS System
UMBC AN HONORS UNIVERSITY IN MARYLAND
An Empirical Study of Property Collocation on Large Scale of Knowledge Base 龚赛赛
Database Systems Instructor Name: Lecture-3.
Measures in Variability
A Graph-Based Approach to Learn Semantic Descriptions of Data Sources
Measuring Complexity of Web Pages Using Gate
ХУД 34-р сургуулийн нийгмийн ухааны багш О. БАЯРЦЭЦЭГ
Knowledge Representation
WSExpress: A QoS-Aware Search Engine for Web Services
Numerical Statistics Measures of Variability
Business process ontology
8 Core Steps of success: I.Step:1 : DREAM SETTING: II. Step: 2 : LIST MAKING : IV. Step: 4 : SHOW THE PLAN: III. Step: 3 : INVITATION: V. Step: 5 : FOLLOW.
Presentation transcript:

Ontology Evaluation and Ranking using OntoQA Samir Tartir and I. Budak Arpinar Large-Scale Distributed Information Systems Lab University of Georgia The First IEEE International Conference on Semantic Computing September 17-19, 2007 Irvine, California, USA

Outline Why ontology evaluation? OntoQA –Overview –Metrics –Overall Score –Results Future Work

Why Ontology Evaluation? Having several ontologies to choose from, users often face the problem of selecting the ontology that is most suitable for their needs. Ontology developers need a way to evaluate their work

OntoQA a suite of metrics that evaluate the content of ontologies through the analysis of their schemas and instances in different aspects such as the distribution of classes on the inheritance tree of the schema, the distribution of class instances, and the connectivity between instances of different classes. OntoQA is –tunable –requires minimal user involvement –considers both the schema and the instances of a populated ontology.

OntoQA Overview Keywords

I. Schema Metrics Address the design of the ontology schema. Schema could be hard to evaluate: domain expert consensus, subjectivity etc. Metrics: –Relationship diversity –Inheritance deepness

I. Schema Metrics –Relationship diversity This measure differentiates an ontology that contains mostly inheritance relationships (≈ taxonomy) from an ontology that contains a diverse set of relationships. –Schema Deepness This measure describes the distribution of classes across different levels of the ontology inheritance tree

II. Instance Metrics Evaluate the placement, distribution and relationships between instance data which can indicate the effectiveness of the schema design and the amount of knowledge contained in the ontology.

II. Instance Metrics Overall KB Metrics –This group of metrics gives an overall view on how instances are represented in the KB. Class-Specific Metrics –This group of metrics indicates how each class defined in the ontology schema is being utilized in the KB. Relationship-Specific Metrics –This group of metrics indicates how each relationship defined in the ontology schema is being utilized in the KB.

Overall KB Metrics Class Utilization –Evaluates how classes defined in the schema are being utilized in the KB. Class Instance Distribution –Evaluates how instances are spread across the classes of the schema. Cohesion (connectedness) –Used to discover instance “islands”. CID = StdDev(Inst(Ci))

Class-Specific Metrics Class Connectivity (centrality) –This metric evaluates the importance of a class based on the relationships of its instances with instances of other classes in the ontology. Class Importance (popularity) –This metric evaluates the importance of a class based on the number of instances it contains compared to other classes in the ontology. Relationship Utilization –This metric evaluates how the relationships defined for each class in the schema are being used at the instances level.

Relationship-Specific Metrics Relationship Importance (popularity) –This metric measures the percentage of instances of a relationship with respect to the total number of relationship instances in the KB.

Ontology Score Calculation Metric i : {Relationship diversity, Schema Deepness, Class Utilization, Cohesion, Avg(Connectivity(C i )), Avg(Importance(C i )), Avg(Relationship Utilization(C i )), Avg(Importance(R i )), #Classes, #Relationships, #Instances} W i : Set of tunable metric weights

Results SymbolOntology URL Ihttp://ebiquity.umbc.edu/ontology/conference.owl IIhttp://kmi.open.ac.uk/semanticweb/ontologies/owl/aktive-portal-ontology-latest.owl IIIhttp:// IVhttp:// Vhttp:// VIhttp://owl.mindswap.org/2003/ont/owlweb.rdf VIIhttp:// :9090/RDF/VRP/Examples/SWPG.rdfs VIIIhttp:// IXhttp:// Swoogle Results for "Paper"

OntoQA Ranking - 1 OntoQA Results for "Paper“ with default metric weights

OntoQA Ranking - 2 OntoQA Results for "Paper“ with metric weights biased towards larger schema size

OntoQA vs. Users Ontology OntoQA Rank Average User Rank I 2 9 II 5 1 III 6 5 IV 1 6 V 8 8 VI 4 4 VII 7 2 VIII 3 7 Pearson’s Correlation Coefficient = 0.80

Future work Enable the user to specify an ontology library (e.g. OBO) to limit the search in ontologies that exist in that specific library. Use BRAHMS instead of Sesame as a data store since BRAHMS is more efficient in handling large ontologies that are common in bioinformatics.

Questions