Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.

Slides:



Advertisements
Similar presentations
Upper Ontology Summit Tuesday March 14 The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center.
Advertisements

Personalized Presentation in Web-Based Information Systems Institute of Informatics and Software Engineering Faculty of Informatics and Information Technologies.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Upper Ontology Summit Wednesday March 15 The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National.
Consistent and standardized common model to support large-scale vocabulary use and adoption Robust, scalable, and common API to reduce variation in clinical.
Data Model vs. Ontology Dr. Tatiana Malyuta Associate Professor, CUNY Consultant for DoD Dr. Barry Smith UB, NCOR.
Social networks, in the form of bibliographies and citations, have long been an integral part of the scientific process. We examine how to leverage the.
Thane Kerner Silverchair. What is… The Semantic Web? A Semantic Data Layer? Semantic Tagging? Why add semantics to my content? How can I get semantic.
e-Framework Components and Responsibilities.
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
Universal Core Semantic Layer (UCore SL) An Ontology-Based Supporting Layer for UCore 2.0 Presenter: Barry Smith National Center for Ontological Research.
1 Introduction to (Geo)Ontology Barry Smith
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
Thee-Framework for Education & Research The e-Framework for Education & Research an Overview TEN Competence, Jan 2007 Bill Olivier,
Environmental Terminology System and Services (ETSS) June 2007.
Bridging the Motivation Gap for Individual Annotators: What Can We Learn From Photo Annotation Systems? Tabin Hasan Dept. of Computer Science University.
Ontology and Web 3.0 Ism 158 May 13, 2010 Julian Chytrowski.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
Domain Modelling the upper levels of the eframework Yvonne Howard Hilary Dexter David Millard Learning Societies LabDistributed Learning, University of.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Enriching the Ontology for Biomedical Investigations (OBI) to Improve Its Suitability for Web Service Annotations Chaitanya Guttula, Alok Dhamanaskar,
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
Introduction to Ontology Barry Smith August 11, 2012.
File Processing - Database Overview MVNC1 DATABASE SYSTEMS Overview.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
School of Computing FACULTY OF ENGINEERING Developing a methodology for building small scale domain ontologies: HISO case study Ilaria Corda PhD student.
Ontology Summit2007 Survey Response Analysis -- Issues Ken Baclawski Northeastern University.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
Electronic Scriptorium, Ltd. AIIM Minnesota Chapter Metadata and Taxonomy Presentation Copyright Electronic Scriptorium, Ltd. All rights reserved, 1991.
A School of Information Science, Federal University of Minas Gerais, Brazil b Medical University of Graz, Austria, c University Medical Center Freiburg,
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
You sexy beast. Ok, inappropriate. How about: Web of links to Web of Meaning Hello Semantic Web!
Barry Smith August 26, 2013 Ontology: A Basic Introduction 1.
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
Anatomy Ontology Community Melissa Haendel. The OBO Foundry More than just a website, it’s a community of ontology developers.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Ontology and the Semantic Web Barry Smith August 26,
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Towards a Glossary of Activities in the Ontology Engineering Field Mari Carmen Suárez-Figueroa and Asunción Gómez-Pérez {mcsuarez, Ontology.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Introduction to Markup Languages January 31, 2002.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Application Ontology Manager for Hydra IST Ján Hreňo Martin Sarnovský Peter Kostelník TU Košice.
Need for common standard upper ontology
1cs The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,
Trait ontology approach Marie-Angélique LAPORTE NCEAS June 7 th 2010.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Introduction to the Semantic Web Jeff Heflin Lehigh University.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
A Rule Driven Bi-Directional Translation System for Remapping Queries and Result Sets Between a Mediated Schema and Heterogeneous Data Sources R. Shaker.
Semantics and the EPA System of Registries Gail Hodge IIa/ Consultant to the U.S. Environmental Protection Agency 18 April 2007.
Semantic Media Wiki Open Terminology Development - Initial Steps - Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer.
The Agricultural Ontology Server (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Food and Agriculture Organization.
SEMANTIC WEB Presented by- Farhana Yasmin – MD.Raihanul Islam – Nohore Jannat –
Health Care Conference Aruba June 2nd – 4th, 2017
Development of the Amphibian Anatomical Ontology
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Doron Goldfarb & Yann LE FRANC
MANAGING KNOWLEDGE FOR THE DIGITAL FIRM
Ontologies, Web 2.0 and Beyond
OBO Foundry Principles
OBI – Standard Semantic
Ontology-Based Approaches to Data Integration
ece 627 intelligent web: ontology and beyond
Presentation transcript:

Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1

The strategy of annotation Databases describe data using multiple heterogeneous labels If we can annotate (tag) these labels using terms from common controlled vocabularies, then a virtual arms- length integration can be achieved, providing immediate benefits for search and retrieval a starting point for the creation of net-centric reference data potential longer term benefits for reasoning with no need to modify existing systems, code or data See Ceusters et al. Proceedings of DILS

Ontologies facilitate grouping of annotations brain 20 hindbrain 15 rhombomere 10 Query ‘brain’ without ontology 20 Query ‘brain’ with ontology 45 3 String searches yield partial results, rest on manual effort and on familiarity with existing database contents

Examples of where this method works Reference Genome Annotation Project Human resources data in large organizations Military intelligence data Salmen et al. in Other potential areas of application: Crime Public health Insurance Finance 4

But normally the method does not work Semantic technology (OWL, …) seeks to break down data silos Unfortunately it is now so easy to create ontologies that myriad incompatible ontologies are being created in ad hoc ways leading to the creation of new, semantic silos The Semantic Web framework as currently conceived and governed by the W3C (modeled on html) yields minimal standardization The more semantic technology is successful, they more we fail to achieve our goals 5

Reasons for this effect Just as it’s easier to build a new database, so it’s easier to build a new ontology for each new project You will not get paid for reusing existing ontologies (Let a million ontologies bloom) There are no ‘good’ ontologies, anyway (just arbitrary choices of terms and relations …) Information technology (hardware) changes constantly, not worth the effort of getting things right 6

How to do it right? how create an incremental, evolutionary process, where what is good survives, and what is bad fails create a scenario in which people will find it profitable to reuse ontologies, terminologies and coding systems which have been tried and tested silo effects will be avoided and results of investment in Semantic Technology will cumulate effectively 7

Uses of ‘ontology’ in PubMed abstracts 8

By far the most successful: GO (Gene Ontology) 9

GO provides a controlled vocabulary of terms for use in annotating (tagging) biological data multi-species, multi-disciplinary, open source built and maintained by domain experts contributing to the cumulativity of scientific results obtained by distinct research communities natural language and logical definitions for all terms to support consistent human application and computational exploitation rigorous governance process feedback loop connects users to editors 10

How to do it right ontologies should mimic the methodology used by the GO (following the principles of the OBO Foundry: ontologies in the same field should be developed in coordinated fashion to ensure that there is exactly one ontology for each subdomain ontologies should be developed incrementally in a way that builds on successful user testing at every stage 11