State of the Art for Ontology Repositories Frank Olken National Science Foundation CISE/IIS/III Presentation to Ontology Summit NIST Gaithersburg,

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Three-Step Database Design
INSTITUTE FOR CYBER SECURITY April Access Control and Semantic Web Technologies Ravi Sandhu Executive Director and Endowed Chair Institute for Cyber.
Resource Description And Access: The Practical Impact of RDA Patricia Longo OLA Conference February 2, 2008.
1 Copyright ©2007 Sandpiper Software, Inc. Vocabulary, Ontology & Specification Management at OMG Elisa Kendall Sandpiper Software
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
1 eXtended Metadata Registry (XMDR) Two Slides for Ontology Summit Presentation Bruce Bargmeyer Lawrence Berkeley National Laboratory and University of.
August 6, 2009 Joint Ontolog-OOR Panel 1 Ontology Repository Research Issues Joint Ontolog-OOR Panel Discussion Ken Baclawski August 6, 2009.
Copyright © 2007 Vangent, Inc. All Rights Reserved. Example of OOR Architecture Open Ontology Repository Architecture – Some Considerations April 28-29,
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
EPI809/Spring Chapter 10 Hypothesis testing: Categorical Data Analysis.
Chapter 10: Designing Databases
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
XML: Extensible Markup Language
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Ameet N Chitnis, Abir Qasem and Jeff Heflin 11 November 2007.
Ontology Notes are from:
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Descriptions Robert Grimm New York University. The Final Assignment…  Your own application  Discussion board  Think: Paper summaries  Web cam proxy.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Editing Description Logic Ontologies with the Protege OWL Plugin.
Introduction to UDDI From: OASIS, Introduction to UDDI: Important Features and Functional Concepts.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. XMDR Prototype Day: 21.
ITEC224 Database Programming
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Logics for Data and Knowledge Representation
RDF and OWL Developing Semantic Web Services by H. Peter Alesso and Craig F. Smith CMPT 455/826 - Week 6, Day Sept-Dec 2009 – w6d21.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
Ontology Summit 2015 Track C Report-back Summit Synthesis Session 1, 19 Feb 2015.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Ihr Logo Fundamentals of Database Systems Fourth Edition El Masri & Navathe Chapter 2 Database System Concepts and Architecture.
Ontology Summit2007 Survey Response Analysis Ken Baclawski Northeastern University.
Semantic Web - an introduction By Daniel Wu (danielwujr)
IS 325 Notes for Wednesday August 28, Data is the Core of the Enterprise.
1 Everyday Requirements for an Open Ontology Repository Denise Bedford Ontolog Community Panel Presentation April 3, 2008.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
th Open Forum on Metadata Registries, Kobe, Japan1 XMDR Project Overview Frank Olken & Kevin D. Keck Lawrence.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Open Ontology Repository Initiative Frank Olken Lawrence Berkeley National Laboratory National Science Foundation presented to CENDI/NKOS.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
1 Ontolog OOR-BioPortal Comparative Analysis Todd Schneider 15 October 2009.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lotzi Bölöni.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Object storage and object interoperability
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Extended Metadata Registries and Semantics (Part 2: Implementation) Karlo Berket Ecoterm IV Environmental Terminology Workshop April 18, 2007 Diplomatic.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
XML in Web Technologies
Analyzing and Securing Social Networks
Ontology.
Ontology.
Information Networks: State of the Art
Presentation transcript:

State of the Art for Ontology Repositories Frank Olken National Science Foundation CISE/IIS/III Presentation to Ontology Summit NIST Gaithersburg, MD v05 April 28, 2008

F. Olken, Ontology Summit Disclaimer Opinions expressed in this talk are solely those of the author, and do not reflect the positions of either the National Science Foundation, CISE, IIS or Lawrence Berkeley National Laboratory.

April 28, 2008F. Olken, Ontology Summit This talk: I will address key issues in the design and implementation of ontology repositories and some of the major technologies being used to address these issues.

April 28, 2008F. Olken, Ontology Summit Outline What is an ontology repository? Why doe one want one? Macro vs. Micro Issues Implementation Issues

April 28, 2008F. Olken, Ontology Summit Implementation Issues Ontology acquisition, ingestion Macro vs. micro issues Centralized vs. Decentralized Ontology representation Ontology search, query Ontology Integration Auxiliary tools SOA, etc.

April 28, 2008F. Olken, Ontology Summit What is an Ontology Repository? System for storing, searching, retrieving multiple ontologies Support for ontology integration Variously: Tools for ontology creation, editing, visualization Tools for ontology annotation, curation,....

April 28, 2008F. Olken, Ontology Summit Multiple Ontologies This is the source of the hardest problems in building ontology repositories: Scale Diverse ontology representations Ontology integration (mapping) Namespace issues Complex provenance issues

April 28, 2008F. Olken, Ontology Summit Why would you want an OR? You need to deal with multiple ontologies Usual reasons for ontologies: Natural Language Processing support Data Integration, Exchange Data semantics Support for DB queries DB, application design Classification / Indexing of documents, etc. Creation / maintenance /use of controlled vocabularies

April 28, 2008F. Olken, Ontology Summit Ontology Acquisition Manual acquisition and loading e.g. XMDR Useful if ontology representations are very diverse. Spidering the web to find ontologies (e.g., Nutch) Google (etc.) search to find ontologies How does one recognize an ontology? Use of OWL, RDF, CL, etc. Lots of is-a, part-of relations... Comments that assert file is an ontology

April 28, 2008F. Olken, Ontology Summit Ontology Ingestion Parsing ontology, syntactic validation Consistency checking (no cycles in partial orders: taxonomies, partonomies) Conversion to common representation (?) Syntactic translation Semantic translation e.g., CWA vs. OWA Indexing, transitive closure computations,...

April 28, 2008F. Olken, Ontology Summit Centralized vs. Federated Architectures Centralized: collect ontologies into one place High startup, maintenance costs Fast retrieval, facilitates integration Federated: ontologies stay put Low startup, maintenance costs Less performance, reliability More requirements on ontology sites Hybrid Centralize ontology level metadata, indices Leave individual ontologies in place

April 28, 2008F. Olken, Ontology Summit Macro vs. Micro-level Issues Macro-level Searching across a collection of ontologies and their metadata Micro-level Searching, inferencing, within individual ontologies

April 28, 2008F. Olken, Ontology Summit Macro & Micro similarities Most (not all) macro and micro level issues are essentially the same and can use the same technologies for implementation.

April 28, 2008F. Olken, Ontology Summit Macro-level Support Over collections of ontologies Use an ontology of ontologies e.g., taxonomy of subject matter Ontology of ontology metadata

April 28, 2008F. Olken, Ontology Summit Ontology Search Text-based search Natural language definitions Symbols E.g., Lucene, UIMA Semantic Search Over ontology representation (RDF, OWL, CL) e.g., SPARQL, etc. e.g., faceted search (e.g., Siderean) e.g., navigation over taxonomies, etc.

April 28, 2008F. Olken, Ontology Summit Ontology Representations Text Frames (OBO) Graphs (RDF) Logics (OWL-DL, OWL Full, CL)

April 28, 2008F. Olken, Ontology Summit Text Representation Obvious candidate for ontology representation of informal ontologies, with natural language definitions, etc..... A lowest common denominator representation for more formal ontology representations Readily supports handling diverse ontology representations (must add tags for underlying ontology representation language) Only supports text search directly

April 28, 2008F. Olken, Ontology Summit Frame Representations Each frame is a collection of: (slot, value) pairs or (slot, value list) Originally deployed in Lisp Secondary Storage Each frame is a BLOB Or, decompose into finer grained DB entries Current uses: OBO (open biological ontology) format

April 28, 2008F. Olken, Ontology Summit Graph Representations a.k.a. Semantic networks, semantic graphs Examples: RDF, RDF schemas, XLinks List of edges, each edge: Subject Predicate (relation name, attribute name) Object (or attribute value) Very flexible Only support binary relations directly

April 28, 2008F. Olken, Ontology Summit Types of Graphs Trees Simple Taxonomies (isa), Partonomies (partof) Multi-faceted Classifications Taxonomies with multiple facets e.g.., Vehicles: purpose, propulsion, wheels, axles, color Directed acyclic graphs Multiple inheritance Partial orders

April 28, 2008F. Olken, Ontology Summit Types of graphs Arbitrary directed graphs Allows arbitrary binary relationships Named graphs Allows separate inclusion hierarchy Allow edges to point to/from subgraphs

April 28, 2008F. Olken, Ontology Summit Partial Orders Many ontologies are Partial Orders (i.e, directed acyclic graphs), e.g., taxonomies, partonomies,... Mappings among partial ordered ontologies should be order preserving See work of Cliff Joslyn (PNNL)

April 28, 2008F. Olken, Ontology Summit Note: RDF are collections of edges (triples) No naked nodes allowed

April 28, 2008F. Olken, Ontology Summit Graph Implementations Represent graph as: Triple store (as on previous slide) Quad store (support named graphs) Standalone system, relational DBMS, column store

April 28, 2008F. Olken, Ontology Summit Quad stores & Named graphs Quad stores allow named graphs (named graph, subject, predicate, object) Named graphs (quads) allow one to name subgraphs (collections of edges) and to refer to them by name Hence, subjects and objects are no longer just nodes, but may be subgraphs (collections of edges)

April 28, 2008F. Olken, Ontology Summit Secondary storage of graphs Long skinny relations Triples or quads Column stores (Monet DB, Vertica) Multiple indices sorted by: subject, predicate, object, combinations,... Clusters of edges (Cogito)

April 28, 2008F. Olken, Ontology Summit Semantic graph query languages SPARQL is now the primary candidate Undergoing W3C standardization

April 28, 2008F. Olken, Ontology Summit Logic-based Ontology Representations Description Logic (e.g., OWL-DL) Restricted to make it decidable and computationally tractable Typically, lacks cardinality constraints, arithmetic Datalog (Horn clause logic + recursion) Prolog based First Order Logic (e.g., Common Logic) IKL (FOL + name propositions)

April 28, 2008F. Olken, Ontology Summit Logic-based representations Precise, formal semantics Expressiveness (esp. FOL) Issues of scaling, decidability, computational tractability Esp. for FOL Description Logics growing usage DL + rules languages to approx. FOL

April 28, 2008F. Olken, Ontology Summit Ontology Integration Construct mappings between entities (concepts) in pairs of ontologies Mapping relations: same_as, is_a, part_of units_conversion Specify mappings via: frames, graphs,or logic Graph-based mappings (C. Joslyn, PNNL) Logic-based mappings (PROMPT, N. Noy)

April 28, 2008F. Olken, Ontology Summit Partial Orders Many ontologies are Partial Orders (i.e, directed acyclic graphs), e.g., taxonomies, partonomies,... Mappings among partial ordered ontologies should be order preserving See work of Cliff Joslyn (PNNL)

April 28, 2008F. Olken, Ontology Summit Materialization of Partial Orders Partial orders = taxonomies, partonomies Typically specified as direct edges Immediate is-a, or part-of relations Naïve implementation requires repeated traversal of the partial order graph. Materialization of the transitive closure of the partial order (e.g., taxonomy) can reduce query times However, initialization and maintenance are expensive in time and storage

April 28, 2008F. Olken, Ontology Summit Ontology Constraints Type constraints Range, domain constraints Cardinality constraints on relations DB Integrity constraints Functional dependencies Inclusion dependencies (foreign key constraints) Invertibility Disjointedness (of subclasses)

April 28, 2008F. Olken, Ontology Summit Need for Provenance Fiction: Ontologists write definitions ab initio Reality: Most definitions are written by: Administrators (e.g., Code of Federal Regulations) Legislatures (legislation) Judges (court decisions) Professional bodies (accounting regulations)

April 28, 2008F. Olken, Ontology Summit Implications for Provenance We need to track the provenance of definitions Typically this requires citations to external documents May also require tracking of individual definition decisions.... Varying granularity requirements Individual definitions Collections of axioms, definitions Examples: see ISO 11179, XMDR

April 28, 2008F. Olken, Ontology Summit Other Tools Ontology Creation tools Ontology Editors Ontology Differencing tools Ontology modularization tools (clustering, etc.) Ontology Export Ontology Visualization (e.g., graph visualization) Version management Access control

April 28, 2008F. Olken, Ontology Summit SOA: Service Oriented Architecture Very popular Permit distributed implementations Two major alternatives: REST (Representational State Transfer) Built on HTTP (get, put, delete, post operators) URL/URI addresses for all objects SOAP/WSDL Based on XML Remote Procedure Calls

April 28, 2008F. Olken, Ontology Summit REST vs SOAP REST Simple to implement Requires little more than: HTTP server XML parsers SOAP Much more software complexity Lots of software tooling from commercial vendors Better security ?

April 28, 2008F. Olken, Ontology Summit My advice on REST vs. SOAP: Use REST.

April 28, 2008F. Olken, Ontology Summit Ontology Repository Related Standards ISO/IEC Metadata Registries version 3.0 of Part 3) OMG ODM Ontology Definition Metamodel ISO Topic Maps XML Topic Maps Specification (topicmaps.org) W3C OWL recommendations W3C RDF recommendations

April 28, 2008F. Olken, Ontology Summit Ontology Related Standards ISO/IEC Common Logic ISO TC 37 Terminology Services Standards W3C SKOS Simple Knowledge Organization System Reference ISO/IEC Metamodel Framework for Interoperability (Ontology metadata)

April 28, 2008F. Olken, Ontology Summit Recapitulation Ontology Repositories support storage, search, retrieval of multiple ontologies and ontology integration Macro-level & Micro-level support and search pose similar problems A common ontology representation is desirable, but difficult Multiple ontology representations and ontology integration are the most difficult issues aspects.

April 28, 2008F. Olken, Ontology Summit Acknowledgements This work was supported by NSF IPA agreement with LBNL, IRD support. My earlier work on ontology repositories at LBNL was supported by EPA and DOD. The author would like to thank Joel Sachs, Mark Musen, Natasha Noy, Eric Neumann, Bob MacGregor, Cliff Joslyn, Kevin Keck, Elise Kendall, Mala Mehrotra, Dan Abadi, Deb McGuiness, et al. for their remarks to me about knowledge representation, ontology repositories and ontology mappings.

April 28, 2008F. Olken, Ontology Summit Contact Information Frank Olken National Science Foundation 4201 Wilson Blvd., Suite 1125 Arlington, VA Tel: (receptionist) Tel: (direct)