1 eXtended Metadata Registry (XMDR) Interagency/International Cooperation on Ecoinformatics Ispra, Italy January 17, 2006 Bruce Bargmeyer, Lawrence Berkley.

Slides:



Advertisements
Similar presentations
1 Copyright ©2007 Sandpiper Software, Inc. Vocabulary, Ontology & Specification Management at OMG Elisa Kendall Sandpiper Software
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
1 eXtended Metadata Registry (XMDR) Two Slides for Ontology Summit Presentation Bruce Bargmeyer Lawrence Berkeley National Laboratory and University of.
Ecoinformatics International Technical Collaboration
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
1 Extended Metadata Registries and Semantics April 18, 2007 Bruce Bargmeyer University of California, Berkeley and Lawrence Berkeley National Laboratory.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Direction of Proposals for New Edition (E3) of ISO/IEC 11179
Guoqian Jiang, MD, PhD Mayo Clinic
Thesauri, Terminologies and the Semantic Web
Semantic Web Tools Vagan Terziyan Department of Mathematical Information Technology, University of Jyvaskyla ;
SDC JE-xxxx. Bruce Bargmeyer EPA/OIRM/EIM Division Tel: (202) WWW URL:
Future of MDR - ISO/IEC Metadata Registries (MDR) Larry Fitzwater, SC 32 WG 2 Convener Computer Scientist U.S. Environmental Protection Agency May.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Agents and Semantic Mediation Mikhaila Burgess Cardiff University.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. XMDR Prototype Day: 21.
Practical RDF Chapter 1. RDF: An Introduction
A Standard & Prototype Starting Point for An Open Ontology Repository: The Extended Metadata Registry Project John L. McCarthy XMDR Project Lawrence Berkeley.
Metadata Open Forum 2008 ISO/IEC/IEC 11179: Metadata Registries A Tutorial from the National Cancer Institute Dianne M. Reeves, RN, MSN National Cancer.
LexEVS 6.0 Overview Scott Bauer Mayo Clinic Rochester, Minnesota February 2011.
Knowledge based Learning Experience Management on the Semantic Web Feng (Barry) TAO, Hugh Davis Learning Society Lab University of Southampton.
Environmental Terminology Research in China HE Keqing, HE Yangfan, WANG Chong State Key Lab. Of Software Engineering
1 Collaborative Research, Development and Demonstration Ecoinformatics International Technical Collaboration Copenhagen, Denmark March, Bruce Bargmeyer.
LexEVS Overview Mayo Clinic Rochester, Minnesota June 2009.
Classification and the Metadata Registry Judith Newton NIST IRS XML Stakeholders/ XML Working Group May 18, 2004.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Building an Ontology of Semantic Web Techniques Utilizing RDF Schema and OWL 2.0 in Protégé 4.0 Presented by: Naveed Javed Nimat Umar Syed.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
2004 Open Forum for eBusiness and Metadata Technology Standardization Metamodel Framework for Ontology Keqing He, Yixin Jing, Yangfan He State Key Laboratory.
LexRDF: A Semantic-Web Compatible Extension of LexGrid Cui Tao Jyotishman Pathak Harold R. Solbrig Wei-Qi Wei Christopher G. Chute Division of Biomedical.
LexBIG Release Overview Aug 21, LexBIG Context Project Goals for Sept –Incremental point release of LexBIG infrastructure to support EVS activities.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
EPA’s Environmental Terminology System and Services (ETSS) Michael Pendleton Data Standards Branch, EPA/OEI Ecoiformatics Technical Collaborative Indicators.
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
Clinical Data Interchange Standards Consortium (CDISC) uses NCIt for its Study Data Tabulation Model (SDTM) and other global data standards for medical.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Presentation Title: Day:
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
th Open Forum on Metadata Registries, Kobe, Japan1 XMDR Project Overview Frank Olken & Kevin D. Keck Lawrence.
CaDSR Software Users Meeting 3.1 Requirements Review 9/19/2005 caDSR Software Team Host: Denise Warzel NCICB, Assistant Director, caDSR.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
1 eXtended Metadata Registry (XMDR) Ecoterm Rome, Italy May 17, 2006 Bruce Bargmeyer, Lawrence Berkley National Laboratory University of California Tel:
LexGrid Philosophy, Model and Interfaces Harold R Solbrig Division of Biomedical Statistics and Informatics Mayo Clinic.
A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
- EVS Overview - Biomedical Terminology and Ontology Resources Frank Hartel, Ph.D. Director, Enterprise Vocabulary Services NCI Center for Bioinformatics.
Overview of SC 32/WG 2 Standards Projects Supporting Semantics Management Open Forum 2005 on Metadata Registries 14:45 to 15:30 13 April 2005 Larry Fitzwater.
1 Technical Projects Workgroup Report to Plenary Ecoinformatics International Technical Collaboration April 10, 2008 Research Triangle Park, North Carolina,
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Metadata Registries Workshop Metadata Registries Workshop U.S. Bureau of Labor Statistics Conference Center April 15-17, 1998.
Extending the MDR for Semantic Web November 20, 2008 SC32/WG32 Interim Meeting Vilamoura, Portugal - Procedure for the Specification of Web Ontology -
ISO/IEC JTC 1/SC 32 Plenary and WGs Meetings Jeju, Korea, June 25, 2009 Jeong-Dong Kim, Doo-Kwon Baik, Dongwon Jeong {kjd4u,
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Terminology Components for Ecoinformatics Sharing Gail Hodge Consultant to USGS BIO/NBII Information International Associates, Inc. 28 January 2004 science.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Semantics and the EPA System of Registries Gail Hodge IIa/ Consultant to the U.S. Environmental Protection Agency 18 April 2007.
Update on Ecoinformatics Technical Working Group Activities Larry Fitzwater Computer Scientist US Environmental Protection Agency Rome, Italy – 17 May.
Ontology Technology applied to Catalogues Paul Kopp.
National Cancer Institute caDSR Briefing for Small Scale Harmonication Project Denise Warzel Associate Director, Core Infrastructure caCORE Product Line.
Extended Metadata Registries and Semantics (Part 2: Implementation) Karlo Berket Ecoterm IV Environmental Terminology Workshop April 18, 2007 Diplomatic.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush.
Agenda Federated Enterprise Architecture Vision
Report on Eighth Open Forum on Metadata Registries, Berlin, April 2005
The Re3gistry software and the INSPIRE Registry
Presentation transcript:

1 eXtended Metadata Registry (XMDR) Interagency/International Cooperation on Ecoinformatics Ispra, Italy January 17, 2006 Bruce Bargmeyer, Lawrence Berkley National Laboratory University of California Tel:

2 XMDR Project Collaboration F Collaborative, interagency effort u EPA, USGS, NCI, Mayo Clinic, DOD, LBNL …& others F Draws on and contributes to interagency/International Cooperation on Ecoinformatics F Involves Ecoterm, international, national, state, local government agencies, other organizations as content providers and potential users F Interacts with many organizations around the world through ISO/IEC standards committees

3 XMDR Project Results: Bootstrapping Semantic Computing F Design for next generation metadata registries—expressed as a standard F XMDR Prototype, open source software F Content loaded in prototype: millions of concepts, terms, and relations between concepts. F Demonstrations for healthcare and the environment

4 Metadata Registry Extensions F Register (and manage) any semantics that are useful for managing data. u E.g., this may include registering not only permissible values (concepts), definitions, but may extend to registration of the full concept systems in which the permissible values are found. u E.g., may want to register keywords, thesauri, taxonomies, ontologies, axiomatized ontologies…. F Support traditional data management and data administration F Lay Foundation for semantic computing: Semantics Service Oriented Architecture, Semantic Grids, Semantics based workflows, Semantic Web ….

5 Where have we been? Where are we planning to go? System manuals Data dictionaries E E3 XML & related standards Semantic grids E2 Semantics services (SSOA) Complex semantics management Data engineering Data Standards XMDR Project Semantics: Semantic Web Data + ontology lifecycle management Terminologies, ontologies Data Management/ Data Administration

6 XMDR Draws Together Metadata Registry Terminology Thesaurus Themes Data Standards Ontology GEMET Structured Metadata Users Registries Terminology CONCEPT Referent Refers To Symbolizes Stands For “Rose”, “ClipArt”

7 Concept System Store Metadata Registry Concept System Thesaurus Themes Data Standards Ontology GEMET Structured Metadata Users Concept systems: Keywords Controlled Vocabularies Thesauri Taxonomies Ontologies Axiomatized Ontologies (Essentially graphs: node-relation-node + axioms) }

8 Management of Concept Systems Metadata Registry Concept System Thesaurus Themes Data Standards Ontology GEMET Structured Metadata Users Concept system: Registration Harmonization Standardization Acceptance (vetting) Mapping (correspondences) }

9 Life Cycle Management Metadata Registry Concept System Thesaurus Themes Data Standards Ontology GEMET Structured Metadata Users Life cycle management: Data and Concept systems (ontologies) }

10 Grounding Semantics Metadata Registry Concept System Thesaurus Themes Data Standards Ontology GEMET Structured Metadata Users Registries Semantic Web RDF Triples Subject (node URI) Verb (relation URI) Object (node URI) Ontologies

11 Ontology Editor Protege OWL Ontology XMDR Prototype Architecture: Initial Modules MetadataValidator AuthenticationService MappingEngine Registry External Interface Generalization Composition (tight ownership) Aggregation (loose ownership) Jena, Xerces Java RetrievalIndex FullTextIndex Lucene LogicBasedIndex Jena, OWI KS Racer RegistryStore WritableRegistryStore Subversion

12 Ontology Editor Protege OWL Ontology XMDR Prototype Architecture: Initial Implemented Modules Registry External Interface Generalization Composition (tight ownership) Aggregation (loose ownership) Java RetrievalIndex FullTextIndex Lucene LogicBasedIndex Jena Racer,etc. RegistryStore WritableRegistryStore Subversion

13 UML is Used for Metamodel, XMDR uses OWL, RDF & XML Schema OWL XMDR Ontology & annotations XMDR’s Relax NG Schema XMDR XML Schema UML11179 Metamodel Relational Schema Relational Metadata RDF Spec TRang XML Schema Language spec XML Objects Types & Cardinalities What things go in own files? Which property direction stored? Sequential ordering of properties Triples: binary labeled relationships

14 Refined XMDR Subclasses Improve Organization & Enable Inference

15 XMDR Example Content Loaded from Diverse Sources via LexGrid & XSLT Original Source A Lexgrid Source A XSLT script Harold Solbrig (Mayo Clinic) Concept System A A Concepts A Relationships Content loaded to date: 2.7 million triples

16 XMDR Content List (partial) NBII Biocomplexity Thesaurus NCI Thesaurus National Cancer Institute Thesaurus NCI Data Elements (National Cancer Institute Data Standards Registry UMLS (non-proprietary portions) GEMET (General Multilingual Environmental Thesaurus) EDR Data Elements (Environmental Data Registry) USGS Geographic Names Information System (GNIS) HL7 Terminology, Data Elements Mouse Anatomy GO (Gene Ontology) EPA Web Registry Controlled Vocabulary BioPAX Ontology NASA SWEET Ontologies …

17 NASA-JPL Semantic Web for Earth and Environmental Terminology F SWEET written in OWL ontology language (W3C) u Can view with Internet Explorer 5+, Netscape 7+, etc. u Can also use OWL-specific tools (e.g., SWOOP, Protégé) F Terms in other taxonomies can be mapped to SWEET using u Global Change Master Directory (GCMD) u CF Standard Names F –Earth Realms –Physical Phenomena (any transient feature) –Physical Processes –Physical Properties –Physical Substances –Sun Realms –Biosphere Data –Data Centers –Human Activities –Material Things –Numerics –Sensors –Space –Time –Units

18 Content Loaded from EPA EDR and NASA SWEET Ontology concepts & relationships XMDR ontology SWEET (OWL) java EDR XMDR files (ontologies)

19 What happens to XMDR files before they can be used for text searching or inference? Concept System A A Concepts A Relationships Lucene Lucene indexes Jena Model A Model B XMDR Ontology …etc Text queries (Lucene) Inference queries (Jena) Search/Query results are sets of URLs for xmdr files pictured above Concept System B B Concepts B Relationships etc. … [all xmdr files] [each system (A,B,…etc) loaded individually] Union of all models

20 Object Class Chemopreventive Agent Property NSCNumber Conceptual Domain Agent Data Element Concept Chemopreventive Agent NSC Number Data Element Chemopreventive Agent Name Value Domain NSC Code Context caCORE Representation Code Classification Schemes caDSRTraining Valid Values Cyclooxygenase Inhibitor Doxercalciferol Eflornithine … Ursodiol Enterprise Vocabulary Services (EVS) Concepts Unite NCI MDR But how can we search/query such a complex system of metadata and vocabularies?

21 How to Search/Query Complex Concepts & Relationships New Proposed Objects Current Objects

22 How Can Terminologies and Ontologies Help Manage Metadata? F At the metadata registry schema level (ISO/IEC metamodel) u Ontologies specify formal relationships u Compute across the nodes and relations in the metamodel n Inheritance, aggregation, … u Search sub-classes & inverses, specify semantic pathways for indexing F At the level of metadata (concept system) instances in a registry u Compute across the nodes, relations and axioms in concept systems u Connect metadata entities via shared terms n Via automatic indexing of metadata words n Via text values from specific metadata elements

23 XMDR RDF Graph Query Facilities Compliment Text Query Capabilities F SQL-like queries u e.g., names of ontologies in a registry F Span items that are only indirectly connected u e.g., data elements associated with a conceptual domain F Expand queries to subsumed classes in hierarchy u e.g., ConceptualDomain includes EnnumeratedConc.. F Transitivity u e.g., all subclasses subsumed by a higher order class u e.g., all superclasses (ancestors) of a particular class F Least common ancestor u e.g., closest subsuming concept for 2 concepts

24 Example Subclass Queries: (Inference with Transitivity) F Environmental: u What are all the (sub)types of Wetland (in SWEET)? RDQL: SELECT ?x WHERE (?x rdfs:subClassOf earthrealm:Wetland) USING earthrealm FOR F Health u Find all the types of "Lung Carcinoma"

25 More Complex “Sibling” Queries: Concepts with Multiple Ancestors F Health u Find all the siblings of Breast Neoplasm n Note: This is complex, since Breast Neoplasm has two parents - Neoplasm by Site and Breast Disorder -- You would get returned both the by site Neoplasms, such as Eye Neoplasm, Respiratory System Neoplasm, etc. and the Breast Disorder siblings such as Non-Neoplastic Breast Disorder

26 Least Common Ancestor Queries: (Inference with Transitivity) F Health: u "Morphine Sulfate" and "Acetaminophen". n least common ancestor should be Analgesic Agent (with multiple intervening concepts.)

27 Searching caDSR for Data Elements via Concepts and Vice-Versa F Common Data Elements (CDEs) are 'connected' to concepts through the Object Class and Property of the CDE. A query such as this should look for the CDE's Object Class derivation rule and select only those data elements associated with those object classes.. Alternatively, you could query the caDSR Concept Class and find all related OCs where the concept was flagged as "primary concept", then get all the Data Elements.. leveraging the ISO relationships...e.g. Object Class has related Data Element Concepts, DECs have related DEs... Concepts can also be associated with Value Meanings. So, search Concept Class with concept code, find all related Value Meanings, find all Value Domains that used the value meaning, find all Data Elements that used the Value domain.

28 Reasoners Use OWL Ontologies to Augment RDF Graph Queries OWL Ontology OWL built-in rules RDF Query (rdql/nrdql/SPARQL) Reasoner Jena (main memory) result set includes subclasses, inverses, etc metadata (xml/rdf files) Jena is a Java framework for building Semantic Web applications. Jena provides a programmatic environment for RDF, RDFS and OWL, including a rule-based inference engine. Jena is open source and grown out of work with the HP Labs Semantic Web Programme. Introduction: Jena API Overview:

29 Comparison of Different Reasoners (on 2.7m triples)

30 Challenges and Future Goals for XMDR Prototype F Scalability & performance F Tools u RDF tool adaptation for metadata registries u User-friendly interface u Form interface for registration & uploading metadata F References to externally maintained sources u Data, ontologies, terminologies F Evaluate alternative technologies u For different modules F Demonstrate for key use cases and ecoinformatics applications

31 Challenges and Future Goals (cont) F Progress proposals through standards committees F Harmonization with W3C and OMG standards F Incorporate Common Logic, Web Services, etc. F Ontology Lifecycle Management (OLM) F Improve link of concepts to data F Generate schemas from axiomatized ontologies

32 Ecoinformatics Challenges F How does this fit into the research, development, and demonstration activities of the Interagency/International Cooperation on Ecoinformatics? F Should this be a part of the EU-US collaborative R&D?