Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush.

Slides:



Advertisements
Similar presentations
Introduction The cancerGrid metadata registry (cgMDR) has proved effective as a lightweight, desktop solution, interoperable with caDSR, targeted at the.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
CACORE TOOLS FEATURES. caCORE SDK Features caCORE Workbench Plugin EA/ArgoUML Plug-in development Integrated support of semantic integration in the plugin.
Enterprise Content Management Departmental Solutions Enterprisewide Document/Content Management at half the cost of competitive systems ImageSite is:
1 Submitted to: NCI Center for Bioinformatics Prepared by: 101 West Renner Road, Suite 130 Richardson, TX September 22, 2004 Contact Information:
EleMAP: An Online Tool for Harmonizing Data Elements using Standardized Metadata Registries and Biomedical Vocabularies Jyotishman Pathak, PhD 1 Janey.
File Systems and Databases
Hands-On Microsoft Windows Server 2003 Administration Chapter 5 Administering File Resources.
SiS Technical Training Development Track Technical Training(s) Day 1 – Day 2.
WEB DESIGNING Prof. Jesse A. Role Ph. D TM UEAB 2010.
Best Practices for Including Enumerated Value Domains in UML Models What are the mechanics of creating CDEs associated with enumerated value domains in.
Development Principles PHIN advances the use of standard vocabularies by working with Standards Development Organizations to ensure that public health.
Form Builder Iteration 2 User Acceptance Testing (UAT) Denise Warzel Semantic Infrastructure Operations Team Presented to caDSR Curation Team March.
Future of MDR - ISO/IEC Metadata Registries (MDR) Larry Fitzwater, SC 32 WG 2 Convener Computer Scientist U.S. Environmental Protection Agency May.
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
Department of Biomedical Informatics Development of Ontology-anchored Grid-based Data Services to Facilitate Integrative Clinical and Translational Science.
Classroom User Training June 29, 2005 Presented by:
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings Department of Biomedical Informatics Ohio State University.
Metadata Open Forum 2008 ISO/IEC/IEC 11179: Metadata Registries A Tutorial from the National Cancer Institute Dianne M. Reeves, RN, MSN National Cancer.
CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics.
LexEVS Overview Mayo Clinic Rochester, Minnesota June 2009.
Cancer Clinical Trial Suite (CCTS): An Introduction for Users A Tool Demonstration from caBIG™ Bill Dyer (NCI/Pyramed Research) June 2008.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
XML Registries Source: Java TM API for XML Registries Specification.
H Using the Open Metadata Registry (OpenMDR) to generate semantically annotated grid services Rakesh Dhaval, MS, Calixto Melean,
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
DEV337 Modeling Distributed Enterprise Applications Using UML in Visual Studio.NET David Keogh Program Manager Visual Studio Enterprise Tools.
Open Terminology Portal (TOP) Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer Institute, Center for Biomedical Informatics.
CaAdapter Fundamentals  Session Date:  Session Length: 1.5 hours  Trainer:
Training by the Office of Library and Information Services Contact for more information: karen.gardner- or
CaCORE Software Development Kit George Komatsoulis 25-Feb-2005.
CaDSR Software Users Meeting 3.1 Requirements Review 9/19/2005 caDSR Software Team Host: Denise Warzel NCICB, Assistant Director, caDSR.
1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project.
3 Copyright © 2004, Oracle. All rights reserved. Working in the Forms Developer Environment.
CaDSR O&M Draft Scope September 2010 Denise Warzel National Cancer Institute Center for Biomedical Informatics and Information Technology.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo.
Module 9 User Profiles and Social Networking. Module Overview Configuring User Profiles Implementing SharePoint 2010 Social Networking Features.
Data Migration Training Page 1 KE EMu Data Migration
Introduction to KE EMu Unit objectives: Introduction to Windows Use the keyboard and mouse Use the desktop Open, move and resize a.
What is NCIA? National Cancer Imaging Archive Searchable repository of in vivo cancer images in DICOM format Publicly available at no cost over the Internet.
May 2007 Registration Status Small Group Meeting 1: August 24, 2009.
Introduction to KE EMu Unit objectives: Introduction to Windows Use the keyboard and mouse Use the desktop Open, move and resize a.
Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange Jean-Francois LEBLANC Christian SEBASTIAN SDMX IT Tools SDMX.
Patterns in caBIG Baris E. Suzek 12/21/2009. What is a Pattern? Design pattern “A general reusable solution to a commonly occurring problem in software.
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
Compatibility Review System 3.0 Robert Freimuth October 28, 2008 Overview.
CaCORE Training: UML-based Metadata Curation Session 1 Course Number:1071 Date:September 15, 2009 Duration: 90 Minutes Trainer: Becky Angeles
CaBIG ™ is an initiative of the National Cancer Institute, NIH, DHHS Semantic Integration Workbench (SIW) v3.1 and UML Model Browser v.5  Session Date:
Modeling Your Application, Data or Service Creating Your UML Model.
CaCORE Training Forms- based Metadata Curation Session 1 Course Number:1061 Duration: 90 Minutes Intended Audience: Metadata Curators – Using Forms Instructor:
Challenges and issues with information sharing: The four pillars of semantic interoperability Douglas B. Fridsma, MD, PhD, FACP University of Pittsburgh.
CaCORE In Action: An Introduction to caDSR and EVS Browsers for End Users A Tool Demonstration from caBIG™ caCORE (Common Ontologic Representation Environment)
National Cancer Institute caDSR Briefing for Small Scale Harmonication Project Denise Warzel Associate Director, Core Infrastructure caCORE Product Line.
1 caBIG®-aligned Enterprise Metadata Infrastructure to Support Commercial Clinical Trials Management Software: A Pilot Implementation September 11, 2009.
CgMDR and Excel Addin Overview Denise Warzel Nano WG May 5, 2011.
FIRE1000S - Self-Paced FIREBIRD Training Training on the Federal Investigator Registry of Biomedical Informatics Research Data (FIREBIRD) for Clinical.
CaDSR Enablement of PRESAGE Fox Chase Cancer Center.
SDTM Metadata Curation Process  Dianne Reeves. Session Outline  Submit Candidate Terminology – Example spreadsheet  Load new terms into EVS (Enterprise.
Create your Domain Model. Session Outline caCORE Build Process Review of UML Modeling Lesson 1: Model a Data Service Lesson 2: Create a UML Model for.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
University of Colorado at Denver and Health Sciences Center Department of Preventive Medicine and Biometrics Contact:
Accessing EVS via BioPortal Course Number:1030 Intended Audience: caDSR Users and Metadata Consumers Duration:Self-Paced.
NCI Center for Biomedical Informatics and Information Technology (CBIIT) The CBIIT is the NCI’s strategic and tactical arm for research information management.
Lesson # 9 HP UCMDB 8.0 Essentials
Networking and Health Information Exchange
Introduction to K2 Designer
SDMX IT Tools SDMX Registry
Presentation transcript:

Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush

Session Outline  Audience Interview – What do you want to learn?  Standard Vocabularies – NCI Thesaurus as part of EVS  Metadata and Data Elements – Their differences and why we use them  caCORE Infrastructure and caDSR – How it all fits together  caDSR Tools – CDE Browser – CDE Curation Tool – Sentinel Tool  Semantic Interoperability – UML Model Browser / Semantic Integration Workbench

Standard Vocabularies  Facilitate translational research  Integrate diverse data systems  Improve the links between clinical research and the healthcare delivery system

Enterprise Vocabulary Services (EVS)  Address NCI’s needs for controlled vocabulary and semantics  Components NCI ThesaurusNCI Metathesaurus Stand-alone reference terminology Relational: Links to multiple terminologies One definition for cancer research One or more definitions from multiple sources Designed for annotation and database coding to facilitate data analysis and retrieval Designed for mapping cancer terms across terminologies throughout the cancer research community to facilitate integration

Use EVS  Check and compare dictionary definitions.  Find synonyms.  Determine relationships to other concepts/terms.  Identify and evaluate potential options when curating new CDEs or adding terms to permissible value lists.  Provide links to related research publications.  If you can’t find a term, you can submit a new one

Exercise 1 - Examine an EVS Term  Complete: Exercise 1 from the “Semantic Interoperability” exercise handout.  Time: 2 minutes

Exercise 1 - Examine an EVS Term 1.Navigate to the NCI Terminology Browser nciterms.nci.nih.gov 2.Select the NCI Thesaurus 3.Select “Connect” 4.Enter “gene” in the Quick Search entry field, then “Gene” from the results

concept code

 Metadata is data about data  Metadata describes the content, quality, condition, and other characteristics of data  Example: If a question on a form reads: “What is your age?” – What is the data? – What is the metadata? Define Metadata

caDSR Overview: Metadata Example: Age caDSR metadata repository Data Describes the data in What is your age?: Metadata 33 Local database stored in Person Self Reported Age (data element) Person Self Reported Age (data element concept) Age Values (value domain) Person (object class) Self Reported Age (property) Datatype: Numeric Max length: 10 Version: 2.0 High Value: 999 Low Value: 0 Type: Non-enumerated stored in

Data Elements  A data element is a standard way of describing and representing metadata – e.g. caDSR contains metadata based on the ISO/IEC metadata standard  “Semantically Immutable Metadata” are data elements that are made up of one or more terms from a standard vocabulary  “Semantically Interoperable Systems” base their data models (in our case, UML Class Diagrams) on metadata that is semantically immutable

Data Element Fundamentals DECDEC Object Class Property Data Element ConceptValue Domain Data Element += D E VDVD DECDEC VDVD Representation Term Representation Term + Object Class + Property + Rep Term = Data Element + Object Class + Property + Rep Term = Data Element

Data Element Fundamentals DECDEC Person Address Person AddressZip Code Person Address Zip Code += D E VDVD DECDEC VDVD Zip Code Zip Code Person Address Zip Code Person Address Zip Code

Libraries of Re-usable Components D E VDVD DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC VDVD VDVD VDVD VDVD VDVD VDVD VDVD VDVD VDVD DECDEC ID# 106 Person Address Zip Code

Libraries of Reusable Components D E VDVD DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC VDVD VDVD VDVD VDVD VDVD VDVD VDVD VDVD VDVD DECDEC ID# 106 ID# 77 Person Address State Code

How Data Elements are Used  On Forms for data collection (CRFs)  In Databases to describe database field attributes and constraints  In information/UML Modeling  Support APIs  To describe application user interface components, validation rules, display name and format

Cancer Data Standards Repository (caDSR )  Metadata repository and registry  Based on the ISO/IEC standard for metadata registries  Designed to integrate caCORE infrastructure  Supports the development and deployment of Data Elements that are used as metadata descriptors

 ISO is a non-government network of the national standards institutes of 151 countries  ISO has standards for mathematics, manufacturing, electrical mechanical and civil engineering, imaging, electronics, and information technology  Benefits of using ISO/IEC 11179: – Metadata model fully supports the variations needed for biomedical applications – Easier to understand and share cancer research information. – – ISO: International Organization for Standardization

caCORE Components Enterprise Vocabulary Data Standards Bioinformatics Objects

caCORE Infrastructure Vocabulary for CDE specification Dictionary, thesaurus services Domain object metadata Common data elements Public APIs Common data elements (CDEs)

caDSR Tools: Purpose  caDSR Tools are designed to: – Create, consume, distribute and promote ISO/IEC compliant metadata – Enable semantic consistency across research domains – Support the metadata life-cycle and governance processes

caDSR Tools  CDE Browser / FormBuilder – Search for and Download Data Elements – Collect Data Elements onto Forms and Download Forms  CDE Curation Tool – Curate (Create and Edit) Data Element Concepts, Value Domains and Data Elements  Sentinel Tool – Create Alert Definitions to monitor changes to caDSR metadata

CDE Browser (Search & Download) caDSR Search Tree: Displays all the current caDSR Contexts. Users can search for groups of DEs by navigating the tree. Data Element Search Pane: This is the main search window. Users looking for Data Elements can enter a key word or phrase. Navigation Menu: use these buttons to navigate to the CDE cart, Form Builder, or back to Home( that is back to this page)

Exercise 2 – Examine a Data Element in the CDE Browser  Complete: Exercise 2 from the “Semantic Interoperability” exercise handout.  Time: 5 minutes

Exercise 2 – Examine a Data Element in the CDE Browser  Navigate to the CDE Browser –  Select the third option, “At least one of the terms”  Enter “gene” in the search term field  Scroll down to “Gene Identifier java.lang.Long “ in the results list; select the Long Name to open the Data Element details window

Exercise 2 – Examine a Data Element in the CDE Browser

 Answer the following questions: – What is the Long Name of the Data Element? – What is the Public ID of the Data Element? – What context owns the Data Element? – What is the Data Element Concept Long Name? – Are there permissible values for this Data Element?

Exercise 2 – Examine a Data Element in the CDE Browser  Answers: – What is the Long Name of the Data Element? Gene Name java.lang.String – What is the Public ID of the Data Element? – What context owns the Data Element? caCORE – What is the Data Element Concept Long Name? Gene Name – Are there permissible values for this Data Element? NO

CDE Curation Tool (Create/Edit Metadata Using EVS)

CDE Curation Tool (Create/Edit Existing Metadata)

Sentinel Tool (Monitor Changes to Metadata) What to watch When to Watch What to watch for What to report

Sentinel Tool Reports (View Changes Made to Metadata) Change Blocks Associated Blocks

Semantic Integration Tools  UML Model Browser – Browse administered items that are part of registered UML Models – Supports browsing, searching, and exporting the classes, attributes and relationships between classes of a UML domain model  Semantic Integration Workbench – Guides users through the workflow process required for annotating a UML domain model – Tags UML Models with matching semantic concepts from the NCI Thesaurus

UML Model Browser  Web-based –  Designed for UML model owners  Search for and view UML model components in caDSR – classes – class attributes – associations between classes and attributes – ISO Components (metadata) related to those classes and attributes

UML Model Browser Interface UML Model Search Tree: Search for model components. Basic Class/Attribute Search Pane: Users looking for classes and attributes can enter search criteria here. Basic Class/Attribute Search Pane: Users looking for classes and attributes can enter search criteria here. Navigation Menu: Access other caDSR tools and resources.

UML Model Browser : UML Model Search Tree  Displays current caDSR Contexts  For each Context, – lists all the UML classes – grouped by project, subproject and package  Search for classes by navigating the tree and clicking on a context, project, subproject or package  Search for attributes by clicking on a class project subproject package class

UML Model Browser : UML Class - Model Tree Search Results # Matches ‘crumb trail’ Class Search Results Package

Exercise 3 – View Classes & Attributes in the UML Model Browser  Complete: Exercise 3 from the “Semantic Interoperability” exercise handout.  Time: 5 minutes

Exercise 3 – View Classes & Attributes in the UML Model Browser 1.Navigate to the UML Model Browser Use the tree to navigate to the caCORE Project: 1. caCORE  Projects  caCORE  Cancer Bioinformatic Infrastructure Objects  gov.nih.nci.cabio.domain 3.Scroll down the list of classes, select the “Gene” class 4.Answer the following: 1. What are the two attributes in the Gene class? 2. What project does the Gene class belong to? 3. What context is this project in? 4. What is the Public ID of the “Gene Name” data element?

Exercise 3 – View Classes & Attributes in the UML Model Browser  Answers: – What are the two attributes in the Gene class? Gene cluterId Gene fullName – What project does the Gene class belong to? caCORE – What context is this project in? caCORE – What is the Public ID of the “Gene Name” data element?

Semantic Integration Workbench   Audience: caCORE SDK UML Model developers/users performing semantic annotation  Performs the tasks associated with semantic annotation and review for loading of UML Models into caDSR  Benefits: – Users select NCI Thesaurus concepts or existing metadata for UML model annotation  Recommended Prerequisites – EVS terms – Enterprise Architect – UML Class Diagram as your domain model

SIW in the caCORE SDK Workflow 1.Design system and draw model (UML tool) 2.Perform Semantic Integration (SIW - Semantic Integration Workbench) 3.Register metadata (UML Loader) 4.Generate and deploy system (Code Generator)    

Using the Semantic Integration Workbench SIW Viewer Window UML Entities Mapped Concept

NCICB Application Support  Live Support: Monday – Friday 8 am – 8 pm Eastern Time – Telephone support is available Monday to Friday, 8 am – 8 pm Eastern Time, excluding government holidays. – You may leave a message, send an or submit a support request via the Web at any time.   Phone:  Toll-free:  Web:

Questions