EVS Data Curation The processing and publication of data for web browsing and programmatic access.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

OOR Federation Dan Cerys Jim Chatigny Mike Dean OOR Panel on Coordinating our OOR Software Development 19 February 2010 OOR Panel on Coordinating our OOR.
Extending Web-Protégé to Support Reasoning
Consistent and standardized common model to support large-scale vocabulary use and adoption Robust, scalable, and common API to reduce variation in clinical.
Eralp Erat Senior Software Developer MCP,MCAD.NET,MCSD.NET.
LexBIG/EVS API Overview NCBO Seminar Series October 2008.
Building ontologies using Jenkins. Changing requirements for ontology engineering Original ontology build pipeline – What pipeline? – Life on the bleeding.
Open Health Tools Distributed Terminology System Presentation Jack Bowie SVP Sales and Marketing Apelon, Inc. 1.
LexGrid for cBIO Division of Biomedical Informatics Mayo Clinic Rochester, MN.
© Copyright 2008, Mayo Clinic College of Medicine Mayo Clinic Open Health Tools Application for Membership OHT Board Meeting, Birmingham, UK July 1, 2008.
LexWiki Framework & Use Cases SMW for Distributed Terminology Development Guoqian Jiang, PhD, Robert Freimuth, PhD, Haorld Solbrig Mayo Clinic NCI caBIG.
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
Brian A. Carlsen Apelon, Inc. Tools For Classification Integration Networked Knowledge Organization Systems/Services Workshop June 28, 2001.
Mayo LexWiki: A Prototype of Collaborative Platform for Terminology/Ontology Content Development Guoqian Jiang, Ph.D. Division of Biomedical Informatics,
ELSE (eLearning for Software Engineering) S. Stojanov ECL, University of Plovdiv.
Development and Production Environment Setup with Kentico CMS Karol Jarkovsky Consultant Kentico Software
Standards for Technology in Automotive Retail STAR Workbench 1.0 Michelle Vidanes & Dave Carver STAR XML Data Architects, Certified Scrum Masters.
1 New : Create your own message starting from scratch 2 New From Template: add professionally designed templates provided exclusively by Gorilla Contact.
Value Domain and Pick List Support in LexEVS 5.1 Sridhar Dwarkanath Mayo Clinic CaBIG Architecture/VCD Joint Workspace F2F.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
Form Builder Iteration 2 User Acceptance Testing (UAT) Denise Warzel Semantic Infrastructure Operations Team Presented to caDSR Curation Team March.
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
CaDSR Software Development Update Denise Warzel Semantic Infrastructure Operations Team Presented to caDSR Content team November 2012.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Peter Fox CSCI Week 9, October 27, 2008.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
AQS Web Quick Reference Guide Changing Raw Data Values Using Maintenance 1. From Main Menu, click Maintenance, Sample Values, Raw Data 2. Enter monitor.
LexEVS 101 Craig Stancl Rick Kiefer February, 2010.
The Apelon Formal-Terminology Solution Terminology Creation and Maintenance Application Development and Deployment TerminologyApplications.
8 Copyright © 2009, Oracle. All rights reserved. Using Process Flows.
1 LexEVS 5.0 Advanced Topics Configuration Options LexEVS Boot Camp November, 2009.
Deliverable Readiness Review LexEVS 5.1 December 17, 2009.
LexEVS Overview Mayo Clinic Rochester, Minnesota June 2009.
January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1 Quick Tutorial – Part 2 Open Data Web Services for Oracle BPM August, 2013 Forms.
Key Foundational Layers Classification, Vocab, Web Stds –Classification Content Model (Taxonomy) –Web Std Tech Requirements W3C, cloud, online, offline.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness TA Weijing Chen Semantic eScience Week 10, November 7, 2011.
Session 1 SESSION 1 Working with Dreamweaver 8.0.
Aude Dufresne and Mohamed Rouatbi University of Montreal LICEF – CIRTA – MATI CANADA Learning Object Repositories Network (CRSNG) Ontologies, Applications.
LexBIG Release Overview Aug 21, LexBIG Context Project Goals for Sept –Incremental point release of LexBIG infrastructure to support EVS activities.
Ontology Evolution and Regression Analysis Insights into Ontology Regression Testing Maria Copeland Rafael Goncalvez Robert Stevens Bijan Parsia Uli Sattler.
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
Open Terminology Portal (TOP) Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer Institute, Center for Biomedical Informatics.
Terminology Services in the OpenHIE. Agenda Terminology Services Overview Terminology Services in Rwanda Distributed Terminology System (DTS) Next Steps.
Kako razvijate PL/SQL pakete? File based PL/SQL development Mitja Golouh SIOUG 2006,
CaDSR Software Users Meeting 3.1 Requirements Review 9/19/2005 caDSR Software Team Host: Denise Warzel NCICB, Assistant Director, caDSR.
SupervisorStudent Prof. Atilla ElciHussam Hussein ABUAZAB June 2007 Using ORACLE XML Parser to Access Ontology CMPE 588 Engineering Semantic for.
Overview of the Automated Build & Deployment Process Johnita Beasley Tuesday, April 29, 2008.
- EVS Overview - Biomedical Terminology and Ontology Resources Frank Hartel, Ph.D. Director, Enterprise Vocabulary Services NCI Center for Bioinformatics.
February 26, 2003NCICB Jamboree1 Enhancing Quality of Retrieval Through Concept Edit History -- EVS Update Frank Hartel Sherri De Coronado Gilberto Fragoso.
,plot No-27,NGGO's Colony, Pattabhi reddy gardens,Visakhapatnam-07 Ph No: Mob:
EVS 4.0 Feature Overview EVS API and User Interface pBIO Meeting March 20, 2007 Frank Hartel Gilberto Fragoso
1 EVS Automated Data Promotion LexEVS 5.1 Data BDA.
LexEVS 5.0 EVS to LexEVS: A Migration Guide November, 2009.
Protégé 3.4 Plug-in for Editing and Maintaining the NCI Thesaurus Protégé Conference June 23, 2009 Amsterdam Sherri de Coronado, Gilberto Fragoso.
22 Copyright © 2008, Oracle. All rights reserved. Multi-User Development.
1.Getting Started 2.Modifying Design 3.Newsletter Templates 4.Announcement 5.Administer Sections Index Training 14 th Mar., 2011.
Neil Kidd Developer Tools Technical Specialist Microsoft UK.
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
EcoInformatics Initiative 04/18/2007 Terminology and the Semantic MediaWikiEcoterm IV – Vienna 17 – 18 April 2007 Terminology Curation with the Semantic.
Sherri de Coronado Enterprise Vocabulary Services NCI Center for Bioinformatics and Information Technology March 11, 2009 A Terminology.
National Cancer Institute 1 1 LexBIG integration caCORE Software User Meeting Aug 7, 2006.
Labcheck Next Generation Quick Start Guide Equipment Management.
Semantic Media Wiki Open Terminology Development - Initial Steps - Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer.
LexWiki Framework & Use Cases SMW for Distributed Terminology Development Guoqian Jiang, PhD, Harold Solbrig Mayo Clinic Meeting with Dr. Jakob (WHO) May.
CaCORE In Action: An Introduction to caDSR and EVS Browsers for End Users A Tool Demonstration from caBIG™ caCORE (Common Ontologic Representation Environment)
Extended Metadata Registries and Semantics (Part 2: Implementation) Karlo Berket Ecoterm IV Environmental Terminology Workshop April 18, 2007 Diplomatic.
Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush.
BiomedGT Wiki support for CTCAE update/ Creating a pre-coordinated OWL file Sherri de Coronado NCI CBIIT/ EVS May 1, 2009.
Terminology Service Bureau Vision
Metadata Editor Introduction
Presentation transcript:

EVS Data Curation The processing and publication of data for web browsing and programmatic access

Data Curation Flowchart

Gene Ontology and Zebrafish Downloaded as OBO from web sites Processed with C++ program into Ontylog xml – OBO2TDE.exe Processed with C++ program into OWL – ontyxToOWL.exe Loaded using LoadNCIThesOWL.sh Metadata loaded using LoadMetadata Hierarchy and Sources manually edited

HL7 and VA_NDFRT Retrieved from sources Processed by Apelon into Ontylog XML Loaded into LexBIG using LoadNCIThesOwl and manifest Metadata loaded using LoadMetadata

MGED OWL file downloaded from source web site Loaded into Protégé Classified Inferred version exported as OWL file Loaded into LexBIG using LoadNCIThesOwl Metadata loaded using LoadMetadata Hierarchy and Sources manually edited

Snomed, MedDRA and LOINC Extracted from the UMLS into RRF files Loaded into LexBIG using LoadUMLSFiles Metadata loaded using LoadMetadata

UMLS Semnet Downloaded from UMLS Semnet web site Loaded using LoadUMLSSemnet Metadata loaded using LoadMetadata

Metathesaurus Load from UMLS into MEME NCI Thesaurus imported monthly Other vocabs added or removed NCI specific edits made to data and relations Exported as RRF Imported to LexBIG using LoadNCIMeta Metadata loaded using LoadMetadata

Preparing TDE Thesaurus for MEME Thesaurus Ontylog XML baseline is processed through C++ app publishMEME.exe Current baseline compared to previous to get summary of new properties or roles Summary used to create import configuration file Baseline imported into MEME

Preparing Thesaurus for MEME

NCI Thesaurus from TDE Edited in TDE and exported to Ontylog XML by name Run through publishTDE to remove unpublishable properties run through OntyxToOwl.exe to create OWL file by code Loaded into LexBIG using LoadNCIThesOWL Metadata loaded using LoadMetadata History generated from TDE baseline History loaded using LoadNCIHistory

NCI Thesaurus from TDE

NCI Thesaurus from Protege Run OWL through application to get Ontylog XML by name Run Ontylog XML through publishTDE to remove unpublishable properties Run through OntylogtoOWL to get OWL by code Do history using the Ontylog XML

NCI Thesaurus History Processing evs_history records concept modifications made in editor These records are extracted monthly to consolidate and to remove identifying information Cleaned records are loaded into concept_history Full concept_history loaded into LexBIG for NCI Thesaurus

History

TDE to DTS

log.out New concepts created through Create or Split actions: C72675|Feet_First. Concepts merged into other concepts: C17841|Oncologic_Surgeon. Retired concepts (including merged): C17841|Oncologic_Surgeon. New concepts not found in BSLN2: C73140|Ethaverine_. Retired concepts not found in BSLN2 C73401|Maqui_Berry_Flavor. Modify records correponding to Retired_Kind are discarded: |C62920|Medical_Device_Unsafe_to_Use|Modify| …. Modify records correponding to new codes are discarded: |C72831|Pramiracetam_Hydrochloride|Modify| …. Modify records correponding to merged codes are discarded: |C3824|Lesion|Modify| :03:49.0|remennik|6116otsaremennl.nci.nih.gov|(null)|0. Records correponding to codes not found in BSLN2 are discarded: |C73140|Ethaverine_|New| :03:01.0|shaiu|MSDCorp-Mesh001.inside.msdinc.com|(null)|0. WARNING: New codes created, then retired, but still found in BSLN2: (to be edited manually) C72675|Feet_First. List of all remaining records. List of all discarded records: |C72831|Pramiracetam_Hydrochloride|Modify| :02:56.0|shaiu|MSDCorp-Mesh001.inside.msdinc.com|(null)|0.

tde_history_report.txt Spilanthes_oleracea (Code: C72446) Number of modelers: 3 Modeler: shaiu Modeler: thomas Modeler: creech Modeler: shaiu Action: modify time: :03:58.0 Modeler: thomas Action: modify time: :03:05.0 Action: modify time: :03:06.0 Modeler: creech Action: modify time: :03: Edited actions for the following concepts are discarded: Concept codes requiring manual review:

DTS_history DTS_history_script.sql insert into concept_history(concept, editaction, editdate, reference) values ('C72675', 'create', '28-MAR-08', null); insert into concept_history(concept, editaction, editdate, reference) values ('C72676', 'create', '28-MAR-08', null);. DTS_history_out.txt |C72675|create|28-MAR-08|(null) |C72676|create|28-MAR-08|(null) |C62171|modify|28-MAR-08|(null).

DTS_history_out.out Lists complete contents of both baselines. Number of codes in {baseline A} : Number of codes in {baseline B} : Concepts found in {baseline B}: but not in {baseline A} C72675 C Concepts found in {baseline A}: but not in {baseline B} (should be empty). Verify DTS_history_out.txt against baseline data. New Concepts: 757 (1) C72675 (2) C Concepts created through Split: 0 Split Concepts: 0 Retired Concepts: 4 (1) C20920 (2) C62920 Concepts retired through Merge: 5 (1) C14142 Merge Concepts: 5 (1) C1363 Modified Concepts: 1364 Invalid actions: 0

Tiered Deployments NCICB uses 4-tiered deployments  Dev tier – used internally by EVS team to test software and data  QA tier – used by QA and other software teams to test against new EVS software or data  Stage tier – used to test software deployments in a near-production environment  Production – available to outside users