ETRIKS Harmonization System October 14 2015 Fabien Richard CNRS, EISBM 1.

Slides:



Advertisements
Similar presentations
Global Congress Global Leadership Vision for Project Management.
Advertisements

Enabling Access to Sound Archives through Integration, Enrichment and Retrieval WP1. Project Management.
Cloud platforms Lead to Open and Universal access for people with Disabilities and for All WP Federating repositories of Solutions.
S&I Framework Provider Directories Initiative esMD Work Group October 19, 2011.
Bentley Systems, Incorporated
Slide 1 WGISS CEOS WGISS 22, Annapolis September 2006 Report on GEO Architecture & Data Committee (ADC) Report on GEO Architecture & Data Committee.
Validata Release Coordinator Accelerated application delivery through automated end-to-end release management.
1 CSL Workshop, October 13-14, 2005 ESDI Workshop on Conceptual Schema Language and Tools - Aim, Scope, and Issues to be Addressed Anders Friis-Christensen,
Building Enterprise Applications Using Visual Studio ®.NET Enterprise Architect.
BEDES & SEED Building Energy Data Exchange Specification & Standard Energy Efficiency Data Platform April 14, 2015 Robin Mitchell Lawrence Berkeley National.
S&I Data Provenance Initiative Presentation to the HITSC on Data Provenance September 10, 2014.
Supplement 02CASE Tools1 Supplement 02 - Case Tools And Franchise Colleges By MANSHA NAWAZ.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1 August 15th, 2012 BP & IA Team.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
CSCI ClearQuest 1 Rational ClearQuest Michel Izygon - Jim Helm.
© 2009 IBM Corporation 1 RTC ClearQuest Importer and Synchronizer Lorelei Ngooi – RTC ClearQuest Synchronizer Lead.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Visually Execute Your Strategy. The Disconnect  2014 KPI Fire2 Strategic Goals KPIs Projects Poor Results.
US NITRD LSN-MAGIC Coordinating Team – Organization and Goals Richard Carlson NGNS Program Manager, Research Division, Office of Advanced Scientific Computing.
Cancer is heterogeneous disease! -> enabled characterization of new tumor subtypes for improving personalized treatment and ultimately achieving better.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
The OMII Perspective on Grid and Web Services At the University of Southampton.
Workflow and SharePoint Presented by Ben Geers. Overview What is workflow? Windows Workflow Foundation How does workflow apply to SharePoint? WSS v3 vs.
Digital Object Architecture
1 Carlos Rueda, Paul Alexander, John Graybeal Marine Metadata Interoperability Project (MMI) Monterey Bay Aquarium Research Institute (MBARI) The MMI Registry.
OpenNPC EXPAND WP5 Technical Meetings
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Natacha Fernandez-Ureña, Database Manager Jose Hernandez, Supervising Case Manager Shannon Skinner, Coordinator for Testing & Linkage to Care Dustin Przybilla,
FP WIKT '081 Marek Skokan, Ján Hreňo Semantic integration of governmental services in the Access-eGov project Faculty of Economics.
Public Health Tiger Team we will start the meeting 3 min after the hour DRAFT Project Charter April 15, 2014.
CIP Quality System for Genebank ISO 17025
Draft Policy Brief on Semantic Interoperability Editors Dipak Kalra and Mark Musen.
Clinical Collaboration Platform Overview ST Electronics (Training & Simulation Systems) 8 September 2009 Research Enablers  Consulting  Open Standards.
The data standards soup … Is the most exciting topic you can dream of.
HIT Standards Committee Overview and Progress Report March 17, 2010.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Cloud platforms Lead to Open and Universal access for people with Disabilities and for All WP Federating repositories of Solutions.
Dictionary based interchanges for iSURF -An Interoperability Service Utility for Collaborative Supply Chain Planning across Multiple Domains David Webber.
Workforce Scheduling Release 5.0 for Windows Implementation Overview OWS Development Team.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
Worldwide Protein Data Bank Common D&A Project Sequence Processing Modular Demo May 6, 2010 Project Deliverable.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
Worldwide Protein Data Bank wwPDB Common D&A Project Full Project Team Meeting Rutgers March 16-19, 2010.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
S&I FRAMEWORK PROPOSED INITIATIVE SUMMARIES Dr. Douglas Fridsma Office of Interoperability and Standards December 10, 2010.
NURHALIMA 1. Identify the trade-offs when using CASE Describe organizational forces for and against adoption of CASE tools Describe the role of CASE tools.
Oracle Business Intelligence Foundation - Commonly Used Features in Repository.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
May 2007 CTMS / Imaging Interoperability Scenarios March 2009.
1 Geospatial Standards for Canada Proposed blueprint for Jean Brodeur and Cindy Mitchell.
Wait Time Project Implementation Strategy. Implementation Plan: Goals 1.To educate and provide clarification around the wait time project, wait time definitions,
1 © ATHENA Consortium 2006 Dynamic Requirements Definition System Interoperability Issues Mapping Nicolas Figay, EADS ATHENA M30 Intermediate Audit 4.-5.
Getting Semantic Technologies into the Business- Major Issues and Key Success Factors Martin Romacker, Principal Scientist Data/Information Architecture.
Esri UC 2014 | Technical Workshop | Address Maps and Apps for State and Local Government Allison Muise Nikki Golding Scott Oppmann.
EBI is an Outstation of the European Molecular Biology Laboratory. Semantic Interoperability Framework Sarala M. Wimalaratne (RICORDO project)
ETRIKS Platform for bioinformatics ISGC 17/03/15 Pengfei Liu, CC-IN2P3/CNRS.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
Building Enterprise Applications Using Visual Studio®
RDA 9th Plenary Breakout 3, 5 April :00-17:30
eHS AI component roadmap: Step I: prototype with fuzzy matching
INTAROS WP5 Data integration and management
Data Ingestion in ENES and collaboration with RDA
eHS AI component roadmap: Step I: prototype with fuzzy matching
Data challenges in the pharmaceutical industry
Data Warehouse.
The Re3gistry software and the INSPIRE Registry
Repository Platforms for Research Data Interest Group: Requirements, Gaps, Capabilities, and Progress Robert R. Downs1, 1 NASA.
Best Practices in Higher Education Student Data Warehousing Forum
Presentation transcript:

eTRIKS Harmonization System October Fabien Richard CNRS, EISBM 1

Presentation goals Overview of eTRIKS Harmonization System (eHS): its aims, its components, its future functionalities Key success factors Achievements and on going work 2

Why to harmonize data across projects? To increase and/or speed up: Data interoperability Cross-study analyses Advanced/integrated findings Understanding of disease mechanisms, drug targets and mode of action, patient stratification Delivering patient-tailored treatments Curation of patients 3

Mission of eTRIKS Harmonization System (1/2) Enable data harmonization across projects What does ‘data harmonization across projects’ require? Data of all projects are standardized by using: The same set of terminologies, The same user’s annotation information, and The same reporting information (ie MIG/template) 4

Mission of eTRIKS Harmonization System (eHS) (2/2) Enable data harmonization across projects What does ‘data harmonization across projects’ imply? Two kind of repositories: 5 Confidential DATA Public METADATA One instance Public terminologies Annotation Information Reporting Information (Templates) Multiple instances Project data

How to harmonize data across projects? eHS team has defined Data Harmonization Process / workflow: 1.Harmonize study activities, assays  templates, 2.Harmonize user’s variables  Controlled Vocabulary Term (CVT) variables, 3.Harmonize user’s values  CVT values. 6 Why this order? Metadata model: A template has a predefined list of CVT variables, A CVT variable has a predefined list of CVT values that come from: – A public eTRIKS-selected terminology, or – A eTRIKS list (if CVT values do not exist in the selected terminology).

Roles of eHS components in the data harmonization Several eHS components are required for data harmonization. 7 eSDR eTRIKS Study Design Report (eHS App.) Study trial design Study activity reports eTRIKS Study Design Report (eHS App.) Study trial design Study activity reports eHW eTRIKS Harmonization Wizard (eHS App.) Load and transform user’s data into harmonized data (based on user’s annotations) eTRIKS Harmonization Wizard (eHS App.) Load and transform user’s data into harmonized data (based on user’s annotations) ePLC eTRIKS Post-Loading Curation tool (eHS App.) Curator’s validation or correction of the user’s data annotations eTRIKS Post-Loading Curation tool (eHS App.) Curator’s validation or correction of the user’s data annotations eDVE eTRIKS Data Visualization and Export (eHS App.) Visualize data, select data subset, export it eTRIKS Data Visualization and Export (eHS App.) Visualize data, select data subset, export it eTI eTRIKS TranSMART Integration Structure data according to the TM master tree eTRIKS TranSMART Integration Structure data according to the TM master tree Biospeak DB Biospeak DB Biospeak DB Record data obtained or produced by the eHS App. : user’s data, harmonized data, study activities and design Biospeak DB Record data obtained or produced by the eHS App. : user’s data, harmonized data, study activities and design PROJECT DATA PUBLIC METADATA Terminology Server Repository of eTRIKS-selected public terminologies Curation Base Repository of curation policies and user’ annotations to CV terms Template Base Repository of templates eTRIKS Metadata Registry

Services between the eHS components along with the data harmonization workflow 8 eSDR + AI eHW + AI ePLC eDVE eTI Biospeak DB PROJECT DATAPUBLIC METADATA Terminology Server Curation Base Template Base eTRIKS Metadata Registry Study activities 1 Selection/ Annotations of templates 4 Queries of templates 2 User’s data and terms 5 Selection/ Annotations of CV terms 8 Queries of user’s annotations 9 Proposed templates 3 Proposed CV terms 7 Queries of CV terms 6 Validated or corrected user’s annotations 11 Queries of Harmonized data 13 Visualization of harmonized data 14 Selected data subset 15 Validated new CV terms 12 Validated annotations Validated new templates or template annotations 12 Visualization of user’s annotations 10 TM Master Tree schema 16 TM-formatted harmonized data 17 Load into TM 18 User’s inputseHS actions eHS inputs 1Data harmonization workflow steps Legend

Interdependency between the eHS components 9 eSDR eHW + AI ePLC eDVE eTI Biospeak DB PROJECT DATAPUBLIC METADATA Terminology Server Curation Base Template Base eTRIKS Metadata Registry Implications: All eHS components are needed for data harmonization across projects Coordination of the eHS component development Alignment between the data back-ends and inputs/requests of the applications/ front-ends

How to make eHS a long-term success? (1/3) Same approach as tranSMART: Make the eHS to be adopted by the broadest community of users. 10 Who are the users? Data providers (scientists and clinicians). 95 % of the users Data managers and curators. 5 % of the users eHS main target is users who are not experts in data management and curation. Consequence on the eHS application development: User interfaces must be intuitive (no need to read a manual) User interfaces must guide the users from what users know (eg their assays, studies, terms) toward what is needed for data harmonization (eg template, CV terms, terminologies)

How to make eHS a long-term success? (2/3) Other key success factors: Open-source code of the eHS components, A reference eMDR where curation knowledge, terminologies, template are publicly shared, and Standard and easy installation of Biospeak (eg docker). 11

How to make eHS a long-term success? (3/3) The higher is the eHS adoption by users: The higher adoption of good practices of data curation and reporting is, The more curation knowledge is captured in eMDR, The quicker/more automatic data harmonization is, The more time the curators have to resolve difficult cases of data curation, The higher the eHS sustainability is (raise of the developer’s community). 12

Achievements and on going progresses Achievements: Definition of data harmonization process, Definition of the eHS components, Definition of services between the eHS components, Evaluation of the existing tools for the development of eMDR, Implementation of the ‘matching’ algorithms for the Artificial Intelligence engine, Data model for molecular data in the Biospeak DB. 13 On going progress: eHS application eHW, Tests of for Artificial Intelligence engine, Design of tranSMART Master Tree, Deployment of terminology service in eTRIKS (eTRIKS Lab), Deployment of eHS sandbox in eTRIKS (eTRIKS Lab), Development plan for eMDR and the curation features of the eHS applications.

Goals of the eHS workshop 14 Understand the on going work, Understand the dependencies and needs of eHS components under development, Adjust and coordinate the development of the eHS components Identify the gaps (what has not been addressed yet), Define priorities of future work and timelines, Adjust the road map.

The eHS team 15 Adriano Barbosa Da Silva Maria Biryukov Francisco Bonachela Capdevilla Dorina Bratfalean Ibrahim Emam Wei Gu Fabien Richard Philippe Rocca-Serra Martin Romacker Venkata Satagopam