HL7 Clinical Genomics – Domain Information Model The HL7 Clinical Genomics Work Group Prepared by Amnon Shabo (Shvo), PhD HL7 Clinical Genomics WG Co-chair.

Slides:



Advertisements
Similar presentations
Specimen Model Project: Requirements from Clinical Genomics Mukesh Sharma Washington University Medical School St. Louis October 2010 WG meeting.
Advertisements

1 Intermountain Healthcare Clinical Genetics Institute Marc S. Williams, M.D. Director Grant M. Wood Senior IT Strategist Introduction to HL7 Clinical.
1 HL7 Jan 2010 Working Group MeetingsCTLAB Pharmacogenomics Update JOINT RCRIM and CG Session CTLAB Message Pharmacogenomics Results Overview.
HL7 Clinical Genomics and Structured Documents Work Groups CDA Implementation Guide: Genetic Testing Report DRAFT PROPOSAL Amnon Shabo (Shvo), PhD
HL7 Clinical Genomics – RIM Constraining Issues May 2008 The HL7 Clinical Genomics SIG Amnon Shabo (Shvo), PhD HL7 Clinical Genomics SIG Co-chair and Modeling.
HL7 Clinical-Genomics SIG: Tissue-Typing Models and a Reusable Genotype Module HL7 V3 Compliant HL7 Clinical-Genomics SIG Facilitator Amnon Shabo (Shvo)
HL7 v3 Clinical Genomics – Overview
IHIC 2011 – Orlando, FL Amnon Shabo (Shvo), PhD HL7 Clinical Genomics WG Co-chair and Modeling Facilitator HL7 Structured Documents WG.
Proteomics Examination Yvonne (Bonnie) Eyler Technology Center 1600 Art Unit 1646 (703)
IHWG Workshop Data Tools for HLA Sequence.
Electronic Submission of Medical Documentation (esMD) Clinical Document Architecture R2 and C-CDA Comparison April 24, 2013.
1 Single Nucleotide Polymorphisms (SNP) Gary Jones SPE, Technology Center 1600 (703)
SMART/FHIR Genomic Resources An overview... For latest see
Recommendations from HL7 Clinical Genomics & Anatomic Pathology Workgroups, NCBI, and LOINC/Lister Hill Center at NLM To the College of American Pathologists.
Object-Oriented Analysis and Design
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Introduction of Cancer Molecular Epidemiology Zuo-Feng Zhang, MD, PhD University of California Los Angeles.
Chapter 8 Structuring System Data Requirements
FEBRUARY 4, 2015 STANLEY M. HUFF, MD CHIEF MEDICAL INFORMATICS OFFICER INTERMOUNTAIN HEALTHCARE Modeling and Terminology.
Data Modeling Introduction. Learning Objectives Define key data modeling terms –Entity type –Attribute –Multivalued attribute –Relationship –Degree –Cardinality.
LRI Validation Suite Meeting November 1st, Agenda Review of LIS Test Plan Template CLIA Testing EHR testing (Juror Document)—Inspection Testing.
Overview of Basic Genetic Science Dr. Mike Dougherty Department of Biology Hampden-Sydney College.
UML Class Diagrams: Basic Concepts. Objects –The purpose of class modeling is to describe objects. –An object is a concept, abstraction or thing that.
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 10 Structuring.
CaGrid, Fog and Clouds Joel Saltz MD, PhD Director Center for Comprehensive Informatics.
FHIM Overview How the FHIM can organize other information modeling efforts.
Utility Requirement in Japan Makoto Ono, Ph.D. Anderson, Mori & Tomotsune Website:
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
Computer System Analysis Chapter 10 Structuring System Requirements: Conceptual Data Modeling Dr. Sana’a Wafa Al-Sayegh 1 st quadmaster University of Palestine.
HL7 Clinical Genomics – Implementation Roadmap The HL7 Clinical Genomics SIG Amnon Shabo (Shvo), PhD HL7 Clinical Genomics SIG Co-chair and Modeling Facilitator.
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
HL7 HL7  Health Level Seven (HL7) is a non-profit organization involved in the development of international healthcare.
© 2006 ITT Educational Services Inc. SE350 System Analysis for Software Engineers: Unit 8 Slide 1 Chapter 9 Structuring System Data Requirements.
Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis June 16, 2011.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
SMART/FHIR Genomic Resources
EHR-S Functional Requirements IG: Lab Results Interface Laboratory Initiative.
Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.
Dave Iberson-Hurst CDISC VP Technical Strategy
Clinical Document Architecture. Outline History Introduction Levels Level One Structures.
1 LS DAM Overview and the Specimen Core February 16, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund,
Public Health Reporting Initiative Stage 3 Sprint: Implementation Guide Development Phone: x
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
HL7 v3 Clinical Genomics – Overview The HL7 Clinical Genomics Work Group Prepared by Amnon Shabo (Shvo), PhD HL7 Clinical Genomics WG Co-chair and Modeling.
SMART/FHIR Genomic Resources An overview.... Change Log Added a few changes to Sequence resource Added data support for alignment data (e.g. SAM or BAM.
Networking and Health Information Exchange Unit 5b Health Data Interchange Standards.
Health IT Workforce Curriculum Version 1.0 Fall Networking and Health Information Exchange Unit 3b National and International Standards Developing.
Public Health Reporting Initiative Stage 3 Sprint: Implementation Guide Development 1.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
Bioinformatics and Computational Biology
HL7 Clinical-Genomics SIG: Tissue-Typing Models and a Reusable Genotype Module HL7 V3 Compliant IBM Research Lab in Haifa together with Hadassah University.
Commentary: The HL7 Reference Information Model as the Basis for Interoperability George W. Beeler, Jr. Ph.D. Co-Chair, HL7 Modeling & Methodology.
BRIDG Imaging Project Nov. 25th, Agenda Project Goals & Objectives Imaging Projects of interest Rationale for aligning with BRIDG Principles on.
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
CAP Cancer BioMarker Reporting Committee Biomarker - Problem, ROI and Scope College of American Pathologists’ Biomarker Reporting Committee Problem 1.Clinicians.
Standards Analysis Summary vMR – Pros Designed for computability Compact Wire Format Aligned with HeD Efforts – Cons Limited Vendor Adoption thus far Represents.
THE DICOM 2014 INTERNATIONAL SEMINAR August 26Chengdu, China HL7 and DICOM: Complementary Standards, Collaborating Organizations Bao Yongjian Principal.
SNOMED CT Vendor Introduction 27 th October :30 (CET) Implementation Special Interest Group Tom Seabury IHTSDO.
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 10 Structuring.
INTERPRETING GENETIC MUTATIONAL DATA FOR CLINICAL ONCOLOGY Ben Ho Park, M.D., Ph.D. Associate Professor of Oncology Johns Hopkins University May 2014.
1 LS DAM Overview August 7, 2012 Current Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Mervi Heiskanen, NCI-CBIIT, Joyce.
Information Representation Working Group WG Meeting September 5, 2008.
To develop the scientific evidence base that will lessen the burden of cancer in the United States and around the world. NCI Mission Key message:
Genomic Definition by Reference Mapping
Dave Iberson-Hurst CDISC VP Technical Strategy
What is Useful? Publishing Support for Precision Medicine
Content and Labeling of Tests Marketed as Clinical “Whole-Exome Sequencing” Perspectives from a cancer genetics clinician and clinical lab director Allen.
, editor October 8, 2011 DRAFT-D
Lecture 10 Structuring System Requirements: Conceptual Data Modeling
Presentation transcript:

HL7 Clinical Genomics – Domain Information Model The HL7 Clinical Genomics Work Group Prepared by Amnon Shabo (Shvo), PhD HL7 Clinical Genomics WG Co-chair and Modeling Facilitator Pedigree (family health history) Co-editor GTR Implementation Guide Primary Editor HL7 Structured Documents WG CDA R2 Co-editor CCD Implementation Guide Co-editor

2 Overview of Activities v3:  Family History (Pedigree) Topic  Deprecated:  Genetic Locus / Loci  Genetic Variations Topic  Gene Expression Topic v2: v2 Implementation Guides * The IG “Genetic Test Result Reporting to EHR” is modeled after the HL7 Version Implementation Guide: Orders And Observations; Interoperable Laboratory Result Reporting To EHR (US Realm), Release 1 CDA:  A CDA Implementation Guide for Genetic Testing Reports (GTR) Common:  Domain Analysis Models  Domain Information Model (DIM) describing the common semantics  DIM is standards-independent; Semantic alignment among the various specs Main efforts: (chronologically ordered) Normative DSTU Informative FHIR:  Genomic resources  Genetic family history profile Under Development:

3 CG Domain Information Models (DIMs)  Represent agreed-upon and common CG semantics  Are aligned with the CG DAM requirements and models  A focal point to align the various CG specifications  Balloted in 9/2014 at informative level  Consisted of two models:  Clinical Genomics Statement (CGS)  A specialization of the HL7 Clinical Statement model  An extension of the GTR Clinical Genomics Statement  Family Health History (utilizing the CGS)  Passed, as follows: Aff. : 29, Neg.: 12, Abst.: 69, NV: 24  All comments regarding CGS - reconciled (need to reconcile Pedigree)

Goals  Top-down conceptual model aligns bottom-up designed specs  Glossary (e.g., what is mutation and when it should be used)  Concrete standards can match pieces of the puzzle, and further detail / constrain them 4

Modeling Interplays – Informing Each Other 5 DAM FHIR v2 CDA Others Profiles TemplatesIGs DIM(s) CG Information Modeling (IM)

Clinical Genomic Statement - Core Principles  Explicit representation of a 'genotype-phenotype' association  The association should not be pre-coordinated, that is, a phenotype is not represented as part of some genetic variation representation, e.g., a genetic biomarker is an explicit association of two distinct observations and there is no pre-coordination of the interpretation into a single biomarker code  The association is a many-to-many relationship, e.g., a genetic variation can be associated with several phenotypes and a phenotype could be associated with multiple ‘genotypes’ 6

Genetic Observations  Capture various types of data, from known mutations to sequencing- based variations (e.g., somatic mutations in tumor tissues) to structural variations (e.g., large deletions, cytogenetics) to gene expression and so forth  The core class GeneticObservation contains data that is not the result of downstream analysis, i.e., that cannot be derived analytically from another observation available to performer  Link to GeneticObservationComponent class (zero to many) enables the post-coordination of any core data item (e.g., HGVS string) to its unambiguous basic primitives  Enabling encapsulation of key raw omics data that is the basis of creating this clinical genomics statement  The encapsulated data can have links to the full-blown omics data (e.g., NGS sequencing data) 7

Phenotypes  Phenotype class can be subtyped to distinguish between predictive phenotype and observed  Prediction is represented as a distinct observation with its own attributes such as time, method, performer, etc.  Links to knowledge sources / methods / services used for the interpretation 8

CGS DIM - Overview 9 The backbone: Gen-Phen association The Phenotype(s) The genetic observation(s) Encapsulated raw data Predictive phenotype Observed phenotype

Class: GeneticObservation  Representation of any observation that cannot be analytically derived from other observations made by the performer, including genetic loci in the size of a gene, e.g., genes, biomarkers, point DNA mutations, etc., as well as structural variations, e.g., large DNA deletions or duplications, etc.  Note that omics data derived from this observation, e.g., RNA, proteins, expression, etc., should be represented using the linked DerivedGeneticData class. 10

Class: Phenotype  Description:  Representation of any data considered phenotype of the source genetic observation. For each genetic observation (e.g., variation), there could be multiple phenotypes, however, each phenotype object should represent a single phenotype. Defining a set of properties to be a 'single' phenotype might be determined by the intended utility of this phenotype, e.g. in patient care or in clinical trials or earlier on in different phases of research.  Attributes:  id: a globally-unique id of this observation  type: the type of phenotypic observation, e.g., drug responsiveness  value: the actual observation (should be drawn from dedicated ontologies, e.g., Human Phenotype Ontology, phenotype-ontology.org/) phenotype-ontology.org/ 11

Association Class: Gen-Phen  This association class adds characteristics that cannot be represented through either side of the core association, e.g.,  type of phenotype (observed or non-observed, e.g., predictive)  type of relation like causality, pathogenic if phenotype is a pathology, or non-causative (benign)  status of this relation, e.g., established, provisional, presumed, preliminary, etc.  method of how this associated was created  time when the association was created  performers of this association 12

Post-coordination of Gen. and Phen. Obs. 13 Genetic observation components Phenotye components

CGS DIM Core Association: ‘Gen-Phen’ 14

Class: GeneticObservationComponent  Description:  Representation of components of the core genetic observation in a post-coordinated manner, i.e., breaking it down to its primitives in order to disambiguate data held by the GeneticObservation class, e.g., an HGVS string.  Attributes:  type: the type of genetic observation component, e.g., variation location, variation length, etc.  value: the actual component observation, e.g., if type is variation base length then value is the actual length 15

Derived Genetic Data 16 DerivedGeneticData: the results of downstream analysis of the source genetic observation

Class DerivedGeneticData  Description:  Representing any downstream data derived from the core genetic observation (including omics data), e.g., RNA, proteins, expression, etc.  Attributes:  id: a globally-unique id of this observation  type: the type of observation, e.g., characteristics of the resulting protein  value: the actual observation  performer: e.g., unique id of the genetic lab that generated this data 17

Molecular Specimen and Indication 18 Indication of performing the genetic observation Specimen used for the genetic observation

Encapsulated Key Raw Data Sets 19 Key raw data sets are encapsulated in its native format, with references to the full-blown raw data

Class: KeyRawOmicsData  Description:  Representation of key data extracted from mass raw genomic data, e.g., an exon's sequence that is considered key to this clinical genomics statement, is extracted from some larger sequencing (up to whole exome squencing) and encapsulated within this statement structure. Another example: certain lines of a VCF file (preferably its 'clinical-grade' version). The encapsulated data should contain references to the full-blown raw omics data.  Attributes:  encapsulatedData: a placeholder for the key raw omics data, e.g., DNA sequences, variants calls, expression data, etc. Could be any type of data encoding: binary, text, XML, etc.  reference: a pointer to the repository containing the full-blown data sets out of which these key data items were `extracted from  type: designating the notation used in the encapsulatedData attribute, e.g., VCF, FASTA, etc.  schemaReference: a pointer to the schema governing the representation of the data in the encapsulatedData attribute 20

Predictive versus Observed Phenotype 21 Predictive Phenotype – a distinct observation Observed Phenotype, that is, observed in the individual and associated with the genetic observation Additional information pertinent to the core phenotype observation

Class ObservedPhenotype (ext. Phenotype)  Description:  This is a sub-class of Phenotype, representing a phenotype observed in the subject, e.g., if the somatic mutation is known to be the reason of positive responsiveness to some drug, then that responsiveness is an observed phenotype.  Attributes:  clinicalTime: the effective time for the subject of this observation  ageOfOnset: use this attribute in cases where dates are not available (e.g., in clinical genomics statements pertaining to a relative of a patient in a family health history)  healthRecordReference: a pointer to an information system (e.g., EHR system) holding the broader contextual data of this observation (this could be implemented by guidance provided by health information exchange systems such as the specifications set out by the NwHIN in the US) 22

Class PredictivePhenotype (ext. Phenotype)  Description:  This is a sub-class of Phenotype, representing a phenotype that is likely to be manifested in the subject, e.g, the somatic mutation is likely to cause resistance to some drug  Attributes:  method: the method by this interpretation was made (e.g., an identified algorithm, a rule engine, a clinical decision support application/service, etc.)  interpretation Time: the time this interpretation was made available for this Clinical Genomics statement instance)  performer: unique id of the entity making this interpretation (e.g., clinical decision support application/service) 23

Anatomic Pathology Specimen & Findings 24 Anatomic pathology findings with optional reference to the full AP report Anatomic pathology specimen (prior to DNA/RNA extraction)

© Amnon Shabo (Shvo) Relation: [SAS] Kobayashi et al. (2005) reported a patient with advanced non- small cell lung cancer in complete remission during treatment with Gefitinib (he was Gefitinib-responsive due to somatic EGFR-mutant). However, after 2 years he got into relapse. Translational Clinical Genomics Statement 25 Observation SequenceVariation EGFR Variant id Relation: cause; evidence Observation ClinicalPhenotype responsive Medication DrugTherapy Gefitinib intake details: dose, time, etc. Observation SequenceVariation EGFR Variant id Relation: cause; simulation Observation ClinicalPhenotype resistant Relation: [subject] Clinical genomics Translational Re-sequencing DNA of the EGFR gene in his tumor biopsy specimen at relapse revealed the presence of a second mutation. Structural modeling and biochemical studies showed that this second mutation led to the Gefitinib resistance.

26 The End Thank you for your attention… Questions? Contact Amnon at Comments of general interest should be posted to the CG mailing list at