An extended SNOMED CT concept model for observations in molecular genetics James R. Campbell MDac Geoffrey Talmon MDad Allison Cushman-Vokoun MD PhDa Daniel Karlsson PhDbc W. Scott Campbell PhDad aDepartments of Internal Medicine and Pathology University of Nebraska Medical Center bDepartment of Biomedical Engineering; Linkoping University; Linkoping, Sweden CObservables Project Group and diPALM SIG; IHTSDO
Outline Terminology challenges created by Personalized Medicine What is an ‘Observable entity? Where do they fit in the SNOMED CT concept model? …the ONC terminology model? LOINC – SNOMED CT harmonization model for observable entities Anatomic and molecular pathology project at UNMC …Application of the harmonized model to genomics and molecular pathology? Where do we go from here?
Revolution in healthcare Sequencing of human genome has led to flood of new observational data appearing in scientific literature and now…clinical records Majority of this clinical data is unstructured and not useful for decision support in the EHR or clinical research Understanding of genetic and molecular basis of human disease now having impact on therapy Challenge for precision medicine - to have sufficiently detailed molecular genetics observations and diagnostic data for clinical care and research
Limitations of ONC terminologies for observables in molecular pathology Research community NCBI: Gene Ontology; HGNC; OMIM; Orphanet; Protein Ontology; Cell ontology Clinical community SNOMED CT: No concept model for subcellular anatomy (genes) or molecular structure (proteins) No concept model for Observable entity or molecular basis of disease in Clinical findings Little content LOINC 2.54: 1275 PCR; 1406 MOLGEN; 116 FISH; 1502 CELL MARKERS LOINC defining attributes inadequate to fully specify what is being resulted Provides only tag-level interoperability of molecular data No meaningful bridge joining genetic research findings with clinical concept models
Observable entity? IHTSDO Regenstrief institute “Observables are considered to be partial observation results, where there is a defined part of the observation missing.” “Concepts in this hierarchy can be thought of as representing a question or procedure which can produce an answer or a result” “Observable + Evaluation (test) result = Finding” Regenstrief institute “…a set of universal names and ID codes for identifying laboratory and clinical test results” “…master file of standard ‘test’ names and codes that will cover most of the entries in these files of operational laboratory systems” “…organization of LOINC is divided first into four categories, ‘lab’, ‘clinical’ ‘attachments’ and ‘surveys’ ”
LOINC – SNOMED CT Harmonization of Observable entities 2008 agreement between Regenstrief (RI) and IHTSDO https://loinc.org/collaboration/ihtsdo/agreement.pdf Extend and harmonize a shared concept model for 363787002|Observable entity| Map and instantiate LOINC parts in SNOMED CT content Jointly publish expression data set defining (lab) LOINC concepts within the harmonized concept model Technology preview alpha January 2016 Assertion: What is needed to support clinical decision making and research in molecular genetics is a unified domain ontology for Observable entities spanning clinical and research content including genomics
MP: Domain ontology Terminology Use Cases Retrieve all colon cancer tissue specimens which had BRAF gene testing (IHC or sequencing) Retrieve all breast cancer cases that tested ER-, PR- and Her2/Neu- Identify all colon cancer cases with high degree microsatellite instability by PCR or absence of mismatch repair protein by IHC Identify all cases with Stage IIIa non-small cell lung cancer that had EGFR mutation testing by any method
UNMC: Project for structured encoding of AP/MP cancer reports Objective: Detailed structured reporting of all anatomic and molecular pathology observations for all CAP synoptic cancer worksheets (82 types of malignancies) Proposal: Analyze detailed semantics of CAP worksheets; apply harmonized concept model to develop terminology requirements; deploy as real-time structured reporting from COPATH system interfaced to tissue biobank and EPIC Tooling: Nebraska Lexicon© extension namespace; SNOWOWL authoring platform; SNOMED CT International + US Extension + Technology preview; ELK 0.4.1 DL classifier Penciled into IHTSDO workplan for 2017
Fully Encoded Checksheet
Observables Project: Concept model draft
SNOMED extensions for MP: Genes and proteins Body structures>>Cell structures>>Nucleotide sequences and Genes Substances>>Proteins Qualifiers>>Techniques>> AP and MP methods Qualifiers>>Measurement Properties>>AP and MP properties Observable entity>>AP and MP observables Clinical findings>>Anatomic and molecular genetic observation results and disorders
Developments for MP: Genes and proteins Body structures>>Cell structures>>Nucleotide sequences and Genes Substances>>Proteins Qualifiers>>Techniques>> AP and MP methods Qualifiers>>Measurement Properties>>AP and MP properties Observable entity>>AP and MP observables Clinical findings>>Anatomic and molecular genetic observation results and disorders Define genes and related subcellular structures by mapped reference to scientific ontologies classified in conjunction with SNOMED CT & LOINC Observables
Developments for MP: Genes and proteins Body structures>>Cell structures>>Nucleotide sequences and Genes Substances>>Proteins Qualifiers>>Techniques>> AP and MP methods Qualifiers>>Measurement Properties>>AP and MP properties Observable entity>>AP and MP observables Clinical findings>>Anatomic and molecular genetic observation results and disorders Define proteins and related molecular structures by mapped reference to scientific ontologies classified in conjunction with SNOMED CT & LOINC Observables
MP: Immunohistochemistry
Development for MP: Sequence datatypes
Molecular Pathology Finding: Fully defining observations of somatic sequence variants
Patient Problem List: Somatic mutation patient condition
Congenital (germline) mutations “Familial adenomatous polyposis” Cancer predilection syndrome found to originate with multiple genetic etiologies: FAP 1: heterozygous germline mutation of the APC gene Gardner syndrome: clinical variant of FAP 1; includes desmoid tumors and other hamartomas; mutation of APC gene FAP 2: mutation of MUTYH FAP 3: mutation of MTHL1 FAP 4: mutation of MSH3
MP: Germline mutation use case: steps to define “Familial adenomatous polyposis 1”
MP: Germline mutations
MP: Germline mutations
Terminology development summary: Colorectal and Breast Cancer SNOMED CT hierarchy Anatomic Pathology Concepts/Primitives Molecular Genetic Exemplar molecular extension concepts Observable entities 61/1 32/3 “BRAF nucleotide sequence detected in excised malignancy” Body Structures 10/9 29/3 “BRAF gene locus” Clinical findings 6/2 7/3 “BRAF V600E variant identified in excised malignancy” Procedures 2/1 Techniques 4/4 7/7 “Pyrosequencing” Property types 8/8 2/2 “Sequence property” Scale types 9/9 “Variant call format” Situations 1/0 Substances 0/0 11/11 “BRAF human cellular protein” Attributes 3/3 Qualifiers TOTALS 88/29 100/41
Deployment of MP data HL7 interface to tissue biobank for indexing of tissue for research community ETL of data to i2b2 data warehouse Publication of laboratory/pathology observables ontology with NLM and RI Interface development and extension to structured pathology data for EHR
HLA Histocompatibility
Gene and associated protein HLA-A gene locus(cell structure) ISA Nucleotide sequence ISA Chromosome structure Part of Chromosome pair 6 Chromosome region 6p22.1 49495002|HLA-A antigen(substance)|