Download presentation
Presentation is loading. Please wait.
Published byFrederica Morgan Modified over 9 years ago
1
Phenotype Capture in Genetic Variant Databases Peng Chen School of Computer and Information Science chepy049@mymail.unisa.edu.au Supervisor: Dr Jan Stanek Research Fields: Health Informatics Health Computer Science Health Information System
2
Outline Motivation Research Question Literature Methodology Phenotype Data Review Result The openEHR Archetypes Review Result Phenotype Capture Experiment Result Conclusion
3
Motivation 1950s health computer science, EHR (Electronic Health Record) Slow development Bio-medical research & EHR systems Genotype – Phenotype correlation
4
Research Question Can the existing standard openEHR be used to capture and store phenotype data/clinical data? Hypothesis one: most of the phenotype data in genetic variant databases is not coded, has little clinical details, not stored in a consistent manner. Hypothesis two: openEHR is potentially suitable to store phenotype data as a standard.
5
Literature Claustres et al. (2002) ‘Time for a Unified System of Mutation Description and Reporting: A Review of Locus-Specific Mutation Databases’ Mitropoulou et al. (2010) ‘Locus-specific database domain and data content analysis: evolution and content maturation toward clinical use’ Spath & Grimson (2011) ‘Applying the archetype approach to the database of a biobank information management system’ Chen et al. (2009) ‘Archetype-based conversion of EHR content models: pilot experience with a regional EHR system’
6
Methodology Criteria form for phenotype review 1. Storage 4. Granualrity Collect phenotypes Overall granularity level Internal storage Partial fine-grained phenotypes Proprietary external storage Foreign external storage 5. Curation Curated 2. Terminology Formal terminology 6. Multiple phenotypes Proprietary terms (mapped to Single phenotype a recognised terminology) Multiple phenotype External terminology used directly Recognised terminology 7. Case level Variant-level phenotypes 3. Coding standard Case-level phenotypes Formal coding standard Proprietary codes (mapped to 8. Database a recognised coding standard) Database family External coding standard used directlyFlatform Recognised coding standard
7
Methodology The openEHR phenotype capture model
8
Methodology Data integration workflow towards a proposed health care EHR integration architecture
9
Phenotype Data Review Result Reviewed 1224 databases, 978 collect phenotype, all stored in internal storages. 40 (4.1%) has formal terminology, 30 (3.1%) has formal coding. 959 (98%) store low-granularity phenotype data. 604 (62%) were curated by experts. 534 (54.6%) store single phenotype data, 444 (45.5%) store multiple phenotype data. 757 (77.4%) store phenotypes on case basis, 221 (22.6%) on variant basis. Database: Database familyNumberPlatform LOVD614MySQL UMD134D SQL DB 63% of databases are LOVD PlatformNumber MySQL DB617 Web page table form209 Web page free text132 4D SQL DB13 PDF table form4 Excel table form2 Web page bar chart1
10
Phenotype Data Review Result Phenotype samples: Sample 1: ‘MRX’, ‘ARRP’, ‘AMD’, ‘arCRD’, ‘CIPA or HSN IV (H406Y + G613V are polymorphisms)’, ‘Type I, type II, non syndromic recessive’ Sample 2: ‘Failure to thrive; Pneumocystis carinii pneumonia; Diarrhea; Marked lymphopenia’ Sample 3:
11
The openEHR Archetypes Review Result Reviewed 283 existing openEHR archetypes Multilingual translation mechanism Term binding mechanism CriteriaResult Number of terms7361 Number of term bindings94 Coding systemSNOMED-CT, LOINC Has term binding7 (0.24% archetypes) Has multilingual translations83 (29.3% archetypes) Languages English, German, Arabic, Portuguese, Japanese, Russian, Dutch, Chinese, Spanish, Farsi Compile failure14
12
Multilingual translation mechanism - example ontology terminologies_available = term_definitions = < … ["zh-cn"] = < items = <... ["at0004"] = < text = description = … ["de"] = < items = <... ["at0004"] = < text = description = <"Der höchste arterielle Blutdruck eines Zyklus - gemessen in der systolischen oder Kontraktionsphase des Herzens."> … ["en"] = < items = <... ["at0004"] = < text = description = > (ADL display) The openEHR Archetypes Review Result
13
Multilingual translation mechanism - compare
14
Term binding mechanism term_bindings = < ["SNOMED-CT"] = < items = < ["at0000"] = ["at0004"] = ["at0005"] = ["at0013"] = > (ADL display) The openEHR Archetypes Review Result
15
Phenotype Capture Experiment Result The chosen sample: The mapping of concepts:
16
Phenotype Capture Experiment Result The openEHR archetypes mapping: Evaluation Diagnosis Observation Symptom Action Treatment NO.ArchetypesEntry items 1openEHR-EHR-EVALUATION.problem-diagnosis.v1.adlDiagnosis 2openEHR-EHR-OBSERVATION.lab_test-full_blood_count.v1.adlPlatelet count 3openEHR-EHR-ACTION.procedure.v1.adlProcedure, Comments
17
Phenotype Capture Experiment Result Phenotype capture snapshots:
18
Phenotype Capture Experiment Result Phenotype capture snapshots:
19
Phenotype Capture Experiment Result Phenotype capture snapshots:
20
Phenotype Capture Experiment Result Phenotype capture snapshots:
21
A conceptual patient-centric EHR data warehouse schema
22
Conclusion The research results have justified the hypotheses and have matched the expected outcomes The openEHR standard is potentially suitable for storing clinical data, even for integrating health information systems. The multilingual language mechanism and term binding mechanism are two strong evidences for semantic interoperability between heterogeneous systems. We need international cooperation on managing the archetypes and completing a full set of archetypes for health concepts. We need international agreement on choosing terminologies and enhancing the terminologies for resolving semantic conflicts.
23
Conclusion The philosophy and the future A health care EHR integration architecture Archetype-ontology Cognitive IS Human friendly Robust, scalable, integrated Semantic interoperability Syntactic consistency Data modelling neutral Start from learning terms and concepts IS essentially for communication Ubiquitous information computing
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.