Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deep Phenotyping for Deep Learning (DPDL): Progress Report

Similar presentations


Presentation on theme: "Deep Phenotyping for Deep Learning (DPDL): Progress Report"— Presentation transcript:

1 Deep Phenotyping for Deep Learning (DPDL): Progress Report
Tzung-Chien Hsieh Institut für Genomische Statistik und Bioinformatik Universität Bonn July 2018

2 Schilbach-Rott Syndrome
Deep Gestalt Facial analysis framework proposed by FDNA which utilizes computer vision and deep learning to quantifies similarities to genetic syndromes by training with over 26,000 patient photos. Schilbach-Rott Syndrome Syndrome Rank 1 First of all, I would like to introduce a facial image analysis technique which is deep gestalt. It is a next-generation phenotyping technique which is proposed by FDNA, and it enables the facial image analysis on patient with rare Mendelian disorders. It utilize deep learning to train the model with more than patients’ photo and to quantify the similarity to the genetic disorder. Therefore, We can obtain the similarity scores to genetic disorders by uploading our photo.

3 Support vector machine
PEDIA approach Prioritization of Exome Data with Image Analysis Phenomizer P-Score Support vector machine Symptoms Feature Match Symptom Analysis F-Score How can we utilize this technique? In our previous PEIDA study, we integrate the phenotype and genotype information for exome prioritization. The pedia score is the result of a machine learning approach that integrates multiple layers of information, such as symptoms, photo and exome sequecning data. Currently we work with phenomizer, feature scores that are derived from the clinical description of a patient based on HPO terminology. Similarity scores from image analysis come from face2gene. On the molecular level. now we work with the CADD score by annotating the VCF file. We further integrates different scores by Support vector machine. The output is our PEDIA scores. In the end we can simply sort the pedia scores. And We further make our diagnosis based on the rank of pedia scores. Photo Gestalt Match Pattern Recognition G-Score PEDIA Score Exome Variant Filter Variant Scoring CADD Diagnosis

4 DPDL a framework enabling PEDIA and much more
case-based search capabilities for disorders genes features mutations phenotype space exploration To perform PEDIA approach, we need to integrate different layers of information. Therefore, we implemented a database which stores all the similarity scores and genomic data. In order to extend the feature of data or add the different types of data in the future, we have data flexibility. The compiled data, is organized in a way that maximizes it’s searchability. Therefore, here we proposed a way to store the phenotype genotype information in a case-based database. We would like to store the data on case level. It is because we don’t want to lose the clinical feature frequency by aggregating the data to disorder level.

5 Database structure Phenotypic Scores:
The similarity scores of disorders. For example: Gestalt scores, Phenomizer scores Mutation Scores: We annotated the CADD score in VCF file, and store the CADD score of each mutation. Annotations: We store the external annotation database such as Clinvar for variant classification and dbSNP. Disorder to Gene: We connect the disorder to gene relationship by importing the mapping from OMIM. Features: Human Phenotype Ontology (HPO) For example: Intellectual Disability, Seizures HPO term Features Disorder Score Phenotypic Scores Case ID Name Cases Disorder Gene Disorder_to_Gene Once we obtain the dataset. We need to store them into database. Now, I will introduce our database design. At first we will store the Human phenotype ontology annotation in features table for each patient. Mutation Gene Score Mutation Scores Mutation Clinvar dbSNP Annotations

6 Application (1) – www.dpdl.org
Web-based exome prioritization platform Report Case submitter HPO terms F2G LABS Report DPDL Phenotype information VCF How can we use it? First you sent the patients photo and annotate the HPO term in F2g, and we will retrieve the data and store into our databse. At the same time, you will send the blood sample to the NGS facility. Once the sequencing is finished, we will also store the VCF into our databse. Then we can perform our pedia approach. The analysis team will further analyze the prioritization results to generate a report and send back to our databse. Then you can download it on our website. HPO NGS Facility VCF PEDIA score Sample Life&Brain Analysis Team

7 Application – www.dpdl.org
Here we will give a quick demo of our website. Our website is one of the project founded by translate namse. At first you can see your patient list, and also review the patient’s data. Also you will find the results from PEDIA, the genes in top 10 ranks and also the manhatton plot as another visualization. Moreover, you could go into the VCF file to check the mutations in these genes. If you are interested in specific gene, for eample this is a missence variant. you could go into the mutation to review the information from the other external database. We could also follow ACMG guildline and make report in our database. The variant which you classified by ACMG guildline will be store in dpdl and extend the knowledeq base.


Download ppt "Deep Phenotyping for Deep Learning (DPDL): Progress Report"

Similar presentations


Ads by Google