Melissa Haendel April 18th, 2016 Force206 Phenopackets: Making phenotype profiles FAIR++ for disease diagnosis and discovery Findable Accessible outside paywalls and private data sources Attributable Interoperable and Computable, Reusable, exchangeable across contexts and disciplines Melissa Haendel April 18th, 2016 Force206 @monarchinit @ontowonka haendel@ohsu.edu
Prevailing clinical pipelines leverage only a tiny fraction of the data POSSIBLE DISEASES PATIENT EXOME / GENOME PUBLIC GENOMIC DATA Under-utilized data PUBLIC PHENOTYPE, DISEASE DATA PATIENT PHENOTYPES PUBLIC ENVIRONMENT, DISEASE DATA PATIENT ENVIRONMENT DIAGNOSIS & TREATMENT
It takes an interoperable village to diagnose a rare platelet syndrome http://bit.ly/stim1paper Ranked STIM-1 variant maximally pathogenic based on cross-species G2P data, in the absence of traditional data sources http://bit.ly/exomiser MGI mouse Stim1Sax/Sax This was the novel case we solved. The UDP patient had a number of signs and symptoms including various platelet abnormalities. The same heterozygous, missense mutation was seen in 2 patients and ranked top by Exomiser. It had never been seen in any of the SNP databases and was predicted maximally pathogenic. Finally a mouse curated by MGI involving a heterozygous, missense point mutation introduced by chemical mutagenesis exhibited strikingly similar platelet abnormalities. Phenotypic profile N/A Heterozygous, missense mutation STIM-1 Heterozygous, missense mutation STIM-1 Genes N/A
Image credit-https://pixabay What if we all helped?
Biology central dogma Genes + Environment = Phenotypes The classic G+E=P. But the = has a lot that can be applied to aid the linking. Standards for encoding and exchanging data must be up to these challenges. @ontowonka
Standard exchange formats exist for genes … but for phenotypes Standard exchange formats exist for genes … but for phenotypes? Environment? Genes Environment Phenotypes NEW The classic G+E=P. But the = has a lot that can be applied to aid the linking. GFF VCF BED PXF @ontowonka
Introducing PhenoPackets It’s exactly what you think it is: a packet of phenotype data to be used anywhere, written by anyone
What does a PhenoPacket look like? phenotype_profile: - entity: "person#1" phenotype: types: - id: "HP:0200055" label: "Small hands" onset: description: "during development" - id: "HP:0003577" label: "Congenital onset" evidence: - types: - id: "ECO:0000033" label: ”Traceable Author Statement" source: - id: "PMID:1" title: "age of onset example" persons: - id: "#1" label: "Donald Trump" sex: "M" Canonical JSON format Image credits: upi.com
Phenopackets for laypersons Dry eyes Developmental delay Elevated liver function phenotype_profile: - entity: ”patient16" phenotype: types: - id: "HP:0000522" label: ”Alacrima" onset: description: ”at birth" - id: "HP:0003577" label: "Congenital onset" evidence: - types: - id: "ECO:0000033" label: ”Traceable Author Statement" source: - id: ” https://twitter.com/examplepatient/status/123456789" Reeldx, Inspire, patientslikeme, post phenopacket on facebook Patient registries Social media https://bit.ly/hpo-layperson Image credits: ngly1.org
If it is alive, it can be PhenoPackaged Patients & Cohorts Personalized Medicine Rare Disease Diagnosis Disease vectors Model Organisms Drug discovery & Development Epidemiological Monitoring Mechanistic Discovery Biodiversity Crops Domestic Animals Mosquito image from https://pixabay.com/en/brazil-health-mosquito-news-virus-1300017/ no attribution required Environmental Monitoring Genetic Engineering Some biodiversity images adapted from http://i.vimeocdn.com/video/417366050_1280x720.jpg
Phenopackets for journals (using the new JATS extension!) Each article can be associated with a phenopacket Each phenopacket can be shared via DOI (or other ID) in any repository outside paywall (eg. Figshare, Zenodo, etc) Robinson, P. N., Mungall, C. J., & Haendel, M. (2015). Capturing phenotypes for precision medicine. Molecular Case Studies, 1(1), a000372. doi:10.1101/mcs.a000372
Acknowledgements OHSU Queen Mary College London Lawrence Berkeley Matt Brush Kent Shefchek Julie McMurry Tom Conlin Nicole Vasilevsky Queen Mary College London Damian Smedley Jules Jacobson Lawrence Berkeley Chris Mungall Suzanna Lewis Jeremy Nguyen Seth Carbon Charité Peter Robinson Sebastian Kohler RTI Jim Balhoff Garvan Tudor Groza Alfred Wegener Pier Buttigieg Cyverse Ramona Walls U of Pittsburgh Harry Hochheiser FUNDING: NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P, Phenotype Research Coordination Network (NSF-DEB-0956049)