Presentation is loading. Please wait.

Presentation is loading. Please wait.

DONNA MAGLOTT, PH.D. PRO AND MEDICAL GENETICS RESOURCES AT NCBI.

Similar presentations


Presentation on theme: "DONNA MAGLOTT, PH.D. PRO AND MEDICAL GENETICS RESOURCES AT NCBI."— Presentation transcript:

1 DONNA MAGLOTT, PH.D. PRO AND MEDICAL GENETICS RESOURCES AT NCBI

2 OPPORTUNITIES The medical genetics group is a relatively recent addition to the suite of resources at NCBI, and manages the NIH Genetic Testing Registry (GTR), ClinVar, and MedGen. These databases share the need to standardize representation of genes, proteins, small molecules, variation, conditions, and phenotypes, not only with respect to explicit terms, but also the relationships among those terms. This presentation will focus on opportunities for utilization of PRO in the NCBI’s Medical Genetics group.

3 CASE STUDIES MEDICAL GENETICS: CLINVAR, GENE, GTR, MEDGEN

4 A QUICK TOUR From the home page…

5 USING THE RESOURCE SECTIONS

6 TRY ALL SECTIONS

7

8 MAJOR DOMAINS OF INFORMATION ConceptNCBI database/ResourceUsed in Diseases and their defining features MedGen ( Diseases, Findings… ) ClinVar, dbVar, Gene, GTR, PheGenI, dbGaP DrugsMedGen ( Pharmacologic Substance ) ClinVar, GTR Genes and gene products Gene, Nucleotide, Protein, HomoloGene, RefSeq ClinVar, dbSNP, dbVar, GTR … Biological processes, cellular components, molecular functions ---Gene Interactions and pathways Biosystems, Gene VariationClinVar, dbSNP, dbVarClinVar, dbSNP, dbVar… Records connected by reciprocal, generic links via database identifiers

9 SOME TALKING POINTS Except for RefSeq, curation minimal RefSeq-based with pointers to UniProtKB Use ontologies to acquire and represent standard terms Point to ontologies, but not used to support node-based query interfaces Capturing primary data that can be used to drive development of ontologies Some user communities think in terms of nucleotide only Data being submitted with uncertain significance Look for opportunities for adding value to NCBI’s databases and tools

10 GENE AND DATA STANDARDS Name of the gene (nomenclature committees) Names of protein products Primary product (Swiss-Prot) Isoforms (RefSeq) Names of associated conditions (multiple) Descriptions of pathways (submitters) Biological processes, cellular components and molecular functions (GO) HIV interactions (NIAID) http://www.ncbi.nlm.nih.gov/gene?term=hiv1interactions[Properties] http://www.ncbi.nlm.nih.gov/projects/RefSeq/HIVInteractions/

11 HUMAN MISMATCH REPAIR

12 RESTRICT TO THOSE REPORTED TO BE DISEASE-CAUSING

13 Summary Bibliography Interactions Pathways Gene Ontology General protein information Reference sequences Locus-specific databases Phrase found in: www.ncbi.nlm.nih.gov/gene/4292

14 Titles of pathways Descriptions of interactions

15

16 GENE PROTEIN

17 HOMOLOGENE

18 DISEASES AND PHENOTYPES MEDGEN: UMLS, HPO, OMIM, ORDO, GTR

19 WHY MEDGEN? A stable node of identifiers within NCBI for disease names, their clinical features, and pharmacological substances Built on the foundation of a subset of UMLS, with supplements from HPO, OMIM (between UMLS releases), and submissions to GTR and ClinVar Primarily automated, but some overview by M.D.s and genetic counselors on staff, and feedback from the community

20 TERMS FROM UMLS/OMIM/GTR/CLINVAR

21 HIERARCHIES: CURATED BY GTR STAFF Guided by OMIM’s clinical series and user feedback

22 HIERARCHIES: COMPUTED FROM NODES IN UMLS

23 Hierarchy from DNA Repair Deficiency Disorders

24 USING HPO FOR CLINICAL FEATURES Partial display Organized by top nodes of the ontology Each specific term supports a link to disorders manifesting that feature

25 CLINVAR: REPORTED VARIATION- PHENOTYPE RELATIONSHIP

26 Submitter archive (not curated) Variant Disease and/or phenotypes Interpretation Confidence

27 SUBSET OF A DETAILED RECORD Gene name and symbol Sequence ontology for molecular and functional consequences Diseases Identifiers and links Observed phenotypes (as distinct from those reported to be characteristic of the diagnostic term) Protein change from the variant

28 DATA SOURCES AND GROWTH

29 SUBMISSIONS FROM UNIPROT Summarize submissions by genes, diseases, and phenotypes

30 CURRENT STATUS: CLINGEN-RELATED http://www.clinicalgenome.org/ Diseases Genes Variants Predictions Conserved sequence Conserved domains Pathways

31 ‘PHENOTYPE’ AND CLINGEN/CLINVAR Working group on phenotype Make distinctions among Disease category (body system, metabolic perturbation, cancer) Diagnosis Characteristic features General or gene-specific Diseases targeted by drugs for which the response is genetically determined Observed phenotypes HPO PhenoDB Indications for testing Standardization One ontology or many? Relationship to OMIM

32 VARIATION AND CLINGEN/CLINVAR Sequence Ontology for variant location and effect Coordinate with PharmGKB for pharmacogenomics Description of haplotypes No discussion yet about authorities for pathways, conserved domains, post-translational modifications

33 CURRENT STATUS: NCBI Working with UMLS to improve representation of terms and relationships Mapping concepts Reporting relationships Supplement current UMLS with HPO, Orphanet (ORDO, in progress), and recent data from OMIM Working with Clinical Pharmacogenetics Implementation Consortium (CPIC) and PharmGKB Representation of haplotypes/star alleles Drug responses/Disease target Consumer of ontologies to standardize terminology, with definitions Link to resource site Provide attribution Support term-specific queries

34 CURRENT STATUS: NCBI Queries currently term by term, not by node Some relationships based on links in Entrez Gene disease Disease clinical feature Variation gene Some relationships explicit Genome->transcript->protein Nucleotide change->protein change Some relationships reported as hierarchies GTR MedGen (MeSH) ORDO (in progress)

35 CURRENT STATUS: NCBI Maintenance primarily automatic Some curatorial review by staff of ClinVar and NIH Genetic Testing Registry (GTR) Expect expanded review from the ClinGen group Data freely available by ftp or E-utilities ftp://ftp.ncbi.nih.gov/pub/clinvar/ ftp://ftp.ncbi.nih.gov/gene/ ftp://ftp.ncbi.nih.gov/pub/GTR/ ftp://ftp.ncbi.nih.gov/pub/medgen/

36 ACKNOWLEDGEMENTS Slava GorelenkovMedGen Melissa LandrumClinVar Jennifer LeeGTR, ClinVar Terence MurphyGene Lon PhandbSNP/dbVar Kim PruittRefSeq Wendy RubinsteinGTR, MedGen Ming WarddbSNP and all their staff


Download ppt "DONNA MAGLOTT, PH.D. PRO AND MEDICAL GENETICS RESOURCES AT NCBI."

Similar presentations


Ads by Google