database of Genotype and Phenotype

Slides:



Advertisements
Similar presentations
Lecture 2 Strachan and Read Chapter 13
Advertisements

Microsoft Excel 2003 Illustrated Complete Excel Files and Incorporating Web Information Sharing.
The National Center for Biotechnology Information (NCBI) a primary resource for molecular biology information Database Resources.
The Rice Functional Genomics Program of China cDNA microarray database (RIFGP-CDMD) consists of complete datasets, including the probe sequences, microarray.
Map Curation on GrainGenes Victoria Carollo, Gerard Lazo, David Matthews, Olin Anderson Biological Databases Curators Meeting October 2003.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
NCBI Elements of WGA n Phenotype Model n Genotype n Association between Phenotype Model and Genotype.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
How to access genomic information using Ensembl August 2005.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
PubMed/How to Search, Display, Download & (module 4.1)
NGS Analysis Using Galaxy
DbSNP: the NCBI database of genetic variation S. T. Sherry, M.H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski and K. Sirotkin, Nucleic Acids.
1 Identify the location of a particular gene, trait, QTL or marker - and the grass species they have been mapped to - on genetic, QTL, physical, sequence,
1 Welcome to the Quantitative Trait Loci (QTL) Tutorial This tutorial will describe how to navigate the section of Gramene that provides information on.
Gramene Objectives Develop a database and tools to store, visualize and analyze data on genetics, genomics, proteomics, and biochemistry of grass plants.
Gene Expression Omnibus (GEO)
Copyright OpenHelix. No use or reproduction without express written consent1.
Future Use of Stored Samples & Data and the NIH Policy on GWAS and dbGaP NIAID/DAIDS Dione Washington, M.S. -- ProPEP Sudha Srinivasan, Ph.D.-- TRP Tanisha.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Searching PubMed® NCBI, NLM Resources, Micromedex -GSBS TTUHSC Preston Smith Library presents Rev. 08/17/14.
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
DONNA MAGLOTT, PH.D. PRO AND MEDICAL GENETICS RESOURCES AT NCBI.
Introduction to the Gramene Genetic Diversity module 5/2010 Build #31.
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Quantitative Genetics
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Map-based Exploration of Population Biology Data in VectorBase What is VectorBase? We are a consortium of institutions that hosts the genomes of invertebrate.
This tutorial will describe how to navigate the section of Gramene that provides descriptions of alleles associated with morphological, developmental,
Career Services Network Annual Retreat July 29, 2010 How to use the library to help students get a job!
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
Applied Bioinformatics Week 9 Jens Allmer. Theory I Gene Expression Microarray.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
What do we already know ? The rice disease resistance gene Pi-ta Genetically mapped to chromosome 12 Rybka et al. (1997). It has also been sequenced Bryan.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
This tutorial will describe how to navigate the section of Gramene that allows you to view various types of maps (e.g., genetic, physical, or sequence-based)
Copyright OpenHelix. No use or reproduction without express written consent1.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Welcome to the combined BLAST and Genome Browser Tutorial.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Advanced Search dbGaP Closed captioning: and enter www.captionedtext.com The recording, will be on our YouTube channel in.
GEO (Gene Expression Omnibus) Deepak Sambhara Georgia Institute of Technology 21 June, 2006.
Michael Feolo Outline  What is dbGaP  How to get your study registered  How to submit data  Not Covered  SRA Submission.
Essex Insight Introduction to Essex Insight Training Guide Source: Research and Analysis Unit v4.
T3/Tutorials: Data Submission
Summon - HINARI Search (Basic Course Module 7)
Summon - HINARI Search (Module 3)
After this course you will be able to:
Sample Registration – Batch Reg
User Awareness Program ‘Accessing Emerald’ Universitas Lancang Kuning
Bioinformatics Research Group
Information Systems Today: Managing in the Digital World
Using ArrayExpress.
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Quantitative traits Lecture 13 By Ms. Shumaila Azam
Welcome to the Gene and Allele Database Tutorial
TAMU Bovine QTL db and viewer
Welcome to the Quantitative Trait Loci (QTL) Tutorial
Welcome to the Markers Database Tutorial
Genome Database for Rosaceae:
TOPMed Analysis Workshop Genetic Analysis Center Biostatistics Department University of Washington TOPMed Data Coordinating Center August 7-9, 2017 Introduction.
Welcome to the GrameneMart Tutorial
How to Effectively Search and Download Data in CottonGen
PubMed/How to Search, Display, Download & (module 4.1)
PubMed/How to Search, Display, Download & (module 4.1)
Presentation transcript:

database of Genotype and Phenotype http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap Kim Pruitt (for Matt Mailman) NCBI

Overview Phenotype Genotype Genotype X Phenotype Association

Overview Phenotype Genotype Genotype X Phenotype Association Data tables Columns are phenotypes Rows are individuals Documents (ie: protocols, data collection forms) Parts of documents linked to variables Data dictionary Genotype Genotype X Phenotype Association Question for Matt: Data dictionary is a glossary describing the variables? DDictionary provides context information and isn’t made public as a stand alone object - Only parts of the data dictionary are public, variable descriptions are made public. No format requirement – get this in a variety of formats at this time – format restrictions expected in the future.. Not all values are phenotypes (eg date) so called the data ‘variables’

Overview Phenotype Genotype Genotype X Phenotype Association Genotype files directly from vendor Intensity files (ie: .CEL) Genotype X Phenotype Association Oligo microarrays, measuring allelelic biallelic SNPs variant at a position aa/ab/bb. Illumina and affimetrix platforms are two data types that they get data from now. CEL files will be distributed. (determining the intensity is mature technology but calling the genotype is an area of research so if you have the CEL files you can recalculate the genotypes yourself using newer technology – for authorized access only!

Overview Phenotype Genotype Genotype X Phenotype Association Various statistical models and methods P-value or LOD score for each marker Filters by P-value, HWE, minor allele frequency Map phenotypes onto genomic sequence Question for Matt: Does ‘phenotype’ protocol document also describe the association method and statistical model used?

Overview Phenotype Genotype Genotype X Phenotype Association Obvious expansion potential: More species; different types of association data (QTL) Critically important to archive all data: Submit primary data to appropriate public archive! Probe DB: primers, resequencing amplicons dbSTS: STS markers Maps: UniSTS; Map Viewer GenBank: ESTs

dbGaP Web Site two levels of access - open and controlled open access to non-sensitive data study summaries and documents measured variables and data elements analysis reports genome browser controlled access provides oversight and accountability for use of sensitive datasets involving personal information De-identified phenotypes and genotypes for individual subjects Pedigrees

Browse Studies http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap Link back to dbGaP homepage Instructions Description of dbGaP Question: What do you expect to have for “type of study”? What is a sub-study definition? (longitudinal etc) Currently all whole genome association, will be expanded to other types in future, can expand to other organisms.. Link to study report List of variables in study List of documents in study Automated query to PubMed for genome-wide association study articles

Browse Studies by Disease Expand/collapse Link to Terms from MeSH vocabulary Link to study report

Advanced Search Fields to be searched Add any number of search criteria

Study Report Citeable unique stable identifier Genotype x phenotype association or linkage analyses search this study Link to variable report History Publications Attribution Access Rules Links back to submitter website Criteria for inclusion/exclusion

Variable Report Citeable unique stable identifier Documents containing a section that has been linked to this variable Statistical summary of values for this variable P-value is red if cases differ from controls

Variable Report (continued) Document name Section of document that has been linked to this variable Link to document

Analysis Report Link back to report for measured or derived variable that was analyzed Genome browser of analysis results

Genome Browser of Analysis Results Slider filters results less significant than threshold 2MB bins colored to represent the most Significantly associated marker Click on bin of interest to zoom in and see association in context with other objects mapped to the same genomic region LINK

Genome Browser – Higher Resolution Collapse table P-value of genotyped marker Scroll via boxes above Add maps CFH gene has been associated with AMD in several studies

Coming Soon… Studies Features Early 2007 Spring 2007 Summer 2007 Michael J. Fox Foundation Parkinson’s Disease Study (LEAPS) NINDS Stroke and ALS Spring 2007 GAIN (Genetic Association Information Network) Framingham SHARe – first two generations NIDDK GoKinD and EDIC Summer 2007 Framingham SHARe – third generation Late 2007- Early 2008 GEI (Genes and Environment Initiative) Features Search analysis results by: Gene SNP or microsatellite marker Genomic region Filter analysis results by: P-value HWE Minor allele frequency Call rate? Download Public summaries Authorized access for individual-level data Other associations with phenotype (expression data..)

Acknowledgements Phenotype Genotype XML Authorized Access Rinat Bagoutdinov Luning Hao Mas Kimura Jimmy Jin Natasha Popova Stephanie Pretels Karl Sirotkin Jack Wang Matt Mailman Genotype Mike Feolo Lon Phan David Shao Ming Ward Steve Sherry XML Kim Tryka Laura Kelly Jeff Beck Authorized Access Steve Sherry Eugene Yaschenko Valdimir Soussov Misha Kimmelman Don Preuss Al Graeff Jim Ostell http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap

Document HTML

Document PDF

Multiple maps can be displayed to elucidate what is already known in a particular genomic region