A Connected Digital Biomedical Research Enterprise with Big Data Belinda Seto, Ph.D. Deputy Director National Eye Institute.

Slides:



Advertisements
Similar presentations
Corpus Callosum Damage Predicts Disability Progression and Cognitive Dysfunction in Primary-Progressive MS After Five Years.
Advertisements

Lecture 2 Strachan and Read Chapter 13
ACCELERATING SPARSE CANONICAL CORRELATION ANALYSIS FOR LARGE BRAIN IMAGING GENETICS DATA Jingwen Yan, Hui Zhang, Lei Du, Eric Wernert, Andew J. Saykin,
Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Genetic Analysis in Human Disease
Mapping Genetic Risk of Suicide Virginia Willour, Ph.D.
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
The Microstructural Basis of Abnormal Connectivity in Autism Janet Lainhart, MD Associate Professor of Psychiatry, Pediatrics, Psychology, and Faculty.
Dr. Almut Nebel Dept. of Human Genetics University of the Witwatersrand Johannesburg South Africa Significance of SNPs for human disease.
1 FSTL4 and SEMA5A are associated with alcohol dependence: meta- analysis of two genome-wide association studies Kesheng Wang, PhD Department of Biostatistics.
EleMAP: An Online Tool for Harmonizing Data Elements using Standardized Metadata Registries and Biomedical Vocabularies Jyotishman Pathak, PhD 1 Janey.
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Positional Cloning LOD Sib pairs Chromosome Region Association Study Genetics Genomics Physical Mapping/ Sequencing Candidate Gene Selection/ Polymorphism.
Resolving membership in a study in shared aggregate genetics data David W. Craig, Ph.D. Investigator & Associate Director Neurogenomics Division
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Andrew Singleton Molecular Genetics Section Laboratory of Neurogenetics National Institute on Aging Andrew Singleton, Chief of the.
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
Manolis Kellis Broad Institute of MIT and Harvard
Understanding Genetics of Schizophrenia
Higher BMI (body mass index) is linked to greater brain atrophy in 700 MCI and AD patients, and in healthy elderly ADNI (N=587,critical P-value: 0.025)
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Strong Heart Family Study Phase VI Genetics Center Aims October 8, 2009.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Butte Lab Journal Club 16 Aug 2010 Alexander A. Morgan.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Chapter 13 Carrier Screening. Introduction Carrier screening involves testing of individuals for heterozygosity for genes that would produce significant.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Paul M. Thompson1 on behalf of the ENIGMA Consortium2
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Genetic Variation Influences Glutamate Concentrations in Brains of Patients with Multiple Sclerosis Robby Bonanno.
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
Chapter 25 Chapter 25 Genetic Determinants of Osteoporosis Copyright © 2013 Elsevier Inc. All rights reserved.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Gene Hunting: Linkage and Association
Online Mendelian Inheritance in Man (OMIM): What it is & What it can do for you Knowledge Management & Eskind Biomedical Library January 27, 2012 helen.
Genome-Wide Association Study (GWAS)
Personalized Medicine Dr. M. Jawad Hassan. Personalized Medicine Human Genome and SNPs What is personalized medicine? Pharmacogenetics Case study – warfarin.
The Stanley Neuropathology Consortium Integrative Database: A novel web-based tool for exploring neuropathological traits, gene expression and associated.
Lab 13: Association Genetics December 5, Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
The Co-Evolution of Genetics and Statistics Bio-Stat seminar 2 February 2011.
Thompson Lab’s Genetic findings in ADNI Sept Paul Thompson’s Lab* and the ADNI MRI & Genetics Cores *Jason Stein, April Ho, Xue Hua, Suh Lee, Alex.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL FFGWAS Fast Functional Genome Wide Association AnalysiS of Surface-based Imaging Genetic Data Chao Huang.
The analysis of A Genome-wide Association Study of Autism Reveals a Common Novel Risk Locus at 5p14.1 Rodney Knowlton Kyle Andrews.
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
 2011 Mayo Foundation for Medical Education and Research Novel late-onset Alzheimer’s disease loci variants associate with brain gene expression Mariet.
Genome-Wides Association Studies (GWAS) Veryan Codd.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Brendan Burke and Kyle Steffen. Important New Tool in Genomic Medicine GWAS is used to estimate disease risk and test SNPs( the most common type of genetic.
SNPs and complex traits: where is the hidden heritability?
Genetic Testing for the Clinician
Global Variation in Copy Number in the Human Genome
A Connected Digital Biomedical Research Enterprise with Big Data
Alzheimer's & Dementia: The Journal of the Alzheimer's Association
Gene Hunting: Design and statistics
Beyond GWAS Erik Fransen.
Eliza Congdon, Russell A. Poldrack, Nelson B. Freimer  Neuron 
Exercise: Effect of the IL6R gene on IL-6R concentration
An Expanded View of Complex Traits: From Polygenic to Omnigenic
Discovery From Data Repositories H Craig Mak  Nature Biotechnology 29, 46–47 (2011) 2013 /06 /10.
Presentation transcript:

A Connected Digital Biomedical Research Enterprise with Big Data Belinda Seto, Ph.D. Deputy Director National Eye Institute

What is it?  Digital research assets: data, workflow, publications, software  To connect these assets  Unique identifiers or tags  Annotation  Community-developed standards  Interfaces

Benefits  Increase scientific productivity  Enhance collaborations  Foster creativity: new tools, algorithms, methods, modeling  Enable new discoveries  Improve interoperability  Facilitate reproducibility

Gene Expression Data Barrett T et al. Nucl. Acids Res. 2013;41:D991-D995 Published by Oxford University Press Volume Velocity Variety

Gene Expression Omnibus  A public repository (NLM) of microarray, next generation sequencing and functional genomic data  Web-based interface and apps for query and data download

Myriad Data Types Other ‘Omic ImagingPhenotypic Clinical Genomic Exposure

Making Big Data Functional  Engender interdisciplinary approach to data collection and analysis by integrating scientific, algorithmic, and computational work  Drive functional data collection and analysis that has practical value in determining risk alleles

Integration of Data  Opportunities: Understanding biology across scales, from molecules to population  Challenges: need access to primary data and processed data, machine- readable metadata, tools to reduce dimensionality

Integration of Disparate Data Types: Brain Images with Genomic

Brain measures versus epidemiological studies to find genetic variants that directly affect the brain DIFFICULT EASIER ? May require 10,000-30,000 people e.g., the Psychiatric Genetics Consortium studies Gene variants (SNP’s) may affect brain measures directly, many brain measures relate to disease status.

Finding Genetic Variants Influencing Brain Structure … CTAGTCAGCGCT CTAGTAAGCGCT CTAGTCAGCGCT SNP C/C A/C A/A Intracranial Volume PhenotypeGenotype Association

Genome-Wide Association Studies (GWAS)  Identify loci for phenotypes or diseases using genotyping arrays throughout entire genome  Study association of polymorphisms with complex human traits  Meta-analysis across multiple studies

One SNP “Candidate gene” approach e.g., BDNF Screening 500,000 SNPs – 2,000,000 SNPs Position along genome NIH-funded database of genotypes and phenotypes enabling searches to find where in the genome a variant is associated with a trait. Genome-wide Association Study -log 10 (P-value) C/C A/C A/A Intracranial Volume

Applications of GWAS  Identify genetic variants that affect brain measures: volumetric, fiber integrity, connectivity  Risk genes  Early biomarkers of disease

What is a risk gene? - A common genetic variant related to a brain measure, or a disease, or a trait such as obesity, found by searching the genome 23 pairs of chromosomes In a particular part of the chromosome 5 there are many genes Within a gene there are exons, introns, and SNPs Single Nucleotide Polymorphism (SNP) 99.9% of DNA is the same for all people - DNA variation causes changes in predisposition to disease, and brain structure. One type of variation is a single nucleotide polymorphism (SNP)- Single letter change in the DNA code

GRIN2B Risk Allele  Glutamate receptor, signaling pathway  Genetic polymorphism of GRIN2B gene  Associated with reductions of brain white matter integrity  Bipolar disorder  Obsessive compulsive disorder

Jason L. Stein 1, Xue Hua PhD 1, Jonathan H. Morra PhD 1, Suh Lee 1, April J. Ho 1, Alex D. Leow MD PhD 1,2, Arthur W. Toga PhD 1, Jae Hoon Sul 3, Hyun Min Kang 4, Eleazar Eskin PhD 3,5, Andrew J. Saykin PsyD 6, Li Shen PhD 6, Tatiana Foroud PhD 7, Nathan Pankratz 7, Matthew J. Huentelman PhD 8, David W. Craig PhD 8, Jill D. Gerber 8, April Allen 8, Jason J. Corneveaux 8, Dietrich A. Stephan 8, Jennifer Webster 8, Bryan M. DeChairo PhD 9, Steven G. Potkin MD 10, Clifford R. Jack Jr MD 11, Michael W. Weiner MD 12,13, Paul M. Thompson PhD 1,*, and the ADNI (2010). Genome-Wide Analysis Reveals Novel Genes Influencing Temporal Lobe Structure with Relevance to Neurodegeneration in Alzheimer's Disease, NeuroImage GRIN2b genetic variant is associated with 2.8% temporal lobe volume deficit GRIN2b is over-represented in AD - could be considered an Alzheimer’s disease risk gene - needs replication

Jason L. Stein 1, Xue Hua PhD 1, Jonathan H. Morra PhD 1, Suh Lee 1, April J. Ho 1, Alex D. Leow MD PhD 1,2, Arthur W. Toga PhD 1, Jae Hoon Sul 3, Hyun Min Kang 4, Eleazar Eskin PhD 3,5, Andrew J. Saykin PsyD 6, Li Shen PhD 6, Tatiana Foroud PhD 7, Nathan Pankratz 7, Matthew J. Huentelman PhD 8, David W. Craig PhD 8, Jill D. Gerber 8, April Allen 8, Jason J. Corneveaux 8, Dietrich A. Stephan 8, Jennifer Webster 8, Bryan M. DeChairo PhD 9, Steven G. Potkin MD 10, Clifford R. Jack Jr MD 11, Michael W. Weiner MD 12,13, Paul M. Thompson PhD 1,*, and the ADNI (2010). Genome-Wide Analysis Reveals Novel Genes Influencing Temporal Lobe Structure with Relevance to Neurodegeneration in Alzheimer's Disease, NeuroImage, GRIN2b genetic variant associates with brain volume in these regions; 2.8% more temporal lobe atrophy

Alzheimer’s risk gene carriers (CLU-C) have lower fiber integrity even when young (N=398), 50 years before disease typically hits Voxels where CLU allele C (at rs ) is associated with lower FA after adjusting for age, sex, and kinship in 398 young adults (68 T/T; 220 C/T; 110 C/C). FDR critical p = Left hem. on Right Braskie et al., Journal of Neuroscience, May

Effect is even stronger for carriers of a schizophrenia risk gene variant, trkA-T (N=391 people) a. p values indicate where NTRK1 allele T carriers (at rs6336) have lower FA after adjusting for age, sex, and kinship in 391 young adults (31 T+; 360 T-). FDR critical p = b. Voxels that replicate in 2 independent halves of the sample (FDR-corrected). Left is on Right. Braskie et al., Journal of Neuroscience, May 2012

Neural Fiber Integrity Fractional Anisotropy  Applied to diffusion tensor MRI  Eigen = 0 means diffusion is totally unrestricted  Eigen = 1 means diffusion is restricted to only one direction  FA measures fiber density, axonal diameter, or myelination of white matter

Kohannim O, et al. Predicting white matter integrity from multiple common genetic variants. Neuropsychopharmacology 2012, in press. COMT HFE CLU NTRK1 ErbB4 BDNF SNP’s can predict variance in brain integrity Neuro-chemical genes Neuro- developmental genes Neuro- degenerative risk genes A significant fraction of variability in white matter structure of the corpus callosum (measured with DTI) is predictable from SNPs;

Big Data  26,000 whole brain MR images  > 500,000 single nucleotide polymorphism (SNP)  Analyze each voxel of the entire brain and search for genetic variants of the whole genome at each brain voxel  Select only the most associated SNP at each voxel, by analyzing P-values through an inverse beta transformation

Genetic clustering boosts GWAS power 1.Many top hits now reach genome-wide significance (N=472) and replicate 2.Several SNPs affect multiple ROIs 3.Can form a network of SNPs that affect similar ROIs 4.It has a small-world, scale-free topology (for more, see Chiang et al., J. Neurosci., 2012)

Population level Data Integration: Electronic Medical Records, Genotypes and Phenotypes

eMERGE  Goal: research to combine DNA biorepositories with EMR for large- scale association studies of genetics and phenotypes; to incorporate genetic variants into EMG for use in clinical care

Network Members

eMERGE Innovation  Algorithms for electronic phenotyping of clinical conditions identified in EMR  Discoveries of genetic variants in biorepository samples

Big Data to Knowledge