Download presentation
Presentation is loading. Please wait.
Published byJerome Goodwin Modified over 9 years ago
1
Big Data Opportunities and Challenges in Human Disease Genetics & Genomics
Manolis Kellis Broad Institute of MIT and Harvard MIT Computer Science & Artificial Intelligence Laboratory
2
Big data Opportunities & Challenges in human disease genetics & genomics
The goal: Mechanistic basis of human disease Epigenomics: Enhancers, networks, regulators, motifs Genetics: GWAS, QTLs, molecular epidemiology The challenges / opportunities: Effects are very small, huge number of hypotheses Much larger cohorts are needed, consent limitations Technologies for privacy vs. excuse for data hoarding Overcoming the challenges: Case study: Schizophrenia, Alzheimer’s Collaboration & sharing: personal & technological
3
Bringing knowledge gap from genetics to disease
Chromatin states Promoter Enhancer Insulator Silencer Circuitry Control regions Retina Heart Cortex Lung Blood Skin Nerve Tissue Cell Type Protein miRNA TIMP3 ncRNA Target genes Factors Intermediate effects Lipids Tension Eye drusen Metabolism Drug response Genetic Variant CATGACTG Disease CATGCCTG Environment Requires: systematic understanding of genome function
4
The most complete map of human gene regulation
2.3M regulatory elements across 127 tissue/cell types High-resolution map of individual regulatory motifs Circuitry: regulatorsregionsmotifstarget genes
5
Non-coding variants lie in tissue-specific regulatory regions
Yield new insights on relevant tissues and pathways Enable linking non-coding elements to relevant target genes Provide a mechanistic basis for developing therapeutics
6
Control regions harbor 1000s weak-effect disease SNPs
GWAS top hits only explain small fraction of trait heritability Functional enrichments well past genome-wide significance
7
Bayesian integration of weak effects disease modules
Poorly ranked SNP nearby Highly ranked SNP nearby Disease gene Genetic association Disease SNP For a type 1 diabetes dataset in dbGap, our model also identifies few relatively SNPs and genes relevant to disease. Here, the model marks the MAZ regulator (which is a regulator of insulin expression) as being relevant, which also is not near any significant SNP in the study but is important for connecting the disease modules. MAZ no direct assoc, but clusters w/ many T1D hits MAZ indeed known regulator of insulin expression
8
Brain methylation changes in Alzheimer’s patients
MAP Memory and Aging Project + ROS Religious Order Study Dorsolateral PFC Genotype (1M SNPs x700 ind.) Reference Chromatin states Methylation (450k probes x 700 ind) Variation in methylation patterns largely genotype driven Global signature of repression in 1000s regulatory regions: hypermethylation, enhancer states, brain regulator targets
9
Big data Opportunities & Challenges in human disease genetics & genomics
The goal: Mechanistic basis of human disease Epigenomics: Enhancers, networks, regulators, motifs Genetics: GWAS, QTLs, molecular epidemiology The challenges / opportunities: Effects are very small, huge number of hypotheses Much larger cohorts are needed, consent limitations Technologies for privacy vs. excuse for data hoarding Overcoming the challenges: Case study: Schizophrenia, Alzheimer’s Collaboration & sharing: personal & technological
10
Big data Opportunities & Challenges in human disease genetics & genomics
The goal: Mechanistic basis of human disease Epigenomics: Enhancers, networks, regulators, motifs Genetics: GWAS, QTLs, molecular epidemiology The challenges / opportunities: Effects are very small, huge number of hypotheses Much larger cohorts are needed, consent limitations Technologies for privacy vs. excuse for data hoarding Overcoming the challenges: Case study: Schizophrenia, Alzheimer’s Collaboration & sharing: personal & technological
11
Scaling of QTL discovery power w/ sample
Number of meQTLs continues to increase linearly Weak-effect meQTLs: median R2<0.1 after 400 indiv.
12
Inflection point in complex trait GWAS
Incl. replication (~100K) Freeze May 2013 (~80K) Freeze Jan (~70K) WCPG Hamburg 2012 (~65K) Incl. SWE + CLOZUK (~60K) out
13
Schizophrenia GWAS: Number of significant loci
3,500 cases 0 loci 10,000 cases 5 loci 35,000 cases 62 loci!
14
Similar inflection point found in every complex trait!
Adult height Crohn’s Schizophrenia (per 5000/5000) (per 1000/1000) (per 3000/3000) 1x 2 1 2x 4 3x 7 5 6 9x 68 51 62 18x 180 - Same story in: Type 1 diabetes Type 2 diabetes Serum cholesterol level Every common chronic disease Significantly associated regions (p < 5e-08) Larger samples lead to new biological insights Proof that Schizophrenia is a heritable, medical disorder Genetic architecture similar to non-brain diseases and traits Many genes recognition of key pathways and processes Voltage-gated calcium channels (CACNA1C, CACNA1D, CACNA1I, CACNB2) Proteins interacting with FMRP, fragile X gene Neuron organization: Postsynaptic density, dendritic spine heads Enhancers: brain (angular gyrus, inferior temporal lobe), immune Eric Lander!!
15
Big data Opportunities & Challenges in human disease genetics & genomics
The goal: Mechanistic basis of human disease Epigenomics: Enhancers, networks, regulators, motifs Genetics: GWAS, QTLs, molecular epidemiology The challenges / opportunities: Effects are very small, huge number of hypotheses Much larger cohorts are needed, consent limitations Technologies for privacy vs. excuse for data hoarding Overcoming the challenges: Collaboration, consortia, sharing of datasets Case study: Schizophrenia, Alzheimer’s
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.