Manolis Kellis Broad Institute of MIT and Harvard

Slides:



Advertisements
Similar presentations
What is neural stemness? Why is it important? What are the molecular signatures of neural stemness? What are the regulatory networks that control neural.
Advertisements

SHI Meng. Abstract The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants,
Interpreting Variation in Human Non-Coding Genomic Regions Using Computational Approaches with Experimental Support Lisa Brooks, Ph.D., Mike Pazin, Ph.D.
1 Harvard Medical School Mapping Transcription Mechanisms from Multimodal Genomic Data Hsun-Hsien Chang, Michael McGeachie, and Marco F. Ramoni Children.
Manolis Kellis: Research synopsis Brief overview 1 slide each vignette Why biology in a computer science group? Big biological questions: 1.Interpreting.
Biological pathway and systems analysis An introduction.
Genetic Analysis in Human Disease
Mapping Genetic Risk of Suicide Virginia Willour, Ph.D.
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Genome Browsers Ensembl (EBI, UK) and UCSC (Santa Cruz, California)
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Higher BMI (body mass index) is linked to greater brain atrophy in 700 MCI and AD patients, and in healthy elderly ADNI (N=587,critical P-value: 0.025)
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
What does it take to detect risk genes for psychiatric disorders?
Nature vs. Nurture.  Once nurture seemed clearly distinct from nature. Now it appears that our diets and lifestyles can change the expression of our.
ENCODE The Human Genome project sequenced “the human genome” “the human genome” that we have labeled as such doesn’t actually exist What we call.
Epigenomic and regulatory genomics of complex human disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad Institute of.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Strong Heart Family Study Phase VI Genetics Center Aims October 8, 2009.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Radiogenomics in glioblastoma multiforme
An Introduction to ENCODE Mark Reimers, VIPBG (borrowing heavily from John Stamatoyannopoulos and the ENCODE papers)
Computational personal genomics: selection, regulation, epigenomics, disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad.
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
What host factors are at play? Paul de Bakker Division of Genetics, Brigham and Women’s Hospital Broad Institute of MIT and Harvard
GWAS Hits and Functional Implications Peter Castaldi February 1, 2013.
Harbin Institute of Technology Computer Science and Bioinformatics Wang Yadong Second US-China Computer Science Leadership Summit.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Decoding the Network Footprint of Diseases With increasing availability of data, there is significant activity directed towards correlating genomic, proteomic,
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
1 Image-Based Biomedical Big Data Analytics Jens Rittscher Department of Engineering Science, Nuffield Department of Medicine, University of Oxford.
ACCELERATING CLINICAL AND TRANSLATIONAL RESEARCH Challenges in Bioinformatics R.W. Doerge Department of Statistics Department Agronomy.
Manolis Kellis Broad Institute of MIT and Harvard
Jason Ernst Broad Institute of MIT and Harvard
Genomics 2015/16 Silvia del Burgo. + Same genome for all cells that arise from single fertilized egg, Identity?  Epigenomic signatures + Epigenomics:
Effect of Alcohol on Brain Development NormalFetal Alcohol Syndrome.
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
(1) Genotype-Tissue Expression (GTEx) Largest systematic study of genetic regulation in multiple tissues to date 53 tissues, 500+ donors, 9K samples, 180M.
Transcriptional Enhancers Looking out for the genes and each other Sridhar Hannenhalli Department of Cell Biology and Molecular Genetics Center for Bioinformatics.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
1st workshop of the Aberdeen Microarray Analysis Network (AMAN)
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Understanding GWAS SNPs Xiaole Shirley Liu Stat 115/215.
Brendan Burke and Kyle Steffen. Important New Tool in Genomic Medicine GWAS is used to estimate disease risk and test SNPs( the most common type of genetic.
Integrative Genomics. Double-helix DNA strands are separated in the gene coding region Which enzyme detects the beginning of a gene ? RNA Polymerase (multi-subunit.
Sungkyunkwan University, School of Medicine.
Functional Mapping and Annotation of GWAS: FUMA
An emerging molecular “parts list” for schizophrenia
Gene Hunting: Design and statistics
Chromatin-guided interpretation of variation in a disease cohort.
Genome-wide Associations
Chromatin state and DNA sequence in TF binding dynamics and disease
Epigenomic views of human disease reveal 1000s of regulatory variants
The Impact of Network Medicine in Gastroenterology and Hepatology
In these studies, expression levels are viewed as quantitative traits, and gene expression phenotypes are mapped to particular genomic loci by combining.
Genetic variation in DREs could be a causative factor in dysregulation of distal target gene expression. Genetic variation in DREs could be a causative.
Parisa Shooshtari, Hailiang Huang, Chris Cotsapas 
An Expanded View of Complex Traits: From Polygenic to Omnigenic
GWAS-eQTL signal colocalisation methods
Anh Pham Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer’s disease.
Integrative analysis of 111 reference human epigenomes
M-H Pinard-van der Laan
Presentation transcript:

Big Data Opportunities and Challenges in Human Disease Genetics & Genomics Manolis Kellis Broad Institute of MIT and Harvard MIT Computer Science & Artificial Intelligence Laboratory

Big data Opportunities & Challenges in human disease genetics & genomics The goal: Mechanistic basis of human disease Epigenomics: Enhancers, networks, regulators, motifs Genetics: GWAS, QTLs, molecular epidemiology The challenges / opportunities: Effects are very small, huge number of hypotheses Much larger cohorts are needed, consent limitations Technologies for privacy vs. excuse for data hoarding Overcoming the challenges: Case study: Schizophrenia, Alzheimer’s Collaboration & sharing: personal & technological

Bringing knowledge gap from genetics to disease Chromatin states Promoter Enhancer Insulator Silencer Circuitry Control regions Retina Heart Cortex Lung Blood Skin Nerve Tissue Cell Type Protein miRNA TIMP3 ncRNA Target genes Factors Intermediate effects Lipids Tension Eye drusen Metabolism Drug response Genetic Variant CATGACTG Disease CATGCCTG Environment Requires: systematic understanding of genome function

The most complete map of human gene regulation 2.3M regulatory elements across 127 tissue/cell types High-resolution map of individual regulatory motifs Circuitry: regulatorsregionsmotifstarget genes

Non-coding variants lie in tissue-specific regulatory regions Yield new insights on relevant tissues and pathways Enable linking non-coding elements to relevant target genes Provide a mechanistic basis for developing therapeutics

Control regions harbor 1000s weak-effect disease SNPs GWAS top hits only explain small fraction of trait heritability Functional enrichments well past genome-wide significance

Bayesian integration of weak effects  disease modules Poorly ranked SNP nearby Highly ranked SNP nearby Disease gene Genetic association Disease SNP For a type 1 diabetes dataset in dbGap, our model also identifies few relatively SNPs and genes relevant to disease. Here, the model marks the MAZ regulator (which is a regulator of insulin expression) as being relevant, which also is not near any significant SNP in the study but is important for connecting the disease modules. MAZ no direct assoc, but clusters w/ many T1D hits MAZ indeed known regulator of insulin expression

Brain methylation changes in Alzheimer’s patients MAP Memory and Aging Project + ROS Religious Order Study Dorsolateral PFC Genotype (1M SNPs x700 ind.) Reference Chromatin states Methylation (450k probes x 700 ind) Variation in methylation patterns largely genotype driven Global signature of repression in 1000s regulatory regions: hypermethylation, enhancer states, brain regulator targets

Big data Opportunities & Challenges in human disease genetics & genomics The goal: Mechanistic basis of human disease Epigenomics: Enhancers, networks, regulators, motifs Genetics: GWAS, QTLs, molecular epidemiology The challenges / opportunities: Effects are very small, huge number of hypotheses Much larger cohorts are needed, consent limitations Technologies for privacy vs. excuse for data hoarding Overcoming the challenges: Case study: Schizophrenia, Alzheimer’s Collaboration & sharing: personal & technological

Big data Opportunities & Challenges in human disease genetics & genomics The goal: Mechanistic basis of human disease Epigenomics: Enhancers, networks, regulators, motifs Genetics: GWAS, QTLs, molecular epidemiology The challenges / opportunities: Effects are very small, huge number of hypotheses Much larger cohorts are needed, consent limitations Technologies for privacy vs. excuse for data hoarding Overcoming the challenges: Case study: Schizophrenia, Alzheimer’s Collaboration & sharing: personal & technological

Scaling of QTL discovery power w/ sample Number of meQTLs continues to increase linearly Weak-effect meQTLs: median R2<0.1 after 400 indiv.

Inflection point in complex trait GWAS Incl. replication (~100K) Freeze May 2013 (~80K) Freeze Jan. 2013 (~70K) WCPG Hamburg 2012 (~65K) Incl. SWE + CLOZUK (~60K) out

Schizophrenia GWAS: Number of significant loci 3,500 cases  0 loci 10,000 cases  5 loci 35,000 cases  62 loci!

Similar inflection point found in every complex trait! Adult height Crohn’s Schizophrenia (per 5000/5000) (per 1000/1000) (per 3000/3000) 1x 2 1 2x 4 3x 7 5 6 9x 68 51 62 18x 180 - Same story in: Type 1 diabetes Type 2 diabetes Serum cholesterol level Every common chronic disease Significantly associated regions (p < 5e-08) Larger samples lead to new biological insights Proof that Schizophrenia is a heritable, medical disorder Genetic architecture similar to non-brain diseases and traits Many genes  recognition of key pathways and processes Voltage-gated calcium channels (CACNA1C, CACNA1D, CACNA1I, CACNB2) Proteins interacting with FMRP, fragile X gene Neuron organization: Postsynaptic density, dendritic spine heads Enhancers: brain (angular gyrus, inferior temporal lobe), immune Eric Lander!!

Big data Opportunities & Challenges in human disease genetics & genomics The goal: Mechanistic basis of human disease Epigenomics: Enhancers, networks, regulators, motifs Genetics: GWAS, QTLs, molecular epidemiology The challenges / opportunities: Effects are very small, huge number of hypotheses Much larger cohorts are needed, consent limitations Technologies for privacy vs. excuse for data hoarding Overcoming the challenges: Collaboration, consortia, sharing of datasets Case study: Schizophrenia, Alzheimer’s