QTL Studies- Past, Present and Future David Evans.

Slides:



Advertisements
Similar presentations
What is an association study? Define linkage disequilibrium
Advertisements

Genome-wide Association Studies John S. Witte. Association Studies Hirschhorn & Daly, Nat Rev Genet 2005 Candidate Gene or GWAS.
Meta-analysis for GWAS BST775 Fall DEMO Replication Criteria for a successful GWAS P
Genetic Epidemiology Michèle Sale, Ph.D. Center for Public Health Genomics Tel:
Genetic Analysis in Human Disease
November 17, 2007Georgia Tech / ORNL Bioinformatics Conference1 Opportunities at the NIH Peter Good, Ph.D. NHGRI November 17, th Georgia Tech / ORNL.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
What is Mendelian Randomisation? Frank Dudbridge.
Diabetes Genome Wide Association Alessandra C Goulart Ida Hatoum Stalo Karageorgi Mara Meyer EPI293 January 2008 Harvard School of Public Health Alessandra.
Estimating “Heritability” using Genetic Data David Evans University of Queensland.
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls The Wellcome Trust Case Control Consortium, Nature, 2007.
Genetics for Imagers: How Geneticists Model Quantitative Phenotypes
Genetic Traits Quantitative (height, weight) Dichotomous (affected/unaffected) Factorial (blood group) Mendelian - controlled by single gene (cystic fibrosis)
Genome-Wide Association Studies Xiaole Shirley Liu Stat 115/215.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Common Disease Findings (case study on diabetes) GWAS Workshop Francis S. Collins, M.D., Ph.D. National Human Genome Research Institute May 1, 2007.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Design Considerations in Large- Scale Genetic Association Studies Michael Boehnke, Andrew Skol, Laura Scott, Cristen Willer, Gonçalo Abecasis, Anne Jackson,
Understanding Genetics of Schizophrenia
Chapter 7 Multifactorial Traits
Exploring the behavioral genetics of Trade and Cooperation Arcadi Navarro and Elodie Gazave July 5th 2007.
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Rare and common variants: twenty arguments G.Gibson Homework 3 Mylène Champs Marine Flechet Mathieu Stifkens 1 Bioinformatics - GBIO K.Van Steen.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Molecular and Genetic Epidemiology Kathryn Penney, ScD January 5, 2012.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Multifactorial Traits
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
A single-nucleotide polymorphism tagging set for human drug metabolism and transport Kourosh R Ahmadi, Mike E Weale, Zhengyu Y Xue, Nicole Soranzo, David.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
What host factors are at play? Paul de Bakker Division of Genetics, Brigham and Women’s Hospital Broad Institute of MIT and Harvard
A basic review of genetics Dr. Danny Chan Associate Professor Assistant Dean (Faculty of Medicine) Department of Biochemistry Department of Biochemistry.
Genome-Wide Association Study (GWAS)
Whole genome association studies Introduction and practical Boulder, March 2009.
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Risk Prediction of Complex Disease David Evans. Genetic Testing and Personalized Medicine Is this possible also in complex diseases? Predictive testing.
Genome wide association studies (A Brief Start)
The International Consortium. The International HapMap Project.
Admixture Mapping Controlled Crosses Are Often Used to Determine the Genetic Basis of Differences Between Populations. When controlled crosses are not.
Design and Analysis of Genome- wide Association Studies David Evans.
Insights Into Complex Disease Genetics (?) David M Evans.
Genome-Wides Association Studies (GWAS) Veryan Codd.
Genetic Analysis in Human Disease Kim R. Simpfendorfer, PhD Robert S.Boas Center for Genomics & Human Genetics The Feinstein Institute for Medical Research.
A Fine Mapping Theorem to Refine Results from Association Genetics Studies S.J. Schrodi, V.E. Garcia, C.M. Rowland Celera, Alameda, CA ABSTRACT Justification.
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Brendan Burke and Kyle Steffen. Important New Tool in Genomic Medicine GWAS is used to estimate disease risk and test SNPs( the most common type of genetic.
Allele-Specific Copy Number in Tumors
SNPs and complex traits: where is the hidden heritability?
Common variation, GWAS & PLINK
Complex disease and long-range regulation: Interpreting the GWAS using a Dual Colour Transgenesis Strategy in Zebrafish.
Genome Wide Association Studies using SNP
Marker heritability Biases, confounding factors, current methods, and best practices Luke Evans, Matthew Keller.
High level GWAS analysis
Introduction to Genetics and Genomics 3. Association Studies
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Genome-wide Associations
Beyond GWAS Erik Fransen.
Topic 10.2 Inheritance.
Genetic Testing and the Prevention of Type 1 Diabetes
Chapter 7 Multifactorial Traits
Exercise: Effect of the IL6R gene on IL-6R concentration
Mendelian Randomization (Using genes to tell us about the environment)
Medical genomics BI420 Department of Biology, Boston College
Perspectives from Human Studies and Low Density Chip
Medical genomics BI420 Department of Biology, Boston College
An Expanded View of Complex Traits: From Polygenic to Omnigenic
Mendelian Randomization: Genes as Instrumental Variables
Presentation transcript:

QTL Studies- Past, Present and Future David Evans

Genetic studies of complex diseases have not met anticipated success Glazier et al, Science (2002) 298:

Korstanje & Pagan (2002) Nat Genet

BUT…Not much success in mapping complex diseases / traits ! Reasons for Failure? Complex Phenotype Common environment MarkerGene 1 Individual environment Polygenic background Gene 2 Gene 3 Linkage disequilibrium Mode of inheritance Linkage Association Weiss & Terwilliger (2000) Nat Genet

Type 1 diabetes and Insulin VNTR Bennett & Todd, Ann Rev Genet, 1996 Alzheimers and ApoE4 Roses, Nature 2000 LD Patterns and Allelic Association Pattern of LD unpredictable Extent of common genetic variation unknown

Genome-wide Association? Risch & Merikangas, Science 1996

Multiple Rare Variant Hypothesis? GWA assumes that common variants underlie common diseases

Enabling Genome-wide Association Studies HAPlotype MAP High throughput genotyping Large cohorts “ALSPAC”

Wellcome Trust Case Control Consortium

CASES 1.Type 1 Diabetes 2.Type 2 Diabetes 3.Crohn’s Disease 4.Coronary Heart Disease 5.Hypertension 6.Bipolar Disorder 7.Rheumatoid Arthritis 8.Malaria 9.Tuberculosis 10.Ankylosing Spondylitis 11.Grave’s Disease 12.Breast Cancer 13.Multiple Sclerosis CONTROLS 1. UK Controls A (1, BC) 2. UK Controls B (1,500 - NBS) 3. Gambian controls (2000) Wellcome Trust Case-Control Consortium Genome-Wide Association Across Major Human Diseases DESIGN Collaboration amongst 26 UK disease investigators 2000 cases each from 9 diseases 1000 cases from 4 diseases GENOTYPING Affymetrix 500k SNPs Illumina Human NS_12 SNP chip

Ankylosing Spondylitis Auto-immune arthritis resulting in fusion of vertebrae Often associated with psoriasis, IBD and uveitis Prevalence of 0.4% in Caucasians. More common in men. Ed Sullivan, Mike Atherton

Ankylosing Spondylitis GWAS WTCCC (2007) Nat Genet IL23-R ARTS-1

Breast ca Prostate ca Colorectal ca Crohn’s Dis IBD T1D T2D Obesity Triglycerides CAD Coeliac Dis Rheumatoid A Ankylosing S Alzheimer Dis Glaucoma Asthma FGFR2 SMAD7 TNRC9 MAPKI3 LSP1 2q358q24 NOD2 IL23R INS 9p21 FTO GCKR 9p21 IL2PTPN22 ARTS GALPLOXL1 ORMDL IL21 IL23R 6q251 2q36 12q24 18p11 IL2RA FTO CDKAL1 IGF2BP2 KIAAA 035D HHEX PPARG SLC30A8 TCF7L2 KCNJ11 5p13 ATG16L1 BSN IRGM PHOX2B NCF4 NKX2-3 FAM92B PTPN2 MSMB KLK3 SLC22A3 NUDT11 3des. 11q13 LMTK2 HNF1B CTBP2 JAZF1 8q24 WFS1 TCF2 PTPN22 CD25 ERBB3 CTLA4 IFIH1 10q21 Common genes = common etiology? Some in gene deserts Large relative risks does not = success Drug targets Successes…

What About Quantitative Traits? Quantitative genetics theory suggests that quantitative traits are the result of many variants of small effect The corollary is that very large sample sizes will be needed to detect these variants in UNSELECTED samples 1 Gene  3 Genotypes  3 Phenotypes 2 Genes  9 Genotypes  5 Phenotypes 3 Genes  27 Genotypes  7 Phenotypes 4 Genes  81 Genotypes  9 Phenotypes Central Limit Theorem  Normal Distribution Unselected samples

FTSO FTO FTO produces a moderate signal in WTCCC T2D scan But, no signal in an American T2D scan…? WTCCC T2D Scan -American cases and controls matched on BMI

FTO Replication is critical !!!

Twin, family and adoption studies suggest that, within a population, 90% of variation in height is due to genetic variation Dizygotic Twins Twins separated at birth Monozygotic Twins Borjeson, Acta Paed, 1976 Height- The Archetypal Polygenic Trait

GWA of Height Weedon et al. (in press) Nat Genet Large numbers are needed to detect QTLs !!! Collaboration is the name of the game !!! A Cases (WTCCC T2D) B Cases (DGI) C Cases (WTCCC HT) D Cases (WTCCC CAD) E Cases (EPIC) F Cases (WTCCC UKBS) Significant results Other loci?

A: 1,900 C: 7,200 E: 12,600F: 14,000 D: 9,100 B: 5,000 Weedon et al. (in press) Nat Genet Some real hits sit in the bottom of the distribution Some hits initially look interesting but then go away

Hedgehog signaling, cell cycle, and extra-cellular matrix genes over-represented Candidate geneMonogenicKnockout mouseDetails* ZBTB38--Transcription factor. CDK6-YesInvolved in the control of the cell cycle. HMGA2Yes Chromatin architectural factors GDF5Yes Involved in bone formation LCORL--May act as transcription activator LOC Not known EFEMP1Yes-Extra-cellular matrix C6orf106--Not known PTCH1Yes Hedgehog signalling SPAG17--Not known SOCS2-YesRegulates cytokine signal transduction HHIP--Hedgehog signaling ZNF678--Transcription factor DLEU7--Not known SCMH1-YesPolycomb protein ADAMTSL3--Extra-cellular matrix IHHYes Hedgehog signaling ANAPC13--Cell cycle ACANYes Extra-cellular matrix DYMYes-Not known Weedon et al. (in press) Nat Genet

The combined impact of the 20 SNPS with a P < 5 x The 20 SNPs explain only ~3% of the variation of height Lots more genes to find – but extremely large numbers needed Weedon et al. (in press) Nat Genet

Perola et al, Plos Genetics, 2007; data available at Weedon et al.; unpublished data Height Linkage Regions

Perola et al, Plos Genetics, 2007; data available at Weedon et al.; unpublished data

What’s Going On? BUT…what if linkage analysis and association analysis identify different types of loci? Areas identified by linkage don’t have significant assocation hits over them Linkage analysis lacks power? Loci identified by GWAs don’t have linkage peaks over them Type I error? Power?

What next? Initial Genome Wide Scans Functional Studies Other ethnic groups Fine mapping More genes CNVs Transcriptomics Animal models Mendelian Randomisation Genomic Profiling Epigenetics Genome-wide Sequencing

Distribution of MAFs in HapMap Genome-wide panels and HapMap biased towards common variants Common variants don’t tag rare variants well

FTO IL23R HLA-B27 TCF7L2 ARTS1 TCF2 PPARG WFS1??? SEMA5A “Rotten Fruit” “Low hanging fruit” “High hanging fruit” Complex Disease Tree

Methods of gene hunting Effect Size Frequency rare, monogenic (linkage) common, complex (association) ?

Evans et al. (2008) EJHG Genome-wide Sequencing Sequence individuals’ genomes Will identify rare variants But will we have enough power?

Genomic Profiling BUT…if we consider several predisposing genetic and environmental factors, can we predict disease? Predictive testing in the case of monogenic diseases has been used for years (1300+ tests available) (e.g. Phenylketonuria) The idea of using genetic information to inform diagnosis Not possible in complex diseases as effects of an individual variant is so small

(from Yang et al AJHG) (from Janssens et al AJHG) Genomic Profiling => Give up and go home?

Prevalence of B27+, ARTS1+,IL23R+ is 2.4% Prevalence of B27-, ARTS1-, IL23R- is 19% Ankylosing Spondylitis (Brown & Evans, in prep)

Using Genetics to Inform Classical Epidemiology

Observational Studies Obesity YesNo CHD Control50250 Odds of obesity in cases: 200/100 = 2 Odds of obesity in controls: 50/250 = 0.2 Odds Ratio: 2/0.2 = 10 Fanciful claims often made from observational studies In a case-control study, a group of diseased individuals are recruited (Cases); A group of individuals without disease are gathered (Controls); Both groups are then measured retrospectively on an exposure of interest; A test of association is performed Example: Obesity (Exposure) and Coronary Heart Disease (Outcome)

Classic limitations to “observational” science Confounding Reverse Causation Bias

Randomized Control Trials Individuals free from exposure and disease Treatment Group Control Group Randomization Measure Outcome Measure Outcome Randomization controls for confounding Reverse causation impossible Gold standard for assessing causality

Mendelian Randomization Fortunately nature has provided us with a natural randomized control trial ! RCTs not always ethical or possible Mendel’s law of independent assortment states that inheritance of a trait is independent (randomized) with respect to other traits Assessing the relationship between genotype, environmental risk factor and disease informs us on causality Therefore individuals are randomly assigned to three groups based on their genotype (AA, Aa, aa) independent of outcome

Genetic Variant Modifiable Environmental Exposure Disease Confounding Variables (FTO) (Obesity) (smoking, diet etc.) (Coronary Heart Disease) Mendelian Randomization If obesity causes CHD then the relationship between FTO and CHD should be estimated by the product of β FTO-Obesity and β Obesity-CHD β FTO-Obesity β Obesity-CHD

Genetic Variant Modifiable Environmental Exposure Disease Confounding Variables (FTO) (Obesity) (smoking, diet etc.) (Coronary Heart Disease) Mendelian Randomization If CHD causes obesity then β FTO-CHD should be zero. β FTO-Obesity β Obesity-CHD

Genetic Variant Modifiable Environmental Exposure Disease Confounding Variables (FTO) (Obesity) (smoking, diet etc.) (Coronary Heart Disease) Mendelian Randomization If the relationship between Obesity and CHD is purely correlational (i.e. due to confounding) then β FTO-CHD should be 0 β FTO-Obesity

Genetic Variant Modifiable Environmental Exposure Disease Confounding Variables (FTO) (Obesity) (smoking, diet etc.) (Coronary Heart Disease) Mendelian Randomization Genotype is NOT associated with confounders Genotype is associated with the environmental exposure of interest Genotype is only related to its outcome via its association with the modifiable environmental exposure

Mendelian Randomization Mendelian Randomization is a way of using a genetic variant(s) to make causal inferences about (modifiable) environmental risk factors for disease and health related outcomes Environmental exposures (e.g. Obesity) can be modified ! Genetic factors cannot (at least for the moment…) Still a relatively new approach that has problems (i.e. finding genetic proxies for environmental exposures- multiple instruments?) …but a LOT of scope for development…

Genetic Variant Modifiable Environmental Exposure Disease Confounding Variables (FTO) (Obesity) (smoking, diet etc.) (Coronary Heart Disease) Could SEM be used to enhance MR? Genetic Variant (MC4R) Genetic Variant (?)