Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology.

Slides:



Advertisements
Similar presentations
What is an association study? Define linkage disequilibrium
Advertisements

Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Julia Krushkal 4/11/2017 The International HapMap Project: A Rich Resource of Genetic Information Julia Krushkal Lecture in Bioinformatics 04/15/2010.
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Linkage Disequilibrium
Efficient Algorithms for Genome-wide TagSNP Selection across Populations via the Linkage Disequilibrium Criterion Authors: Lan Liu, Yonghui Wu, Stefano.
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Ferdinand van ’t Hooft Cardiovascular Genetics and Genomics Group Karolinska Institutet, Stockholm, Sweden Genome-Wide Association Study GWAS
Genetic Association Analysis --- impact of NGS 1.
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
1 FSTL4 and SEMA5A are associated with alcohol dependence: meta- analysis of two genome-wide association studies Kesheng Wang, PhD Department of Biostatistics.
SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
SNP Selection University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for.
Course Overview Personalized Medicine: Understanding Your Own Genome Fall 2014.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Genome Variations & GWAS
Design Considerations in Large- Scale Genetic Association Studies Michael Boehnke, Andrew Skol, Laura Scott, Cristen Willer, Gonçalo Abecasis, Anne Jackson,
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Understanding Genetics of Schizophrenia
Bernard Keavney Institute of Human Genetics University of Newcastle, UK. Recent developments in genetic epidemiology relevant to PURE.
HapMap: application in the design and interpretation of association studies Mark J. Daly, PhD on behalf of The International HapMap Consortium.
Factors to Consider in Selecting a Genotyping Platform Elizabeth Pugh June 22, 2007.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Bioinformatics SNPs and haplotypes Kristel Van Steen, PhD, ScD Université de Liege - Institut Montefiore
Copy Number Variants: detection and analysis Manuel Ferreira & Shaun Purcell Boulder, 2009.
Medical variations Gabor T. Marth Boston College Biology Department BI543 Fall 2013 February 5, 2013.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
A single-nucleotide polymorphism tagging set for human drug metabolism and transport Kourosh R Ahmadi, Mike E Weale, Zhengyu Y Xue, Nicole Soranzo, David.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Molecular & Genetic Epi 217 Association Studies
CS177 Lecture 10 SNPs and Human Genetic Variation
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
What host factors are at play? Paul de Bakker Division of Genetics, Brigham and Women’s Hospital Broad Institute of MIT and Harvard
A Genome-wide association study of Copy number variation in schizophrenia Andrés Ingason CNS Division, deCODE Genetics. Research Institute of Biological.
Gene Hunting: Linkage and Association
Genome-Wide Association Study (GWAS)
1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.
Whole genome association studies Introduction and practical Boulder, March 2009.
Polymorphism Haixu Tang School of Informatics. Genome variations underlie phenotypic differences cause inherited diseases.
Methods in genome wide association studies. Norú Moreno
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
Identification of Copy Number Variants using Genome Graphs
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Future Directions Pak Sham, HKU Boulder Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.
Linear Reduction Method for Tag SNPs Selection Jingwu He Alex Zelikovsky.
A PPROACHING THE G ENOME - G ENETIC M ARKERS, L INKAGE AND A SSOCIATION G ENETICS 202 Jon Bernstein Department of Pediatrics October 8, 2015.
The HapMap Project and Haploview
The International Consortium. The International HapMap Project.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Motivations to study human genetic variation
Copyright OpenHelix. No use or reproduction without express written consent1.
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment.
Analysis of Next Generation Sequence Data BIOST /06/2015.
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
Genome-Wides Association Studies (GWAS) Veryan Codd.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Population stratification
Common variation, GWAS & PLINK
Itsik Pe’er, Yves R. Chretien, Paul I. W. de Bakker, Jeffrey C
Medical genomics BI420 Department of Biology, Boston College
BF528 - Genomic Variation and SNP Analysis
Medical genomics BI420 Department of Biology, Boston College
Haplotypes When the presence of two or more polymorphisms on a single chromosome is statistically correlated in a population, this is a haplotype Example.
Presentation transcript:

Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology 1 WEHI Postgraduate seminar, 31 May 2010

2 Predict disease risk / drug response Personalized Medicine Lancet 2010; 375: 1525–35 Understand disease aetiology Why?

3 Rare, monogenic traits Ng et al. Nature Genetics 2010; 42:

4 Common, complex traits

Phenotypic modelling Linkage analysis Association analysis GENETICS OF COMMON DISEASES

Recent advances assays/analysis genetic variation HapMap, 1000 Genomes High-throughput genotyping & sequencing Analytic Methods Genome-wide association, imputation, stratification, CNVs, risk prediction genes 6

HapMap project “The HapMap was designed to determine the frequencies and patterns of association among roughly 3 million common Single Nucleotide Polymorphisms (SNPs) in four populations, for use in genetic association studies.” [4] 1. GOALS [1] The International HapMap Consortium. Nature 2003; 426: 789. [2] International HapMap Consortium. Nature 2005; 437: [3] International HapMap Consortium. Nature 2007; 449: 851. [4] Manolio et al. J Clin Invest 2008; 118: Individuals SNPs 7

HapMap project 2. STRATEGY 30 trios Yoruba in Ibadan, Nigeria (YRI) 30 trios European descent in Utah (CEU) 45 unrelated Han Chinese from Beijing (CHB) 45 unrelated Japanese from Tokyo (JPT) Genome-wide SNP discovery 1,7 million dbSNP9,2 million ,7 million (6,5 million validated) 2009 Genotyping Phase 1: MAF>0.05, validated, non-synonymous SNPs prioritised (1,27 million total) Phases 2 and 3 expanded SNP (4 million) and population (11) coverage SNP selection 7 genotyping platforms used/developed by 12 centres 8

HapMap project 3. OUTCOMES “Systematic” catalogue of common human variation Linkage disequilibrium (LD) or correlation between SNPs (tagging, fine-mapping, imputation) Designing and refining high-throughput genotyping platforms 9 Population genetics (selection, sub-structure, recombination & mutation)

10 Gene A Haplotypes HapMap SNPs D’ and r 2 Correlation (LD) between SNPs Haploview, Tagger SNP tags Genetic Coverage Proportion of known SNPs tagged Haploview Fine-mapping Interesting SNPs to follow-up Cross-study comparisons eg. SNP 1 ‘tags’ 4/10 variants

Genomes project GOAL “The 1000 Genomes Project aims to achieve a nearly complete catalog of common human genetic variants (defined as frequency 1% or higher) by generating high-quality sequence data for >85% of the genome for three sets of individuals (...)” 2,500 samples at 4x by 2011

High-throughput genotyping & sequencing 12 Whole-genome genotyping (from $300 USD/sample) Whole-genome sequencing (from $10,000 USD/sample) Illumina: HiSeq x coverage 100 bp read length Complete Genomics 40x coverage 35 bp read length Affymetrix: 6.0 chip >900,000 SNPs CNV probes 82% coverage CEU HapMap Accuracy 99.90% Illumina: Human1M BeadChip >1 million SNPs CNV probes 95% coverage CEU HapMap Accuracy 99.94%

Recent advances assays/analysis genetic variation HapMap, 1000 Genomes High-throughput genotyping & sequencing Analytic Methods Genome-wide Association, stratification, imputation, CNV, risk prediction Examples: recent GWAS. 13

Analytic methods 1. GENOME-WIDE ASSOCIATION 14 Individuals SNPs cases controls

Analytic methods 15 Association tests Study designs Unrelated individuals Families Software Between individual effects Between + Within family effects Many (eg. PLINK) Merlin, etc Unrelated individuals Families More power / $ spent, easier to collect, analyse Assess inheritance (CNVs), robust population stratification Pros

Analytic methods 2. POPULATION STRATIFICATION Ind1Ind2% shared A1A2100 A1A350 A1A425 A1A510 A1A68 A1B15 Genetic matching A B B A 16

Analytic methods 3. IMPUTATION OF UNMEASURED GENOTYPES Reference panel (eg. HapMap) Genotyped Dataset Individuals SNPs MACH, IMPUTE, BEAGLE 17 Shaun Purcell, Doug Ruderfer (PLINK) Genotyped + Imputed Dataset

18 Combine data from studies genotyped using different platforms

Example 1: Bipolar Disorder GWAS Ferreira et al (2008) Nature Genetics 40: ,690 SNPs >1,7 million SNPs

ANK3: Ankyrin G Cases: 7.0% Controls: 5.3% Odds ratio = 1.45 Not related to sex, psychosis or age- of-onset Smith et al (2009) Mol Psychiatry 14: Scott et al (2009) Proc Natl Acad Sci USA 106: [Lee et al (2010) Mol Psychiatry Apr 13 – Han Chinese population] 20 Replicated recently

Example 2: analysis of lymphocyte subsets Ferreira et al. (2010) Am J Hum Genet 86: ,538 individuals | CD4 + T cell levels, CD8 + T cell levels, CD4:CD8 ratio MHC class I rs , C Increased CD8 + T levels Improved host control of HIV (OR=0.32, P=10 -9 ) MHC class II rs , A Increased CD4 + T levels Protective effect for type-1 diabetes (OR = 0.04, P= ) Protective effect Rheum. Arthritis (OR=0.60, P= )

Structural Variants Genomic alterations involving segment of DNA >1kb Quantitative (Copy Number Variants) Positional (Translocations) Orientational (Inversions) Deletions Duplications Insertions Analytic methods 4. Structural Variants

Detection of CNVs Non-polymorphic probes McCarroll et al 2008 Nat Genet 40: 1166

Detection of CNVs Use polymorphic probes from genotyping arrays to Identify and genotype new, potentially rarer CNVs Example: rs A/G... AGCCCGAAATGTTTTCAGA AGCCCGAAGTGTTTTCAGA... probe 1 probe 2 AA AG GG Intensity of probe 2 Intensity of probe 1

Detection of CNVs 1A/G112 2A/-101 3AA/ /G011 5-/-000 6AAA/G314 A/G A A G A A A G A A A G Mat/Pat Ind Genotype Copy number for: AGTotal Pattern

Detection of CNVs A/G A Normalized intensity of allele A Normalized intensity of allele G Polymorphic probe in CNV region A/A A/G G/G Individuals with deletion(s) Individuals with duplication(s) ie. total CN > 2 ie. total CN < 2

Detection of CNVs Combine information across probes to identify new CNVs For example...CasesControls 100kb deletion chr. 210/5,0001/5,000 Korn et al 2008 Nat Genet 40: 1253 Birdseye Affy 5.0, 6.0 Wang et al 2007 Genome Res 17: 1665 PennCNV Affymetrix and Illumina

Example 3: Autism whole-genome CNV analysis Sample16p11CasesControlsP DiscoveryDel (600kb) 5/1,4413/4, x [Affy 500K]Dup7/1,4412/4,234 Replication 1 (CHB)Del5/5120/ [array-CGH]Dup4/5120/434 Replication 2 (deCODE)Del3/2992/18, x [Illumina]Dup0/2995/18,834 Deletion frequency Iceland Autism1% Psychiatric disorder0.1% General population0.01% Weiss et al. N Engl J Med 2008; 358: 667 COPPER Birdseye CNAT deldup inherited26 de novo101 unknown14

Example 4: SCZ whole-genome CNV analysis Shaun Purcell Cases Controls Chromosome → Genome-wide burden Specific loci

3,391 patients with SCZ, 3,181 controls Filter for 100kb 6,753 CNVs Cases have greater rate of CNVs than controls 1.15-fold increase P = 3×10 -5 Cases have greater rate of CNVs than controls 1.15-fold increase P = 3×10 -5 Rate of genic CNVs in cases versus controls 1.18-fold increase P = 5×10 -6 Rate of genic CNVs in cases versus controls 1.18-fold increase P = 5×10 -6 Rate of non-genic CNVs in cases versus controls 1.09-fold increase P = 0.16 Rate of non-genic CNVs in cases versus controls 1.09-fold increase P = 0.16 Results invariant to obvious statistical controls Array type, genotyping plate, sample collection site, mean probe intensity Results invariant to obvious statistical controls Array type, genotyping plate, sample collection site, mean probe intensity Genome-wide burden of rare CNVs in SCZ Shaun Purcell

Similar successes for other common diseases 31

Jan 2006 to Jan 2008 before Jan 2006 Crohn’s Disease (31 loci, ~10% variance) Altshuler, Daly & Lander. Science 2008; 322: 881 Manolio, Brooks & Collins. J Clin Invest : 1590 N confirmed loci 32

Summary Tremendous recent technological advances Large-scale genetic association studies feasible >150 disease loci unequivocally identified since 2006 Provide a solid base to build our knowledge about disease mechanisms Hundreds of loci yet to be identified for most diseases 33