Presentation is loading. Please wait.

Presentation is loading. Please wait.

Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology.

Similar presentations


Presentation on theme: "Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology."— Presentation transcript:

1 Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology 1 WEHI Postgraduate seminar, 31 May 2010

2 2 Predict disease risk / drug response Personalized Medicine Lancet 2010; 375: 1525–35 Understand disease aetiology Why?

3 3 Rare, monogenic traits Ng et al. Nature Genetics 2010; 42: 30-35.

4 4 Common, complex traits

5 Phenotypic modelling Linkage analysis Association analysis GENETICS OF COMMON DISEASES 1990 2000 2005 2008 2009 2010 2015

6 Recent advances assays/analysis genetic variation HapMap, 1000 Genomes High-throughput genotyping & sequencing Analytic Methods Genome-wide association, imputation, stratification, CNVs, risk prediction genes 6

7 HapMap project “The HapMap was designed to determine the frequencies and patterns of association among roughly 3 million common Single Nucleotide Polymorphisms (SNPs) in four populations, for use in genetic association studies.” [4] 1. GOALS [1] The International HapMap Consortium. Nature 2003; 426: 789. [2] International HapMap Consortium. Nature 2005; 437: 1299. [3] International HapMap Consortium. Nature 2007; 449: 851. [4] Manolio et al. J Clin Invest 2008; 118: 1590. Individuals SNPs 7

8 HapMap project 2. STRATEGY 30 trios Yoruba in Ibadan, Nigeria (YRI) 30 trios European descent in Utah (CEU) 45 unrelated Han Chinese from Beijing (CHB) 45 unrelated Japanese from Tokyo (JPT) Genome-wide SNP discovery 1,7 million dbSNP9,2 million 20022005 14,7 million (6,5 million validated) 2009 Genotyping Phase 1: MAF>0.05, validated, non-synonymous SNPs prioritised (1,27 million total) Phases 2 and 3 expanded SNP (4 million) and population (11) coverage http://www.hapmap.org/ SNP selection 7 genotyping platforms used/developed by 12 centres 8

9 HapMap project 3. OUTCOMES “Systematic” catalogue of common human variation Linkage disequilibrium (LD) or correlation between SNPs (tagging, fine-mapping, imputation) Designing and refining high-throughput genotyping platforms 9 Population genetics (selection, sub-structure, recombination & mutation)

10 10 Gene A Haplotypes HapMap SNPs D’ and r 2 Correlation (LD) between SNPs Haploview, Tagger SNP tags Genetic Coverage Proportion of known SNPs tagged Haploview Fine-mapping Interesting SNPs to follow-up Cross-study comparisons eg. SNP 1 ‘tags’ 4/10 variants

11 11 1000 Genomes project GOAL http://www.1000genomes.org/ “The 1000 Genomes Project aims to achieve a nearly complete catalog of common human genetic variants (defined as frequency 1% or higher) by generating high-quality sequence data for >85% of the genome for three sets of 400-500 individuals (...)” 2,500 samples at 4x by 2011

12 High-throughput genotyping & sequencing 12 Whole-genome genotyping (from $300 USD/sample) Whole-genome sequencing (from $10,000 USD/sample) Illumina: HiSeq 2000 30x coverage 100 bp read length Complete Genomics 40x coverage 35 bp read length Affymetrix: 6.0 chip >900,000 SNPs CNV probes 82% coverage CEU HapMap Accuracy 99.90% Illumina: Human1M BeadChip >1 million SNPs CNV probes 95% coverage CEU HapMap Accuracy 99.94%

13 Recent advances assays/analysis genetic variation HapMap, 1000 Genomes High-throughput genotyping & sequencing Analytic Methods Genome-wide Association, stratification, imputation, CNV, risk prediction Examples: recent GWAS. 13

14 Analytic methods 1. GENOME-WIDE ASSOCIATION 14 Individuals SNPs cases controls

15 Analytic methods 15 Association tests Study designs Unrelated individuals Families Software Between individual effects Between + Within family effects Many (eg. PLINK) Merlin, etc Unrelated individuals Families More power / $ spent, easier to collect, analyse Assess inheritance (CNVs), robust population stratification Pros

16 Analytic methods 2. POPULATION STRATIFICATION Ind1Ind2% shared A1A2100 A1A350 A1A425 A1A510 A1A68 A1B15 Genetic matching A B B A 16

17 Analytic methods 3. IMPUTATION OF UNMEASURED GENOTYPES Reference panel (eg. HapMap) Genotyped Dataset Individuals SNPs MACH, IMPUTE, BEAGLE 17 Shaun Purcell, Doug Ruderfer (PLINK) Genotyped + Imputed Dataset

18 18 Combine data from studies genotyped using different platforms

19 Example 1: Bipolar Disorder GWAS Ferreira et al (2008) Nature Genetics 40: 1056 19 325,690 SNPs >1,7 million SNPs

20 ANK3: Ankyrin G Cases: 7.0% Controls: 5.3% Odds ratio = 1.45 Not related to sex, psychosis or age- of-onset Smith et al (2009) Mol Psychiatry 14: 755-63. Scott et al (2009) Proc Natl Acad Sci USA 106: 7501-6. [Lee et al (2010) Mol Psychiatry Apr 13 – Han Chinese population] 20 Replicated recently

21 Example 2: analysis of lymphocyte subsets Ferreira et al. (2010) Am J Hum Genet 86: 88-92 21 2,538 individuals | CD4 + T cell levels, CD8 + T cell levels, CD4:CD8 ratio MHC class I rs2524054, C Increased CD8 + T levels Improved host control of HIV (OR=0.32, P=10 -9 ) MHC class II rs9270986, A Increased CD4 + T levels Protective effect for type-1 diabetes (OR = 0.04, P=10 -125 ) Protective effect Rheum. Arthritis (OR=0.60, P=10 -15 )

22 Structural Variants Genomic alterations involving segment of DNA >1kb Quantitative (Copy Number Variants) Positional (Translocations) Orientational (Inversions) Deletions Duplications Insertions Analytic methods 4. Structural Variants

23 Detection of CNVs Non-polymorphic probes McCarroll et al 2008 Nat Genet 40: 1166

24 Detection of CNVs Use polymorphic probes from genotyping arrays to Identify and genotype new, potentially rarer CNVs Example: rs1006737 A/G... AGCCCGAAATGTTTTCAGA...... AGCCCGAAGTGTTTTCAGA... probe 1 probe 2 AA AG GG Intensity of probe 2 Intensity of probe 1

25 Detection of CNVs 1A/G112 2A/-101 3AA/-202 4-/G011 5-/-000 6AAA/G314 A/G A A G A A A G A A A G Mat/Pat Ind Genotype Copy number for: AGTotal Pattern

26 Detection of CNVs A/G A Normalized intensity of allele A Normalized intensity of allele G Polymorphic probe in CNV region A/A A/G G/G Individuals with deletion(s) Individuals with duplication(s) ie. total CN > 2 ie. total CN < 2

27 Detection of CNVs Combine information across probes to identify new CNVs For example...CasesControls 100kb deletion chr. 210/5,0001/5,000 Korn et al 2008 Nat Genet 40: 1253 Birdseye Affy 5.0, 6.0 Wang et al 2007 Genome Res 17: 1665 PennCNV Affymetrix and Illumina

28 Example 3: Autism whole-genome CNV analysis Sample16p11CasesControlsP DiscoveryDel (600kb) 5/1,4413/4,234 1.1 x 10 -4 [Affy 500K]Dup7/1,4412/4,234 Replication 1 (CHB)Del5/5120/434 0.007 [array-CGH]Dup4/5120/434 Replication 2 (deCODE)Del3/2992/18,834 4.2 x 10 -4 [Illumina]Dup0/2995/18,834 Deletion frequency Iceland Autism1% Psychiatric disorder0.1% General population0.01% Weiss et al. N Engl J Med 2008; 358: 667 COPPER Birdseye CNAT deldup inherited26 de novo101 unknown14

29 Example 4: SCZ whole-genome CNV analysis Shaun Purcell Cases Controls Chromosome → Genome-wide burden Specific loci

30 3,391 patients with SCZ, 3,181 controls Filter for 100kb 6,753 CNVs Cases have greater rate of CNVs than controls 1.15-fold increase P = 3×10 -5 Cases have greater rate of CNVs than controls 1.15-fold increase P = 3×10 -5 Rate of genic CNVs in cases versus controls 1.18-fold increase P = 5×10 -6 Rate of genic CNVs in cases versus controls 1.18-fold increase P = 5×10 -6 Rate of non-genic CNVs in cases versus controls 1.09-fold increase P = 0.16 Rate of non-genic CNVs in cases versus controls 1.09-fold increase P = 0.16 Results invariant to obvious statistical controls Array type, genotyping plate, sample collection site, mean probe intensity Results invariant to obvious statistical controls Array type, genotyping plate, sample collection site, mean probe intensity Genome-wide burden of rare CNVs in SCZ Shaun Purcell

31 Similar successes for other common diseases 31

32 Jan 2006 to Jan 2008 before Jan 2006 Crohn’s Disease (31 loci, ~10% variance) 10 20 30 0 5 http://www.genome.gov/gwastudies Altshuler, Daly & Lander. Science 2008; 322: 881 Manolio, Brooks & Collins. J Clin Invest 2008 118: 1590 N confirmed loci 32

33 Summary Tremendous recent technological advances Large-scale genetic association studies feasible >150 disease loci unequivocally identified since 2006 Provide a solid base to build our knowledge about disease mechanisms Hundreds of loci yet to be identified for most diseases 33


Download ppt "Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology."

Similar presentations


Ads by Google