Conifer Translational Genomics Network Coordinated Agricultural Project www.pinegenome.org/ctgn Genomics in Tree Breeding and Forest Ecosystem Management.

Slides:



Advertisements
Similar presentations
15 The Genetic Basis of Complex Inheritance
Advertisements

Statistical methods for genetic association studies
Conifer Translational Genomics Network Coordinated Agricultural Project Genomics in Tree Breeding and Forest Ecosystem Management.
The genetic dissection of complex traits
Population structure.
What is an association study? Define linkage disequilibrium
Gene-by-Environment and Meta-Analysis Eleazar Eskin University of California, Los Angeles.
Association Tests for Rare Variants Using Sequence Data
Why this paper Causal genetic variants at loci contributing to complex phenotypes unknown Rat/mice model organisms in physiology and diseases Relevant.
Qualitative and Quantitative traits
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
ASSOCIATION MAPPING WITH TASSEL Presenter: VG SHOBHANA PhD Student CPMB.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Multiple Comparisons Measures of LD Jess Paulus, ScD January 29, 2013.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Genomics in Tree Breeding and Forest Ecosystem Management
Power and limitations of GWAS Aaron Lorenz Department of Agronomy and Horticulture.
Genome-wide association mapping Introduction to theory and methodology
Conifer Translational Genomics Network Coordinated Agricultural Project Genomics in Tree Breeding and Forest Ecosystem Management.
Objectives Cover some of the essential concepts for GWAS that have not yet been covered Hardy-Weinberg equilibrium Meta-analysis SNP Imputation Review.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Genomics in Tree Breeding and Forest Ecosystem Management
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
MALD Mapping by Admixture Linkage Disequilibrium.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Association Modeling With iPlant
Lab 13: Association Genetics. Goals Use a Mixed Model to determine genetic associations. Understand the effect of population structure and kinship on.
Analysis of whole genome association studies in pedigreed populations
Abstract The goal of the Conifer Translational Genomics Network (CTGN) project is to provide tree breeders across the United States with new tools to enhance.
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Quantitative Genetics
Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs PBG 650 Advanced Plant Breeding.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Population Stratification
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
A single-nucleotide polymorphism tagging set for human drug metabolism and transport Kourosh R Ahmadi, Mike E Weale, Zhengyu Y Xue, Nicole Soranzo, David.
Lecture 25: Association Genetics November 30, 2012.
1 Association Analysis of Rare Genetic Variants Qunyuan Zhang Division of Statistical Genomics Course M Computational Statistical Genetics.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
INTRODUCTION TO ASSOCIATION MAPPING
Lab 13: Association Genetics December 5, Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect.
California Pacific Medical Center
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
Separation of the largest eigenvalues in eigenanalysis of genotype data from discrete populations Katarzyna Bryc Postdoctoral Fellow, Reich Lab, Harvard.
C2BAT: Using the same data set for screening and testing. A testing strategy for genome-wide association studies in case/control design Matt McQueen, Jessica.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Strategies for Metabolomic Data Analysis Dmitry Grapov, PhD.
Genome-Wides Association Studies (GWAS) Veryan Codd.
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Quantitative genetics
Common variation, GWAS & PLINK
Ø Novel approaches for linkage mapping in dairy cattle
upstream vs. ORF binding and gene expression?
Washington State University
Genome Wide Association Studies using SNP
Marker heritability Biases, confounding factors, current methods, and best practices Luke Evans, Matthew Keller.
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Mapping Quantitative Trait Loci
Genome-wide Associations
Genome-wide Association Studies
Medical genomics BI420 Department of Biology, Boston College
Perspectives from Human Studies and Low Density Chip
Medical genomics BI420 Department of Biology, Boston College
Presentation transcript:

Conifer Translational Genomics Network Coordinated Agricultural Project Genomics in Tree Breeding and Forest Ecosystem Management Module 11 – Association Genetics Nicholas Wheeler & David Harry – Oregon State University

What is association genetics?  Association genetics is the process of identifying alleles that are disproportionately represented among individuals with different phenotypes. It is a population-based survey used to identify relationships between genetic markers and phenotypic traits –Two approaches for grouping individuals –By phenotype (e.g., healthy vs. disease) –By marker genotype (similar to approach used in QTL studies) –Two approaches for selecting markers for evaluation –Candidate gene –Whole genome

Association genetics: conceptual example

Comparing the approaches

Essential elements of association genetics  Appropriate populations –Detection –Verification  Good phenotypic data  Good genotypic data –SNPs: Number determined by experimental approach –Quality of SNP calls –Missing data  Appropriate analytical approach to detect significant associations

Flowchart of a gene association study Modified from Flint-Garcia et al. 2005

An association mapping population with known kinship Figures courtesy of CFGRP – University of Florida

Phenotyping: Precision, accuracy, and more Figures courtesy of Gary Peter – University of Florida

Genotyping: Potential genomic targets Nicholas Wheeler and David Harry, Oregon State University

Whole genome or candidate gene? Let’s look again at how this works Rafalski COPB 5:

Local distribution of SNPs and genes Jorgenson & Witte Nat Rev Genet 7:

Candidate genes for novel (your) species  Availability of candidate genes –Positional candidates –Functional studies –Model organisms –Genes identified in other forest trees

Candidate genes for association studies Wheeler et al Mol Breed 15: Figures courtesy of Kostya Krutovsky, Texas A&M University

Potential genotyping pitfalls  Quality of genotype data –Contract labs, automated base calls  Minor allele frequency –Use minimum threshold, e.g., MAF ≥ 0.05, or MAF ≥ 0.10 –Rare alleles can cause spurious associations due to small samples (recall that D’ is unstable with rare alleles)  Missing data !!! –Alternative methods for imputing missing data

Statistical tests for marker/trait associations  SNP by trait association testing is, at its core, a simple test of correlation/regression between traits  In reality such cases rarely exist and more sophisticated approaches are required. These may take the form of mixed models that account for potential covariates and other sources of variance  The principle covariates of concern are population structure and kinship or relatedness, both of which may result in LD between marker and QTN that is not predictive for the population as a whole

Causes of population structure  Geography –Adaptation to local conditions (selection)  Non-random mating – Isolation / bottlenecks (drift) – Assortative mating – Geographic isolation  Population admixture (migration)  Co-ancestry

Case-control and population structure Marchini et al Nat.Genet. 36:

Accommodating population structure  Avoid the problem by avoiding admixted populations or working with populations of very well defined co-ancestry  Use statistical tools to make appropriate adjustments

Detecting and accounting for population structure  Family based methods  Population based methods –Genomic control (GC) –Structured association (SA) –Multivariate  Mixed model analyses (test for association)

Family based approaches  Avoid unknown population structure by following marker-trait inheritance in families (known parent-offspring relationships)  Common approaches include –Transmission disequilibrium test (TDT) for binary traits –Quantitative transmission disequilibrium test (QTDT) for quantitative traits –Both methods build upon Mendelian inheritance of markers within families  Test procedure –Group individuals by phenotype –Look for markers with significant allele frequency differences between groups  For a binary trait such as disease, use families with affected offspring  Constraints –Family structures much be known (e.g., pedigree) –Limited samples

Population based: Genomic control  Because of shared ancestry, population structure should translate into an increased level of genetic similarity distributed throughout the genome of related individuals  By way of contrast, the expectation for a causal association would be a gene specific effect  Genomic control (GC) process –Neutral markers (e.g., SSRs) are used to estimate the overall level of genetic similarity within a sampled population –In turn, this proportional increase in similarity is used as an inflation factor, sometimes called, used to adjust significance probabilities (p- values) –For example –Typical values of are in the range of ~ P-value (adj) = P-value (unadj) /(1+ )

Structured association  The general idea behind structured association (SA) is that cryptic population history (or admixture) causes increased genetic similarity within groups  The challenge is to determine how many groups (K) are represented, and then to quantify group affinities for each individual  Correction factors are applied separately to each individual, based upon the inferred group affinities  SA is computationally demanding

Multivariate methods  Multivariate methods build upon co-variances among marker genotypes  Multivariate methods such as PCA offer several advantages over SA  Downstream analysis of SA and PCA data are similar

Mixed model approaches  Mixed models test for association by taking into account factors such as kinship and population structure, provided by other means  Provides good control of both Type 1 (false positive associations) and Type 2 (false negative associations) errors Bradbury et al (19): TASSEL: Software for association mapping of complex traits in diverse samples

Yu et al. 2006

Q-Q plots in GWAS Pearson & Manolio JAMA 299:

Significant associations for diabetes distributed across the human genome McCarthy et al Nat. Rev. Genet. 9:

Association genetics: Concluding comments  Advantages – Populations – Mapping precision – Scope of inference  Drawbacks – Resources required – Confounding effects – Repeatability

References cited in this module  Bradbury, P. Z. Zhang et al TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 23(19): )  Flint-Garcia, S, A. Thuillet et al Maize association population: A high-resolution platform for quantitative tgrait locus dissection. Plant Journal 44:  Jorgenson, E. and J. Witte A gene-centric approach to genome- wide association studies. Nature Reviews Genetics 7:  Marchini, J., L. Cardon et al The effects of human population structure on large genetic association studies. Nature Genetics 36(5):  McCarthy, M., G Abecasis et al. (2008) Genome-wide assoicaiton studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics 9(5):

References cited in this module  Neale, D. and O. Savolainen Association genetics of complex traits in conifers. Trends in Plant Science 9(7):  Pearson, T. and T. Manolio How to interpret a genome-wide association study. Journal of the American Medical Association 299(11):  Rafalski, A Applications of single nubleotide polymorphisms in crop genetics. Current Opinion in Plant Biology 5(2):  Wheeler,N., K. Jermstad, et al Mapping of quantitative trait loci controlling adaptive traits in coastal Doublas-fir. IV. Cold-hardiness QTL verification and candidate gene mapping. Molecular Breeding 15(2):  Yu, J., G. Pressoir et al A unified mixed-model method for association mapping that accounts for multiple leveles of relatedness. Nature Genetics 38(2):

Thank You. Conifer Translational Genomics Network Coordinated Agricultural Project