Future Directions Pak Sham, HKU Boulder 2007. Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.

Slides:



Advertisements
Similar presentations
Statistical methods for genetic association studies
Advertisements

Analysis of imputed rare variants
What is an association study? Define linkage disequilibrium
Association Tests for Rare Variants Using Sequence Data
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
Meta-analysis for GWAS BST775 Fall DEMO Replication Criteria for a successful GWAS P
Genetic Analysis in Human Disease
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Basics of Linkage Analysis
. Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
MALD Mapping by Admixture Linkage Disequilibrium.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
More Powerful Genome-wide Association Methods for Case-control Data Robert C. Elston, PhD Case Western Reserve University Cleveland Ohio.
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Genome-Wide Association Studies Xiaole Shirley Liu Stat 115/215.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Statistical Power Calculations Boulder, 2007 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Standardization of Pedigree Collection. Genetics of Alzheimer’s Disease Alzheimer’s Disease Gene 1 Gene 2 Environmental Factor 1 Environmental Factor.
Comments on Rare Variants Analyses Ryo Yamada Kyoto University 2012/08/27 Japan.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Process of Genetic Epidemiology Migrant Studies Familial AggregationSegregation Association StudiesLinkage Analysis Fine Mapping Cloning Defining the Phenotype.
IAP workshop, Ghent, Sept. 18 th, 2008 Mixed model analysis to discover cis- regulatory haplotypes in A. Thaliana Fanghong Zhang*, Stijn Vansteelandt*,
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
Calculation of IBD State Probabilities Gonçalo Abecasis University of Michigan.
CS177 Lecture 10 SNPs and Human Genetic Variation
Genome-Wide Association Study (GWAS)
BGRS 2006 SEARCH FOR MULTI-SNP DISEASE ASSOCIATION D. Brinza, A. Perelygin, M. Brinton and A. Zelikovsky Georgia State University, Atlanta, GA, USA 123.
Whole genome association studies Introduction and practical Boulder, March 2009.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Methods in genome wide association studies. Norú Moreno
Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Linear Reduction Method for Tag SNPs Selection Jingwu He Alex Zelikovsky.
Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
Genome wide association studies (A Brief Start)
The International Consortium. The International HapMap Project.
C2BAT: Using the same data set for screening and testing. A testing strategy for genome-wide association studies in case/control design Matt McQueen, Jessica.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Fast test for multiple locus mapping By Yi Wen Nisha Rajagopal.
Multiple-Locus Genome-Wide Association Testing David Dean CSE280A.
Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Introduction to Genetic Theory
Association tests. Basics of association testing Consider the evolutionary history of individuals proximal to the disease carrying mutation.
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Regression Models for Linkage: Merlin Regress
Common variation, GWAS & PLINK
upstream vs. ORF binding and gene expression?
Genome Wide Association Studies using SNP
Regression-based linkage analysis
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Mapping Quantitative Trait Loci
Correlation for a pair of relatives
Pak Sham & Shaun Purcell Twin Workshop, March 2002
Lecture 9: QTL Mapping II: Outbred Populations
IBD Estimation in Pedigrees
Perspectives from Human Studies and Low Density Chip
Presentation transcript:

Future Directions Pak Sham, HKU Boulder 2007

Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics

Gene Mapping – GWA Era The promise Detect all the “big” genetic players Understand how they interact with each other and with the environment Generation of detailed hypotheses regarding etiology The challenges How to get enough funding? How to get the most out of them? Study design Data analysis

GWA: Taking the Plunge Age-related Macular Degeneration 96 cases / 50 controls 100,000 SNPs Klein et al. (2005)

CARD15 IBD5 ATG16L1 IL23R conf From Dr Mark Daly946 cases, 977 controls

NOTCH-2 (1) TCF7L2 (3) Type 2 Diabetes Mellitus Genome-wide Association Results From Dr Mark Daly

Design of GWA Studies Reducing cost Shared pool of control subjects Split sample designs Two stage: GWA  replication Split-half: e.g. 250K (Sty / Nsp) in each half Maximizing information Choosing most extreme (genetically loaded) subjects Choosing most accurately and comprehensively phenotyped subjects

Mining GWA Data Aim To squeeze out all the information from the vast amount of genotype data Strategy Optimal genotype calls Thorough data cleaning Sample characterization (stratification) Apply multiple statistical methods Replication

WGA Analysis Flow Raw probe signalsCopy number variations Genotype calls Cleaned genotype data Expanded genotype data (with imputations) Sample stratification CNV –phenotype relationship Stratified single SNP association tests Other analyses

Deletion CNVs

CNV and Disease CNVs are common throughout the genome can influence gene function (increased / decreased levels) can influence disease susceptibility Charcot Marie-Tooth disease type A (PMP22) Early-onset Parkinson’s disease (alpha-synuclein) Susceptibility to HIV infection (CCL3L1)

Family-based Tests “DFAM” Test (implemented in PLINK) Break pedigree into nuclear families Consider count of minor allele (X) among affected offspring Calculate expectation and variance of (X) conditional on parental genotypes (if available) or sibship genotypes Calculate single test statistics

Combining Studies Imputation to establish common SNP set and then Combine data and use stratified association analysis Meta-analysis – combine odds ratios (inverse variance weighting)

Looking for Epistasis Epistatic components not detectable by single-locus association analyses Simple methods of epistasis analyses Test of homogeneity of odds ratios or means differences across genotypic strata Test of interaction in logistic or linear regression models Test for correlation between unlinked loci Test for difference in correlation between loci, in cases and controls Increases multiple testing: e.g. 500,000 SNPs leads to 124,999,750,000 possible pairs of SNPs

Epistasis: Is it worth doing? Marchini et al (2005) compared 4 analytic strategies using simulated data with epistasis (1) One single-locus analysis (2) Two single-locus analyses (3) All possible pairs of loci (4) Two-stage: pairs of loci with low p- values from single-locus analysis Results: Strategies (3) and (4) are often more powerful than (1) and (2) even after Bonferroni adjustment

Will GWA catch all? Certainly not! Alleles of small effects Rare variants “Residual” genetic variation

Using Functional Information A B C D E F G H I J Canonical correlation analysis Gene-gene interaction Gene-based tests: Haplotypic / Allelic Pathways –based tests

Population-based Linkage Linkage is possible only in families, BUT Every pair of individuals are related if traced back far enough Genetic relationship (overall genomic sharing) can be estimated from GWA data Local IBD sharing can be also estimated from GWA data Therefore IBD sharing can be correlated with phenotypic similarity in GWA data Likely to be useful for rare phenotypes with rare variants of moderately strong effects

Estimating Genome-wide IBD  (p i q i ) 2 4  p i q i (1-2p i q i )N-2  p i q i (2-3p i q i ) 10 2  p i q i N - 2  p i q i 200N IBD Expected number of SNPs with IBS = N : Number of SNPs; p, q : allele frequencies

Estimating Genome-wide IBD N 0 = Number of IBS0 SNPs N 1 = Number of IBS1 SNPs N 2 = Number of IBS2 SNPs f 0 = Proportion of genome IBD0 f 1 = Proportion of genome IBD1 f 2 = Proportion of genome IBD2

Estimating Genome-wide IBD The estimated genome-wide IBD proportions, obtained by solving the linear equations, are: Boundary conditions Small sample (rare allele) adjustment Inbreeding adjustment

HapMap Relationships B C A B C A P(IBD=0)P(IBD=1)P(IBD=2) A A C Two Yoruba Trios

Distribution of Genomic Sharing: Among CEPH Founders Average sharing ~ 1.6% (Average kinship ~ 0.8%)

Estimating Segmental Sharing Prune SNPs to reduce LD relationships Use allele frequencies to calculate likelihood of IBD given single SNP genotypes of the pair of individuals Use genome-wide IBD to estimate the least number of meioses that separate the two genomes Use number of meioses to calculate transition matrix of IBD states Use Hidden Markov Model to calculate “multipoint” IBD probabilities

Transition Matrix : Recombination fraction m : Number of meioses (  2) IBD(i) IBD(i+1)

Estimated Segmental Sharing YRI: NA19130, NA (Half aunt / Half niece) Overall: IBD0 = 0.76, IBD1 = 0.24, IBD2 = 0

Estimated Segmental Sharing YRI: NA18913, NA19240 (Grandparent / Grandchild) Overall: IBD0 = 0.44, IBD1 = 0.56, IBD2 = 0

Other Data Mining Methods The possibilities are endless! Neural networks CART MARS etc …….

Beyond GWA Incorporating measured genotypes into quantitative genetic- epidemiological analysis Functional genomic studies – gene expression profiles, cell biology, etc.

Summary The technology for GWA is reaching maturity GWA is already yielding novel susceptibility loci for complex diseases GWA are increasing in number and in size GWA data offer interesting analytical and computational challenges The results from GWA studies will revolutionize quantitative genetics and functional genomics

The End