A Statistical Method for Adjusting Covariates in Linkage Analysis With Sib Pairs Colin O. Wu, Gang Zheng, JingPing Lin, Eric Leifer and Dean Follmann Office.

Slides:



Advertisements
Similar presentations
15 The Genetic Basis of Complex Inheritance
Advertisements

VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal.
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
A Brief Overview of Affected Relative Pair analysis Jing Hua Zhao Institute of Psychiatry
Tutorial #2 by Ma’ayan Fishelson. Crossing Over Sometimes in meiosis, homologous chromosomes exchange parts in a process called crossing-over. New combinations.
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
Basics of Linkage Analysis
. Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Linkage analysis: basic principles Manuel Ferreira & Pak Sham Boulder Advanced Course 2005.
1 Associating Genomic Variations with Phenotypes Model comparison, rare variants, and analysis pipeline Qunyuan Zhang Division of Statistical Genomics.
Quantitative Genetics Theoretical justification Estimation of heritability –Family studies –Response to selection –Inbred strain comparisons Quantitative.
Statistical association of genotype and phenotype.
Journal Club Alcohol, Other Drugs, and Health: Current Evidence January–February 2009.
Admixture Mapping Qunyuan Zhang Division of Statistical Genomics GEMS Course M Computational Statistical Genetics Computational Statistical Genetics.
Estimating “Heritability” using Genetic Data David Evans University of Queensland.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Journal Club Alcohol, Other Drugs, and Health: Current Evidence July–August 2009.
Quantitative Genetics
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
A Longitudinal Study of Maternal Smoking During Pregnancy and Child Height Author 1 Author 2 Author 3.
Introduction to Multilevel Modeling Using SPSS
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Standardization of Pedigree Collection. Genetics of Alzheimer’s Disease Alzheimer’s Disease Gene 1 Gene 2 Environmental Factor 1 Environmental Factor.
Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University
Introduction to QTL analysis Peter Visscher University of Edinburgh
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
1 Phenotypic Variation Variation of a trait can be separated into genetic and environmental components Genotypic variance  g 2 = variation in phenotype.
Lab 5 instruction.  a collection of statistical methods to compare several groups according to their means on a quantitative response variable  Two-Way.
Linkage in selected samples Manuel Ferreira QIMR Boulder Advanced Course 2005.
Whole genome approaches to quantitative genetics Leuven 2008.
Trait evolution Up until now, we focused on microevolution – the forces that change allele and genotype frequencies in a population This portion of the.
Presented by Alicia Naegle Twin Studies. Important Vocabulary Monozygotic Twins (MZ)- who are identical twins Dizygotic Twins (DZ)- who are twins that.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Quantitative Genetics
Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Genetic pedigree analysis of spring Chinook salmon reintroduced above Foster Dam Melissa Evans, Kathleen O’Malley, Marc Johnson, Michael Banks, Dave Jacobson,
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
C2BAT: Using the same data set for screening and testing. A testing strategy for genome-wide association studies in case/control design Matt McQueen, Jessica.
Vivia V. McCutcheon, Howard J. Edenburg, John R. Kramer, Kathleen K. Bucholz 9 th Annual Guze Symposium St. Louis, MO February 19, 2009 Gender Differences.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Lecture 22: Quantitative Traits II
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
David M. Evans Multivariate QTL Linkage Analysis Queensland Institute of Medical Research Brisbane Australia Twin Workshop Boulder 2003.
A simple method to localise pleiotropic QTL using univariate linkage analyses of correlated traits Manuel Ferreira Peter Visscher Nick Martin David Duffy.
Efficient calculation of empirical p- values for genome wide linkage through weighted mixtures Sarah E Medland, Eric J Schmitt, Bradley T Webb, Po-Hsiu.
When  is unknown  The sample standard deviation s provides an estimate of the population standard deviation .  Larger samples give more reliable estimates.
QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
Introduction to Biostatistics Lecture 1. Biostatistics Definition: – The application of statistics to biological sciences Is the science which deals with.
Pedigree Charts The family tree of genetics. What is a Pedigree?  A pedigree is a chart of the genetic history of family over several generations. 
Lecture 17: Model-Free Linkage Analysis Date: 10/17/02  IBD and IBS  IBD and linkage  Fully Informative Sib Pair Analysis  Sib Pair Analysis with Missing.
Bayesian Variable Selection in Semiparametric Regression Modeling with Applications to Genetic Mappping Fei Zou Department of Biostatistics University.
Extended Pedigrees HGEN619 class 2007.
Regression Models for Linkage: Merlin Regress
Identifying QTLs in experimental crosses
PBG 650 Advanced Plant Breeding
upstream vs. ORF binding and gene expression?
Can resemblance (e.g. correlations) between sib pairs, or DZ twins, be modeled as a function of DNA marker sharing at a particular chromosomal location?
The Genetic Basis of Complex Inheritance
Regression-based linkage analysis
Mapping Quantitative Trait Loci
gene mapping & pedigrees
Lecture 9: QTL Mapping II: Outbred Populations
Stephen C. Pratt, Mark J. Daly, Leonid Kruglyak 
William F. Forrest, Eleanor Feingold 
Presentation transcript:

A Statistical Method for Adjusting Covariates in Linkage Analysis With Sib Pairs Colin O. Wu, Gang Zheng, JingPing Lin, Eric Leifer and Dean Follmann Office of Biostatistics Research, DECA, NHLBI A Statistical Method for Adjusting Covariates in Linkage Analysis With Sib Pairs Colin O. Wu, Gang Zheng, JingPing Lin, Eric Leifer and Dean Follmann Office of Biostatistics Research, DECA, NHLBI Presenter’s Name: Colin O. Wu National Heart, Lung, and Blood Institute National Institutes of Health Department of Health and Human Services Presenter’s Name: Colin O. Wu National Heart, Lung, and Blood Institute National Institutes of Health Department of Health and Human Services

1. Framingham Heart Study (GAW13) 1.1 First Generation:  5209 subjects (2336 men & 2873 women);  29 to 62 years old when recruited;  1644 spouse pairs;  Continuously examined every 2 years since 1948 – medical history – physical exams – laboratory tests.

1.2 Second Generation (full dataset):  5124 of the original participants’ adult children & spouses of these adult children;  2616 subjects are offspring of original spouse pairs;  34 are stepchildren;  898 offspring are children with only one parent in the study;  1576 are spouses of the offspring;  Offspring cohort followed every 4 years;  Interval between Exams 1&2 is 8 years.

1.3 Second Generation (sib-pair subset)  482 multi-sib families – from 330 pedigrees;  Observed trait: systolic blood pressure;  Covariates: 1. age (in years), 2. gender (0=female, 1=male), 3. drinking (average daily alcohol consumption in ml).  Genotype data: 398 random markers with an average of 10cM apart.

2. Methods for Linkage Analysis 2.1 Methods based on identity by descent (IBD)  Association in pedigrees between phenotype and IBD sharing at loci linked to trait loci;  Linkage for qualitative traits – IBD sharing conditional on phenotypes: e.g. affected sib-pair methods (Hauser & Boehnke, 1998).  Linkage for quantitative trait loci (QTL) – phenotypes conditional on IBD sharing, e.g. Haseman & Elston (1972), Amos (1994); – extremely discordant sib-pairs, e.g. Risch & Zhang (1995, 1996).

2.2 The Haseman-Elston method

HE Model without Covariate Adjustment:

Linkage : Limitations:  Covariate effects are not included.  Genetic and environmental effects are additive.  Method may not have sufficient power.

2.3 HE Method with Linear Covariate Adjustment (SAGE SIBPAL)  Involve families with more than 2 sibs.  Can use other measures of trait difference – e.g. the mean-corrected cross-product.  Include covariate effects in linear regression – e.g. Elston, Buxbaum, Jacobs and Olson (2000) “Haseman and Elston Revisted”.

Linear generalization:

Linkage: Covariate effects: Limitation:

3. The Proposed Method 3.1. Modeling the covariates

 Cross-sectional data:

Regression models for covariates:

 Longitudinal data: (Repeated measurements over time)

Notation: Linear model (Verbeke & Molenberghs, 2000):

3.2 Covariate adjusted linkage detection  General Procedure: Select a regression model for the covariates. Estimate the covariate adjusted trait based on the above regression model. Apply the linkage procedures, such as the HE model or the variance-components model, using the estimated adjusted trait values and genotypic values.

3.3 Cross-sectional data Covariate adjusted HE model:

Estimation of adjusted trait values: Data from sib pairs are correlated  Existing estimation methods for independent data can not be directly applied. Two approaches: 1)Use methods for correlated data, such as GEE – treat each family as a subject, each member as a single observation. 2)Resample independent observations:

i.Randomly sample one member from each family. ii.Estimate the parameters and adjusted trait values using the re-sampled data and procedures for independent data, such as LSE, MLE, etc. iii.Repeat the previous steps many times and compute the estimates using the average of the estimates from the re-sampled data. This leads to consistent estimates when the sample size (number of families) is large (Hoffman et al., 2001).

Procedure for linear adjustment model:

3.5 Longitudinal data

Two sources of potential correlations in the estimation of adjusted trait values: i.Correlation within a sib  intra-subject correlation. ii.Correlation between sibs within a family  intra-family correlation.  Nested longitudinal data.  Methods for longitudinal estimation can not be directly applied (Morris, Vannucci, Brown and Carroll, 2003, JASA).

Resampling approach: i.Randomly select one sib from each family  Resampled data contain repeated measurements of independent sibs. ii.Estimate the covariate adjusted trait values from the above resampled data based on longitudinal estimation methods (GEE, MLE, REMLE, etc.). iii.Repeat the above steps many times and estimate the parameters using the averages of the estimates from the resampled data. iv.Fit the HE model using existing procedure.

4. Framingham Heart Study Features of the data:  Clustered data from families;  Repeated measurements;  Multi-sib families;  Continuous and categorical covariates. Variables:  Quantitative trait: SBP;  Covariates: age, gender (0=female, 1=male), drinking (average daily consumption).

5. Discussion  Advantages for covariate adjustment: small variation for the estimates; better interpretation for the model.  Directions of further research: Non-additive models, e.g. covariate-gene and covariate-environment interactions; Covariate adjustment with other measures of the trait difference; Methods of model selection; Models with general pedigrees.