GenABEL: an R package for Genome Wide Association Analysis Younghun Han Department of Epidemiology UT MD Anderson Cancer Center
Outline Introduction GeneABEL: GWAA.data class Importing data to GenABEL Genetic data QC GWA association analysis
Introduction GenABEL is an R library developed to facilitate Genome-Wide Association analysis of binary and quantitative traits. Features of GenABEL : specific facilities for storage and manipulation of large data QC Maximum Likelihood estimation of linear, logistic and Cox regression on Genome-wide scale Specific functions to analyze and display the results
GeneABEL: GWAA.data class
GWAA.data class >library(GenABEL) >load(“lung2291.Rdata”)
GWAA.data class
Importing data to GenABEL Need a phenotypic and genotypic data Example of a phenotype file : Example of a genotypic data (PLINK tped files) TPED- file TFAM -file
Importing data to GenABEL Convert the data to GenABEL raw format : > convert.snp.affymetrix() > convert.snp.illumina() > convert.snp.mach() > convert.snp.ped() > convert.snp.text() > convert.snp.tped() Load the data into GenABEL >load.gwaa.data() Example : > convert.snp.tped(tped="lung2291.tped", tfam="lung2291.tfam",out="lung2291.raw", strand=“u") > lung2291<- load.gwaa.data(phe="pheno.txt", gen="lung2291.raw",force = T)
Genetic data QC summary.snp.data() : Number of observed genotypes, allelic frequency, genotypic distrbution, P-value of the exact test for HWE check.trait() : summary of phenotypic data check for outliers (using False Discovery Rate framework) and plots the raw data check.marker() : The major genetic data QC function of GenABEL HWE.show() : showing HWE tables, Chi2 and exact HWE P-values perid.summary() : call rate and heterozygosity per person ibs() : matrix of average IBS for a group of people hom() : average homozygosity (inbreeding) for a set of people, across multiple markers EXAMPLE of QC
GWA association analysis Descriptives of the phenotypic and marker data : descriptives.trait(lung2291) descriptives.trait(lung2291, by=case_control) descriptives.marker(lung2291) descriptives.marker(lung2291,ids=(case_control==0)) Score test an0 <- qtscore(case_control, data=lung2291, trait = "binomial") an1 <- qtscore(case_control~sex, data=lung2291, trait = "binomial") Chi-squre test for binary trait an2 <- ccfast("case_control", data=lung2291) SNP association test using glm in R library Scan.glm(“case_control~sex+CRSNP”, family=binomial) Scan.glm(“case_control~sex*CRSNP”, family=binomial) # no G*E test Scan.glm.2D(“case_control~sex+CRSNP”, family=binomial) # 2-snp interaction scan Note : formula must contain CRSNP variable to be replaced with the analysis SNPs
GWA association analysis GWAS results from qtscore, ccfast, scan.glm P1df : P-values of 1-d.f. (additive or allelic) test P2df : P-values of 2-d.f. (genotypic) test for association Pc1df : P-values of 1-d.f. test: the statistics is corrected for possible inflation effB: Effect of the B allele(second allele from coding) in allelic test (OR for ccfast) effAB : Effect of the AB effBB : Effect of the BB map : list of map positions of the SNPs chromosome : list of chromosomes the SNPs belong to idnames : list of people used in analysis lambda : inflation factor estimate formula : formula/function call was used to compute P-values family : family of the link function
Example of descriptives.trait() and descriptives.marker()
Example of qtscore() and ccfast()
Example of scan.glm() and scan.glm.2D()
Thank you!!!