ESP6800 BP Analysis (recessive model)

Slides:



Advertisements
Similar presentations
BST 775 Lecture PLINK – A Popular Toolset for GWAS
Advertisements

Association Tests for Rare Variants Using Sequence Data
Meta-analysis for GWAS BST775 Fall DEMO Replication Criteria for a successful GWAS P
Presented by Qing Duan Dr. Yun Li group UNC at Chapel Hill
What is a chromosome?.
Integrating domain knowledge with statistical and data mining methods for high-density genomic SNP disease association analysis Dinu et al, J. Biomedical.
Ch. 12. Notes: Genetics  Mendel- The “Father of Genetics” -Used peas in experiments. -Peas are, in nature, self pollinators -Mendel was able to control.
Heredity Review.
Host Genomics in WIHS  The WIHS GWAS data set  Concept Sheet  Data use agreement  Data transfer  Analytic support.
Punnett Square Part 2 A punnett square is used to show the possible allele (gene) combinations for the offspring of 2 parents. The four boxes represent.
1 Association Analysis of Rare Genetic Variants Qunyuan Zhang Division of Statistical Genomics Course M Computational Statistical Genetics.
E XOME SEQUENCING AND COMPLEX DISEASE : practical aspects of rare variant association studies Alice Bouchoms Amaury Vanvinckenroye Maxime Legrand 1.
Mendel’s Law of Heredity Chapter 10, Section 1. The Father of Genetics Gregor Mendel’s experiments founded many of the principles of Genetics we use today.
A physical characteristic like eye color.. A small part on a chromosome that controls a trait.
GENETICS VOCABULARY REVIEW Please take out your agenda, science journal, and turn in any missing work into the late basket.
GenABEL: an R package for Genome Wide Association Analysis
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
 Probability can be used to predict the results of genetic crosses.  Probability- the likelihood that something is going to happen. In genetics expressed.
Schematic of the single variant polymorphism (SNP) genotyping assay.
Sequence Kernel Association Tests (SKAT) for the Combined Effect of Rare and Common Variants 統計論文 奈良原.
Mendel’s Punnett Squares. Genes and Alleles Gene: Place on chromosome and determines certain trait Allele: variation of that trait Ex: Gene: Eye color.
May 4, What is an allele?. Genotype: genetics of trait (what alleles?) Homozygous: two copies of the same allele –Homozygous dominant (BB) –Homozygous.
Canadian Bioinformatics Workshops
Quality Control Using EasyQC & Meta-Analysis in METAL
Imputation Sarah Medland Boulder 2015.
Punnett Squares.
Genome-wide association study identifies new type 2 diabetes risk loci in Jordan subpopulations  Jin Li, Rana Dajani, Zhi Wei, Yousef Khader, Michael March,
QC analysis Uppsala University Work done by Jonas Almlöf
Genetic Association Analysis
Hardy-Weinberg Theorem
Genome Wide Association Studies using SNP
876 fetal cord blood DNA samples
Marker heritability Biases, confounding factors, current methods, and best practices Luke Evans, Matthew Keller.
Zhengzheng Tang and Danyu Lin March 26, 2013
Preparing data for GWAS analysis
Introduction to Data Formats and tools
Living Things Inherit Traits in Patterns.
Sex-Linked Traits.
Sex-Linked Traits.
KEY CONCEPT Phenotype is affected by many different factors.
Genetics.
KEY CONCEPT Phenotype is affected by many different factors.
Genetics Notes Chapter 13.
Beyond GWAS Erik Fransen.
Figure S1. Distributions of scales for loneliness
Unit 5 “Mendelian Genetics”
Dihybrid Crosses and Polygenic Traits
Genetics definitions Label each chromosome pair as homozygous dominant, homozygous recessive, or heterozygous with definitions Label dominant.
GENE POOL All the genes of all members of a particular population.
Punnett Squares.
Using Punnett Squares Dominant & Recessive.
Whose got Genes? Genes, Heredity, & DNA Baker 2003/2004.
KEY CONCEPT A combination of methods is used to study human genetics.

Mendel and Inheritance of Traits Notes
Zheng-Zheng Tang, Dan-Yu Lin  The American Journal of Human Genetics 
10 Years of GWAS Discovery: Biology, Function, and Translation
Chapter 5 Vocabulary.
Guidelines for Large-Scale Sequence-Based Complex Trait Association Studies: Lessons Learned from the NHLBI Exome Sequencing Project  Paul L. Auer, Alex.
Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test  Michael C. Wu, Seunggeun Lee, Tianxi Cai, Yun Li, Michael.
Brent S. Pedersen, Aaron R. Quinlan 
Vocab #18 Mr. Addeo.
Power Calculation for QTL Association
Unit: Animals at the Cellular Level
Punnett Squares.
Punnett Squares.
GENETICS HEREDITY.
Figure Joint tests of SNPs and vitamin D deficiency in CACNA1C and CACNA1D Joint tests of SNPs and vitamin D deficiency in CACNA1C and CACNA1D Each point.
Presentation transcript:

ESP6800 BP Analysis (recessive model) Zhengzheng Tang and Danyu Lin March 26, 2013

Variant QC Started with “ESP6900 May02 release” vcf (1908614 variants; 6823 individuals) SVM filter deleted variants that were flagged in vcf Per-genotype depth filter set genotype to missing if DP < 10 Bi-allelic filter deleted non-bi-allelic variants after this step, 1908596 variants remain (18 variants deleted) Per-variant depth filter filtered out all variants with an average read depth > 500 applied Paul’s exclusion list Fail_DP500 (284 variants listed) after this step, 1908312 variants remain (284 variants deleted) HWE filter deleted variants if race-specific p-value < 5x10^-8 applied Paul’s race-specific exclusion lists Fail_hwe_AA (2779 variants listed) and Fail_hwe_EA (2592 variants listed) after this step, 1904594 variants remain (3718 variants deleted)

Sample QC I Started with 6823 subjects Dropped 536 subjects in certain cohorts/phenotype groups ESP_COHORT: PAH (80), CF (431) ESP_PHENOTYPE: BP (1), Blind (22), EOMI_Case_Drop (2), SSC PAH (40), SSC no PAH (35), Other (5) Dropped 34 subjects whose self-reported race is not missing and is not AA or EA Dropped 52 subjects based on QC information contained in the phenotype file intentional duplicates (22), high missing rates (1), sex mismatch (13), high homozygosity (1), poor concordance (5), no data to check concordance (3), unresolved ID (4), self-reported race is missing (30), PCA outliers (17), race mismatch (13) Dropped 16 subjects due to duplications for each duplicated pair, dropped the subject with the lower genotype call rate

Sample QC II Dropped 155 subjects due to relatedness Only consider 1st and 2nd degree relatedness For each related pair: If one of the subjects in the pair is missing on SSR and the other is not, then drop the one with missing SSR. If both of them are missing on SSR or neither of them is missing on SSR, then drop the one with the lower genotype call rate.

Data Phenotype groups (studies): Genetic variants: Nonsynonymous T1/T5/VT burden scores LOF T5 Single variants Annotation: SeattleSeq MAFs: extracted from vcf file Study LDL BMI BP EOMI Stroke DPR AA EA 286 318 6120 262 502 351 579 82 404 226 667 AA w/BP EA w/BP 227 269 520 260 464 36 196 188 447

Data Processing Phenotypes Genotypes Genes SSR Remove variants with call rates < 90%. Impute missing values by expected number of minor alleles. Remove variants with MAC<2 for single variant analysis. Genes Remove genes with MAC<2 for rare-variant analysis.

Analysis Perform race-specific analysis and meta analyze the results. The total number of genes/variants being analyzed: 5658 (Nonsyn T1); 8116 (Nonsyn T5); 139 (LOF T5); 210077 (Single variant). Covariates: pc1-2, sex, age, age square, target, cohorts

Analysis Methods: For each study, calculated the score statistic based on full likelihood (MLE). Performed meta-analysis of the score statistics from the six studies. The MLE approach properly adjusts for trait-dependent sampling. It has the highest power among all valid tests and provides unbiased estimates of genetic effects. Genetic models: Recessive model

Nonsyn T1: combined

Nonsyn T1: AA

Nonsyn T1: EA

Nonsyn T1: Top 1-25 genes

Nonsyn T1: Top 26-50 genes

Nonsyn T5: combined

Nonsyn T5: AA

Nonsyn T5: EA

Nonsyn T5: Top 1-25 genes

Nonsyn T5: Top 26-50 genes

Nonsyn VT: combined

Nonsyn VT: AA

Nonsyn VT: EA

Nonsyn VT: Top 1-25 genes

Nonsyn VT: Top 26-50 genes

LOF T5: combined

LOF T5: AA

LOF T5: EA

LOF T5: Top 1-25 genes

LOF T5: Top 26-50 genes

Single Variant: combined

Single Variant: AA

Single Variant: EA

Single Variant: combined Top 1-25 SNPs

Single Variant: combined Top 26-50 SNPs

Single Variant: AA Top 1-25 SNPs

Single Variant: AA Top 26-50 SNPs

Single Variant: EA Top 1-25 SNPs

Single Variant: EA Top 26-50 SNPs