Disease Models and Association Statistics Nicolas Widman CS 224- Computational Genetics Nicolas Widman CS 224- Computational Genetics.

Slides:



Advertisements
Similar presentations
Lecture 39 Prof Duncan Shaw. Meiosis and Recombination Chromosomes pair upDNA replication Chiasmata form Recombination 1st cell division 2nd cell divisionGametes.
Advertisements

How do we know if a population is evolving?
Genetics notes For makeup. A gene is a piece of DNA that directs a cell to make a certain protein. –Homozygous describes two alleles that are the same.
 Read Chapter 6 of text  Brachydachtyly displays the classic 3:1 pattern of inheritance (for a cross between heterozygotes) that mendel described.
What is a chromosome?.
Copyright © McGraw-Hill Education. Permission required for reproduction or display. Chapter 14 Constant Allele Frequencies.
Single nucleotide polymorphisms Usman Roshan. SNPs DNA sequence variations that occur when a single nucleotide is altered. Must be present in at least.
Introduction to Genetics. Chromosomes Chromosomes are made up of DNA wrapped around proteins. Each chromosome codes for several genes. Each Gene codes.
Single nucleotide polymorphisms and applications Usman Roshan BNFO 601.
 Read Chapter 6 of text  We saw in chapter 5 that a cross between two individuals heterozygous for a dominant allele produces a 3:1 ratio of individuals.
Introducing the Hardy-Weinberg principle The Hardy-Weinberg principle is a mathematical model used to calculate the allele frequencies of traits with dominant.
 What is genetics?  Genetics is the study of heredity, the process in which a parent passes certain genes onto their children. What does that mean?
The Hardy-Weinberg Equation
Do Now: 5/14 (Week 36) Objectives : 1. Define gene pool, phenotype frequency, and genotype frequency. 2. State the Hardy-Weinberg Principle. 3. Describe.
Genetic Drift Random change in allele frequency –Just by chance or chance events (migrations, natural disasters, etc) Most effect on smaller populations.
How do we know if a population is evolving?
Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.
Population Genetics I. Basic Principles. Population Genetics I. Basic Principles A. Definitions: - Population: a group of interbreeding organisms that.
Mechanisms of Evolution Hardy-Weinberg Law.  The Hardy–Weinberg principle states that the genotype frequencies in a population remain constant or are.
INTRODUCTION TO GENETICS
 A llele frequencies will remain constant unless one or more factors cause the frequencies to change.  If there is no change, there is no evolving.
1. Define the following terms:  Genetic drift: random change in a gene frequency that is caused by a series of chance occurrences that cause an allele.
6.4 Traits, Genes, and Alleles KEY CONCEPT Genes encode proteins that produce a diverse range of traits.
The same gene can have many versions.
Dominant and Recessive Dominance Table 3. Alleles sequence of DNA any of several forms of a gene determine the genotype (genetic constitution of an organism.
Mechanisms of Evolution  Lesson goals:  1. Define evolution in terms of genetics.  2. Using mathematics show how evolution cannot occur unless there.
Principles of Mendelian Genetics B-4.6. Principles of Mendelian Genetics Genetics is the study of patterns of inheritance and variations in organisms.
1 Human Genetics: Pedigrees. Pedigree Looks at family history and how a trait is inherited over several generations and can help predict inheritance patterns.
Godfrey Hardy ( ) Wilhelm Weinberg ( ) Hardy-Weinberg Principle p + q = 1 Allele frequencies, assuming 2 alleles, one dominant over the.
Predict and interpret patterns of inheritance Genetics Unit.
Lesson Overview Lesson Overview Other Patterns of Inheritance -Describe the other patterns of inheritance. -Explain the relationship between genes and.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Genetics: Inheritance. Meiosis: Summary  Diploid Cells (2n): Cells with two sets of chromosomes, (aka “homologous chromosomes”)  One set of chromosomes.
Hardy Weinberg Equilibrium. What is Hardy- Weinberg? A population is in Hardy-Weinberg equilibrium if the genotype frequencies are the same in each generation.
HS-LS-3 Apply concepts of statistics and probability to support explanations that organisms with an advantageous heritable trait tend to increase in proportion.
Power Calculations for GWAS
Mendelian genetics in Humans: Autosomal and Sex- linked patterns of inheritance Obviously examining inheritance patterns of specific traits in humans.
Hardy-Weinberg Theorem
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Genetics Definitions Definition Key Word
Phenotype the set of observable characteristics of an individual resulting from their DNA information.
The same gene can have many versions.
Week 9 Vocab Definitions
Heredity Vocabulary Sexual Reproduction- a cell containing genetic info from the mother and a cell containing genetic info from the father combine into.
The same gene can have many versions.
Beyond Dominant and Recessive Alleles
The same gene can have many versions.
The same gene can have many versions.
The same gene can have many versions.
Genetics definitions Label each chromosome pair as homozygous dominant, homozygous recessive, or heterozygous with definitions Label dominant.
Predict and interpret patterns of inheritance
The same gene can have many versions.
The same gene can have many versions.
The same gene can have many versions.
Bio Do Now What is the relationship between alleles, genotype, and phenotype? Write down an example of a genotype that is: Homozygous dominant Homozygous.
Chapter 11: Introduction to Genetics Mendel and Meiosis
The same gene can have many versions.
The same gene can have many versions.
GENETICS WORKSHEET.
Carrier = an organism that has inherited a genetic trait or mutation, but displays no symptoms X-linked traits = traits that are passed on from parents.
Introduction to Heredity Vocabulary
The same gene can have many versions.
GENETICS HEREDITY.
Understand the concept of a gene pool
HARDY-WEINBERG & EVOLUTION
The same gene can have many versions.
The same gene can have many versions.
Mendelian Genetics Vocabulary.
Presentation transcript:

Disease Models and Association Statistics Nicolas Widman CS 224- Computational Genetics Nicolas Widman CS 224- Computational Genetics

Introduction Certain SNPs within genes may be associated with a disease phenotype Statistical model used in class only considers inheritance of a single copy of an SNP location: Single Chromosome Model Expand the statistic to a diploid model and take into account different expression patterns of a SNP Certain SNPs within genes may be associated with a disease phenotype Statistical model used in class only considers inheritance of a single copy of an SNP location: Single Chromosome Model Expand the statistic to a diploid model and take into account different expression patterns of a SNP

Basic Statistic- Haploid Model  : Relative Risk p A : Probability of disease-associated allele F: Disease prevalence For this project, F is assumed to be very small +/-: Disease State Derivation of case (p + ) and control (p - ) frequencies: P(A)=p A p + A =P(A|+)p - A =P(A|-)F=P(+) P(A|+)=P(+|A)P(A)/P(+) P(+|A)=  P(+|¬A)  : Relative Risk p A : Probability of disease-associated allele F: Disease prevalence For this project, F is assumed to be very small +/-: Disease State Derivation of case (p + ) and control (p - ) frequencies: P(A)=p A p + A =P(A|+)p - A =P(A|-)F=P(+) P(A|+)=P(+|A)P(A)/P(+) P(+|A)=  P(+|¬A)

Derivation- Continued P(+)=F=p A P(+|A)+(1-p A )P(+|¬A) P(+)=F= p A P(+|A)+(1-p A )P(+|A)/  P(+)=F=P(+|A)(p A +(1-p A )/  )=P(+|A)(p A (  -1)+1)/  P(+|A)=  F/(p A (  -1)+1) P(A|+)=P(+|A)P(A)/P(+)=P(+|A)p A /F=  p A /(p A (  -1)+1) P(-|A)=1-P(+|A)=1-  F/(p A (  -1)+1) P(A|-)=P(-|A)P(A)/P(-) If F is small, then 1-F ≈ 1 and P(-|A) ≈ 1 then, P(A|-) ≈ P(A) = p A P(+)=F=p A P(+|A)+(1-p A )P(+|¬A) P(+)=F= p A P(+|A)+(1-p A )P(+|A)/  P(+)=F=P(+|A)(p A +(1-p A )/  )=P(+|A)(p A (  -1)+1)/  P(+|A)=  F/(p A (  -1)+1) P(A|+)=P(+|A)P(A)/P(+)=P(+|A)p A /F=  p A /(p A (  -1)+1) P(-|A)=1-P(+|A)=1-  F/(p A (  -1)+1) P(A|-)=P(-|A)P(A)/P(-) If F is small, then 1-F ≈ 1 and P(-|A) ≈ 1 then, P(A|-) ≈ P(A) = p A

Haploid Model The relative risk formula: Association Power: The relative risk formula: Association Power:

Assumptions Low disease prevalence F ≈ 0: Allows p - A ≈ p A Uses Hardy-Weinberg Principle A-Major Allelea-Minor Allele P(AA)=P(A)^2 P(Aa)=2*P(A)*(1-P(A)) P(aa)=(1-P(A))^2 Uses a balanced case-control study Low disease prevalence F ≈ 0: Allows p - A ≈ p A Uses Hardy-Weinberg Principle A-Major Allelea-Minor Allele P(AA)=P(A)^2 P(Aa)=2*P(A)*(1-P(A)) P(aa)=(1-P(A))^2 Uses a balanced case-control study

Diploid Disease Models When inheriting two copies of a SNP site, there are three common relationships between major and minor SNPs Dominant Particular phenotype requires one major allele Recessive Particular phenotype requires both minor alleles Additive Particular phenotype varies based whether there are one or two major alleles When inheriting two copies of a SNP site, there are three common relationships between major and minor SNPs Dominant Particular phenotype requires one major allele Recessive Particular phenotype requires both minor alleles Additive Particular phenotype varies based whether there are one or two major alleles

Diploid Disease Models AA- Homozygous major Aa, aA- Heterozygous aa- Homozygous minor AA- Homozygous major Aa, aA- Heterozygous aa- Homozygous minor

Modifying the Calculation for Relative Risk Previous relative risk formula only considered the haploid case of having a SNP or not having a SNP. Approach: Create a virtual SNP which replaces p A in the formula. Previous relative risk formula only considered the haploid case of having a SNP or not having a SNP. Approach: Create a virtual SNP which replaces p A in the formula.

Virtual SNPs Use Hardy-Weinberg Principle to calculate a new p A - the virtual SNP using the characteristics of diploid disease models. Recessive p A =p d *p d Dominant p A =p d *p d +2*p d *(1-p d ) Additive p A =p d *p d +c*p d *(1-p d ) P d : Probability of disease-associated allele. In the calculations used to determine the association power, c was set to sqrt(2). Use Hardy-Weinberg Principle to calculate a new p A - the virtual SNP using the characteristics of diploid disease models. Recessive p A =p d *p d Dominant p A =p d *p d +2*p d *(1-p d ) Additive p A =p d *p d +c*p d *(1-p d ) P d : Probability of disease-associated allele. In the calculations used to determine the association power, c was set to sqrt(2).

Diploid Disease Models:  =1.5

Diploid Disease Models:  =2

Diploid Disease Models:  =3

Results Achieving significant association power with low relative risk SNPs (  =1.5) Minimum of 200 cases and 200 controls required to reach 80% power within strongest p d intervals for each type of SNP At a sample size of 1000 cases and 1000 controls, dominant and additive SNPs show very significant power for almost all SNP probabilities below 50% Difficult to obtain significant association for low probability recessive SNPs regardless of sample size Achieving significant association power with low relative risk SNPs (  =1.5) Minimum of 200 cases and 200 controls required to reach 80% power within strongest p d intervals for each type of SNP At a sample size of 1000 cases and 1000 controls, dominant and additive SNPs show very significant power for almost all SNP probabilities below 50% Difficult to obtain significant association for low probability recessive SNPs regardless of sample size

Results SNP probability ranges for greatest association power Dominant: Recessive: Additive: Higher relative risk SNPs require fewer cases and controls to achieve the same power. As  approaches 1, the association power to detect a recessive allele with probability p is the same as the power to detect dominant allele with probability 1-p. SNP probability ranges for greatest association power Dominant: Recessive: Additive: Higher relative risk SNPs require fewer cases and controls to achieve the same power. As  approaches 1, the association power to detect a recessive allele with probability p is the same as the power to detect dominant allele with probability 1-p.

Results Diseases with higher relative risk have their range of highest association power skewed toward lower probability SNPs. Challenges in obtaining high association power: Low probability recessive SNPs Low relative risk diseases, especially with small sample sizes High probability dominant SNPs, however these are unlikely due natural selection and that the majority of the population would be affected by such diseases. Diseases with higher relative risk have their range of highest association power skewed toward lower probability SNPs. Challenges in obtaining high association power: Low probability recessive SNPs Low relative risk diseases, especially with small sample sizes High probability dominant SNPs, however these are unlikely due natural selection and that the majority of the population would be affected by such diseases.