Single nucleotide polymorphisms Usman Roshan. SNPs DNA sequence variations that occur when a single nucleotide is altered. Must be present in at least.

Slides:



Advertisements
Similar presentations
Mendel’s Laws.
Advertisements

Review of main points from last week Medical costs escalating largely due to new technology This is an ethical/social problem with major conseq. Many new.
CZ5225 Methods in Computational Biology Lecture 9: Pharmacogenetics and individual variation of drug response CZ5225 Methods in Computational Biology.
AP Biology.  Segregation of the alleles into gametes is like a coin toss (heads or tails = equal probability)  Rule of Multiplication  Probability.
Simple Logistic Regression
Lecture for Tuesday September 23, 2003 What’s due? CH2 problem set Assignments: CH4 problems: 1-5, 8, 10, 11, 14, 16, 17, 21, 22 What’s due Thursday 9/25?
Genome-wide association studies BNFO 602 Roshan. Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick.
Linkage. Announcements 23andme genotyping. 23andme will genotype in ~3 weeks. You need to deliver finished spit kit by Friday NOON.
What is a χ2 (Chi-square) test used for?
Genetics A. The Vocabulary of Genetics 1. Chromosome – bar-like structures of tightly coiled chromatin (DNA), visible during cellular division 2. Homologous.
What is a chromosome?.
Single nucleotide polymorphisms and applications Usman Roshan BNFO 601.
More Powerful Genome-wide Association Methods for Case-control Data Robert C. Elston, PhD Case Western Reserve University Cleveland Ohio.
Genome-wide association studies Usman Roshan. SNP Single nucleotide polymorphism Specific position and specific chromosome.
Computational Tools for Finding and Interpreting Genetic Variations Gabor T. Marth Department of Biology, Boston College
BNFO 602 Lecture 1 Usman Roshan.
Genome-wide association studies BNFO 601 Roshan. Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick.
BNFO 602 Lecture 2 Usman Roshan. Bioinformatics problems Sequence alignment: oldest and still actively studied Genome-wide association studies: new problem,
Genome-wide association studies Usman Roshan. Recap Single nucleotide polymorphism Genome wide association studies –Relative risk, odds risk (or odds.
1 The Odds Ratio (Relative Odds) In a case-control study, we do not know the incidence in the exposed population or the incidence in the nonexposed population.
Single nucleotide polymorphisms and applications Usman Roshan BNFO 601.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
Biostatistics-Lecture 2 Ruibin Xi Peking University School of Mathematical Sciences.
Disease Models and Association Statistics Nicolas Widman CS 224- Computational Genetics Nicolas Widman CS 224- Computational Genetics.
Genome-wide association studies Usman Roshan. SNP Single nucleotide polymorphism Specific position and specific chromosome.
Allele. Alternate form of a gene gene variant autosome.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Regression Usman Roshan CS 698 Machine Learning. Regression Same problem as classification except that the target variable y i is continuous. Popular.
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
Warm-Up 1. What is the phenotypic ratio and genotypic ratio of the offspring of a monohybrid cross? 2. What is the phenotypic ratio of a dihybrid cross?
6.4 Traits, Genes, and Alleles KEY CONCEPT Genes encode proteins that produce a diverse range of traits.
The same gene can have many versions.
Dominant and Recessive Dominance Table 3. Alleles sequence of DNA any of several forms of a gene determine the genotype (genetic constitution of an organism.
More Contingency Tables & Paired Categorical Data Lecture 8.
Regression Usman Roshan CS 675 Machine Learning. Regression Same problem as classification except that the target variable y i is continuous. Popular.
Lectures 7 – Oct 19, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Genome-wide association studies
Did Mendel fake is data? Do a quick internet search and can you find opinions that support or reject this point of view. Does it matter? Should it matter?
III. Statistics and chi-square How do you know if your data fits your hypothesis? (3:1, 9:3:3:1, etc.) For example, suppose you get the following data.
AP Biology Heredity PowerPoint presentation text copied directly from NJCTL with corrections made as needed. Graphics may have been substituted with a.
Chapter 5 Heredity The passing of traits from parent to offspring Click for Term.
Power Calculations for GWAS
Genetics A. The Vocabulary of Genetics
The same gene can have many versions.
Phenotype the set of observable characteristics of an individual resulting from their DNA information.
The same gene can have many versions.
The same gene can have many versions.
The same gene can have many versions.
Culminating Performance Task
The same gene can have many versions.
The same gene can have many versions.
The student is expected to: 6A identify components of DNA, and describe how information for specifying the traits of an organism is carried in the DNA.
The same gene can have many versions.
Mendelian Inheritance
The same gene can have many versions.
Genetics Vocabulary Review
The same gene can have many versions.
The same gene can have many versions.
Chapter 5 Heredity.
What is a χ2 (Chi-square) test used for?
The same gene can have many versions.
The same gene can have many versions.
Pedigrees A Pedigree allows you to trace an inherited (genetic) disease through a family. The pattern of a pedigree helps determine: If the disease is.
The same gene can have many versions.
The same gene can have many versions.
The same gene can have many versions.
Presentation transcript:

Single nucleotide polymorphisms Usman Roshan

SNPs DNA sequence variations that occur when a single nucleotide is altered. Must be present in at least 1% of the population to be a SNP. Occur every 100 to 300 bases along the 3 billion-base human genome. Many have no effect on cell function but some could affect disease risk and drug response.

Toy example

SNPs on the chromosome

Perl exercise Determining SNPs from a pairwise genome alignment: –Can we solve this problem with a Perl script?

Bi-allelic SNPs Most SNPs have one of two nucleotides at a given position For example: –A/G denotes the varying nucleotide as either A or G. We call each of these an allele –Most SNPs have two alleles (bi-allelic)

Perl exercise Determining SNP type from a multiple genome alignment.

SNP genotype We inherit two copies of each chromosome (one from each parent) For a given SNP the genotype defines the type of alleles we carry Example: for the SNP A/G one’s genotype may be –AA if both copies of the chromosome have A –GG if both copies of the chromosome have G –AG or GA if one copy has A and the other has G –The first two cases are called homozygous and latter two are heterozygous

SNP genotyping

Perl exercise SNP encoding: –Convert SNP genotype from a character sequence to numeric one

Real SNPs SNP consortium: snp.cshl.org SNPedia:

Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick random humans with and without cancer (say breast cancer) –Perform SNP genotyping –Look for associated SNPs –Also called genome-wide association study

Case-control example Study of 100 people: –Case: 50 subjects with cancer –Control: 50 subjects without cancer Count number of dominant and recessive alleles and form a contingency table #Recessive alleles #Dominant alleles Case1040 Control248

Perl exercise Contingency table: –Compute contingency table given case and control SNP genotype data

Odds ratio Odds of recessive in cancer = a/b = e Odds of recessive in no-cancer = c/d = f Odds ratio of recessive in cancer vs no-cancer = e/f #Recessive alleles #Dominant alleles Cancerab No cancercd

Risk ratio (Relative risk) Probability of recessive in cancer = a/(a+b) = e Probability of recessive in no-cancer = c/(c+d) = f Risk ratio of recessive in cancer vs no-cancer = e/f #Recessive alleles #Dominant alleles Cancerab No cancercd

Odds ratio vs Risk ratio Risk ratio has a natural interpretation since it is based on probabilities In a case-control model we cannot calculate the probability of cancer given recessive allele. Subjects are chosen based disease status and not allele type Odds ratio shows up in logistic regression models

Example Odds of recessive in case = 15/35 Odds of recessive in control = 2/48 Odds ratio of recessive in case vs control = (15/35)/(2/48) = 10.3 Risk of recessive in case = 15/50 Risk of recessive in control = 2/50 Risk ratio of recessive in case vs control = 15/2 = 7.5 #Recessive alleles #Dominant alleles Case1535 Control248

Odds ratios in genome-wide association studies Higher odds ratio means stronger association Therefore SNPs with highest odds ratios should be used as predictors or risk estimators of disease Odds ratio generally higher than risk ratio Both are similar when small

Statistical test of association (P-values) P-value = probability of the observed data (or worse) under the null hypothesis Example: –Suppose we are given a series of coin-tosses –We feel that a biased coin produced the tosses –We can ask the following question: what is the probability that a fair coin produced the tosses? –If this probability is very small then we can say there is a small chance that a fair coin produced the observed tosses. –In this example the null hypothesis is the fair coin and the alternative hypothesis is the biased coin