SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.

Slides:



Advertisements
Similar presentations
Lecture 2 Strachan and Read Chapter 13
Advertisements

applications of genome sequencing projects
What is an association study? Define linkage disequilibrium
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Note that the genetic map is different for men and women Recombination frequency is higher in meiosis in women.
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
ASSOCIATION MAPPING WITH TASSEL Presenter: VG SHOBHANA PhD Student CPMB.
Gene Linkage and Genetic Mapping
Polymorphisms: Clinical Implications By Amr S. Moustafa, M.D.; Ph.D. Assistant Prof. & Consultant, Medical Biochemistry Dept. College of Medicine, KSU.
Ferdinand van ’t Hooft Cardiovascular Genetics and Genomics Group Karolinska Institutet, Stockholm, Sweden Genome-Wide Association Study GWAS
MALD Mapping by Admixture Linkage Disequilibrium.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Genomics An introduction. Aims of genomics I Establishing integrated databases – being far from merely a storage Linking genomic and expressed gene sequences.
Dr. Almut Nebel Dept. of Human Genetics University of the Witwatersrand Johannesburg South Africa Significance of SNPs for human disease.
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Single nucleotide polymorphisms and applications Usman Roshan BNFO 601.
Restriction Fragment Length Polymorphisms (RFLPs) By Amr S. Moustafa, M.D.; Ph.D. Assistant Prof. & Consultant, Medical Biochemistry Dept. College of.
RFLP DNA molecular testing and DNA Typing
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Standardization of Pedigree Collection. Genetics of Alzheimer’s Disease Alzheimer’s Disease Gene 1 Gene 2 Environmental Factor 1 Environmental Factor.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Broad-Sense Heritability Index
A gene is composed of strings of bases (A,G, C, T) held together by a sugar phosphate backbone. Reminder - nucleotides are the building blocks.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
Non-Mendelian Genetics
CS177 Lecture 10 SNPs and Human Genetic Variation
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.
Gene Hunting: Linkage and Association
Announcements: Proposal resubmission deadline 4/23 (Thursday).
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Complex Traits Most neurobehavioral traits are complex Multifactorial
Quantitative Genetics
Personalized Medicine Dr. M. Jawad Hassan. Personalized Medicine Human Genome and SNPs What is personalized medicine? Pharmacogenetics Case study – warfarin.
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
What is a SNP?. Lecture topics What is a SNP? What use are they? SNP discovery SNP genotyping Introduction to Linkage Disequilibrium.
Genes in human populations n Population genetics: focus on allele frequencies (the “gene pool” = all the gametes in a big pot!) n Hardy-Weinberg calculations.
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
1 DNA Polymorphisms: DNA markers a useful tool in biotechnology Any section of DNA that varies among individuals in a population, “many forms”. Examples.
Allele Frequencies: Staying Constant Chapter 14. What is Allele Frequency? How frequent any allele is in a given population: –Within one race –Within.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
1 Balanced Translocation detected by FISH. 2 Red- Chrom. 5 probe Green- Chrom. 8 probe.
The International Consortium. The International HapMap Project.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Simple-Sequence Length Polymorphisms SSLPs Short tandemly repeated DNA sequences that are present in variable copy numbers at a given locus. Scattered.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Genetics of Gene Expression BIOS Statistics for Systems Biology Spring 2008.
NCSU Summer Institute of Statistical Genetics, Raleigh 2004: Genome Science Session 3: Genomic Variation.
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Common variation, GWAS & PLINK
Genetic Linkage.
Of Sea Urchins, Birds and Men
Introduction to bioinformatics lecture 11 SNP by Ms.Shumaila Azam
Genetic Linkage.
Recombination (Crossing Over)
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Power to detect QTL Association
Diversity of Individuals and Evolution of Populations
Genome-wide Associations
Genetic Linkage.
Association Design Begins with KNOWN polymorphism theoretically expected to be associated with the trait (e.g., DRD2 and schizophrenia). Genotypes.
Presentation transcript:

SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt

Human Genome and SNPs Now that the human genome is (mostly) sequenced, attention turning to the evaluation of variation Alterations in DNA involving a single base pair are called single nucleotide polymorphisms, or SNPs Map of ~1.4 million SNPs (Feb 2001) It is estimated that ~60,000 SNPs occur within exons; 85% of exons within 5 kb of nearest SNP

SNP Initiatives Industrial Academic – Industry Consortium Governmental Genset Incyte Celera CuraGen Academic – Industry Consortium Governmental US Japan Non-industrial scale academic programs

Goals of SNP Initiatives Immediate goals: Detection/identification of … The hundreds of thousands of SNPs estimated to be present in the human genome Interest also in other organisms, e.g. potatoes(!) Establishment of SNP Database(s)

Longer term goals: Areas of SNP Application Gene discovery and mapping Association-based candidate polymorphism testing Diagnostics/risk profiling Response prediction Homogeneity testing/study design Gene function identification …etc. See Schork, Fallin, Lanchbury 2000

Polymorphism Technical definition: most common variant (allele) occurs with less than 99% frequency in the population Also used as a general term for variation Many types of DNA polymorphisms, including RFLPs, VNTRs, microsatellites ‘Highly polymorphic’ = many variants

Use of Polymorphism in Gene Mapping 1980s – RFLP marker maps 1990s – microsatellite marker maps

SNPs in Genetic Analysis Abundance – lots Position – throughout genome Haplotype patterns – groups of SNPs may provide exploitable diversity Rapid and efficient to genotype Increased stability over other types of mutation Recombination patterns – e.g. ‘hot spots’

Gene Discovery and Mapping Linkage Analysis Within-family associations between marker and putative trait loci Linkage Disequilibrium (LD) Across-family associations

One locus: Founder genotype probabilities Founder: individual whose parents are not in the pedigree Usually obtain genotype probs. assuming Hardy-Weinberg Equilibrium (HWE): Say P(D) = p, P(d) = 1-p; Then P(DD) = p2, P(Dd) = 2p(1-p), P(dd) = (1-p)2 Genotypes of founder couples treated as independent: P(Father Dd and Mother DD) = 2p(1-p)3

One locus: Transmission probabilities (I) Offspring get their genes according to Mendel’s rules… Independently for different offspring P(3 dd | 1 Dd & 2 Dd) = ½ x ½ Dd Dd 1 2 dd 3

One locus: Transmission probabilities (II) Dd Dd 1 2 3 4 5 dd Dd DD P(3 dd & 4 Dd & 5 DD| 1 Dd & 2 Dd) = (½ x ½) x (2 x ½ x ½) x (½ x ½)

P(affected|DD) = p (<1) One locus: Penetrance Usual to assume that the chance of having a particular phenotype (being affected with a disease, say) depends only on the genotype at one locus Complete penetrance: P(affected|DD) = 1 Incomplete penetrance: P(affected|DD) = p (<1)

One locus: putting it all together 1 2 Assume: P(Aff|dd) = .1 P(Aff|Dd) = .3 P(Aff|DD) = .8 P(D) = .01 Dd Dd 3 4 5 dd Dd DD P(pedigree) = (2 x .01 x .99 x .7) x (2 x .01 x .99 x .3) x (½ x ½ x .9) x (2 x ½ x ½ x .7) x (½ x ½ x .8)

Crossing over and Recombination

Two loci: Linkage and Recombination Dd TT Dd tt 1 2 Dd Tt 3 T D (1-)/2 /2 ½ d 3 produces gametes in proportions:

Recombination Fraction  = ½ : independent assortment (Mendel)  < ½ : linked loci  = 0 : tightly linked loci (no recombination) In 3, if the loci are linked then D-T and d-t are parental haplotypes, D-t and d-T are recombinant haplotypes

LOD-score Linkage Analysis LOD(*) = log10 of the odds ratio L: L = P(data|*)/P(data|½) LOD(*) measures the relative strength of the data for  = * rather than  = ½ Can compute LOD() at several values Can find the value  maximizing the LOD

IBD Allele Sharing

Allele-sharing Methods Based on number (or proportion) of alleles shared identical by descent (IBD) of related individuals Can be done either assuming (likelihood-based) or not assuming (nonparametric) a genetic mode of inheritance for a trait

Errors Genotyping errors can result in false positive or false negative findings Data checking/cleaning necessary (although there are approaches which model error) Must be especially careful with SNP genotypes, because errors often pass simple Mendelian checks

Disease-Marker Association A marker locus is associated with a disease if the distribution of genotypes at the marker locus in disease-affected individuals differs from the distribution in the general population A specific allele may be positively associated (over-represented in affecteds) or negatively associated (under-represented)

Examples: Alzheimer’s Alzheimer’s disease and ApoE E4 present E4 absent Patients 58 33 Controls 16 55 The E4 allele appears to be positively associated with Alzheimer’s disease: Odds Ratio = (58/16)/(33/55) = 6

Examples: HLA Disease Allele RR Ankylosing spondylitis B27 87 Myasthenia gravis B8 4.1 Systemic lupus erythematosus 2.1 Hemachromotosis A3 8.2 (and many more…)

Linkage Disequilibrium Disease locus Alleles D, d LD penetrance Marker locus Alleles M, m Disease

Linkage Disequilibrium Concept of the ‘historical recombinant’ Explanations for observed association between marker and disease: Marker locus may be a disease susceptibility locus Marker locus may be linked to disease susceptibility locus Spurious result due, e.g. to admixture, population stratification, heterogeneity

Linkage and LD Mutation occurs Allele D is created Nearby marker Allele M was nearby D and M subsequently transmitted together

Candidate Polymorphism Testing Linkage and LD assume markers have indirect association with the trait Large SNP collections may allow testing for direct, physiologically relevant associations with trait

Diagnostics/Risk Profiling Identified SNP associations can potentially be used to develop diagnostic tools Applicability will require large-scale studies, since most diseases of interest now are influenced by many genetic and nongenetic factors

Response Prediction Related to diagnosis/risk assessment Strategy: stratify populations to improve effectiveness of interventions Pharmaceutical companies especially interested in this: Aim to identify those likely to respond Predict toxicity reactions in susceptible individuals Response to any kind of substance; creation of ‘functional foods’

Homogeneity Testing Test to protect against false inferences about the relationship between endpoints (e.g. disease) and risk factors Assess generalizability of results Can assess the homogeneity of the genetic background of study participants using a panel of randomly distributed SNPs

Gene Function Identification Alternative to other experimental procedures (e.g. knock-outs, which cannot be used in humans) Studies to compare individuals with and without naturally occurring disease predisposing genetic profiles

Haplotype Variation The large databases already available (and increasing in size) should allow characterization of haplotype variation across the genome in different populations Can help population geneticists trace evolution and reveal connections between populations/ethnic groups