Single Nucleotide Polymorphism Copy Number Variations and SNP Array Xiaole Shirley Liu and Jun Liu.

Slides:



Advertisements
Similar presentations
applications of genome sequencing projects
Advertisements

Manish Anand Nihar Sheth Jim Costello Univ. of Indiana
What is an association study? Define linkage disequilibrium
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Multiple Comparisons Measures of LD Jess Paulus, ScD January 29, 2013.
BMI 731- Winter 2005 Chapter1: SNP Analysis Catalin Barbacioru Department of Biomedical Informatics Ohio State University.
Single Nucleotide Polymorphism And Association Studies
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
MALD Mapping by Admixture Linkage Disequilibrium.
Signatures of Selection
Genomics An introduction. Aims of genomics I Establishing integrated databases – being far from merely a storage Linking genomic and expressed gene sequences.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Applying haplotype models to association study design Natalie Castellana June 7, 2005.
Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Genome-Wide Association Studies Xiaole Shirley Liu Stat 115/215.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
SNP Selection University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for.
Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Genome Variations & GWAS
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Problem Set I review BIOL221T: Advanced Bioinformatics for Biotechnology Irene Gabashvili, PhD.
Genetic Variations Lakshmi K Matukumalli. Human – Mouse Comparison.
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
CS177 Lecture 10 SNPs and Human Genetic Variation
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
Gene Hunting: Linkage and Association
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
What is a SNP?. Lecture topics What is a SNP? What use are they? SNP discovery SNP genotyping Introduction to Linkage Disequilibrium.
Polymorphism Haixu Tang School of Informatics. Genome variations underlie phenotypic differences cause inherited diseases.
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
Genotype Calling Jackson Pang Digvijay Singh Electrical Engineering, UCLA.
Genes in human populations n Population genetics: focus on allele frequencies (the “gene pool” = all the gametes in a big pot!) n Hardy-Weinberg calculations.
INTRODUCTION TO ASSOCIATION MAPPING
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
1 DNA Polymorphisms: DNA markers a useful tool in biotechnology Any section of DNA that varies among individuals in a population, “many forms”. Examples.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
1 Balanced Translocation detected by FISH. 2 Red- Chrom. 5 probe Green- Chrom. 8 probe.
The International Consortium. The International HapMap Project.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Simple-Sequence Length Polymorphisms SSLPs Short tandemly repeated DNA sequences that are present in variable copy numbers at a given locus. Scattered.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment.
NCSU Summer Institute of Statistical Genetics, Raleigh 2004: Genome Science Session 3: Genomic Variation.
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
The Haplotype Blocks Problems Wu Ling-Yun
Genome-Wides Association Studies (GWAS) Veryan Codd.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Restriction Fragment Length Polymorphism. Definition The variation in the length of DNA fragments produced by a restriction endonuclease that cuts at.
Evolution and Population Genetics
Single Nucleotide Polymorphisms (SNPs
Of Sea Urchins, Birds and Men
Xiaole Shirley Liu STAT115/STAT215/
Introduction to bioinformatics lecture 11 SNP by Ms.Shumaila Azam
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
BF528 - Genomic Variation and SNP Analysis
SNPs and CNPs By: David Wendel.
Presentation transcript:

Single Nucleotide Polymorphism Copy Number Variations and SNP Array Xiaole Shirley Liu and Jun Liu

2 Outline Definition and motivation SNP distribution and characteristics –Allele frequency, LD, population stratification SNP discovery (unknown) and genotyping (known)SNP discovery genotyping –CNV detection

3 Polymorphism Polymorphism: sites/genes with “common” variation, less common allele frequency ≥1%, otherwise called rare variant and not polymorphic First discovered (early 1980): restriction fragment length polymorphism Some definitions: –Locus: position on chromosome where sequence or gene is located –Allele: alternative form of DNA on a locus

4 Polymorphism Single Nucleotide Polymorphism –Occasionally short (1-3 bp) indels are considered SNPs too –Come from DNA-replication mistake individual germ line cell, then transmitted –~90% of human genetic variation Copy number variations –May or may not be genetic

5 Why Should We Care Disease gene discovery –Association studies, certain SNPs are susceptible for diabetes –Chromosome aberrations, duplication / deletion might cause cancer Personalized Medicine –Drug only effective if you have one allele

6

7

8 SNP Distribution Most common, 1 SNP / bp –Balance between mutation introduction rate and polymorphism lost rate –Most mutations lost within a few generations 2/3 are CT differences In non-coding regions, often less SNPs at more conserved regions In coding regions, often more synonymous than non-synonymous SNPs

9 SNP Characteristics: Allele Frequency Distribution Most alleles are rare (minor allele frequency < 10%)

10 Mode of inheritance

11 SNP Characteristics: Allele Frequency Distribution Nucleotide diversity –Average fraction of nucleotides differ between a pair of random chosen allele AACCG GCTTA GCCGA GTTAT AAGCG GCTTA GCCGA GATAT AACCG GCTAA GCCGA GTTAT AAGCG GCTTA GCCGA GTTAT AACCG GCTTA GCCGA GATAT

12 SNP Characteristics: Hardy-Weinberg equilibrium (HWE)

13 SNP Characteristics: Linkage Disequilibrium EquilibriumDisequilibrium LD: If Alleles occur together more often than can be accounted for by chance, then indicate two alleles are physically close on the DNA –In mammals, LD is often lost at ~100 KB –In fly, LD often decays within a few hundred bases

14 SNP Characteristics: Linkage Disequilibrium Statistical Significance of LD –Chi-square test with 1 df –e ij = n i. n. j / n T B1B2Total A1n 11 n 12 n1.n1. A2n 21 n 22 n2.n2. Totaln. 1 n. 2 nTnT

15 SNP Characteristics: Linkage Disequilibrium Three ways to calculate LD Observed Expected

16 SNP Characteristics: Linkage Disequilibrium Haplotype block: a cluster of linked SNPs Haplotype boundary: blocks of sequence with strong LD within blocks and no LD between blocks, reflect recombination hotspots Haplotype size distribution

17 SNP Characteristics: Linkage Disequilibrium Can see haplotype block: a cluster of linked SNPs

18 SNP Characteristics: Linkage Disequilibrium [C/T] [A/G] T X C [A/C] [T/A] –Possible haplotype: 2 4 –In reality, a few common haplotypes explain 90% variations Tagging SNPs: –SNPs that capture most variations in haplotypes –removes redundancy Redundant

19 SNP Characteristics: Population Stratification Population stratification: individuals selected from two genetically different populations, stratification may be environmental, cultural, or genetic Could give spurious results in case control association studies – the example of “chopstick genes”

Using genetic variation to study populations 20

21 SNP Discovery Methods Sequencing individuals for difference: too costly First check whether big regions have SNPs –Basic idea: denature and re-anneal two samples, detect heterduplex –Can pool samples (e.g. 10 African with 10 Caucasians) to speed screening Resequence to verify dbSNP: 12M RefSNP, 6M validateddbSNP

22 SNP Genotyping For a known locus TT C/A AG, does this individual have CC, AA or AC? Many methods Hybridization-based methods –Dynamic allele-specific hybridization –Molecular beacons –SNP-array chip (simultaneously genotype thousands of SNPs) Enzyme-based methods –RFLP –PCR-based methods –Flap endonuclease –Primer extension –Oligonucleotide ligase assay Other methods (based on physical properties of DNA)

23 SNP Array One SNP at a time or genome-wide (SNP array) 2.5kb 5.8kb 0.30

24 40 Probes Used Per SNP Allele call –AA, BB, AB Signal –Theoretically 1A+1B, 2A, 2B –But could have 1A+3B Amplified!

25 T SNP Chip for LOH Loss of Heterozygosity: tumor suppressor gene inactivation by allelic loss in cancers TT NormalFirst genetic hitCancer X OR TT X T X T X A BAA A B LOH

26 Making LOH Calls Compare the cancer and normal SNP profile of the same individual

27 SNP Array for CNV Collect normal / diseased samples on SNP arrays Probe normalization, background subtraction Use HMM to infer CNV

28 Integrate CNV with Expression to Identify oncogene MITF in melanoma

29 Summary SNP and CNV SNP distribution and characteristics –Allele frequency (minor allele > 1%) –LD: linkage ~ physical proximity –Population stratification SNP discovery: heteroduplex SNP genotyping –SNP array –CNV detection: HMM

30 Acknowledgement Stefano Monti Tim Niu Kenneth Kidd, Judith Kidd and Glenys Thomson Joel Hirschhorn Greg Gibson & Spencer Muse