Download presentation
Presentation is loading. Please wait.
Published byPhilippa Phillips Modified over 9 years ago
1
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment and Assembly Applications: structural changes, GWAS
2
The chromosome
3
SNP Variations in DNA sequence. Single Nucleotide Polymorphism (SNP) --- a single letter change in the DNA. Common SNPs occur every few hundred bases. Each form is called an “allele”. Almost all SNPs have only two alleles. Allele frequencies are often different between ethnic groups. http://upload.wikimedia.org/wiki pedia/commons/thumb/2/2e/Dn a-SNP.svg/180px-Dna- SNP.svg.png
4
Correlations between SNPs Why measure the SNP alleles? http://www.evolutionpages.com/images/ crossing_over.gif DNA change in two ways during evolution: Point mutation SNPs Recombination This happens in large segments. Alleles of adjacent SNPs are highly dependent. Haplotype: A group of alleles linked closely enough to be inherited mostly as a unit.
5
Why SNP? http://www.hapmap.org/originhaplotype. html.en Figure 1: This diagram shows two ancestral chromosomes being scrambled through recombination over many generations to yield different descendant chromosomes. If a genetic variant marked by the A on the ancestral chromosome increases the risk of a particular disease, the two individuals in the current generation who inherit that part of the ancestral chromosome will be at increased risk. Adjacent to the variant marked by the A are many SNPs that can be used to identify the location of the variant.
6
Why SNP? Nature Genetics 26, 151 - 157 (2000) Figure 1. Schematic model of trait aetiology. The phenotype under study, Ph, is influenced by diverse genetic, environmental and cultural factors (with interactions indicated in simplified form). Genetic factors may include many loci of small or large effect, G Pi, and polygenic background. Marker genotypes, Gx, are near to (and hopefully correlated with) genetic factor, G p, that affects the phenotype. Genetic epidemiology tries to correlate G x with Ph to localize G p. Above the diagram, the horizontal lines represent different copies of a chromosome; vertical hash marks show marker loci in and around the gene, G p, affecting the trait. The red P i are the chromosomal locations of aetiologically relevant variants, relative to Ph. SNPs The gene deciding pheonotype
7
SNP array The SNP array Affymetrix.com
8
SNP array The SNP array Affymetrix.com 40 probes per SNP (20 for forward strand and 20 for reverse strand.) PM/MM strategy. Data summary (generating AA/AB/BB calls) omitted here.
9
SNP array Genotype calls Association analysis Linkage analysis Loss of Heterozygosity Signal strength Copy number abberation
10
CNA --- Background Copy Number Aberration (CNA): A form of chromosomal aberration Deviation from the regular 2 copies for some segments of the chromosomes One of the key characteristics of cancer CNA in cancer: Reduce the copy number of tumor-suppressor genes Increase the copy number of oncogenes Possibly related to metastasis
11
CNA --- the statistician’s task High density arrays allow us to identify “focused CNA”: copy number change in small DNA segments. With the high per-probeset noise, how to achieve high sensitivity AND specificity?
12
CNA – maximizing sensitivity/specificity Two approaches that complement each other: Reducing noise at the single probeset level: Based on dose-response (Huang et al., 2006) Based on sequence properties (Nannya et al., 2005) Segmentation methods. Smoothing; Hidden Markov Model-based methods; Circular Binary Segmentation … …
13
HMM data segmentation Fridlyand et al. Journal of Multivariate Analysis, June 2004, V. 90, pp. 132-153 Amplified Normal Deleted
14
Forward-backword fragment assembling
15
Some example: Top: model cell line, 3 copy segment in chromosome 9 Bottom: Cancer sample
16
Keith W. Brown and Karim T.A. Malik, 2001, Expert Reviews in Molecular Medicine LOH Loss of Heterozygosity (LOH) Happens in segments of DNA.
17
Discov Med. 2011 Jul;12(62):25-32. LOH On SNP array, LOH will yield identical calls (AA or BB, rather than AB) for a number of consecutive SNPs.
18
GWAS © Pasieka, Science Photo Libraryhttp://www.mpg.de/10680/Modern_psychiatry
19
GWAS
20
Nature Genetics 41, 986 - 990 (2009) GWAS Genome-wide association study identifies variants in the ABO locus associated with susceptibility to pancreatic cancer
21
DNA sequencing
22
Background
26
When a reference genome is available --- Alignment Can rely on existing reference genome as a blue print. Align the short reads onto the reference genome. Need a few fold coverage to cover most regions. Sequence a whole new genome? --- Assembly Overlaps are required to construct the genome. The reads are short need ~30 fold coverage. If 3G data per run, need 30 runs for a new genome similar to human size. Alignment and Assembly
27
Hash table-based alignment. Similar to BLAST in principle. (1) Find potential locations: (2) Local alignment.
28
Alignment and Assembly From read to graph:
29
Alignment and Assembly
30
de Bruijn graph assembly Red: read error.
31
Alignment and Assembly de Bruijn graph assembly
32
Alignment and Assembly de Bruijn graph assembly
33
Whole gnome/exome/transcriptome sequencing
34
Genomics Whole genome sequencing detects all variants (SNP alleles, rare variants, mutations) Could be associated with disease: Rare variants (burden testing by collapsing by gene) De novo mutations (need family tree) Rare Mendelian disorders Structural variants in cancer
35
Identification of translocations from discordant paired-end reads. Cancer Genetics 206 (2014) 432e440 Structural changes
36
CNV by depth of coverage Cancer Genetics 206 (2014) 432e440 Structural changes
37
Cancer Genetics 206 (2014) 432e440 Structural changes
38
http://www.geneious.com/features/sequence-analysis-annotation-prediction Genotype calling
39
Medical Genomics Nature Reviews Genetics 11, 415 Example: Extreme-case sequencing to find rare variants associated with a disease.
40
GWAS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.