Next Generation Sequencing

Slides:



Advertisements
Similar presentations
Lecture 41 Prof Duncan Shaw. Genetic Variation Already know that genes have different alleles - how do these arise? Process of mutation - an alteration/change.
Advertisements

Manish Anand Nihar Sheth Jim Costello Univ. of Indiana
Evolution of genomes.
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Single Nucleotide Polymorphism Copy Number Variations and SNP Array Xiaole Shirley Liu and Jun Liu.
Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id
Outline to SNP bioinformatics lecture
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Mutation and DNA Mutation = change(s) in the nucleotide/base sequence of DNA; may occur due to errors in DNA replication or due to the impacts of chemicals.
Section 1: Mutation and Genetic Change
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Analyzing DNA Differences PHAR 308 March 2009 Dr. Tim Bloom.
Genetics-multistep tumorigenesis genomic integrity & cancer Sections from Weinberg’s ‘the biology of Cancer’ Cancer genetics and genomics Selected.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
The Biology and Genetic Base of Cancer. 2 (Mutation)
CS177 Lecture 10 SNPs and Human Genetic Variation
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
Polymorphism Haixu Tang School of Informatics. Genome variations underlie phenotypic differences cause inherited diseases.
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
The International Consortium. The International HapMap Project.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Single nucleotide polymorphisms and Large scale variation
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Microbial Genetics.  In bacteria genetic transfer (recombination) can happen three ways:  Transformation  Transduction  Conjugation  The result is.
Genes in ActionSection 1 Section 1: Mutation and Genetic Change Preview Bellringer Key Ideas Mutation: The Basis of Genetic Change Several Kinds of Mutations.
Notes: Human Genome (Right side page)
The Haplotype Blocks Problems Wu Ling-Yun
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
MEDICAL GENETICS.
Interpreting exomes and genomes: a beginner’s guide
Single Nucleotide Polymorphisms (SNPs
Genomic Analysis: GWAS
SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class.
Common variation, GWAS & PLINK
Nucleotide variation in the human genome
Molecular mechanism of mutation
Section 1: Mutation and Genetic Change
Consideration for Planning a Candidate Gene Association Study With TagSNPs Shehnaz K. Hussain, PhD, ScM Epidemiology 243: Molecular.
Genetic Variation Genetic Variation in Populations
Chapter 14 GENETIC VARIATION.
RFLP analysis RFLP= Restriction fragment length polymorphism
School of Pharmacy, University of Nizwa
Introduction to bioinformatics lecture 11 SNP by Ms.Shumaila Azam
Gene Hunting: Design and statistics
What makes a mutant?.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Heredity Lesson 8.
Genome-wide Associations
By Michael Fraczek and Caden Boyer
Sex Chromosomes So far, what do you know about sex chromosomes? In addition to their role in determining sex, the sex chromosomes, especially X chromosomes,
“TaqMan genotyping Assay’’
Sexual reproduction creates unique combinations of genes.
School of Pharmacy, University of Nizwa
Medical genomics BI420 Department of Biology, Boston College
Medical genomics BI420 Department of Biology, Boston College
The Content of the Genome
Haplotypes When the presence of two or more polymorphisms on a single chromosome is statistically correlated in a population, this is a haplotype Example.
BF528 - Whole Genome Sequencing and Genomic Variation
Reminder The AP Exam registration is open in Naviance. The Exam is on Monday, May 13. I’ll let you know when the next test/homework will be.
SNPs and CNPs By: David Wendel.
Analysis of protein-coding genetic variation in 60,706 humans
Presentation transcript:

Next Generation Sequencing SNP/Haplotype Cancer Genomes Complete Genomics Inc Allen.M.Chen@gmail.com UC Berkeley, 2011/10/19 (Wed) @ Cyagen, GuangZhou

Part 1 SNPs and Haplotypes

SNP Concepts SNPs – what are they? Why are SNPs important? SNPs and linkage disequilibrium Common SNPs / haplotype blocks SNPs / Haplotype block – navigation Building complex traits and ‘the myth of race?’, the role of SNPs and haplotypes

SNPs - Introduction Single Nucleotide Polymorphism Occurs once in human evolution Estimate of 1 bp in 600 – 1000 bp Occur mostly in introns (2/3) Regulatory regions, leading to cancer Often lethal when in exons (1/3) Leading to a fatal amino acid substitution

What are Single Nucleotide Polymorphisms (SNPs)? ATGGTAAGCCTGAGCTGACTTAGCGT-AT ATGGTAAACCTGAGTTGACTTAGCGTCAT    SNP SNP indel SNPs result from replication errors and DNA damage They are a ‘polymorphic’ bit state at a nucleoside address

Single nucleotide polymorphisms (SNPs) * Most common genetic variation in human genome. Occur every ~1000 base pairs on average in genome * ~4 millions in database for humans

What (exactly) is a SNP ? A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more than 1 percent) of a large population. Occurs exactly once in human evolution Bi-allele: A/G, translate to 0/1 Polymorphism: frequency 0.1%-49%

SNPs / Polymorphisms A Single Nucleotide Polymorphism is a source of variance in a genome. A SNP ("snip") is a single base mutation in DNA. SNPs are ‘conserved’ across the genome, often in patterns called ‘haplotype blocks’ SNPs are the most simple form and most common source of genetic polymorphism in the human genome (90% of all human DNA polymorphisms are associated with SNPs).

Why are ‘SNP’ Polymorphisms Useful? It’s sometimes possible to correlate a SNP with a particular trait or disease This is known as association genetics Susceptibility to disease may also be described as an ‘unfortunate trait’ Traits are also ‘larger’ than genes SNPs in (regulatory) introgenic regions may be as important as (coding) exons

Disease resistant population Disease susceptible population Genotype all individuals for thousands of SNPs ATGATTATAG ATGTTTATAG geneX Resistant people all have an ‘A’ at position 4 in geneX, while susceptible people have a ‘T’ (A/T are the SNPs)

SNP Applications in Medicine Gene discovery and allele mapping Association-based (drug) candidate polymorphism testing of a trait pool Diagnostics / risk profiling Drug response prediction Homogeneity testing / study design Gene function identification

Genetic Polymorphism Genetic Polymorphism: A difference in DNA sequence among individuals, groups, or populations. One or more SNPs. Genetic Mutation: A change in the nucleotide sequence of a DNA molecule. Genetic mutations are one type of genetic polymorphism (but less than 1%). Polymorphism is common, mutation rare Polymorphism is the ‘stuff of variation’

Variations in Genomes SNP variations Polymorphisms Deletions Insertions Translocations Single Nucleotide Polymorphisms (SNPs), Haplotypes, Linkage Disequilibrium, and the Human Genome

Coding Region SNPs Types of coding region SNPs: Synonymous: the substitution causes no amino acid change to the protein it produces. This is also called a silent mutation. Non-Synonymous: the substitution results in an alteration of the encoded amino acid. A missense mutation changes the protein by causing a change of codon. A nonsense mutation results in a misplaced termination. One half of all coding sequence SNPs result in non-synonymous codon changes. (but half do)

SNPs and Protein Structure Single Nucleotide Polymorphisms (SNPs), Haplotypes, Linkage Disequilibrium, and the Human Genome

SIFT (Sorting Intolerant From Tolerant) Coding Changes CYP4F2:cytochrome P450, subfamily IVF, polypeptide 2 Trp (W)  Gly (G) Predicted to be tolerated Val (V)  Gly (G) Predicted not to be tolerated SNP Discovery and Genotyping Workshop, reference Ng and Henikoff, Gen. Res. 2002

Types of SNPs There are two types of nucleotide base substitutions resulting in SNPs: Transition: substitution between purines (A, G) or between pyrimidines (C, T). Constitute two thirds of all SNPs. Transversion: substitution between a purine and a pyrimidine. Ratio: 2.15

How Common are SNPs? Of the roughly 5 to 6 million SNPS Half appear in 50% of the population One quarter in 25% of the population One eighth in 12.5% of the population The pattern of inheritance of SNPs appears to follow the pattern of inheritance of haplotypes (linkage)

*The International SNP Map Working Group, Nature 409:928 - 933 (2001) Sequence Variation in Humans Population size: 6x109 (diploid) Mutation rate: 2x10–8 per bp per generation Expected “hits”: 240 (humans with a SNP per bp)  Every variant compatible with life exists in the population BUT most are vanishingly rare Compare 2 haploid genomes: 1 SNP per 1331 bp* *The International SNP Map Working Group, Nature 409:928 - 933 (2001) SNP Discovery and Genotyping Workshop

SNP projects Mine them from existing genome resources; NCBI dbSNP website Targeted SNP discovery in candidate genes HapMap study to define global SNP population Berkeley PGA - http://pga.lbl.gov/ CardioGenomics - http://www.cardiogenomics.org/ InnateImmunity - http://innateimmunity.net/ SeattleSNPs - http://pga.mbt.washington.edu/

Sequence-Based Detection and Genotyping of SNPs Jim Sloan, Tushar Bhangle (PolyPhred) Matthew Stephens, Paul Scheet (Quality Scores for SNPs) Phil Green, Brent Ewing, David Gordon (Phred, Phrap, Consed) SNP Discovery and Genotyping Workshop

SNPs and Variation In human beings, 99.9 percent of bases are same Remaining 0.1 percent makes a person unique. Different attributes / characteristics / traits how a person looks, diseases he or she develops. These variations can be: Harmless (change in phenotype) Harmful (diabetes, cancer, heart disease, Huntington's disease, and hemophilia ) Latent (variations found in coding and regulatory regions, are not harmful on their own, and the change in each gene only becomes apparent under certain conditions e.g. susceptibility to lung cancer)

Why Create SNP Profiles? Genome of each individual contains a distinct SNP pattern (haplotype block). People can be grouped based on their SNP profiles (association studies). SNP profiles may be important for identifying response to drug therapy. Correlations might emerge between certain SNP profiles and specific responses to treatment (good and bad).

Populations Based on SNPs Single Nucleotide Polymorphisms (SNPs), Haplotypes, Linkage Disequilibrium, and the Human Genome

SNPs and Haplotypes SNP: Single Nucleotide Polymorphism Haplotype: A set of closely linked genetic markers present on one chromosome which tend to be inherited together (not easily separable by recombination). G A C Set of SNP polymorphisms: a SNP haplotype

From SNP to Haplotype Polymorphism / Haplotype Correlation of characters states among polymorphic sites (across the genome). SNP patterns = ‘blocks’ Insufficient passage of time to randomize character states by meiotic recombination – patterns conserved Haplotypes are ‘blocks’ of associated SNPs Haplotypes may be ‘too recent’… Not enough time for recombination to merge SNPs Or SNP recombination may be a more difficult process

Phenotype vs Haplotype SNP SNP Phenotype Black eye Brown eye Blue eye GATATTCGTACGGA-T GATGTTCGTACTGAAT GATATTCGTACGGAAT AG- 2/6 GTA 3/6 AGA 1/6 Haplotypes 1 2 3 4 5 6 SNP Simple to measure & understand Haplotype have the advantage in the appropriate circumstances of carrying more information about the genotype-phenotype link than do the underlying SNPs. DNA Sequence

SNPs / Linkage Disequilibrium Generally speaking… SNPs should be inherited independently Following Mendelian inheritance Many SNPs appear to be co-inherited Creating ‘hot spots’ in the human genome Blocks of SNPs – ‘SNP haplotypes’ We don’t why or how, but they exist

Haplotypes: the ‘Nature’ of SNPs Visualizing haplotype blocks Haplotypes and SNP distribution Haplotypes as genomic ‘barcodes’ HapMap Project: Global Haplotype study (2005) 1000 Genome (2010) Haplotypes and traits, mapping alleles The myth of race – insights on variation

did not pass the threshold Seven unique Haplotypes in 20 Independent Samples of Chromosomes 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Haplotype Structure one block 26 SNPs spanning 19kb - major allele - minor allele Missing data… did not pass the threshold for data quality Seven unique Haplotypes in 20 chromosomes 1 2 3 4 5 6 7 SNP # 1 SNP # 2 The four most common patterns include 16 of the 20 chromosomes

The International HapMap Project, Nature 2003

Visualizing Haplotype Blocks http://acg.media.mit.edu/people/fry/haplotypes/

SNP HapMap Project http://www.HapMap.org/ Sequence genomes of a large number of people (n >10,000 ethnically diverse) Compare the base sequences to discover SNPs, their location, and their frequency. Generate a single map of the human genome containing all possible SNPs => SNP maps

Haplotypes in the Genome Defined as a pattern of SNPs that appears as an ‘associated block’ on one or more chromosomes There are (estimated*) to be roughly 200K to 300K haplotypes in genome Of these, most can be defined by the identity of 3 SNPs, and always < 10 * Cox et.al. Perlegen Haplotyping of chromosome 21

Golden Path / Human Variation 3 billion base pairs 6 to 10 million SNPs 200K – 300K interrelated haplotype ‘blocks’ in the human population Each block is about 7.8 K bp (this is an average) Each block contains roughly 10 SNPs (of which 1 to 3 define the haplotype)

MAKING SNPs MAKE SENSE http://learn.genetics.utah.edu/content/health/pharma/snips/

Part 2: Cancer Genome Genome Structure Changes: Loss of heterozygosity (LOH) Copy Number Variation (CNV) Structure Variation (SV);

Genetic evolution model Cancer is caused by deregulation of growth controlling molecular pathways. © C. J. Cornelisse

Chromosome Alterations Translocations Aneuploidy ( 非整倍体 ) Chromosomal amplifications Chromosomal deletions (loss of tumor suppressor via a point mutation followed by a deletion)

Loss-of-Heterozygosity (LOH) If a marker (SNP, micro-satellite) has heterozygous genotype in the normal sample but has homozygous genotype in the tumor sample from the same patient. Indicates chromosomal alteration; often related to tumor-suppressor genes

Loss of Heterozygosity Paradigm for Tumor Suppressor Gene Inactivation by Allelic Loss in Cancers Normal First genetic hit Cancer T T T OR T T T X X X X A B A B A A A Loss of Heterozygosity (LOH) © M.Meyerson

Six ways of losing the remaining good copy of a tumor suppressor gene (Rb)

Two mechanisms of LOH

Compare normal and tumor samples Normal Sample Tumor Sample Laser-Capture Microdissection Genomic DNA Purification Genomic DNA Purification Sequencing, Data Analysis

Making LOH calls

Part 3: Complete Genomics Inc. Next-generation DNA sequencing has recently empowered scientists to identify genetic variations associated with human disease at higher resolution and greater sensitivity than previously possible.

To Get the Whole Picture, Sequence the Whole Genome Two approaches are commonly employed -- exome sequencing and whole genome sequencing. Exome sequencing targets protein-coding regions comprising approximately 1% of the human genome, while whole genome sequencing uses an unbiased approach to investigate the majority of the human genome, including a comprehensive survey of coding and non-coding regions.

Exome vs Whole Genome