Motivations to study human genetic variation

Slides:



Advertisements
Similar presentations
Lecture 2 Strachan and Read Chapter 13
Advertisements

Julia Krushkal 4/11/2017 The International HapMap Project: A Rich Resource of Genetic Information Julia Krushkal Lecture in Bioinformatics 04/15/2010.
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Are you ready for the genomic age? An introduction to human genomics Jacques Fellay EPFL School of Life Sciences Swiss Institute of Bioinformatics Lausanne,
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
Linkage Disequilibrium
Efficient Algorithms for Genome-wide TagSNP Selection across Populations via the Linkage Disequilibrium Criterion Authors: Lan Liu, Yonghui Wu, Stefano.
Single Nucleotide Polymorphism And Association Studies
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
MALD Mapping by Admixture Linkage Disequilibrium.
Office hours Wednesday 3-4pm 304A Stanley Hall. Fig Association mapping (qualitative)
The role of variation in finding functional genetic elements Andy Clark – Cornell Dave Begun – UC Davis.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
The 1000 Genomes Project Gil McVean Department of Statistics, Oxford.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
A coalescent computational platform for tagging marker selection for clinical studies Gabor T. Marth Department of Biology, Boston College
Positional Cloning LOD Sib pairs Chromosome Region Association Study Genetics Genomics Physical Mapping/ Sequencing Candidate Gene Selection/ Polymorphism.
Office hours Wednesday 3-4pm 304A Stanley Hall Review session 5pm Thursday, Dec. 11 GPB100.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Welcome to CS374! A survey of computer science in genomics today ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Course Overview Personalized Medicine: Understanding Your Own Genome Fall 2014.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Biotechnology and Genomics Chapter 16. Biotechnology and Genomics 2Outline DNA Cloning  Recombinant DNA Technology ­Restriction Enzyme ­DNA Ligase 
Bernard Keavney Institute of Human Genetics University of Newcastle, UK. Recent developments in genetic epidemiology relevant to PURE.
HapMap: application in the design and interpretation of association studies Mark J. Daly, PhD on behalf of The International HapMap Consortium.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Bioinformatics SNPs and haplotypes Kristel Van Steen, PhD, ScD Université de Liege - Institut Montefiore
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Medical variations Gabor T. Marth Boston College Biology Department BI543 Fall 2013 February 5, 2013.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Conservation of genomic segments (haplotypes): The “HapMap” n In populations, it appears the the linear order of alleles (“haplotype”) is conserved in.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Molecular & Genetic Epi 217 Association Studies
CS177 Lecture 10 SNPs and Human Genetic Variation
Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Genome-Wide Association Study (GWAS)
1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.
10cM - Linkage Mapping Set v2 ABI Median intermarker distance: 4.7 Mb Mean intermarker distance: 5.6 Mb Mean genetic gap distance: 8.9 cM Average Heterozygosity.
Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology.
Molecular & Genetic Epi 217 Association Studies: Indirect John Witte.
Polymorphism Haixu Tang School of Informatics. Genome variations underlie phenotypic differences cause inherited diseases.
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
Recombination based population genomics Jaume Bertranpetit Marta Melé Francesc Calafell Asif Javed Laxmi Parida.
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Biotechnology and Genomics Chapter 16. Biotechnology and Genomics 2Outline DNA Cloning  Recombinant DNA Technology ­Restriction Enzyme ­DNA Ligase 
Linear Reduction Method for Tag SNPs Selection Jingwu He Alex Zelikovsky.
1 Balanced Translocation detected by FISH. 2 Red- Chrom. 5 probe Green- Chrom. 8 probe.
The HapMap Project and Haploview
The International Consortium. The International HapMap Project.
Copyright OpenHelix. No use or reproduction without express written consent1.
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
Signals of natural selection in the HapMap project data The International HapMap Consortium Gil McVean Department of Statistics, Oxford University.
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Gil McVean Department of Statistics
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
A Common 16p11.2 Inversion Underlies the Joint Susceptibility to Asthma and Obesity  Juan R. González, Alejandro Cáceres, Tonu Esko, Ivon Cuscó, Marta.
Haplotypes When the presence of two or more polymorphisms on a single chromosome is statistically correlated in a population, this is a haplotype Example.
Trevor J. Pemberton, Chaolong Wang, Jun Z. Li, Noah A. Rosenberg 
SNPs and CNPs By: David Wendel.
Yu Zhang, Tianhua Niu, Jun S. Liu 
Volume 152, Issue 8, Pages (June 2017)
KDM4A SNP-A482 (rs586339) correlates with worse outcome in patients with NSCLC. A, schematic of the human KDM4A protein is shown with both the protein.
Presentation transcript:

Motivations to study human genetic variation The evolution of our species and its history. Understand the genetics of diseases, esp. the more common complex ones such as diabetes, cancer, cardiovascular, and neurodegenerative. To allow pharmaceutical treatments to be tailored to individuals (adverse reactions based on genetics). 1

Haplotype Map of the Human Genome Goals: Define patterns of genetic variation across human genome Guide selection of SNPs efficiently to “tag” common variants Public release of all data (assays, genotypes) Phase I: 1.3 M markers in 269 people Phase II: +2.8 M markers in 270 people

HapMap Project The HapMap Project tests linkage between SNPs in various sub-populations. For a group of linked SNPs recombination may be rare over tens of thousands of bases A few "tag SNPs" can be used to identify genotypes for groups of linked SNPs Makes it possible to survey the whole genome with fewer markers (1/3-1/10th)

Haplotype Linkage is common in the human population, particularly in genetically isolated sub-populations. A group of alleles for neighboring genes on a segment of a chromosome are very often inherited together. Such a combination of linked alleles is known as a haplotype. When linked alleles are shared by members of a population, it is called a linkage disequilibrium.

Haplotypes (example) .. A C T G A chromosome region with only the SNPs shown. Three haplotypes are shown. The two SNPs in color are sufficient to identify (tag) each of the three haplotyes. For example, if a chromosome has alleles A and T at these two tag SNPs, then it has the first haplotype.

HapMap Samples 90 Yoruba individuals (30 parent-parent-offspring trios) from Ibadan, Nigeria (YRI) 90 individuals (30 trios) of European descent from Utah (CEU) 45 Han Chinese individuals from Beijing (CHB) 45 Japanese individuals from Tokyo (JPT)

Make Genetic Profiles Scan these populations with a large number of SNP markers. Find markers linked to drug response phenotypes. It is interesting, but not necessary, to identify the exact genes involved. Can work with “associated populations,” does not require detailed information on disease in family history(pedigree).

The 1000 Genomes Project submitted 17.3M SNPs The SNP database today March, 2010 105,098,087 The 1000 Genomes Project submitted 17.3M SNPs The 2008 SNP Submissions for the James Watson Genome totaled 3,542,364 The 2008 SNP Submissions for the J. Craig Venter Genome totaled 4,018,050 The 2008 SNP Submissions for the Individual Chinese Genome totaled 5,077,954 The 2008 SNP Submissions for the Individual Korean Genome totaled 1,750,224 Derived from dbSNP release 130 http://www.ncbi.nlm.nih.gov/SNP/

SNP’s aren’t everything: Introducing Copy Number Variations Redon et al. Nature 2006

Copy Number Variation Dataset Genome Structural Variation Consortium Array-CGH using a whole genome tile path array Median clone size ~170 kb All 270 HapMap individuals Measures amount of DNA, not RNA Comparison between two samples ‘Test’ sample vs ‘Reference’ sample 10

Array-CGH technology

Typical Analysis Procedure Values are typically normalized so that the mean log2 value for the entire array (or an individual chromosome) is 0 Analysis consists of identifying segments where the test and reference samples have unequal copy number 12

Log(2) = (test/reference) 13

1,447 CNVRs from 270 HapMap samples

More than 10% of the genome sequence Structural Variation Project More than 10% of the genome sequence Nature 447: 161-165, 2007 15

Copy Number Variations are ubiquitous in the human genome The number of genome structural variants (>1 kb) that distinguish genomes of different individuals is at least on the order of 600–900 per individual. J.O. Korbel et al., Science 318(2007), pp. 420–426

HapMap 3 Merged the results from Affymetrix and Illumina chips Genotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populations Sequenced ten 100-kilobase regions in 692 of these individuals

Centre d’Etude du Polymorphisme Humain collected in Utah, USA, with ancestry from northern and western Europe (CEU) Han Chinese in Beijing, China (CHB) Japanese in Tokyo, Japan (JPT) Yoruba in Ibadan, Nigeria (YRI) African ancestry in the southwestern USA (ASW) Chinese in metropolitan Denver, Colorado, USA (CHD) Gujarati Indians in Houston, Texas, USA (GIH) Luhya in Webuye, Kenya (LWK) Maasai in Kinyawa, Kenya (MKK) Mexican ancestry in Los Angeles, California, USA (MXL) Tuscans in Italy (Toscani in Italia, TSI) CEU, ASW, MXL, MKK, and YRI

Computational detection of structural genomic variation Direct comparison of genomes through sequence alignments Advantages: All types of genomic variation can be identified, including balanced variants (inversions or translocations) No limit in the resolution and breakpoints can be defined at nucleotide level Problems: Generate a lot of false positives due to sequence misassembly and gaps

Out of Africa Modern humans arose in Africa and replaced other human species across the globe. Scientific American, August 1999)

Out of Africa again and again Itai Yanai, 2003 Templeton, A. Nature 416 (2002): 45 - 51

The Human Genome Project cost ~USD 3,000,000,000 Illumina now offers a complete genome sequence from USD 50,000 Complete Genomics will offer a complete genome sequence from USD 5,000 soon There are now an estimated 50 complete human genome sequences

•Craig Venter, Sanger, -$1 million •James Watson, 454. $70 million •Craig Venter, Sanger, -$1 million •African -HapMap –Illumina & Solid, $100,000 •Five African –Penn State University •Chinese, Illumina •Two Koreans •Prof. Quake -Stanford --Nature genetics paper -$50,000, 1 week, Helicos Stanford team -Clinical annotation of genome from “patient Zero”

The 10-gen data set