010101100010010100001010101010011011100110001100101000100101 Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.

Slides:



Advertisements
Similar presentations
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Advertisements

Julia Krushkal 4/11/2017 The International HapMap Project: A Rich Resource of Genetic Information Julia Krushkal Lecture in Bioinformatics 04/15/2010.
Supplementary Figure S1 Distribution of observed (blue) and Poisson expected (red) standard deviation of human-chimpanzee divergence over different window.
Gene Expression Levels Are a Target of Recent Natural Selection in the Human Genome Mol. Biol. Evol. 26(3):649– Journal Club
Using genetics to study human history and natural selection David Reich Harvard Medical School Depatment of Genetics Broad Institute.
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Credits to Vanessa Patel for some of the slides.
Signatures of Selection
Genomics An introduction. Aims of genomics I Establishing integrated databases – being far from merely a storage Linking genomic and expressed gene sequences.
14 Molecular Evolution and Population Genetics
Evolution at the DNA level …ACGGTGCAGTTACCA… …AC----CAGTCCACCA… Mutation SEQUENCE EDITS REARRANGEMENTS Deletion Inversion Translocation Duplication.
Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006.
CS273a Lecture 9/10, Aut 10, Batzoglou Multiple Sequence Alignment.
Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~150,000yr –Single father of all humans (Adam) ~70,000yr –Humans.
Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~150,000yr –Single father of all humans (Adam) ~70,000yr –Humans.
Short Primer on Comparative Genomics Today: Special guest lecture 12pm, Alway M108 Comparative genomics of animals and plants Adam Siepel Assistant Professor.
SNP Selection University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for.
The origin of genetic variation
Welcome to CS374! A survey of computer science in genomics today ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Population Genetics 101 CSE280Vineet Bafna. Personalized genomics April’08Bafna.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
 Archaeology – “the scientific study of material remains (as fossil relics, artifacts, and monuments) of past human life and activities”  Studies.
Genetic Variations Lakshmi K Matukumalli. Human – Mouse Comparison.
Next-Generation Sequencing
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Molecular & Genetic Epi 217 Association Studies
Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.
Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Large-scale recombination rate patterns are conserved among human populations David Serre McGill University and Genome Quebec Innovation Center UQAM January.
Molecular & Genetic Epi 217 Association Studies: Indirect John Witte.
Recombination based population genomics Jaume Bertranpetit Marta Melé Francesc Calafell Asif Javed Laxmi Parida.
Selectionist view: allele substitution and polymorphism
The International Consortium. The International HapMap Project.
Motivations to study human genetic variation
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Admixture Mapping Controlled Crosses Are Often Used to Determine the Genetic Basis of Differences Between Populations. When controlled crosses are not.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Signals of natural selection in the HapMap project data The International HapMap Consortium Gil McVean Department of Statistics, Oxford University.
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
Human survivorship Developed Developing Bob May (2007), TREE 22:
Sequencing of the South Asian Genome Lamri Amel Postdoctoral fellow 1.
The Haplotype Blocks Problems Wu Ling-Yun
Human survivorship Developed Developing Bob May (2007), TREE 22:
Evolution and Population Genetics
Human Population Genomics
Genetic Linkage.
The evolution of lactose tolerance
Population genetics Dr Gavin Band
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Signatures of Selection
Genetic Engineering in Medicine, Agriculture, and Law
Genetic Linkage.
Detection of the footprint of natural selection in the genome
The ‘V’ in the Tajima D equation is:
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Genetic Drift, followed by selection can cause linkage disequilibrium
Genetic Linkage.
Genomic Signatures of Selective Pressures and Introgression from Archaic Hominins at Human Innate Immunity Genes  Matthieu Deschamps, Guillaume Laval,
Detection of human adaptation during the past 2000 years
Haplotypes When the presence of two or more polymorphisms on a single chromosome is statistically correlated in a population, this is a haplotype Example.
Volume 152, Issue 8, Pages (June 2017)
KDM4A SNP-A482 (rs586339) correlates with worse outcome in patients with NSCLC. A, schematic of the human KDM4A protein is shown with both the protein.
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Presentation transcript:

Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG

Cost Killer apps Roadblocks? How soon will we all be sequenced? Time 2013? 2018? Cost Applications

The Hominid Lineage

Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~190,000yr –Single father of all humans (Adam) ~340,000yr –Humans out of Africa ~50000 years ago replaced others (e.g., Neandertals) Multiregional Evolution –Generally debunked, however, –~5% of human genome in Europeans, Asians is Neanderthal, Denisova

Coalescence Y-chromosome coalescence

Why humans are so similar Out of Africa Oppenheimer S Phil. Trans. R. Soc. B 2012;367:

Some Key Definitions Mary: AGCCCGTACG John: AGCCCGTACG Josh: AGCCCGTACG Kate: AGCCCGTACG Pete: AGCCCGTACG Anne: AGCCCGTACG Mimi: AGCCCGTACG Mike: AGCCCTTACG Olga: AGCCCTTACG Tony: AGCCCTTACG Mary: AGCCCGTACG John: AGCCCGTACG Josh: AGCCCGTACG Kate: AGCCCGTACG Pete: AGCCCGTACG Anne: AGCCCGTACG Mimi: AGCCCGTACG Mike: AGCCCTTACG Olga: AGCCCTTACG Tony: AGCCCTTACG Alleles: G, T Major Allele: G Minor Allele: T G/G G/T G/G T/T T/G G/G G/T G/G T/T T/G Recombinations: At least 1/chromosome On average ~1/100 Mb Linkage Disequilibrium: The degree of correlation between two SNP locations MomDad

Human Genome Variation SNP TGCTGAGA TGCCGAGA Novel Sequence TGCTCGGAGA TGC GAGA Inversion Mobile Element or Pseudogene Insertion TranslocationTandem Duplication Microdeletion TGC - - AGA TGCCGAGA Transposition Large Deletion Novel Sequence at Breakpoint TGC

The Fall in Heterozygosity H – H POP F ST = H H – H POP F ST = H

From bones, compared genomes of three different Neanderthals with five genomes from modern humans from different areas of the world The Neanderthal Genome Figure 1- R. E. Green et al., Science 328, (2010)

Neanderthal Genome

Denisovan – Another human relative

Denisovan/Human Comparison

Aboriginal Australian

Benefits of Admixture

Out of Africa Revisited Ann Gibbons Science 28 January 2011: “Human uniqueness?”

The HapMap Project ASWAfrican ancestry in Southwest USA 90 CEUNorthern and Western Europeans (Utah) 180 CHBHan Chinese in Beijing, China 90 CHDChinese in Metropolitan Denver100 GIHGujarati Indians in Houston, Texas100 JPTJapanese in Tokyo, Japan 91 LWKLuhya in Webuye, Kenya100 MXLMexican ancestry in Los Angeles 90 MKKMaasai in Kinyawa, Kenya180 TSIToscani in Italia100 YRIYoruba in Ibadan, Nigeria100 Genotyping: Probe a limited number (~1M) of known highly variable positions of the human genome

Linkage Disequilibrium & Haplotype Blocks pApA pGpG Linkage Disequilibrium (LD): D = P(A and G) - p A p G Linkage Disequilibrium (LD): D = P(A and G) - p A p G Minor allele: A G

Population Sequencing – 1000 Genomes Project 1000 Genomes Project Population Sequencing – 1000 Genomes Project 1000 Genomes Project

Population Sequencing – 1000 Genomes Project 1000 Genomes Project Population Sequencing – 1000 Genomes Project 1000 Genomes Project

Association Studies Control Disease A/G G/G A/G G/G A/A A/G A/A A/G A/A AA04 AG33 GG40 p-value

Wellcome Trust Case Control Nature 447, (7 June 2007) Nature 464, (1 April 2010) Many associations of small effect sizes (<1.5)

Heritability & Environment Bienvenu OJ, Davydow DS, & Kendler KS (2011). Psychological medicine, 41 (1), PMID:

Disease Clustering RA vs. ATD RA vs. MS –No recorded co-occurrence of RA and MS SNP - Allele Gene Symbol Genetic Variation Score (GVS) RA (NARAC) RAAST1DATDMS (IMSGC)MS rs CZSCAN rs ACDSN rs GHLA-DMB rs TTAP rs GVARS rs CCDSN rs ANOTCH rs GBTNL rs TTRIM

Global Ancestry Inference Nature November 6; 456(7218): 98–101.

Ancestry Painting Danish French Spanish Mexican ALLOY: A factorial HMM for ancestry painting

Modeling population haplotypes – VLMC Browning, 2006

Phasing Browning & Browning, 2007

Identity By Descent { {

IBD detection IBD = F IBD = T FastIBD: sample haplotypes for each individual, check for IBD Browning & Browining 2011 Parente Rodriguez et al. 2013

Fixation, Positive & Negative Selection Neutral Drift Positive Selection Negative Selection How can we detect negative selection? How can we detect positive selection?

Ka/Ks ratio: Ratio of nonsynonymous to synonymous substitutions Very old, persistent, strong positive selection for a protein that keeps adapting Examples: immune response, spermatogenesis Ka/Ks ratio: Ratio of nonsynonymous to synonymous substitutions Very old, persistent, strong positive selection for a protein that keeps adapting Examples: immune response, spermatogenesis

How can we detect positive selection?

Positive Selection in Human Lineage

X X X Mutations and LD Slide Credits: Marc Schaub

Extended Haplotype Homozygozity ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 S1S1 S2S2 S3S3 S4S4 S5S5 S6S6 Slide Credits: Marc Schaub

ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A B C D E F G H I J Core C Extended Haplotype Homozygozity Slide Credits: Marc Schaub

ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A B C D E F G H I J Core C 3 core haplotypes: ch 0 = 101 ch 1 = 111 ch 2 = 100 Extended Haplotype Homozygozity Slide Credits: Marc Schaub

ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A B C D E F G H I J Core C 3 core haplotypes: ch 0 = 101 ch 1 = 111 ch 2 = 100 Extended Haplotype Homozygozity Slide Credits: Marc Schaub

ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A B C D E F G H I J Core C 3 core haplotypes: ch 0 = 101 ch 1 = 111 ch 2 = 100 Extended Haplotype Homozygozity Slide Credits: Marc Schaub

ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A B C D E F G H I J Core C Given a core haplotype (101) and a SNP (S 6 ) EHH is the conditional probability of two randomly chosen chromosomes to be homozygous from the core to S 6 given that they include core haplotype 101 Extended Haplotype Homozygozity

ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A B C D E F G H I J Core C EHH is the conditional probability of two randomly chosen chromosomes to be homozygous from the core to S 6 given that they include core haplotype 101 Extended Haplotype Homozygozity

EHH is the conditional probability of two randomly chosen chromosomes to be homozygous from the core to S 6 given that they include core haplotype 101 ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A B C D E F G H I J Core C Extended Haplotype Homozygozity

ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A B C D E F G H I J Core C Extended Haplotype Homozygozity Slide Credits: Marc Schaub

Study of genes known to be implicated in the resistance to malaria. Infectious disease caused by protozoan parasites of the genus Plasmodium Frequent in tropical and subtropical regions Transmitted by the Anopheles mosquito Image source: wikipedia.org Application: Malaria Slide Credits: Marc Schaub

Image source: NIH Application: Malaria Slide Credits: Marc Schaub

Image source: CDC - R/Malaria/malaria_risk_2003.gif R/Malaria/malaria_risk_2003.gif Application: Malaria Slide Credits: Marc Schaub

Source: Sabeti et al. Nature Results: G6PD Slide Credits: Marc Schaub

Results: G6PD Source: Sabeti et al. Nature Slide Credits: Marc Schaub

Results: TNFSF5 Source: Sabeti et al. Nature Slide Credits: Marc Schaub

Malaria and Sickle-cell Anemia Allison (1954): Sickle-cell anemia is limited to the region in Africa in which malaria is endemic. Image source: wikipedia.org Distribution of malariaDistribution of sickle-cell anemia Slide Credits: Marc Schaub

Malaria and Sickle-cell Anemia Hypothesis: mutation causing sickle-cell anemia positively selected for the resistance to malaria. Currat (2002) and Ohashi (2004) identify the mutations in the African respectively Asian populations. Slide Credits: Marc Schaub

Malaria and Sickle-cell Anemia Single point mutation in the coding region of the Hemoglobin-B gene (glu → val). Heterozygote advantage: Resistance to malaria Slight anemia. Image source: wikipedia.org Slide Credits: Marc Schaub

Source: Ingram and Swallow. Population Genetics of Encyclopedia of Life Sciences Slide Credits: Marc Schaub Lactose Intolerance

LCT, 5’ LCT, 3’ Source: Bersaglieri et al. Am. J. Hum. Genet Slide Credits: Marc Schaub Lactose Intolerance

Source: Catherine Janet Ellen Ingram and Dallas Mary Swallow. Population Genetics of Lactase Persistence and Lactose Intolerance advanced. Encyclopedia of Life Sciences Slide Credits: Marc Schaub

-13910*T associated with persistent lactose tolerance. Is this mutation causal? Does not account for tolerance in sub-Saharan populations (Mulcare 2004). Additional SNPs in an enhancer within 100bp are associated with lactose tolerance. Several independent causes for lactose tolerance (reviewed in Ingram 2009). Slide Credits: Marc Schaub Finding the Causal Marker

Lactase persistence (litterature)Predicted lactase persistence 13910*T distribution Source: Ingram et al. Lactose digestion and the evolutionary genetics of lactase persistence. Hum Genet Jan;124(6): Slide Credits: Marc Schaub

Long Haplotypes –iHS test Less time: Fewer mutations Fewer recombinations

Positive Selection in Human Lineage

Immune System & Archaic Admixture

Orthology and Paralogy HB Human WB Worm HA1 Human HA2 Human Yeast WA Worm Orthologs: Derived by speciation Paralogs: Everything else Orthologs: Derived by speciation Paralogs: Everything else

Orthology, Paralogy, Inparalogs, Outparalogs