Presentation is loading. Please wait.

Presentation is loading. Please wait.

010101100010010100001010101010011011100110001100101000100101 Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.

Similar presentations


Presentation on theme: "010101100010010100001010101010011011100110001100101000100101 Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG."— Presentation transcript:

1 010101100010010100001010101010011011100110001100101000100101 Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG

2 Cost Killer apps Roadblocks? How soon will we all be sequenced? Time 2013? 2018? Cost Applications

3 The Hominid Lineage

4 Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~150,000yr –Single father of all humans (Adam) ~70,000yr –Humans out of Africa ~50000 years ago replaced others (e.g., Neandertals) Multiregional Evolution –Generally debunked, however, –~5% of human genome in Europeans, Asians is Neanderthal, Denisova

5 Coalescence Y-chromosome coalescence

6 Why humans are so similar A small population that interbred reduced the genetic variation Out of Africa ~ 50,000 years ago Out of Africa

7 Migration of Humans

8 http://info.med.yale.edu/genetics/kkidd/point.html

9 Migration of Humans http://info.med.yale.edu/genetics/kkidd/point.html

10 Some Key Definitions Mary: AGCCCGTACG John: AGCCCGTACG Josh: AGCCCGTACG Kate: AGCCCGTACG Pete: AGCCCGTACG Anne: AGCCCGTACG Mimi: AGCCCGTACG Mike: AGCCCTTACG Olga: AGCCCTTACG Tony: AGCCCTTACG Mary: AGCCCGTACG John: AGCCCGTACG Josh: AGCCCGTACG Kate: AGCCCGTACG Pete: AGCCCGTACG Anne: AGCCCGTACG Mimi: AGCCCGTACG Mike: AGCCCTTACG Olga: AGCCCTTACG Tony: AGCCCTTACG Alleles: G, T Major Allele: G Minor Allele: T G/G G/T G/G T/T T/G G/G G/T G/G T/T T/G Recombinations: At least 1/chromosome On average ~1/100 Mb Linkage Disequilibrium: The degree of correlation between two SNP locations MomDad

11 Human Genome Variation SNP TGCTGAGA TGCCGAGA Novel Sequence TGCTCGGAGA TGC - - - GAGA Inversion Mobile Element or Pseudogene Insertion TranslocationTandem Duplication Microdeletion TGC - - AGA TGCCGAGA Transposition Large Deletion Novel Sequence at Breakpoint TGC

12 The Fall in Heterozygosity H – H POP F ST = ------------- H H – H POP F ST = ------------- H

13 The HapMap Project ASWAfrican ancestry in Southwest USA 90 CEUNorthern and Western Europeans (Utah) 180 CHBHan Chinese in Beijing, China 90 CHDChinese in Metropolitan Denver100 GIHGujarati Indians in Houston, Texas100 JPTJapanese in Tokyo, Japan 91 LWKLuhya in Webuye, Kenya100 MXLMexican ancestry in Los Angeles 90 MKKMaasai in Kinyawa, Kenya180 TSIToscani in Italia100 YRIYoruba in Ibadan, Nigeria100 Genotyping: Probe a limited number (~1M) of known highly variable positions of the human genome

14 Linkage Disequilibrium & Haplotype Blocks pApA pGpG Linkage Disequilibrium (LD): D = P(A and G) - p A p G Linkage Disequilibrium (LD): D = P(A and G) - p A p G Minor allele: A G

15 Population Sequencing – 1000 Genomes Project 1000 Genomes Project Population Sequencing – 1000 Genomes Project 1000 Genomes Project The 1000 Genomes Project Consortium et al. Nature 467, 1061-1173 (2010) doi:10.1038/nature09534

16 Association Studies Control Disease A/G G/G A/G G/G A/A A/G A/A A/G A/A AA04 AG33 GG40 p-value

17 Wellcome Trust Case Control Nature 447, 661-678(7 June 2007) Nature 464, 713-720(1 April 2010) Many associations of small effect sizes (<1.5)

18 Disease Clustering Disease Genotyping Multiple Sclerosis (MS) Illumina chip, 15K non- synon SNPs Ankylosing Spondylitis (AS) Autoimmune Thyroid (ATD) Breast Cancer (BC) Rheumatoid Arthritis (RA) Affy 500K array Bipolar Disorder (BD) Crohn's Disease (CD) Coronary Artery (CAD) Hypertension (HT) Type 1 Diabetes (T1D) Type 2 Diabetes (T2D) Randomization to determine significance Use results as a distance metric for clustering diseases Compute disease-disease correlations PLoS Genet 5(12): e1000792. doi:10.1371/journal.pgen.1000792. 2009.

19 Disease Clustering RA vs. ATD RA vs. MS –No recorded co-occurrence of RA and MS SNP - Allele Gene Symbol Genetic Variation Score (GVS) RA (NARAC) RAAST1DATDMS (IMSGC)MS rs11752919 - CZSCAN23 -3.48-3.21-9.391.100.703.252.99 rs3130981 - ACDSN -0.46-9.47-4.940.3310.0013.41 rs151719 - GHLA-DMB -6.71-4.77-1.08-13.630.348.5817.76 rs10484565 - TTAP2 25.528.371.3415.74-1.36-0.56-0.30 rs1264303 - GVARS2 11.517.3618.760.89-1.76-1.85-1.75 rs1265048 - CCDSN 6.592.9750.136.34-0.85-2.39-4.16 rs2071286 - ANOTCH4 5.300.786.424.04-0.03-1.89-2.45 rs2076530 - GBTNL2 67.4956.4614.0613.58-6.41-9.50-18.52 rs757262 - TTRIM40 14.589.116.271.56-0.79-2.05-7.34

20 Heritability & Environment Bienvenu OJ, Davydow DS, & Kendler KS (2011). Psychological medicine, 41 (1), 33-40 PMID:

21 Ancestry Inference Danish French Spanish Mexican

22 Global Ancestry Inference Nature. 2008 November 6; 456(7218): 98–101.

23 Ancestry Painting Danish French Spanish Mexican

24 Ancestry Painting – Haplotype-based HAPAA, HAPMIX HAPAA: Genome Res. 2008. 18: 676-682 HAPMIX: PLoS Genet 5(6): e1000519, 2009

25 Fixation, Positive & Negative Selection Neutral Drift Positive Selection Negative Selection How can we detect negative selection? How can we detect positive selection?

26 Conservation and Human SNPs CNSs have fewer SNPs SNPs have shifted allele frequency spectra CNSs have fewer SNPs SNPs have shifted allele frequency spectra Neutral CNS

27 How can we detect positive selection? Ka/Ks ratio: Ratio of nonsynonymous to synonymous substitutions Very old, persistent, strong positive selection for a protein that keeps adapting Examples: immune response, spermatogenesis Ka/Ks ratio: Ratio of nonsynonymous to synonymous substitutions Very old, persistent, strong positive selection for a protein that keeps adapting Examples: immune response, spermatogenesis

28 How can we detect positive selection?

29 Long Haplotypes –iHS test Less time: Fewer mutations Fewer recombinations


Download ppt "010101100010010100001010101010011011100110001100101000100101 Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG."

Similar presentations


Ads by Google