010101100010010100001010101010011011100110001100101000100101 Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.

Slides:



Advertisements
Similar presentations
What is an association study? Define linkage disequilibrium
Advertisements

Julia Krushkal 4/11/2017 The International HapMap Project: A Rich Resource of Genetic Information Julia Krushkal Lecture in Bioinformatics 04/15/2010.
Single Nucleotide Polymorphism Copy Number Variations and SNP Array Xiaole Shirley Liu and Jun Liu.
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
Gene Expression Levels Are a Target of Recent Natural Selection in the Human Genome Mol. Biol. Evol. 26(3):649– Journal Club
Single Nucleotide Polymorphism And Association Studies
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Credits to Vanessa Patel for some of the slides.
Signatures of Selection
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls The Wellcome Trust Case Control Consortium, Nature, 2007.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~150,000yr –Single father of all humans (Adam) ~70,000yr –Humans.
Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~150,000yr –Single father of all humans (Adam) ~70,000yr –Humans.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
SNP Selection University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for.
Welcome to CS374! A survey of computer science in genomics today ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Exploring the behavioral genetics of Trade and Cooperation Arcadi Navarro and Elodie Gazave July 5th 2007.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
 Archaeology – “the scientific study of material remains (as fossil relics, artifacts, and monuments) of past human life and activities”  Studies.
Medical variations Gabor T. Marth Boston College Biology Department BI543 Fall 2013 February 5, 2013.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Molecular & Genetic Epi 217 Association Studies
CS177 Lecture 10 SNPs and Human Genetic Variation
SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.
Gene Hunting: Linkage and Association
Informative SNP Selection Based on Multiple Linear Regression
Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.
10cM - Linkage Mapping Set v2 ABI Median intermarker distance: 4.7 Mb Mean intermarker distance: 5.6 Mb Mean genetic gap distance: 8.9 cM Average Heterozygosity.
Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Large-scale recombination rate patterns are conserved among human populations David Serre McGill University and Genome Quebec Innovation Center UQAM January.
Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology.
Molecular & Genetic Epi 217 Association Studies: Indirect John Witte.
Polymorphism Haixu Tang School of Informatics. Genome variations underlie phenotypic differences cause inherited diseases.
Recombination based population genomics Jaume Bertranpetit Marta Melé Francesc Calafell Asif Javed Laxmi Parida.
Risk Prediction of Complex Disease David Evans. Genetic Testing and Personalized Medicine Is this possible also in complex diseases? Predictive testing.
Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Linear Reduction Method for Tag SNPs Selection Jingwu He Alex Zelikovsky.
The HapMap Project and Haploview
The International Consortium. The International HapMap Project.
Motivations to study human genetic variation
Copyright OpenHelix. No use or reproduction without express written consent1.
Deletions Project Tom Carpel CS CM124 6/11/2008.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Admixture Mapping Controlled Crosses Are Often Used to Determine the Genetic Basis of Differences Between Populations. When controlled crosses are not.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Signals of natural selection in the HapMap project data The International HapMap Consortium Gil McVean Department of Statistics, Oxford University.
Sequencing of the South Asian Genome Lamri Amel Postdoctoral fellow 1.
The Haplotype Blocks Problems Wu Ling-Yun
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Evolution and Population Genetics
Human Population Genomics
Common variation, GWAS & PLINK
Genetic Linkage.
Population genetics Dr Gavin Band
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Genetic Linkage.
Linkage: Statistically, genes act like beads on a string
Detection of human adaptation during the past 2000 years
Haplotypes When the presence of two or more polymorphisms on a single chromosome is statistically correlated in a population, this is a haplotype Example.
Volume 152, Issue 8, Pages (June 2017)
KDM4A SNP-A482 (rs586339) correlates with worse outcome in patients with NSCLC. A, schematic of the human KDM4A protein is shown with both the protein.
Presentation transcript:

Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG

Cost Killer apps Roadblocks? How soon will we all be sequenced? Time 2013? 2018? Cost Applications

The Hominid Lineage

Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~150,000yr –Single father of all humans (Adam) ~70,000yr –Humans out of Africa ~50000 years ago replaced others (e.g., Neandertals) Multiregional Evolution –Generally debunked, however, –~5% of human genome in Europeans, Asians is Neanderthal, Denisova

Coalescence Y-chromosome coalescence

Why humans are so similar A small population that interbred reduced the genetic variation Out of Africa ~ 50,000 years ago Out of Africa

Migration of Humans

Migration of Humans

Some Key Definitions Mary: AGCCCGTACG John: AGCCCGTACG Josh: AGCCCGTACG Kate: AGCCCGTACG Pete: AGCCCGTACG Anne: AGCCCGTACG Mimi: AGCCCGTACG Mike: AGCCCTTACG Olga: AGCCCTTACG Tony: AGCCCTTACG Mary: AGCCCGTACG John: AGCCCGTACG Josh: AGCCCGTACG Kate: AGCCCGTACG Pete: AGCCCGTACG Anne: AGCCCGTACG Mimi: AGCCCGTACG Mike: AGCCCTTACG Olga: AGCCCTTACG Tony: AGCCCTTACG Alleles: G, T Major Allele: G Minor Allele: T G/G G/T G/G T/T T/G G/G G/T G/G T/T T/G Recombinations: At least 1/chromosome On average ~1/100 Mb Linkage Disequilibrium: The degree of correlation between two SNP locations MomDad

Human Genome Variation SNP TGCTGAGA TGCCGAGA Novel Sequence TGCTCGGAGA TGC GAGA Inversion Mobile Element or Pseudogene Insertion TranslocationTandem Duplication Microdeletion TGC - - AGA TGCCGAGA Transposition Large Deletion Novel Sequence at Breakpoint TGC

The Fall in Heterozygosity H – H POP F ST = H H – H POP F ST = H

The HapMap Project ASWAfrican ancestry in Southwest USA 90 CEUNorthern and Western Europeans (Utah) 180 CHBHan Chinese in Beijing, China 90 CHDChinese in Metropolitan Denver100 GIHGujarati Indians in Houston, Texas100 JPTJapanese in Tokyo, Japan 91 LWKLuhya in Webuye, Kenya100 MXLMexican ancestry in Los Angeles 90 MKKMaasai in Kinyawa, Kenya180 TSIToscani in Italia100 YRIYoruba in Ibadan, Nigeria100 Genotyping: Probe a limited number (~1M) of known highly variable positions of the human genome

Linkage Disequilibrium & Haplotype Blocks pApA pGpG Linkage Disequilibrium (LD): D = P(A and G) - p A p G Linkage Disequilibrium (LD): D = P(A and G) - p A p G Minor allele: A G

Population Sequencing – 1000 Genomes Project 1000 Genomes Project Population Sequencing – 1000 Genomes Project 1000 Genomes Project The 1000 Genomes Project Consortium et al. Nature 467, (2010) doi: /nature09534

Association Studies Control Disease A/G G/G A/G G/G A/A A/G A/A A/G A/A AA04 AG33 GG40 p-value

Wellcome Trust Case Control Nature 447, (7 June 2007) Nature 464, (1 April 2010) Many associations of small effect sizes (<1.5)

Disease Clustering Disease Genotyping Multiple Sclerosis (MS) Illumina chip, 15K non- synon SNPs Ankylosing Spondylitis (AS) Autoimmune Thyroid (ATD) Breast Cancer (BC) Rheumatoid Arthritis (RA) Affy 500K array Bipolar Disorder (BD) Crohn's Disease (CD) Coronary Artery (CAD) Hypertension (HT) Type 1 Diabetes (T1D) Type 2 Diabetes (T2D) Randomization to determine significance Use results as a distance metric for clustering diseases Compute disease-disease correlations PLoS Genet 5(12): e doi: /journal.pgen

Disease Clustering RA vs. ATD RA vs. MS –No recorded co-occurrence of RA and MS SNP - Allele Gene Symbol Genetic Variation Score (GVS) RA (NARAC) RAAST1DATDMS (IMSGC)MS rs CZSCAN rs ACDSN rs GHLA-DMB rs TTAP rs GVARS rs CCDSN rs ANOTCH rs GBTNL rs TTRIM

Ancestry Inference Danish French Spanish Mexican

Global Ancestry Inference

Fixation, Positive & Negative Selection Neutral Drift Positive Selection Negative Selection How can we detect negative selection? How can we detect positive selection?

Conservation and Human SNPs CNSs have fewer SNPs SNPs have shifted allele frequency spectra CNSs have fewer SNPs SNPs have shifted allele frequency spectra Neutral CNS

How can we detect positive selection? Ka/Ks ratio: Ratio of nonsynonymous to synonymous substitutions Very old, persistent, strong positive selection for a protein that keeps adapting Examples: immune response, spermatogenesis Ka/Ks ratio: Ratio of nonsynonymous to synonymous substitutions Very old, persistent, strong positive selection for a protein that keeps adapting Examples: immune response, spermatogenesis

How can we detect positive selection?

Long Haplotypes –iHS test Less time: Fewer mutations Fewer recombinations