Molecular & Genetic Epi 217 Association Studies: Indirect John Witte.

Slides:



Advertisements
Similar presentations
Genome-wide Association Studies John S. Witte. Association Studies Hirschhorn & Daly, Nat Rev Genet 2005 Candidate Gene or GWAS.
Advertisements

Julia Krushkal 4/11/2017 The International HapMap Project: A Rich Resource of Genetic Information Julia Krushkal Lecture in Bioinformatics 04/15/2010.
Meta-analysis for GWAS BST775 Fall DEMO Replication Criteria for a successful GWAS P
Genetic Epidemiology Michèle Sale, Ph.D. Center for Public Health Genomics Tel:
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
Efficient Algorithms for Genome-wide TagSNP Selection across Populations via the Linkage Disequilibrium Criterion Authors: Lan Liu, Yonghui Wu, Stefano.
Single Nucleotide Polymorphism And Association Studies
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
MALD Mapping by Admixture Linkage Disequilibrium.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
A coalescent computational platform for tagging marker selection for clinical studies Gabor T. Marth Department of Biology, Boston College
Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006.
Association Analysis SeattleSNPs March 21, 2006 Dr. Chris Carlson FHCRC.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
SNP Selection University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for.
Welcome to CS374! A survey of computer science in genomics today ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Course Overview Personalized Medicine: Understanding Your Own Genome Fall 2014.
The International HapMap Project: Ethical, Social, and Cultural Issues [Names and institutions of presenters]
Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Design Considerations in Large- Scale Genetic Association Studies Michael Boehnke, Andrew Skol, Laura Scott, Cristen Willer, Gonçalo Abecasis, Anne Jackson,
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Bernard Keavney Institute of Human Genetics University of Newcastle, UK. Recent developments in genetic epidemiology relevant to PURE.
HapMap: application in the design and interpretation of association studies Mark J. Daly, PhD on behalf of The International HapMap Consortium.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Medical variations Gabor T. Marth Boston College Biology Department BI543 Fall 2013 February 5, 2013.
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Molecular & Genetic Epi 217 Association Studies
CS177 Lecture 10 SNPs and Human Genetic Variation
Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Next-Generation Sequencing Eric Jorgenson Epidemiology 217 2/28/12.
Genome-Wide Association Study (GWAS)
Population Pathway ? Genes SNPs Analysis Phenotypes Haplotypes/coding SNPs SNP discovery Sequencing/genotyping technology Polymorphism function Replication.
1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.
Introduction: Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.
Large-scale recombination rate patterns are conserved among human populations David Serre McGill University and Genome Quebec Innovation Center UQAM January.
Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology.
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Polymorphism Haixu Tang School of Informatics. Genome variations underlie phenotypic differences cause inherited diseases.
Recombination based population genomics Jaume Bertranpetit Marta Melé Francesc Calafell Asif Javed Laxmi Parida.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte, Xin Liu & Mark Pletcher.
Future Directions Pak Sham, HKU Boulder Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.
Clustering and optimization in genetic data: the problem of Tag-SNPs selection Paola Bertolazzi, Serena D‘ Aguanno, Giovanni Felici *, Paola Festa** *
The HapMap Project and Haploview
The International Consortium. The International HapMap Project.
Motivations to study human genetic variation
Copyright OpenHelix. No use or reproduction without express written consent1.
Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Lectures 7 – Oct 19, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Admixture Mapping Controlled Crosses Are Often Used to Determine the Genetic Basis of Differences Between Populations. When controlled crosses are not.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Signals of natural selection in the HapMap project data The International HapMap Consortium Gil McVean Department of Statistics, Oxford University.
Genome-Wides Association Studies (GWAS) Veryan Codd.
Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA.
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Itsik Pe’er, Yves R. Chretien, Paul I. W. de Bakker, Jeffrey C
A Common 16p11.2 Inversion Underlies the Joint Susceptibility to Asthma and Obesity  Juan R. González, Alejandro Cáceres, Tonu Esko, Ivon Cuscó, Marta.
Xuanyao Liu, Rick Twee-Hee Ong, Esakimuthu Nisha Pillai, Abier M
Haplotypes When the presence of two or more polymorphisms on a single chromosome is statistically correlated in a population, this is a haplotype Example.
Xuanyao Liu, Rick Twee-Hee Ong, Esakimuthu Nisha Pillai, Abier M
Volume 152, Issue 8, Pages (June 2017)
KDM4A SNP-A482 (rs586339) correlates with worse outcome in patients with NSCLC. A, schematic of the human KDM4A protein is shown with both the protein.
Presentation transcript:

Molecular & Genetic Epi 217 Association Studies: Indirect John Witte

Homework, Question 4: Haplotypes IDMTHFR_C677TMTHFR_A1298CHaplotypes? 959CCAAC-A / C-A 1044CCACC-A / C-C 147CTAAC-A / T-A 123CTACC-A / T-C or C-C / T-A Genotypes 677TT and 1298CC never observed together: Suggests most Probable haplotype, and potential selection or chance. Rare variants: not necessarily lethal, especially those that are associated with late onset diseases.

3 SNPs in the TAS2R38 Gene P AV AVIAVI P A I A AV P V I P VV A A I A VV

TASR: 3 SNPs form Haplotypes PAVPAV AVIAVI Taster Non-taster

TAS2R38 Haplotype Function

IDTaster rs rs rs HaplotypesAmino Acid 100CTAGCGCGG*/TACPAV/AVI 121CTAGCGCGG*/TACPAV/AVI CCGG CGG/CGGPAV/PAV 191CTAGCGCGG*/TACPAV/AVI 201CTAGCGCGG*/TACPAV/AVI 22.TTAACCTAC/TACAVI/AVI 241CCGG CGG/CGGPAV/PAV 26.CTAGCGCGG*/TACPAV/AVI 281CTAGCGCGG*/TACPAV/AVI 291CCGGCGCGG/CGCPAV/PAI 300TTAACCTAC/TACAVI/AVI 311CCGG CGG/CGGPAV/PAV TASR Genotyping Results

Too many MTHFR SNPs Solution: Tag SNP Selection  SNPs are correlated (aka Linkage Disequilibrium) Carlson et al. (2004) AJHG 74:106 high r 2 AAAA TTTT G C C G A CCCCCC G C C G T CCCCCC GGGG AAAA A/T 1 G/A 2 G/C 3 T/C 4 G/C 5 A/C 6 Pairwise Tagging: SNP 1 SNP 3 SNP 6 3 tags in total Test for association: SNP 1 SNP 3 SNP 6

Coverage: Measurement Error in TagSNPs

Common Measures of Coverage Threshold Measures –e.g., 73% of SNPs in the complete set are in LD with at least one SNP in the genotyping set at r 2 > 0.8 Average Measures –e.g., Average maximum r 2 = 0.84

Coverage and Sample Size Sample size required for Direct Association, n Sample size for Indirect Association n* = n/ r 2 For r 2 = 0.8, increase is 25% For r 2 = 0.5, increase is 100%

Tag SNPs Database Resources

HapMap Re-sequencing to discover millions of additional SNPs; deposited to dbSNP. SNPs from dbSNP were genotyped Looked for 1 SNP every 5kb SNP Validation –Polymorphic –Frequency Haplotype and Linkage Disequilibrium Estimation –LD tagging SNPs

HapMap Phase III Populations ASW African ancestry in Southwest USA CEU Utah residents with Northern and Western European ancestry from the CEPH collection CHB Han Chinese in Beijing, China CHD Chinese in Metropolitan Denver, Colorado GIH Gujarati Indians in Houston, Texas JPT Japanese in Tokyo, Japan LWK Luhya in Webuye, Kenya MEX Mexican ancestry in Los Angeles, California MKK Maasai in Kinyawa, Kenya TSI Toscani in Italia YRI Yoruba in Ibadan, Nigeria

Tag SNPs: HapMap

Tag SNPs: HapMap & Haploview

Tag SNPs: HapMap & Haploview

 Identified 33 common MTHR SNPs (MAF > 5%) among Caucasians  Forced in 3 potentially functional/previously associated SNPs  Identified tag based on pairwise tagging  15 tags SNPs could capture all 33 MTHR SNPs (mean r2 = 97%)  Note: number of SNPs required varies from gene to gene and from population to population Tag SNPs: HapMap Summary

1K Genomes Project

Genome-wide Assocation Studies (GWAS)

1,2,3,………………………,N 1,2,3,……………………………, M SNPs Samples One-Stage Design Stage 1 Stage 2  samples  markers Two-Stage Design 1,2,3,……………………………, M SNPs Samples 1,2,3,………………………,N One- and Two-Stage GWA Designs

SNPs Samples Replication-based analysis SNPs Samples Stage 1 Stage 2 One-Stage Design Joint analysis SNPs Samples Stage 1 Stage 2 Two-Stage Design

Multistage Designs Joint analysis has more power than replication p-value in Stage 1 must be liberal Lower cost—do not gain power

Complex diseases Diabetes Obesity Diet Physical activity Hypertension Hyperlipidemia Vulnerable plaques Atherosclerosis MI Genetic susceptibility Complex diseases: Many causes = many causal pathways!

Pathways Many websites / companies provide ‘dynamic’ graphic models of molecular and biochemical pathways. Example: BioCarta: May be interested in potential joint and/or interaction effects of multiple genes in one pathway.

Interactions “The interdependent operation of two or more causes to produce or prevent an effect” “Differences in the effects of one or more factors according to the level of the remaining factor(s)” Last, 2001 AAAaaa BBAt risk No risk BbAt risk No risk bbNo risk

Why look for interactions? Improve detection of genetic (& environmental) risks. Understand etiology/biology New hypotheses? Diagnostics Prevention and interventions

Dilution of effects OR= Drinker? Micronutrient X Environmental exposure Y Gene A Other gene Z Within particular subgroups, effect of gene may be quite high or low

Statistical vs. Biological Interactions Not identical. One hypothesizes biological interaction But ‘tests’ for statistical interaction Does statistical evidence support our biological hypothesis?

Multiplicative vs. Additive Interactions gG e E gG e E gG e E Multiplicative “effect” (ORs, RRs) Multiplicative interaction (ORs, RRs) 2.8/ /1.0  = = / /1.0  = = 2.8 Departure from =1 is a multiplicative interaction Additive “effect” RER = (OR(E,G)-1)/((OR(E,g)-1)+(OR(e,G)-1)) = (2.4-1)/((2.0-1)+(1.4-1)) = 1.0 RER = relative excess risk

Brennan, P. Carcinogenesis : Two possible causal pathways: additive and multiplicative interaction for colorectal cancer Additive interaction: G1 and E5: independent risk factors Multiplicative interaction: G2 and E2: work through same pathway If factors are not known to act independently, use multiplicative.

Analysis of Multiple Genes Joint / Additive Multiplicative Increasing complexity

More Complex Modeling Multifactor-dimensionality reduction –(Moore & Williams, Ann Med 2002) Logic regression –(Kooperberg & Ruczinski, Genetic Epi 2005) Multi-loci analysis –(Marchini, Donnelly, Cardon, Nat Genet 2005) Bayesian epistasis association mapping –(Zhang & Liu, Nat Genet 2007)