Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006.

Slides:



Advertisements
Similar presentations
Single Nucleotide Polymorphism Copy Number Variations and SNP Array Xiaole Shirley Liu and Jun Liu.
Advertisements

METHODS FOR HAPLOTYPE RECONSTRUCTION
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
MALD Mapping by Admixture Linkage Disequilibrium.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Signatures of Selection
Genomics An introduction. Aims of genomics I Establishing integrated databases – being far from merely a storage Linking genomic and expressed gene sequences.
Variation Workshop University of Washington March 20-21, 2006 Sponsored by the NHLBI.
Medical Resequencing Debbie Nickerson Department of Genome Sciences University of Washington.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
A coalescent computational platform for tagging marker selection for clinical studies Gabor T. Marth Department of Biology, Boston College
SNP Discovery and Analysis: Application to Association Studies Mark J. Rieder, PhD Dana Crawford, PhD Deborah Nickerson, PhD SeattleSNPs PGA July 19-20,
SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson
Assessing the Impact of Candidate Gene Variation on Quantitative Phenotypes Dana C. Crawford, PhD University of Washington March 21, 2006.
SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson NIEHS SNPs Workshop.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
SNP Discovery and Analysis Application to Association Studies
Inferring Haplotypes Dr. Russell Thomson. A Haplotype. …AGCTATATTA…..GGCTGCTC…..AGCAGCGA… …AGCTAAATTA…..GGCTCCTC…..AGCAGCGA… One individual. Marker 1Marker.
Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~150,000yr –Single father of all humans (Adam) ~70,000yr –Humans.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
SNP Selection University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for.
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006.
Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
National Taiwan University Department of Computer Science and Information Engineering Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
A single-nucleotide polymorphism tagging set for human drug metabolism and transport Kourosh R Ahmadi, Mike E Weale, Zhengyu Y Xue, Nicole Soranzo, David.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Molecular & Genetic Epi 217 Association Studies
CS177 Lecture 10 SNPs and Human Genetic Variation
Informative SNP Selection Based on Multiple Linear Regression
SeattleSNPs Variation Discovery Resource Materials prepared by: Mary E. Mangan, PhD Updated: Q Version 1.
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
Large-scale recombination rate patterns are conserved among human populations David Serre McGill University and Genome Quebec Innovation Center UQAM January.
Molecular & Genetic Epi 217 Association Studies: Indirect John Witte.
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
Polymorphism Haixu Tang School of Informatics. Genome variations underlie phenotypic differences cause inherited diseases.
Utility of Genotyping in Pharmaceutical Target (gene) Discovery and Drug Response Anne Westcott EST-Informatics.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
Linear Reduction Method for Tag SNPs Selection Jingwu He Alex Zelikovsky.
Lecture 7.01 The informatics of SNPs and haplotypes Gabor T. Marth Department of Biology, Boston College CGDN Bioinformatics Workshop June.
SNP Discovery and Genotyping Workshop
SNPs, Haplotypes, Disease Associations Algorithmic Foundations of Computational Biology II Course 1 Prof. Sorin Istrail.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
The HapMap Project and Haploview
The International Consortium. The International HapMap Project.
Copyright OpenHelix. No use or reproduction without express written consent1.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Multiple-Locus Genome-Wide Association Testing David Dean CSE280A.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Admixture Mapping Controlled Crosses Are Often Used to Determine the Genetic Basis of Differences Between Populations. When controlled crosses are not.
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
The Haplotype Blocks Problems Wu Ling-Yun
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Date of download: 11/12/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Influence of Child Abuse on Adult DepressionModeration.
Common variation, GWAS & PLINK
Of Sea Urchins, Birds and Men
Consideration for Planning a Candidate Gene Association Study With TagSNPs Shehnaz K. Hussain, PhD, ScM Epidemiology 243: Molecular.
High-resolution haplotype structure in the human genome
Itsik Pe’er, Yves R. Chretien, Paul I. W. de Bakker, Jeffrey C
Haplotype Diversity across 100 Candidate Genes for Inflammation, Lipid Metabolism, and Blood Pressure Regulation in Two Populations  Dana C. Crawford,
Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium  Christopher S. Carlson,
Presentation transcript:

Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006

Outline of Tutorial Concepts of tagSNPs LD and haplotype definitions Haplotype blocks and definitions Tools to identify tagSNPs

Why Do We Need tagSNPs? Whole Genome: 15,000,000 SNPs 6,000,000 SNPs > 5% MAF Too Many SNPs to Genotype! Ex: E2F2 Average Gene: 26.5 kb 130 SNPs 44 SNPs ≥5% MAF

SNPs Are Correlated (aka linkage disequilibrium) “the nonindependence of alleles at different sites.” Pritchard and Przeworski 2001 Genotype at one site can predict genotype at another site Proportion of sites are correlated

Measuring Pair-wise SNP Correlations SNP correlation described by linkage disequilibrium (LD) Pair-wise measures of LD: D´ and r 2 D = p AB - p A p B ; D´ = D/D max Recombination r 2 = D 2 f(A 1 )f(A 2 )f(B 1 )f(B 2 ) Power

r 2 is inversely related to power 1/r 2 1,000 cases1,250 cases 1,000 controls r 2 =1.01,250 controlsr 2 = 0.80 D´ is related to recombination history D´ = 1no recombination D´ < 1historical recombination LD Statistics: Practical Uses

Where to Find Population LD Statistics For your gene or region of interest, search HapMapwww.hapmap.org Perlegengenome.perlegen.com Environmental Genome Projectegp.gs.washington.edu SeattleSNPs PGApga.gs.washington.edu

Where to Find Population LD Statistics For your gene or region of interest, search HapMapwww.hapmap.org Perlegengenome.perlegen.com Environmental Genome Projectegp.gs.washington.edu SeattleSNPs PGApga.gs.washington.edu

Visualizing Pair-wise LD

USF Visualizing Pair-wise LD

SeattleSNPs + Perlegen SeattleSNPs

Visualizing Pair-wise LD: Beyond the Gene

Visualizing Pair-wise LD: Beyond the Gene

SeattleSNPs Visualizing Pair-wise LD: Beyond the Gene

Multi-SNP Correlations (aka Haplotypes) “…a unique combination of genetic markers present in a chromosome.” pg 57 in Hartl & Clark, 1997

Constructing Haplotypes C TA GC TA G T TG GT TG G C CA GC CA G C/T, A/G C/C, A/G T/T, G/G C/T, A/A C/C, A/G Collect pedigreesSomatic cell hybrids Human Rodent Hybrid SNP 1 SNP 2 C/TA/G Allele-specific PCR

Constructing Haplotypes Examples of Haplotype Inference Software: EM Algorithm Haploview Arlequin PHASE v2.1 HAPLOTYPER

Haplotypes in SeattleSNPs >250 genes re-sequenced in inflammation response 2 populations: European- and African-descent PHASEv2.0 results posted on website Interactive tool (VH1) to visualize and sort haplotypes

Haplotypes in SeattleSNPs

r 2 is inversely related to power 1/r 2 1,000 cases1,250 cases 1,000 controls r 2 =1.01,250 controlsr 2 = 0.80 D´ is related to recombination history D´ = 1no recombination D´ < 1historical recombination Example: LDSelect in GVS Example: Haplotype “blocks” Using LD and Haplotypes to Pick tagSNPs

r 2 is inversely related to power 1/r 2 1,000 cases1,250 cases 1,000 controls r 2 =1.01,250 controlsr 2 = 0.80 Example: LDSelect Using LD and Haplotypes to Pick tagSNPs Discovery genotype datapair-wise LDpick tagSNPs

LDSelect: Using LD to Pick tagSNPs LDSelect Uses SNP discovery data (not haplotypes) Finds all correlated SNPs to minimize the total number Maintains genetic diversity of locus Carlson et al. AJHG (2004)

TagSNPs Are Population Specific European-Americans CRP African-Americans CRP

SNP Selection Using GVS

22 SNPs (>5% MAF) 7 tagSNPs

SNP Selection: tagSNP Data

Side Note: Categorizing tagSNPs SNP context Nonrepetitive > repetitive Location of SNP Coding > noncoding Function Nonsynonymous > synonymous

Categorizing tagSNPs

Haplotypes in Genetic Association Studies Two main approaches with haplotypes: HaplotypesPick tagSNPsGenotype samples Pick tagSNPs Infer haplotypesTest for association

Haplotypes in Genetic Association Studies Two main approaches with haplotypes: Haplotypes Pick tagSNPs Genotype samples Pick tagSNPs Infer haplotypesTest for association Recombination Natural selection Population history Population demography Haplotype block definition

Haplotype “Blocks” Strong LD Few Haplotypes Represent most chromosomes Daly et al 2001 Daly et al Nat. Genet. (2001)

Block Definitions Daly et al 2001 D ´ [Gabriel et al Science (2002)] Daly et al Nat. Genet. (2001)

Block Definitions AB ab Ab aB Four-gamete test: A B ab <4 haplotypes, D´=1block 4 haplotypes, D´<1boundary

Haplotype Blocks and tagSNPs Identifying blocks and tagSNPs: Manually Algorithms – Haploview

Haplotype Blocks and tagSNPs IL1B: 19 SNPs (MAF >5%) 4 “common” haplotypes tagSNPs

Haplotype Blocks and tagSNPs Identifying blocks and tagSNPs: Manually Algorithms – HaploView

HapMap Data and Haploview

Import HapMap Data into Haploview

May not be minimal set

Minimal set of tagSNPs based on r 2

Note: HapMap is not complete variation data

HapMap Variation data, LD, and tagSNPs for ABCE1 in European-Americans 7 SNPs 35 SNPs SeattleSNPs 4 tagSNPs

Where to Find Tagging Software HaploBlockFinder LDSelect SNPtagger TagIT tagSNPs Haploview

Haplotypes, TagSNPs, and Caveats Haplotypes are inferred Block-like structure assumed for some software Different block definitions Block boundaries sensitive to marker density Genotype savings may not be great (recombination)

Small sample size Subgroup analysis and multiple testing Random error Poorly matched control group Failure to attempt study replication Failure to detect LD with adjacent loci Overinterpreting results and positive publication bias Unwarranted ‘candidate gene’ declaration after identifying association in arbitrary genetic region Common Errors in Association Studies Bell and Cardon (2001) e.g., Second case/control study Gene expression studies

Resources available for pair-wise LD and haplotypes Software for tagSNP selection available Be aware the limitations of the approach you choose Replication required by several journals Picking SNPs Application to Association Studies Summary

SeattleSNPs Genotyping Service Free genotyping (BeadArray) Emphasis on young investigators Research related to heart, lung, blood, or sleep disorders Moderate to large population samples Apply at pga.gs.washington.edu Due date: TBA