INTRODUCTION Genome-wide association studies are now feasible. Measuring allele frequencies of pools of cases and controls, instead of between individuals,

Slides:



Advertisements
Similar presentations
Attaching statistical weight to DNA test results 1.Single source samples 2.Relatives 3.Substructure 4.Error rates 5.Mixtures/allelic drop out 6.Database.
Advertisements

Regulation of Consumer Tests in California AAAS Meeting June 1-2, 2009 Beatrice OKeefe Acting Chief, Laboratory Field Services California Department of.
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Which Phenotypes Can be Predicted from a Genome Wide Scan of Single Nucleotide Polymorphisms (SNPs): Ethnicity vs. Breast Cancer Mohsen Hajiloo, Russell.
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.
Objectives Cover some of the essential concepts for GWAS that have not yet been covered Hardy-Weinberg equilibrium Meta-analysis SNP Imputation Review.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7.3 Estimating a Population mean µ (σ known) Objective Find the confidence.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
Development, Implementation and Testing of a DNA Microarray Test Suite Ehsanul Haque Mentors: Dr. Cecilie Boysen Dr. Jim Breaux ViaLogy Corp.
Candidate Gene Resource Steering Committee Meeting July 25, 2006.
8 Statistical Intervals for a Single Sample CHAPTER OUTLINE
Edpsy 511 Homework 1: Due 2/6.
Statistics 800: Quantitative Business Analysis for Decision Making Measures of Locations and Variability.
Resolving membership in a study in shared aggregate genetics data David W. Craig, Ph.D. Investigator & Associate Director Neurogenomics Division
8-1 Introduction In the previous chapter we illustrated how a parameter can be estimated from sample data. However, it is important to understand how.
KinSNP Software for homozygosity mapping of disease genes using SNP microarrays El-Ad David Amir 1, Ofer Bartal 1, Yoni Sheinin 2, Ruti Parvari 2 and Vered.
Andrew Singleton Molecular Genetics Section Laboratory of Neurogenetics National Institute on Aging Andrew Singleton, Chief of the.
Design Considerations in Large- Scale Genetic Association Studies Michael Boehnke, Andrew Skol, Laura Scott, Cristen Willer, Gonçalo Abecasis, Anne Jackson,
1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised.
Things that I think are important Chapter 1 Bar graphs, histograms Outliers Mean, median, mode, quartiles of data Variance and standard deviation of.
Factors to Consider in Selecting a Genotyping Platform Elizabeth Pugh June 22, 2007.
1 Introduction to Estimation Chapter Concepts of Estimation The objective of estimation is to determine the value of a population parameter on the.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Measures of Variability Objective: Students should know what a variance and standard deviation are and for what type of data they typically used.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Figure S1. Quantile-quantile plot in –log10 scale for the individual studies The red line represents concordance of observed and expected values. The shaded.
Scenario 6 Distinguishing different types of leukemia to target treatment.
A Genome-wide association study of Copy number variation in schizophrenia Andrés Ingason CNS Division, deCODE Genetics. Research Institute of Biological.
Development and Application of SNP markers in Genome of shrimp (Fenneropenaeus chinensis) Jianyong Zhang Marine Biology.
Is there a difference in how males and female students perceive their weight?
© 2010 by The Samuel Roberts Noble Foundation, Inc. 1 The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA 2 National Center.
10cM - Linkage Mapping Set v2 ABI Median intermarker distance: 4.7 Mb Mean intermarker distance: 5.6 Mb Mean genetic gap distance: 8.9 cM Average Heterozygosity.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Methods in genome wide association studies. Norú Moreno
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
Genotype Calling Jackson Pang Digvijay Singh Electrical Engineering, UCLA.
Other genomic arrays: Methylation, chIP on chip… UBio Training Courses.
What’s with all those numbers?.  What are Statistics?

Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
Exercise 1 DNA identification. To which population an individual belongs? Two populations of lab-mice have been accidentally put in a same cage. Your.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Schematic of the single variant polymorphism (SNP) genotyping assay.
Date of download: 7/2/2016 Copyright © 2016 American Medical Association. All rights reserved. From: How to Interpret a Genome-wide Association Study JAMA.
Jordan S. Orange, MD, PhD, Joseph T
Advanced Quantitative Techniques
ESTIMATION.
Global Variation in Copy Number in the Human Genome
Statistical Methods For Engineers
Jordan S. Orange, MD, PhD, Joseph T
Detection, Imputation, and Association Analysis of Small Deletions and Null Alleles on Oligonucleotide Arrays  Lude Franke, Carolien G.F. de Kovel, Yurii.
Canine hip dysplasia is predictable by genotyping
Identification of Paralogs in RADseq data
Volume 8, Issue 4, Pages (April 2017)
Phenotypically Concordant and Discordant Monozygotic Twins Display Different DNA Copy-Number-Variation Profiles  Carl E.G. Bruder, Arkadiusz Piotrowski,
Exercise: Effect of the IL6R gene on IL-6R concentration
Thomas Willems, Melissa Gymrek, G
SNP Arrays in Heterogeneous Tissue: Highly Accurate Collection of Both Germline and Somatic Genetic Information from Unpaired Single Tumor Samples  Guillaume.
Determining Which Method to use
Simultaneous Genotype Calling and Haplotype Phasing Improves Genotype Accuracy and Reduces False-Positive Associations for Genome-wide Association Studies 
Volume 8, Issue 4, Pages (April 2017)
Volume 42, Issue 6, Pages (June 2011)
Figure 2 Distribution of DEPDC5 variants in patients and controls
Evaluating the Effects of Imputation on the Power, Coverage, and Cost Efficiency of Genome-wide SNP Platforms  Carl A. Anderson, Fredrik H. Pettersson,
Genotype-Imputation Accuracy across Worldwide Human Populations
Jung-Ying Tzeng, Daowen Zhang  The American Journal of Human Genetics 
Presentation transcript:

INTRODUCTION Genome-wide association studies are now feasible. Measuring allele frequencies of pools of cases and controls, instead of between individuals, would greatly decrease costs and facilitate discovery. DNA pooling has successfully been used on the 10K Affymetrix microarray platform, but no groups have shown that it works on the Illumina platform. We tested pooling on the Infinium chips Human-1 (~109K SNPs) and HumanHap550 (~555K SNPs), which have ~23K overlap. METHODS Individuals and Pool Construction Coriell panel NDPT008 (neurologically normal Caucasian males and females ages 55-84, n=88) Individual genotyping in Hardy lab, call rates >95% Six equimolar pools manually prepared and genotyped by McMahon lab Concentration assayed by PicoGreen (serial dilutions from 200 to 50 to 10ng and reconcentration to 50ng, variance: 10% of mean) and verified by nanodrop Allele Frequency Calculations, Statistical Analysis MAF 88 calculated from 88 individual Hardy lab chips RAF raw calculated by Illumina’s BeadStudio SNP-specific correction factors k and AA avg /BB avg derived separately for each platform: –109K: 242 individuals genotyped in McMahon lab –550K: 120 HapMap individuals genotyped by Illumina DNA Pooling for Whole-Genome Association Studies with Illumina Infinium Assays Amber E. Baum 1, Nirmala Akula 1, Ben Klemens 3, Imer Cardona 1, Winston Corona 1, Andrew Singleton 2, John Hardy 2, Sevilla Detera-Wadleigh 1, Francis J. McMahon 1 1 Genetic Basis of Mood and Anxiety Disorders Unit, National Institute of Mental Health, and 2 Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, 3 Brookings Institution Scatter plots comparing raw and normalized pool frequencies to known population allele frequency. Upper panels: font- scaled Pearson’s r, with red stars indicating significance (all p < ). Lower panels: loess-fitted scatterplots. RESULTS Correlation Matrices RAF k – BB avg AA avg AA avg = avg(RAF raw ) over AA loci Normalizes homozygotes: AA to 1, BB to 0 RAF n k-correction: Hoogendoorn et al, Hum Genet (2000) 107: normalization: Craig et al, BMC Genomics (2005) 6:138 RAF k RAF raw X raw X raw + kY raw k SNP = avg(X raw /Y raw ) over AB loci Corrects for deviation of heterozygote from 50% A X raw X raw + Y raw raw image data, no correction Formulae for raw and normalized relative allele signals. Parameters are calculated individually for each SNP Histograms of (MAF 88 - MAF pool ) for raw and normalized frequencies, overlaid with quartile plots. Black lines illustrate the +/-1 SD interval and the red line marks 0. Difference Distributions Pool A Pool B Pool C SUMMARY AND RECOMMENDATIONS  Simple normalization methods applied to the HumanHap550 chip produce data highly correlated with data from individual genotyping.  The HumanHap550 showed greater precision and accuracy than Human-1 (population vs normalized r=0.90, median differences (raw), (k-corrected), and (normalized).  Additional improvements may be possible with chip-specific normalization methods proposed by Illumina (personal communication). We recommend: 1. Use of more than one DNA quantification method. The greatest source of variance in the pooling process is DNA quality and pool construction. 2. Use of test datasets for verification of accuracy of pooling technique. 3. Conceptualization of pooling/whole-genome association as one of several tools to prioritize SNPs for individual genotyping Research supported by the Intramural Research Programs of the National Institutes of Mental Health and Aging, NIH, DHHS assay chip variance. Correlations between all pools were highly significant with r= Pooling schematic. Three separate pools were made to assay the variance introduced by the pooling process; one pool was sampled with two chips to