Presentation is loading. Please wait.

Presentation is loading. Please wait.

SNP chips Advanced Microarray Analysis Mark Reimers, Dept Biostatistics, VCU, Fall 2008.

Similar presentations


Presentation on theme: "SNP chips Advanced Microarray Analysis Mark Reimers, Dept Biostatistics, VCU, Fall 2008."— Presentation transcript:

1 SNP chips Advanced Microarray Analysis Mark Reimers, Dept Biostatistics, VCU, Fall 2008

2 Affy SNP chips

3 SNP Chip Probe Design 10 25-mers overlapping the SNP Alleles A & B Sense and Anti-sense –or PM and MM (old)

4 RMA for SNP chips Initial Affy software wasn’t very accurate Rabbee & Speed (2006) proposed RLMM, an RMA-like method using: –Quantile normalization –Two variables ( A & B signals) –Discriminant analysis Much better than Affy software Variant (BRLMM) adopted by Affy

5 Discriminating SNPs Estimate common covariance to clusters on ‘training’ set (Hapmap) data Separate clusters by Mahalanobis metric Use pre-defined clusters & metric to tell apart alleles on new data

6 Success Rate 90% (MPAM) to 98% (CRLMM) called at comparable accuracy on HapMap data –Cross-validation estimate BUT New chips don’t have same distributions as ‘training’ set

7 CRLMM - a heroic solution RLMM couldn’t be extended across labs Still problems with several hundred SNPs CRLMM addresses both these issues by careful normalization Achieves accuracy of 99.85% on hets; 99.95% on homozygotes Most complicated statistical calculation in BioC!

8 CRLMM Overview 1.Normalize intensity on each chip separately by 2.Summarize   ,   ,   ,    by median polish: M+ =    -    ; M- =    -    3.Model log ratio bias on each chip by 4.Estimate log ratio bias using E-M –Where Z i indexes which SNP state is likely –k = 1,2,3 for AA, AB, BB

9 Normalization – Step 1 Regress (PM) intensity on sequence predictors and fragment length h b (t) for all four bases on two chips g(L) and 95% CI on one chip

10 Normalization – Step 1 Too many h b (t)’s –Impose constraint: h b (t) is a cubic spline with 5 df on [1,25] Forces neighboring values of h to be close Allows variation in smoothness (unlike loess) Subtract fitted values from signal BUT: bias still present

11 Step 2 – Summarization Median Polish –Tukey’s exploratory method for arrays of numbers –Iterative method Subtract medians of each row and each column (and accumulate) until medians converge Robust Fast

12 Step 3 – Ratio Normalization Fit bias function: –of form:  reflects allele bases But what is k? Estimate by E-M f L (L) for one chip 

13 E-M Algorithm Systematic way to ‘guess and improve’ Start with putative assignments to classes –i.e. guess k based on overall separations Estimate bias for each k: f i,k Use residuals from fit to classify again Repeat until converge!

14 Final Step: Calling Aim: separation in two-dimensional log- ratio space: Accuracy > 99.85% on all Hapmap calls


Download ppt "SNP chips Advanced Microarray Analysis Mark Reimers, Dept Biostatistics, VCU, Fall 2008."

Similar presentations


Ads by Google