Download presentation
Presentation is loading. Please wait.
Published byKenzie Dyment Modified over 9 years ago
1
Randa Stringer Supervisor: Dr. Guillaume Par é A review of quality control and pre- processing measures for the Illumina 450K BeadChip
2
Steps for Review Sample Quality Probe Quality Background correction Normalization Cellular composition Batch effects
3
Array Design > 485,000 CpG sites Covers 99% of RefSeq genes Average of 17 sites per gene Distributed across promoter, 5’ UTR, first exon, gene body, and 3’ UTR Covers 96% of known CpG islands
4
Sample Quality Reported vs. predicted sex Use DNA methylation to predict sex Minfi – getSex function yMed - xMed is less than cutoff we predict a female, otherwise male. Sample detection cut-offs Threshold of failed probes in a sample (usually < 0.05 or 0.01)
5
Probe Quality Probe detection cut-offs Bead count ( > 3 ) Remove probes on sex chromosomes Probes containing SNPs Cross-reactive probes MAF > 1%
6
Background Correction Background subtraction method Available in GenomeStudio Background calculated from negative control probes is subtracted from all probes (separately for each channel [rd vs grn]) (GenomeStudio Methylation Module v1.8 User Guide)
7
Normalization Goal: reduce non-biological variation Equalizes probe intensity and signal distributions across arrays and between colour channels New challenges with DNA methylation vs. gene expression techniques Systematic/technical variation Novel probe design
8
Normalization for Illumina 450K Problem: 2-type probe design Infinium I Probe 2 different probes per CpG Infinium II Probe Single base extension at CpG Maksimovic et al. Genome Biology 2012
9
CpG Content Infinium II ≤ 3 Infinium I ≥ 3 Compressed β value distribution in InfII Solution: scale Infinium II probes to InfI probes Maksimovic et al. Genome Biology 2012
10
Normalization to Internal Controls Illumina GenomeStudio Probe intensity multiplied by constant normalization factor (NF) NF calculated as average of controls in a reference sample (GenomeStudio Methylation Module v1.8 User Guide) Doesn’t account for the InfI vs InfII probe issues
11
Peak-Based Correction (PBC) Uses peak summits to correct β values Convert β to M values Determine peaks for I and II probes with kernel density estimation Rescale M values by peak summits Rescale these corrected M values to the I range and converted back to β values Raw PBC Dedeurwaerder et al. Epigenomics 2011
12
Subset Quantile Normalization (SQN) Modeled after SQN methods in expression Probes separated and poor detection removed ‘Anchors’ (RQs) chosen from InfI probes Target quantiles are estimated for InfI and II InfI and II normalized to their RQs Dataset is rebuilt Touleimat and Tost, Epigenomics, 2012
13
SQN Cont’d No normalization Unique RQs RQs by ‘relation to CpG’ RQs by ‘relation to gene sequence’ Maksimovic et al. Genome Biology 2012
14
Subset Within-Array Normalization (SWAN) Allows InfI and InfII probes to be normalized together Subset of N InfI and InfII probes chosen based on underlying CpG content Separate methylated and unmethylated channels Mean intensity for each of 3N calculated InfI and II probes adjusted separately by linear interpolation Maksimovic et al. Genome Biology 2012
15
Beta-MIxture Quantile normalization (BMIQ) Novel normalization method Fit 3-state (U/H/M) to InfI and InfII probes separately Transform InfI U and M probes using the inverse of the cumulative beta distribution estimated from the respective InfII probes For H probes perform dilation transformation to fit the data into the gap Teschendorff et al. Bioinformatics 2012
16
START Data Raw DataSWAN Normalized
17
Cellular Composition Adapted from Correa-Rocha et al. Pediatric Research 2012
18
Estimations by Houseman Houseman et al. BMC Bioinformatics 2012
19
Batch Effects Can be assessed using principal component analysis or variations on singular variable decomposition (ex. sva) ComBat method uses a parametric or non- parametric empirical Bayes framework to adjust for a known source of batch effects
20
Singular Variable Decomposition (START)
21
Questions & Discussion
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.