Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copy-number estimation using Robust Multichip Analysis - Supplementary materials for the aroma.affymetrix lab session Henrik Bengtsson & Terry Speed Dept.

Similar presentations


Presentation on theme: "Copy-number estimation using Robust Multichip Analysis - Supplementary materials for the aroma.affymetrix lab session Henrik Bengtsson & Terry Speed Dept."— Presentation transcript:

1 Copy-number estimation using Robust Multichip Analysis - Supplementary materials for the aroma.affymetrix lab session Henrik Bengtsson & Terry Speed Dept of Statistics, UC Berkeley August 7, 2007 BioC 2007

2 Affymetrix chips

3 Generic Affymetrix chip
> 1 million identical 25bp sequences * * 1.28 cm 1.28 cm 6.5 million probes/ chip Feature size: 100µm to 18µm to 11µm and now 5µm. Soon: 1µm, 0.8µm, with a huge increase in number of probes.

4 Abbreviated generic assay description
Start with target gDNA (genomic DNA) or mRNA. Obtain labeled single-stranded target DNA fragments for hybridization to the probes on the chip. After hybridization, washing, staining and scanning we get a digital image. This is summarized across pixels to probe-level intensities before we begin. They are our raw data.

5 Affymetrix probe terminology
Target DNA: ...CGTAGCCATCGGTAAGTACTCAATGATAG... ||||||||||||||||||||||||| Perfect match (PM): ATCGGTAGCCATTCATGAGTTACTA Mis-match (MM): ATCGGTAGCCATACATGAGTTACTA 25 nucleotides * other PMs Other DNA Other seq. * PM Target seq. X * * * MM

6 Affymetrix SNP chips (Mapping 10K, 100K, 500K)

7 Single Nucleotide Polymorphism (SNP)
Definition: A sequence variation such that two chromosomes may differ by a single nucleotide (A, T, C, or G). Allele A: A ...CGTAGCCATCGGTA/GTACTCAATGATAG... Allele B: G A person is either AA, AB, or BB at this SNP.

8 Probes for SNPs AA AB BB * * * * PMA: ATCGGTAGCCATTCATGAGTTACTA
Allele A: ...CGTAGCCATCGGTAAGTACTCAATGATAG... Allele B: ...CGTAGCCATCGGTAGGTACTCAATGATAG... PMB: ATCGGTAGCCATCCATGAGTTACTA (Also MMs, but not in the newer chips, so we will not use these!) * * PMA >> PMB AA * PMA ¼ PMB AB * PMA << PMB BB

9 Copy-number analysis with SNP arrays

10 Copy-number estimation using Robust Multichip Analysis (CRMA)
Preprocessing (probe signals) allelic crosstalk (or quantile) Total CN PM = PMA + PMB Summarization (SNP signals  ) log-additive PM only Post-processing fragment-length (GC-content) Raw total CNs R = Reference Mij = log2(ij /Rj) chip i, probe j

11 Copy-number estimation using Robust Multichip Analysis (CRMA)
Cross-hybridization: Allele A: TCGGTAAGTACTC Allele B: TCGGTATGTACTC CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) AA * PMA >> PMB * PMA ¼ PMB AB * PMA << PMB BB

12 Copy-number estimation using Robust Multichip Analysis (CRMA)
AA TT AT CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) PMT PMA offset +

13 Copy-number estimation using Robust Multichip Analysis (CRMA)
Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) PMT PMA

14 Copy-number estimation using Robust Multichip Analysis (CRMA)
Crosstalk calibration corrects for differences in distributions too CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) log2 PM

15 Copy-number estimation using Robust Multichip Analysis (CRMA)
Crosstalk calibration corrects for differences in distributions too CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) log2 PM

16 Copy-number estimation using Robust Multichip Analysis (CRMA)
AA CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) * PM = PMA + PMB AB * * PM = PMA + PMB * BB * * PM = PMA + PMB

17 Copy-number estimation using Robust Multichip Analysis (CRMA)
Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) The log-additive model: log2(PMijk) = log2ij + log2jk + ijk sample i, SNP j, probe k. Fit using robust linear models (rlm)

18 Copy-number estimation using Robust Multichip Analysis (CRMA)
Longer fragments ) less amplified by PCR ) weaker SNP signals  CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) 100K

19 Copy-number estimation using Robust Multichip Analysis (CRMA)
Longer fragments ) less amplified by PCR ) weaker SNP signals  CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) 500K

20 Copy-number estimation using Robust Multichip Analysis (CRMA)
Normalize to get same fragment-length effect for all hybridizations CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj)

21 Copy-number estimation using Robust Multichip Analysis (CRMA)
Normalize to get same fragment-length effect for all hybridizations CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj)

22 Copy-number estimation using Robust Multichip Analysis (CRMA)
Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj)


Download ppt "Copy-number estimation using Robust Multichip Analysis - Supplementary materials for the aroma.affymetrix lab session Henrik Bengtsson & Terry Speed Dept."

Similar presentations


Ads by Google