Copy-number estimation using Robust Multichip Analysis - Supplementary materials for the aroma.affymetrix lab session Henrik Bengtsson & Terry Speed Dept.

Slides:



Advertisements
Similar presentations
Lecture 2 Strachan and Read Chapter 13
Advertisements

Low-Level Copy Number Analysis CRMA v2 preprocessing Henrik Bengtsson Post doc, Department of Statistics, University of California, Berkeley, USA CEIT.
Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.
Modeling sequence dependence of microarray probe signals Li Zhang Department of Biostatistics and Applied Mathematics MD Anderson Cancer Center.
AFFYMETRIX SNP chips Karin Dahlman-Wright Department of Biosciences and BEA Karolinska Institutet.
Introduction to DNA Microarray Technology Steen Knudsen, April 2005.
Getting the numbers comparable
DNA microarray and array data analysis
Probe Level Analysis of AffymetrixTM Data
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
SNP chips Advanced Microarray Analysis Mark Reimers, Dept Biostatistics, VCU, Fall 2008.
Central Dogma 2 Transcription mRNA Information stored In Gene (DNA) Translation Protein Transcription Reverse Transcription SELF-REPAIRING ARABIDOPSIS,
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
Microarray Data Analysis Data quality assessment and normalization for affymetrix chips.
Microarray Data Analysis Data quality assessment and normalization for affymetrix chips.
Genome-wide Copy Number Analysis Qunyuan Zhang,Ph.D. Division of Statistical Genomics Department of Genetics & Center for Genome Sciences Washington University.
Gene expression array and SNP array
Microarray Preprocessing
Copy-number estimation on the latest generation of high-density oligonucleotide microarrays Henrik Bengtsson (work with Terry Speed) Dept of Statistics,
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
Data Type 1: Microarrays
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
DNA Copy Number Analysis Qunyuan Zhang,Ph.D. Division of Statistical Genomics Department of Genetics & Center for Genome Sciences Washington University.
A Single-Array Preprocessing Method for Estimating Full-Resolution Raw Copy Numbers from all Affymetrix Genotyping Arrays Henrik Bengtsson (MSc CS, PhD.
1 Estimating chromosomal copy number InCoB2007, Hong Kong, 30 August, 2007.
CZ5225: Modeling and Simulation in Biology Lecture 10: Copy Number Variations Prof. Chen Yu Zong Tel:
Estimating chromosomal copy numbers from Affymetrix SNP & CN chips Henrik Bengtsson & Terry Speed Dept of Statistics, UC Berkeley September 13, 2007 "Statistics.
Introduction to DNA microarray technologies Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor short course Summer 2002.
Lo w -Level Analysis of Affymetrix Data Mark Reimers National Cancer Institute Bethesda Maryland.
Low-Level Copy Number Analysis Part 1 - Background Henrik Bengtsson Post doc, Department of Statistics, University of California, Berkeley, USA CEIT Workshop.
Intro to Microarray Analysis Courtesy of Professor Dan Nettleton Iowa State University (with some edits)
Summarization of Oligonucleotide Expression Arrays BIOS Winter 2010.
GeneChip® Probe Arrays
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Using a Single Nucleotide Polymorphism to Predict Bitter Tasting Ability Lab Overview.
Analysis of protein-DNA interactions with tiling microarrays
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Using a Single Nucleotide Polymorphism to Predict Bitter Tasting Ability Lab Overview.
Statistical Analyses of High Density Oligonucleotide Arrays Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry.
Oigonucleotide (Affyx) Array Basics Joseph Nevins Holly Dressman Mike West Duke University.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
Gene expression  Introduction to gene expression arrays Microarray Data pre-processing  Introduction to RNA-seq Deep sequencing applications RNA-seq.
Copy Number Analysis in the Cancer Genome Using SNP Arrays Qunyuan Zhang, Aldi Kraja Division of Statistical Genomics Department of Genetics & Center for.
Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data Rafael A. Irizarry Department of Biostatistics, JHU (joint.
Introduction to Oligonucleotide Microarray Technology
DNA Fingerprinting Maryam Ahmed Khan February 14, 2001.
동물 분자 유전체 연구의 최신 동향 National Institute of Animal Science Animal Genomics & Bioinformatics 정호영
Arrays How do they work ? What are they ?. WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise.
Simple-Sequence Length Polymorphisms
Using ArrayStar with a public dataset
Hybridization.
Expression and Methylation: QC and Pre-Processing
Bio 211 d16 DNA!.
Microarray Technology and Applications
Example of a common SNP in dogs
Single Nucleotide Polymorphisms
Microarrays 1/31/2018.
Forensic Biology by Richard Li
Jianbin Wang, H. Christina Fan, Barry Behr, Stephen R. Quake  Cell 
Genomic alterations in breast cancer cell line MDA-MB-231.
“TaqMan genotyping Assay’’
Getting the numbers comparable
Eric Samorodnitsky, Jharna Datta, Benjamin M
How can we use science and technology to solve world food crisis. M
Catarina D. Campbell, Nick Sampas, Anya Tsalenko, Peter H
Density Density ß values ß values
9-3 DNA Typing with Tandem Repeats
Data Type 1: Microarrays
Lecture 3 From Images to Data
Pre-processing AFFY data
Presentation transcript:

Copy-number estimation using Robust Multichip Analysis - Supplementary materials for the aroma.affymetrix lab session Henrik Bengtsson & Terry Speed Dept of Statistics, UC Berkeley August 7, 2007 BioC 2007

Affymetrix chips

Generic Affymetrix chip > 1 million identical 25bp sequences * * 1.28 cm 1.28 cm 6.5 million probes/ chip Feature size: 100µm to 18µm to 11µm and now 5µm. Soon: 1µm, 0.8µm, with a huge increase in number of probes.

Abbreviated generic assay description Start with target gDNA (genomic DNA) or mRNA. Obtain labeled single-stranded target DNA fragments for hybridization to the probes on the chip. After hybridization, washing, staining and scanning we get a digital image. This is summarized across pixels to probe-level intensities before we begin. They are our raw data.

Affymetrix probe terminology Target DNA: ...CGTAGCCATCGGTAAGTACTCAATGATAG... ||||||||||||||||||||||||| Perfect match (PM): ATCGGTAGCCATTCATGAGTTACTA Mis-match (MM): ATCGGTAGCCATACATGAGTTACTA 25 nucleotides * other PMs Other DNA Other seq. * PM Target seq. X * * * MM

Affymetrix SNP chips (Mapping 10K, 100K, 500K)

Single Nucleotide Polymorphism (SNP) Definition: A sequence variation such that two chromosomes may differ by a single nucleotide (A, T, C, or G). Allele A: A ...CGTAGCCATCGGTA/GTACTCAATGATAG... Allele B: G A person is either AA, AB, or BB at this SNP.

Probes for SNPs AA AB BB * * * * PMA: ATCGGTAGCCATTCATGAGTTACTA Allele A: ...CGTAGCCATCGGTAAGTACTCAATGATAG... Allele B: ...CGTAGCCATCGGTAGGTACTCAATGATAG... PMB: ATCGGTAGCCATCCATGAGTTACTA (Also MMs, but not in the newer chips, so we will not use these!) * * PMA >> PMB AA * PMA ¼ PMB AB * PMA << PMB BB

Copy-number analysis with SNP arrays

Copy-number estimation using Robust Multichip Analysis (CRMA) Preprocessing (probe signals) allelic crosstalk (or quantile) Total CN PM = PMA + PMB Summarization (SNP signals  ) log-additive PM only Post-processing fragment-length (GC-content) Raw total CNs R = Reference Mij = log2(ij /Rj) chip i, probe j

Copy-number estimation using Robust Multichip Analysis (CRMA) Cross-hybridization: Allele A: TCGGTAAGTACTC Allele B: TCGGTATGTACTC CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) AA * PMA >> PMB * PMA ¼ PMB AB * PMA << PMB BB

Copy-number estimation using Robust Multichip Analysis (CRMA) AA TT AT CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) PMT PMA offset +

Copy-number estimation using Robust Multichip Analysis (CRMA) Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) PMT PMA

Copy-number estimation using Robust Multichip Analysis (CRMA) Crosstalk calibration corrects for differences in distributions too CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) log2 PM

Copy-number estimation using Robust Multichip Analysis (CRMA) Crosstalk calibration corrects for differences in distributions too CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) log2 PM

Copy-number estimation using Robust Multichip Analysis (CRMA) AA CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) * PM = PMA + PMB AB * * PM = PMA + PMB * BB * * PM = PMA + PMB

Copy-number estimation using Robust Multichip Analysis (CRMA) Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) The log-additive model: log2(PMijk) = log2ij + log2jk + ijk sample i, SNP j, probe k. Fit using robust linear models (rlm)

Copy-number estimation using Robust Multichip Analysis (CRMA) Longer fragments ) less amplified by PCR ) weaker SNP signals  CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) 100K

Copy-number estimation using Robust Multichip Analysis (CRMA) Longer fragments ) less amplified by PCR ) weaker SNP signals  CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj) 500K

Copy-number estimation using Robust Multichip Analysis (CRMA) Normalize to get same fragment-length effect for all hybridizations CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj)

Copy-number estimation using Robust Multichip Analysis (CRMA) Normalize to get same fragment-length effect for all hybridizations CRMA Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj)

Copy-number estimation using Robust Multichip Analysis (CRMA) Preprocessing (probe signals) allelic crosstalk (quantile) Total CNs PM=PMA+PMB Summarization (SNP signals ) log-additive (PM-only) Post-processing fragment-length (GC-content) Raw total CNs Mij = log2(ij/Rj)