Functional Genomics I - Microarrays.  Transcriptomics  Proteomics  Metabolomics  Genomics  SNP (Single Nucleotide Polymorphisms)

Slides:



Advertisements
Similar presentations
Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,
Advertisements

Introduction to Microarray Gene Expression
Pre-processing in DNA microarray experiments Sandrine Dudoit PH 296, Section 33 13/09/2001.
Microarray Normalization
Microarray Basics, and Planning a Microarray Experiment
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Mathematical Statistics, Centre for Mathematical Sciences
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Microarray Data Analysis Stuart M. Brown NYU School of Medicine.
Gene Expression Chapter 9.
Introduction to DNA Microarray Technology Steen Knudsen, April 2005.
Getting the numbers comparable
DNA microarray and array data analysis
Microarrays Dr Peter Smooker,
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
Figure 1: (A) A microarray may contain thousands of ‘spots’. Each spot contains many copies of the same DNA sequence that uniquely represents a gene from.
Microarray Technology Types Normalization Microarray Technology Microarray: –New Technology (first paper: 1995) Allows study of thousands of genes at.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Introduce to Microarray
Gene Expression Data Analyses (1) Trupti Joshi Computer Science Department 317 Engineering Building North (O)
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Analysis of microarray data
with an emphasis on DNA microarrays
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
Affymetrix vs. glass slide based arrays
A cell and its population of genes :. DNA forms double strands by a process called hybridization:
Reading and Pre-Processing Microarrays.  Data processing of Placental Microarrays  Dr. Hugo A. Barrera Saldaña  Paper in Mol. Med.
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
Lecture 22 Introduction to Microarray
CDNA Microarrays MB206.
Data Type 1: Microarrays
Panu Somervuo, March 19, cDNA microarrays.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
DNA Copy Number Analysis Qunyuan Zhang,Ph.D. Division of Statistical Genomics Department of Genetics & Center for Genome Sciences Washington University.
Agenda Introduction to microarrays
Genomica Funcional Dr. Víctor Treviño A7-421
Microarray - Leukemia vs. normal GeneChip System.
ARK-Genomics: Centre for Comparative and Functional Genomics in Farm Animals Richard Talbot Roslin Institute and R(D)SVS University of Edinburgh Microarrays.
CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Procedure Characteristics of Data Data.
Introduction to DNA microarray technologies Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor short course Summer 2002.
What Is Microarray A new powerful technology for biological exploration Parallel High-throughput Large-scale Genomic scale.
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
MICROARRAY TECHNOLOGY
Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University, Sweden Plate Effects in cDNA Microarray Data.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Microarray Technology. Introduction Introduction –Microarrays are extremely powerful ways to analyze gene expression. –Using a microarray, it is possible.
Microarray hybridization Usually comparative – Ratio between two samples Examples – Tumor vs. normal tissue – Drug treatment vs. no treatment – Embryo.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
EE150a – Genomic Signal and Information Processing On DNA Microarrays Technology October 12, 2004.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University Plate Effects in cDNA Microarray Data.
Oigonucleotide (Affyx) Array Basics Joseph Nevins Holly Dressman Mike West Duke University.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Introduction to Oligonucleotide Microarray Technology
Microarray: An Introduction
Arrays How do they work ? What are they ?. WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise.
Microarray - Leukemia vs. normal GeneChip System.
Microarray Technology and Applications
The Basics of Microarray Image Processing
Getting the numbers comparable
Microarray Data Analysis
Data Type 1: Microarrays
Presentation transcript:

Functional Genomics I - Microarrays

 Transcriptomics  Proteomics  Metabolomics  Genomics  SNP (Single Nucleotide Polymorphisms)  CNV (Copy Number Variation, CGH)  Epigenomics

 Technology that provides measurments of thousands of molecules in the same experiment and reasonable prices and precision  Generally in the size of a typical microscope slide (75 x 25 mm (3" X 1") and about 1.0 mm thick)

Biological Question Experimental Design Microarray Experiment Pre-processing Differential Expression ClusteringPrediction Biology: Verification and Interpretation … Image Analysis Background Normalization Sumarization Transformation

Google Images

Molecular Cell Biology [Lodish,Berk,Matsudaira,Kayser,Kreiger,Scott,Zipursky,Danell] (5th Ed) Gene Expression

100bp 200bp RWPE-1DU-145PC bp ladder mRNA, Gene X copies 10 6 copies 10 5 copies 10 4 copies 10 3 copies 10 2 copies 10 copies PCR QPCR

Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

Affymetrix Images – 1 dye two-dyes

Affymetrix Spotted Arrays Inkjet arrays Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

Dr. Hugo Barrera Microarrays Course EMBO-INER 2005, Mexico City

mRNA Extraction (and amplification) Labelling Hybridization Scanning Statistical Analysis Image Analysis & Data Processing PROCESS Healty/ControlDisease/Treatement REFERENCETEST Gene: A 1-1 B 1-0 C 3-3 D 0-3 Gene: E 3-0 F 0-1 G 1-1 H 2-0 Gene: I 2-2 J 0-0 K 3-0 L 2-1 Gene D Gene E Gene K TWO-DYES mRNA/cDNA Labeled mRNA Digital Image Microarray Data Selected Genes PRODUCT TEST Gene: A 1 B 1 C 1 D 0 Gene: E 4 F 1 G 1 H 2 Gene: I 2 J 0 K 5 L 2 Sample Gene D Gene E Gene K Gene J ONE-DYE

Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

Dr. Hugo Barrera, Microarrays Course EMBO-INER 2005, Mexico CityMicroarrays Bioinformatics, Dov Stekel, Cambridge,  m Laser 10  m Laser

Pre-processing Image Analysis Background Normalization Sumarization Transformation Microarray - Pre-Processing Purpose Output: Data File (unique "global relative" measure of expression for every gene with minimal experimental error) Input: Scanned Image File

TECHNOLOGIES DNA Probes Oligos ~20 40nt Target (cDNA, PCR products, etc.) Copies per geneUsually 1Usually 3 Organization Sectors (print-tip) n x m probsets Probeset m probsets (~100) y sectors (~=3) x sectors (~=3) n probsets (~100) Sectors i x j spots (18x20) Empty spots landing lights perfect match probes (pm) mismatch probes (mm) Controls

TECHNOLOGIES 10,000 genes * 2 dyes * 3 copies/gene * ~40 pixels/gene = 2,400,00 values only 10,000 values 10,000 genes * 20 oligos * 2 (pm,mm) * ~ 36 pixels/gene = 14,400,00 values only 10,000 values RAW DATA Image Analysis Pre-processing

Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Addressing Done by GeneChip Affymetrix software

Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Addressing (by grid, GenePix)

Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Segmentation Circular feature Irregular feature shape Finally compute Average

Background Reduction Extraction: Determining Background

2-Color Results (GenePix).gpr file "results" for one array 10,000 genes ~ 30,000 values (.gal files 1 file for a "list" of array) Affymetrix Results.cel file "results" for one array (raw - no background reduced) 10,000 genes ~ 400,000 values Image Analysis

Segmentation (Spot detection) Background Estimation Value Value = Spot Intensity – Spot Background Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample Sample

Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample Sample G=Sample 1 R=Sample 1 G=Sample 1 R=Sample 1 Log 2

Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample Sample (log 2 scale) RGRG 1 value? A M MA-Plot G=Sample 1 R=Sample 1

A M "With-in" (2 color technologies) Normalization – 2 dyes (assumption: Majority No change)

Normalization – 2 dyes (assumption: Majority No change) Before After "With-in" (2 color technologies)

Normalization – 2 dyes "With-in" Spatial (2 color technologies) Before Normalization Aftter loess Global Normalization Aftter loess by Sector (print-tip) Normalization

Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample Log 2

Before normalization After normalization Between-slides Normalization – 1 or 2 dyes quantile MAD (median absolute deviation) scale qspline invariantset loess

Sumarization = "Average"(Intensities) Summarization – Affymetrix Oligonucleotide dependent technologies Usual Methods: tukey-biweight av-diff median-polish PM MM The "summarization" equivalent in two-dyes technologies is the average of gene replicates within the slide.

 Some spots may be defective in the printing process  Some spots could not be detected  Some spots may be damaged during the assay  Artefacts may be presents (bubbles, etc)  Use replicated spots as averages  Remove unrecoverable genes  Remove problematic spots in all arrays  Infer values using computational methods (warning)

 More than 10,000 genes  Too many data increases Computation Time and analysis complexity  Remove  Genes that do not change significantly  Undefined Genes  Low expression  Keeping  Large signal to noise ratio  Large statistical significance  Large variability  Large expression

Image Analysis` Background Subtraction Normalization Summarization Transformation Data Processing Background Detection & Subtraction a) Filtering Microarray Image Scanning Spot Detection Intensity Value Affymetrix Two-dyes b) Image Analysis and Background Subtraction c) Transformation Between Within d) A=log2(R*G)/2 M=log2(R/G) Normalization

Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007

Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, … ….

 Differential Expression  Unsupervised Classification  Biomarker detection  Identifying genes related to survival times  Regression Analysis  Gene Copy Number and Comparative Genomic Hibridization  Epigenetics and Methylation  Genetic Polymorphisms and SNP's  Chromatin Immuno-Precipitation On-Chip  Pathogen Detection ……

Differential Expression Positive Negative Samples A Samples B Samples A Samples B Gene Selection µ=dµ=d µ=dµ=d Expression Level  Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, … p-value  FDR  q-Value

Biomarker Detection PositiveNegative Samples Class A Samples Class B Samples Class A Samples Class B µ=dµ=d µ=dµ=d Gene Selection Expression Level  Biomarker Discovery Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, …

A C G B H E D I K M L Samples Co-Expressed Genes Unsupervised Sample Classification a B Low High Expression b

Genes Associated to Survival Times and Risk PositiveNegative Gene Selection Kaplan-Meier Plot Time  Hazard Kaplan-Meier Plot Time  Hazard Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, …

Regression: Gene Association to outcome Positive Negative Gene Selection Dependent Variable  Gene Expression  Dependent Variable  Gene Expression  Slope ≠ 0Slope = 0 Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, …

LabellingDetectionHybridisation AA CG CC … … SNP 1 SNP 2 SNP 3 3' T T G C G G TG G C 5' SNP 1 SNP 2 SNP 3 Products of 1nt primer extension (in solution) Capture C TGA 5' GC 5' CG AA CG CC … … SNP 1 SNP 2 SNP 3 5' + Transcribed RNA + reverse transcriptase 5' GC A^C 5' TA C^A Extension ddNTPs (one labelled) 5' TA 5' TA 5' GC 5' CG 5' GC 5' GC AA CG CC … … SNP 1 SNP 2 SNP 3 Extension (1nt) + Labelled ddNTPs PCR products + DNA polymerase T C GA SNP 1 SNP 2 SNP 3 a b c

Chromatin Immuno-Precipitation (ChIP-on-Chip) Precipitation of Antibody-TF-DNA complex Fusion of Tag sequence into TF gene Labelling of precipitated DNA Microarray Hybridisation Incubation DNA-Tagged TF Transcription FactorTag Antibody against tag peptide

(1) ACGGCTAGTCACAAC... (2) GCTAGTCACAACCCA... (3) GCTAGTCCGGCACAG Sample SpottedHybridized (1)(2)(3)

Placenta 1 Placenta 2 mRNA Extraction Reference Pool Labelling Microarray Hybridization (by duplicates) Scanning & Data Processing Detection of Differentially Expressed Genes Validation and Analysis Green Red t-test  H 0 : µ = 0 p-values correction: False Discovery Rate Comparison With Known Tissue Specific Genes Image Analysis Within Normalization (per array) Between Normalization (all arrays) (controls) (Dr. Hugo Barrera)

a b cd Placenta/ReferenceControl/Control

(a) Microarray Experiment Ratio (log 2 ) Placenta (b) T1dbase T1 score 1 0 Lung Thalamus Amygdala Spinal Cord Testis Kidney Liver Pituitary Thyroid Cerebellum Hypothalamus Caudate Nucleus Exocrine Pancreas Lymph Node Frontal Cortex Stomach Breast Bone Marrow Pancreatic Islets Uterus Ovary Skin Heart Skeletal Muscle Prostate Thymus Salivary Gland Trachea Placenta 2 Replcate 2 Placenta 2 Replicate 1 Array: Placenta 1 Replicate 1 Placenta 1 Replicate 2

Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007

Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007

Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007