Download presentation
Presentation is loading. Please wait.
Published byNoel Basil Rich Modified over 9 years ago
1
Functional Genomics I - Microarrays
2
vtrevino@itesm.mx Transcriptomics Proteomics Metabolomics Genomics SNP (Single Nucleotide Polymorphisms) CNV (Copy Number Variation, CGH) Epigenomics
3
vtrevino@itesm.mx Technology that provides measurments of thousands of molecules in the same experiment and reasonable prices and precision Generally in the size of a typical microscope slide (75 x 25 mm (3" X 1") and about 1.0 mm thick)
4
Biological Question Experimental Design Microarray Experiment Pre-processing Differential Expression ClusteringPrediction Biology: Verification and Interpretation … Image Analysis Background Normalization Sumarization Transformation
5
vtrevino@itesm.mx Google Images
6
vtrevino@itesm.mx Molecular Cell Biology [Lodish,Berk,Matsudaira,Kayser,Kreiger,Scott,Zipursky,Danell] (5th Ed) Gene Expression
7
vtrevino@itesm.mx 100bp 200bp -+ -+ -+ RWPE-1DU-145PC-3 100 bp ladder mRNA, Gene X http://www.bio168.com/mag/1B8B368B092A/20-3.jpg 10 7 copies 10 6 copies 10 5 copies 10 4 copies 10 3 copies 10 2 copies 10 copies PCR QPCR
8
vtrevino@itesm.mx Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003
9
vtrevino@itesm.mx http://www.well.ox.ac.uk/genomics/facilitites/Microarray/Welcome.shtml
10
vtrevino@itesm.mx Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003
11
www.niaid.nih.gov/dir/services/rtb/microarray/overview.asp http://metherall.genetics.utah.edu/Protocols/Microarray-Spotting.html http://www.lbl.gov/Science-Articles/Archive/http://www.lbl.gov/Science-Articles/Archive/cardiac-hyper-genes.html http://www.nrc-cnrc.gc.ca/multimedia/picture/life/nrc-bri_micro-array_e.html http://learn.genetics.utah.edu/units/biotech/microarray/genechip.jpg Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003
12
vtrevino@itesm.mx
13
Affymetrix Images – 1 dye two-dyes
14
vtrevino@itesm.mx Affymetrix Spotted Arrays Inkjet arrays Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003
15
vtrevino@itesm.mx Dr. Hugo Barrera Microarrays Course EMBO-INER 2005, Mexico City
16
mRNA Extraction (and amplification) Labelling Hybridization Scanning Statistical Analysis Image Analysis & Data Processing PROCESS Healty/ControlDisease/Treatement REFERENCETEST Gene: A 1-1 B 1-0 C 3-3 D 0-3 Gene: E 3-0 F 0-1 G 1-1 H 2-0 Gene: I 2-2 J 0-0 K 3-0 L 2-1 Gene D 0.001 Gene E 0.005 Gene K 0.001 TWO-DYES mRNA/cDNA Labeled mRNA Digital Image Microarray Data Selected Genes PRODUCT TEST Gene: A 1 B 1 C 1 D 0 Gene: E 4 F 1 G 1 H 2 Gene: I 2 J 0 K 5 L 2 Sample Gene D 0.001 Gene E 0.005 Gene K 0.001 Gene J 0.003 ONE-DYE
17
vtrevino@itesm.mx Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003
18
vtrevino@itesm.mx Dr. Hugo Barrera, Microarrays Course EMBO-INER 2005, Mexico CityMicroarrays Bioinformatics, Dov Stekel, Cambridge, 2003 5 m Laser 10 m Laser
19
Pre-processing Image Analysis Background Normalization Sumarization Transformation Microarray - Pre-Processing Purpose Output: Data File (unique "global relative" measure of expression for every gene with minimal experimental error) Input: Scanned Image File
20
vtrevino@itesm.mx TECHNOLOGIES DNA Probes Oligos ~20 40nt Target (cDNA, PCR products, etc.) Copies per geneUsually 1Usually 3 Organization Sectors (print-tip) n x m probsets Probeset m probsets (~100) y sectors (~=3) x sectors (~=3) n probsets (~100) Sectors i x j spots (18x20) Empty spots landing lights perfect match probes (pm) mismatch probes (mm) Controls
21
vtrevino@itesm.mx TECHNOLOGIES 10,000 genes * 2 dyes * 3 copies/gene * ~40 pixels/gene = 2,400,00 values only 10,000 values 10,000 genes * 20 oligos * 2 (pm,mm) * ~ 36 pixels/gene = 14,400,00 values only 10,000 values RAW DATA Image Analysis Pre-processing
22
vtrevino@itesm.mx Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Addressing Done by GeneChip Affymetrix software
23
vtrevino@itesm.mx Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Addressing (by grid, GenePix)
24
vtrevino@itesm.mx Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Segmentation Circular feature Irregular feature shape Finally compute Average
25
Background Reduction Extraction: Determining Background
26
2-Color Results (GenePix).gpr file "results" for one array 10,000 genes ~ 30,000 values (.gal files 1 file for a "list" of array) Affymetrix Results.cel file "results" for one array (raw - no background reduced) 10,000 genes ~ 400,000 values Image Analysis
27
vtrevino@itesm.mx Segmentation (Spot detection) Background Estimation Value Value = Spot Intensity – Spot Background Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample 1 100 209 -7. 9882. 2298 Sample 1 98 4209 2. 9711. 28
28
vtrevino@itesm.mx Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample 1 100 209 -7. 9882. 2298 Sample 1 98 4209 2. 9711. 28 G=Sample 1 R=Sample 1 G=Sample 1 R=Sample 1 Log 2
29
vtrevino@itesm.mx Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample 1 100 209 -7. 9882. 2298 Sample 1 98 4209 2. 9711. 28 (log 2 scale) RGRG 1 value? A M MA-Plot G=Sample 1 R=Sample 1
30
A M "With-in" (2 color technologies) Normalization – 2 dyes (assumption: Majority No change)
31
Normalization – 2 dyes (assumption: Majority No change) Before After "With-in" (2 color technologies)
32
Normalization – 2 dyes "With-in" Spatial (2 color technologies) Before Normalization Aftter loess Global Normalization Aftter loess by Sector (print-tip) Normalization
33
vtrevino@itesm.mx Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample 1 100 209 -7. 9882. 2298 Log 2
34
Before normalization After normalization Between-slides Normalization – 1 or 2 dyes quantile MAD (median absolute deviation) scale qspline invariantset loess
35
Sumarization = "Average"(Intensities) Summarization – Affymetrix Oligonucleotide dependent technologies Usual Methods: tukey-biweight av-diff median-polish PM MM The "summarization" equivalent in two-dyes technologies is the average of gene replicates within the slide.
36
vtrevino@itesm.mx Some spots may be defective in the printing process Some spots could not be detected Some spots may be damaged during the assay Artefacts may be presents (bubbles, etc) Use replicated spots as averages Remove unrecoverable genes Remove problematic spots in all arrays Infer values using computational methods (warning)
37
vtrevino@itesm.mx More than 10,000 genes Too many data increases Computation Time and analysis complexity Remove Genes that do not change significantly Undefined Genes Low expression Keeping Large signal to noise ratio Large statistical significance Large variability Large expression
38
vtrevino@itesm.mx Image Analysis` Background Subtraction Normalization Summarization Transformation Data Processing Background Detection & Subtraction a) Filtering Microarray Image Scanning Spot Detection Intensity Value Affymetrix Two-dyes b) Image Analysis and Background Subtraction c) Transformation Between Within d) A=log2(R*G)/2 M=log2(R/G) Normalization
39
vtrevino@itesm.mx
40
Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007
41
vtrevino@itesm.mx Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, … ….
42
vtrevino@itesm.mx Differential Expression Unsupervised Classification Biomarker detection Identifying genes related to survival times Regression Analysis Gene Copy Number and Comparative Genomic Hibridization Epigenetics and Methylation Genetic Polymorphisms and SNP's Chromatin Immuno-Precipitation On-Chip Pathogen Detection ……
43
vtrevino@itesm.mx Differential Expression Positive Negative Samples A Samples B Samples A Samples B Gene Selection µ=dµ=d µ=dµ=d Expression Level Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, … p-value FDR q-Value
44
Biomarker Detection PositiveNegative Samples Class A Samples Class B Samples Class A Samples Class B µ=dµ=d µ=dµ=d Gene Selection Expression Level Biomarker Discovery Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, …
45
vtrevino@itesm.mx A C G B H E D I K M L Samples Co-Expressed Genes Unsupervised Sample Classification a B Low High Expression 123456789123456789 b
46
vtrevino@itesm.mx Genes Associated to Survival Times and Risk PositiveNegative Gene Selection + + + + ++++++ ++ + + + Kaplan-Meier Plot Time Hazard 1.0 0.0 + + + + ++++++ ++ + + + Kaplan-Meier Plot Time Hazard 1.0 0.0 Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, …
47
vtrevino@itesm.mx Regression: Gene Association to outcome Positive Negative Gene Selection Dependent Variable Gene Expression Dependent Variable Gene Expression Slope ≠ 0Slope = 0 Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, …
48
vtrevino@itesm.mx
49
LabellingDetectionHybridisation AA CG CC … … SNP 1 SNP 2 SNP 3 3' T T G C G G TG G C 5' SNP 1 SNP 2 SNP 3 Products of 1nt primer extension (in solution) Capture C TGA 5' GC 5' CG AA CG CC … … SNP 1 SNP 2 SNP 3 5' + Transcribed RNA + reverse transcriptase 5' GC A^C 5' TA C^A Extension ddNTPs (one labelled) 5' TA 5' TA 5' GC 5' CG 5' GC 5' GC AA CG CC … … SNP 1 SNP 2 SNP 3 Extension (1nt) + Labelled ddNTPs PCR products + DNA polymerase T C GA SNP 1 SNP 2 SNP 3 a b c
50
Chromatin Immuno-Precipitation (ChIP-on-Chip) Precipitation of Antibody-TF-DNA complex Fusion of Tag sequence into TF gene Labelling of precipitated DNA Microarray Hybridisation Incubation DNA-Tagged TF Transcription FactorTag Antibody against tag peptide
51
vtrevino@itesm.mx (1) ACGGCTAGTCACAAC... (2) GCTAGTCACAACCCA... (3) GCTAGTCCGGCACAG...... Sample SpottedHybridized (1)(2)(3)
52
Placenta 1 Placenta 2 mRNA Extraction Reference Pool Labelling Microarray Hybridization (by duplicates) Scanning & Data Processing Detection of Differentially Expressed Genes Validation and Analysis Green Red t-test H 0 : µ = 0 p-values correction: False Discovery Rate Comparison With Known Tissue Specific Genes Image Analysis Within Normalization (per array) Between Normalization (all arrays) (controls) (Dr. Hugo Barrera)
53
a b cd Placenta/ReferenceControl/Control
54
51525654 (a) Microarray Experiment Ratio (log 2 ) 10 -6 Placenta (b) T1dbase T1 score 1 0 Lung Thalamus Amygdala Spinal Cord Testis Kidney Liver Pituitary Thyroid Cerebellum Hypothalamus Caudate Nucleus Exocrine Pancreas Lymph Node Frontal Cortex Stomach Breast Bone Marrow Pancreatic Islets Uterus Ovary Skin Heart Skeletal Muscle Prostate Thymus Salivary Gland Trachea Placenta 2 Replcate 2 Placenta 2 Replicate 1 Array: Placenta 1 Replicate 1 Placenta 1 Replicate 2
55
vtrevino@itesm.mx Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007
56
vtrevino@itesm.mx Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007
57
vtrevino@itesm.mx Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007
58
vtrevino@itesm.mx
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.