Presentation is loading. Please wait.

Presentation is loading. Please wait.

Functional Genomics I - Microarrays.  Transcriptomics  Proteomics  Metabolomics  Genomics  SNP (Single Nucleotide Polymorphisms)

Similar presentations


Presentation on theme: "Functional Genomics I - Microarrays.  Transcriptomics  Proteomics  Metabolomics  Genomics  SNP (Single Nucleotide Polymorphisms)"— Presentation transcript:

1 Functional Genomics I - Microarrays

2 vtrevino@itesm.mx  Transcriptomics  Proteomics  Metabolomics  Genomics  SNP (Single Nucleotide Polymorphisms)  CNV (Copy Number Variation, CGH)  Epigenomics

3 vtrevino@itesm.mx  Technology that provides measurments of thousands of molecules in the same experiment and reasonable prices and precision  Generally in the size of a typical microscope slide (75 x 25 mm (3" X 1") and about 1.0 mm thick)

4 Biological Question Experimental Design Microarray Experiment Pre-processing Differential Expression ClusteringPrediction Biology: Verification and Interpretation … Image Analysis Background Normalization Sumarization Transformation

5 vtrevino@itesm.mx Google Images

6 vtrevino@itesm.mx Molecular Cell Biology [Lodish,Berk,Matsudaira,Kayser,Kreiger,Scott,Zipursky,Danell] (5th Ed) Gene Expression

7 vtrevino@itesm.mx 100bp 200bp -+ -+ -+ RWPE-1DU-145PC-3 100 bp ladder mRNA, Gene X http://www.bio168.com/mag/1B8B368B092A/20-3.jpg 10 7 copies 10 6 copies 10 5 copies 10 4 copies 10 3 copies 10 2 copies 10 copies PCR QPCR

8 vtrevino@itesm.mx Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

9 vtrevino@itesm.mx http://www.well.ox.ac.uk/genomics/facilitites/Microarray/Welcome.shtml

10 vtrevino@itesm.mx Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

11 www.niaid.nih.gov/dir/services/rtb/microarray/overview.asp http://metherall.genetics.utah.edu/Protocols/Microarray-Spotting.html http://www.lbl.gov/Science-Articles/Archive/http://www.lbl.gov/Science-Articles/Archive/cardiac-hyper-genes.html http://www.nrc-cnrc.gc.ca/multimedia/picture/life/nrc-bri_micro-array_e.html http://learn.genetics.utah.edu/units/biotech/microarray/genechip.jpg Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

12 vtrevino@itesm.mx

13 Affymetrix Images – 1 dye two-dyes

14 vtrevino@itesm.mx Affymetrix Spotted Arrays Inkjet arrays Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

15 vtrevino@itesm.mx Dr. Hugo Barrera Microarrays Course EMBO-INER 2005, Mexico City

16 mRNA Extraction (and amplification) Labelling Hybridization Scanning Statistical Analysis Image Analysis & Data Processing PROCESS Healty/ControlDisease/Treatement REFERENCETEST Gene: A 1-1 B 1-0 C 3-3 D 0-3 Gene: E 3-0 F 0-1 G 1-1 H 2-0 Gene: I 2-2 J 0-0 K 3-0 L 2-1 Gene D 0.001 Gene E 0.005 Gene K 0.001 TWO-DYES mRNA/cDNA Labeled mRNA Digital Image Microarray Data Selected Genes PRODUCT TEST Gene: A 1 B 1 C 1 D 0 Gene: E 4 F 1 G 1 H 2 Gene: I 2 J 0 K 5 L 2 Sample Gene D 0.001 Gene E 0.005 Gene K 0.001 Gene J 0.003 ONE-DYE

17 vtrevino@itesm.mx Microarrays Bioinformatics, Dov Stekel, Cambridge, 2003

18 vtrevino@itesm.mx Dr. Hugo Barrera, Microarrays Course EMBO-INER 2005, Mexico CityMicroarrays Bioinformatics, Dov Stekel, Cambridge, 2003 5  m Laser 10  m Laser

19 Pre-processing Image Analysis Background Normalization Sumarization Transformation Microarray - Pre-Processing Purpose Output: Data File (unique "global relative" measure of expression for every gene with minimal experimental error) Input: Scanned Image File

20 vtrevino@itesm.mx TECHNOLOGIES DNA Probes Oligos ~20 40nt Target (cDNA, PCR products, etc.) Copies per geneUsually 1Usually 3 Organization Sectors (print-tip) n x m probsets Probeset m probsets (~100) y sectors (~=3) x sectors (~=3) n probsets (~100) Sectors i x j spots (18x20) Empty spots landing lights perfect match probes (pm) mismatch probes (mm) Controls

21 vtrevino@itesm.mx TECHNOLOGIES 10,000 genes * 2 dyes * 3 copies/gene * ~40 pixels/gene = 2,400,00 values only 10,000 values 10,000 genes * 20 oligos * 2 (pm,mm) * ~ 36 pixels/gene = 14,400,00 values only 10,000 values RAW DATA Image Analysis Pre-processing

22 vtrevino@itesm.mx Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Addressing Done by GeneChip Affymetrix software

23 vtrevino@itesm.mx Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Addressing (by grid, GenePix)

24 vtrevino@itesm.mx Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Segmentation Circular feature Irregular feature shape Finally compute Average

25 Background Reduction Extraction: Determining Background

26 2-Color Results (GenePix).gpr file "results" for one array 10,000 genes ~ 30,000 values (.gal files 1 file for a "list" of array) Affymetrix Results.cel file "results" for one array (raw - no background reduced) 10,000 genes ~ 400,000 values Image Analysis

27 vtrevino@itesm.mx Segmentation (Spot detection) Background Estimation Value Value = Spot Intensity – Spot Background Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample 1 100 209 -7. 9882. 2298 Sample 1 98 4209 2. 9711. 28

28 vtrevino@itesm.mx Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample 1 100 209 -7. 9882. 2298 Sample 1 98 4209 2. 9711. 28 G=Sample 1 R=Sample 1 G=Sample 1 R=Sample 1 Log 2

29 vtrevino@itesm.mx Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample 1 100 209 -7. 9882. 2298 Sample 1 98 4209 2. 9711. 28 (log 2 scale) RGRG 1 value? A M MA-Plot G=Sample 1 R=Sample 1

30 A M "With-in" (2 color technologies) Normalization – 2 dyes (assumption: Majority No change)

31 Normalization – 2 dyes (assumption: Majority No change) Before After "With-in" (2 color technologies)

32 Normalization – 2 dyes "With-in" Spatial (2 color technologies) Before Normalization Aftter loess Global Normalization Aftter loess by Sector (print-tip) Normalization

33 vtrevino@itesm.mx Gene 1 Gene 2 Gene 3. Gene k. Gene N Sample 1 100 209 -7. 9882. 2298 Log 2

34 Before normalization After normalization Between-slides Normalization – 1 or 2 dyes quantile MAD (median absolute deviation) scale qspline invariantset loess

35 Sumarization = "Average"(Intensities) Summarization – Affymetrix Oligonucleotide dependent technologies Usual Methods: tukey-biweight av-diff median-polish PM MM The "summarization" equivalent in two-dyes technologies is the average of gene replicates within the slide.

36 vtrevino@itesm.mx  Some spots may be defective in the printing process  Some spots could not be detected  Some spots may be damaged during the assay  Artefacts may be presents (bubbles, etc)  Use replicated spots as averages  Remove unrecoverable genes  Remove problematic spots in all arrays  Infer values using computational methods (warning)

37 vtrevino@itesm.mx  More than 10,000 genes  Too many data increases Computation Time and analysis complexity  Remove  Genes that do not change significantly  Undefined Genes  Low expression  Keeping  Large signal to noise ratio  Large statistical significance  Large variability  Large expression

38 vtrevino@itesm.mx Image Analysis` Background Subtraction Normalization Summarization Transformation Data Processing Background Detection & Subtraction a) Filtering Microarray Image Scanning Spot Detection Intensity Value Affymetrix Two-dyes b) Image Analysis and Background Subtraction c) Transformation Between Within d) A=log2(R*G)/2 M=log2(R/G) Normalization

39 vtrevino@itesm.mx

40 Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007

41 vtrevino@itesm.mx Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, … ….

42 vtrevino@itesm.mx  Differential Expression  Unsupervised Classification  Biomarker detection  Identifying genes related to survival times  Regression Analysis  Gene Copy Number and Comparative Genomic Hibridization  Epigenetics and Methylation  Genetic Polymorphisms and SNP's  Chromatin Immuno-Precipitation On-Chip  Pathogen Detection ……

43 vtrevino@itesm.mx Differential Expression Positive Negative Samples A Samples B Samples A Samples B Gene Selection µ=dµ=d µ=dµ=d Expression Level  Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, … p-value  FDR  q-Value

44 Biomarker Detection PositiveNegative Samples Class A Samples Class B Samples Class A Samples Class B µ=dµ=d µ=dµ=d Gene Selection Expression Level  Biomarker Discovery Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, …

45 vtrevino@itesm.mx A C G B H E D I K M L Samples Co-Expressed Genes Unsupervised Sample Classification a B Low High Expression 123456789123456789 b

46 vtrevino@itesm.mx Genes Associated to Survival Times and Risk PositiveNegative Gene Selection + + + + ++++++ ++ + + + Kaplan-Meier Plot Time  Hazard 1.0 0.0 + + + + ++++++ ++ + + + Kaplan-Meier Plot Time  Hazard 1.0 0.0 Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, …

47 vtrevino@itesm.mx Regression: Gene Association to outcome Positive Negative Gene Selection Dependent Variable  Gene Expression  Dependent Variable  Gene Expression  Slope ≠ 0Slope = 0 Gene 1 Gene 2 Gene 3. Gene N Class A Samples Class B Samples Normal Tissue, Cancer A, Untreated, Reference, … Tumour Tissue, Cancer B, Treated, Strains, …

48 vtrevino@itesm.mx

49 LabellingDetectionHybridisation AA CG CC … … SNP 1 SNP 2 SNP 3 3' T T G C G G TG G C 5' SNP 1 SNP 2 SNP 3 Products of 1nt primer extension (in solution) Capture C TGA 5' GC 5' CG AA CG CC … … SNP 1 SNP 2 SNP 3 5' + Transcribed RNA + reverse transcriptase 5' GC A^C 5' TA C^A Extension ddNTPs (one labelled) 5' TA 5' TA 5' GC 5' CG 5' GC 5' GC AA CG CC … … SNP 1 SNP 2 SNP 3 Extension (1nt) + Labelled ddNTPs PCR products + DNA polymerase T C GA SNP 1 SNP 2 SNP 3 a b c

50 Chromatin Immuno-Precipitation (ChIP-on-Chip) Precipitation of Antibody-TF-DNA complex Fusion of Tag sequence into TF gene Labelling of precipitated DNA Microarray Hybridisation Incubation DNA-Tagged TF Transcription FactorTag Antibody against tag peptide

51 vtrevino@itesm.mx (1) ACGGCTAGTCACAAC... (2) GCTAGTCACAACCCA... (3) GCTAGTCCGGCACAG...... Sample SpottedHybridized (1)(2)(3)

52 Placenta 1 Placenta 2 mRNA Extraction Reference Pool Labelling Microarray Hybridization (by duplicates) Scanning & Data Processing Detection of Differentially Expressed Genes Validation and Analysis Green Red t-test  H 0 : µ = 0 p-values correction: False Discovery Rate Comparison With Known Tissue Specific Genes Image Analysis Within Normalization (per array) Between Normalization (all arrays) (controls) (Dr. Hugo Barrera)

53 a b cd Placenta/ReferenceControl/Control

54 51525654 (a) Microarray Experiment Ratio (log 2 ) 10 -6 Placenta (b) T1dbase T1 score 1 0 Lung Thalamus Amygdala Spinal Cord Testis Kidney Liver Pituitary Thyroid Cerebellum Hypothalamus Caudate Nucleus Exocrine Pancreas Lymph Node Frontal Cortex Stomach Breast Bone Marrow Pancreatic Islets Uterus Ovary Skin Heart Skeletal Muscle Prostate Thymus Salivary Gland Trachea Placenta 2 Replcate 2 Placenta 2 Replicate 1 Array: Placenta 1 Replicate 1 Placenta 1 Replicate 2

55 vtrevino@itesm.mx Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007

56 vtrevino@itesm.mx Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007

57 vtrevino@itesm.mx Microarray Technology Through Applications, F. Falciani, Taylor & Francis 2007

58 vtrevino@itesm.mx

59


Download ppt "Functional Genomics I - Microarrays.  Transcriptomics  Proteomics  Metabolomics  Genomics  SNP (Single Nucleotide Polymorphisms)"

Similar presentations


Ads by Google