Download presentation
Presentation is loading. Please wait.
Published byHortense York Modified over 6 years ago
1
Microarray Experiment Design and Data Interpretation
Susan Hester, Ph.D. Environmental Carcinogenesis Division Toxicogenomic Core Facility US EPA
2
Presentation Outline Traditional biology versus genomics
Basics of genomics Data mining goals and approaches using parallel analyses -some examples Interpreting changes in gene expression to identify altered molecular pathways Evaluating pathway alterations in concert with traditional toxicology data for greater understanding of mode of action
3
Traditional Biology Measure one tree at a time Measure one element
in samples
4
“Omic” Biology Measure tens of thousands of elements in 2 to 4 samples
Measure Forests (groups of trees)
5
Genomic research is a data-rich technology
Microarrays are called chips or arrays Takes advantage of the natural property of DNA to pair with its complimentary strand One strand is built into the array and then is used as a probe for the complementary strand in the biologic sample The binding confirms the presence of mRNA or cDNA In the sample
6
Genomic Profiling-Find ”Significantly Changed Genes”
From: All probesets Typical experiment is ~ 1M datapoints To: Reduce to a much smaller number of “meaningful genes”
7
Finding genes in samples-1st step
1 genechip cell location 1 genechip apply sample
8
2nd step Tagged DNA fragments that base pair will glow 2nd step
shine light final image text file with gene intensities
9
Experimental Design Use adequate controls Sample collection
Choose time-points and doses Hybridization schemes-1 or 2 colors
10
Data Quality and Data Mining
RNA quality Scans Summary statistics
11
RNA quality: Agilent 2100 Bioanalyzer Measure RNA quality and quantity
Uses small sample size and take minutes Good Quality RNA Degraded RNA Agilent Gel Image
12
QC Assessment of Scanned Slide
Showing Good Dynamic Range of Signal Intensity Low background signal Poor scan Good scan
13
Summary Statistics for each array
Raw gene intensity distribution for each array After normalization shows reduced variance max median min Grp
14
Example of with-in group outliers
Example of 2 array outliers (high and low median values) Arrays
15
Goals of Data Mining genes”
Reduce the large dataset by first exclude “unchanging genes” Early microarray papers used a simple “fold change” to find differences Most analyses now rely on statistical tests to identify changed genes-supervised versus unsupervised Find genes that distinguish the various biologic classes “significant genes”
16
General Approach: From many genes to a few
28,000 rat genes 34,000 mouse genes normalize data to compare across arrays analysis begins here supervised (prior knowledge) and unsupervised (no prior knowledge) T test, ANOVA, etc PCA, KNN, clustering genes…now associate with gene name using databases to assign gene function characterize genes into pathways explore pathways by combining into networks
17
Array Image Inspection Confirms the Induction of Many Genes
1 uM As uM As
18
Statistical Filter shows more significant genes at higher doses
1 uM As 50 uM As genes that have values>1.5 fold and significant p<0.05
19
Many Views of the Data Principal Component Analysis (PCA)
Table of filtered genes Principal Component Analysis (PCA) Venn Diagrams-gene level Correlate Transcription with Functional Assays Map genes to pathways Venn Diagram-pathway level
20
Table view: Significantly Altered Genes by Chemical, Day and Dose
in rat liver
21
Principal Component Analysis
Identifies dose-response, if present Assess experiment Worth analyzing ? Identify outliers-bad chips Find samples with similar expression patterns What it does What it looks like: uses all samples and genes using statistics, reduces and plots the data helps visualize data in 2 or 3 planes (3D) What it tells groups samples or genes with similar profiles differentiates treatment or exposure groups
22
Principal Component Analysis Rat Liver
23
Numbers of Common and Unique Genes Over Time (High Dose)-rat liver
24
Dose response corresponds to functional assays
Better description of dose response by genomics
25
Mapping genes to pathways
Process p-Value # of genes Expressed # of genes in Pathway % Transcription of Retinoid-Target genes Cell signaling/Regulation of transcription 7.56E-09 68 125 54 Regulation activity of EIF2 Cell signaling/Translation regulation 5.86E-05 31 56 55 IGF-R signaling Growth and differentiation 4.57E-06 40 72 AKT signaling 9.50E-06 33 57 58 PTEN pathway 3.65E-05 Tryptophan metabolism Metabolic maps/Amino acid metabolism 3.99E-05 17 24 71 Cholesterol Biosynthesis Metabolic maps/Steroid metabolism 6.25E-06 16 22 82 GTP-XTP metabolism Metabolic maps/Nucleotide metabolism 4.58E-07 34 63 CTP/UTP metabolism 1.32E-05 60 ATP/ITP metabolism 1.49E-05 36 65
26
Pathway Venn Unique and common pathways over time
27
Pathway and network visualizations
cellular molecular network metabolic transcription
28
Example of a molecular pathway with gene intensity values added
Oxidative Phosphorylation pathway red=gene induced green=gene repressed rainbow=mixed ATPase Oxidoreductase NADH dehydrogenase succinate dehydrogenase complex cytochrome c oxidase subunit
29
Cellular pathway extracellular cytoplasmic nuclear Note c-Jun
JNK1, ERK1 repression* nuclear Expression legend Green= decreased Red=increased Rainbow=mixed
30
Gene Network: One Transcription factor:
31
Network objects mapped to cellular localization
32
Conclusions Steps for a successful microarray experiment:
Experiment design-focus your research question Data quality assessment Supervised and unsupervised analyses Integrating gene expression results with other phenotypic endpoints
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.