Download presentation
Presentation is loading. Please wait.
Published byMaximillian Chapman Modified over 8 years ago
1
A Short Overview of Microarrays Tex Thompson Spring 2005
2
Raw Data ● Microarray data at its most raw consists of a spotted image, and information on what each spot represents (spot intensities and metadata). ● Genes may be spotted in replicate ● Affymetrix chips use a match/mismatch technology to guard against non-specific hybridization.
3
Normalizing Data ● Normalization of microarray data is the process of removing array-specific bias in order to make results between arrays comparable. ● Intensity data relevant to a single gene needs to be combined and normalized in order to define “expression levels” for each gene. ● The basic idea is that the expression level is proportional to the number of mRNA transcripts of that gene within the tissue of interest.
4
RMA Normalization ● Each array is assumed to have a common amount of “background noise.” ● Normalization is performed by quantile normalization, such that the intensities across each chip are adjusted to produce identical distributions. ● A statistician (or Google) could tell you much more about this.
5
Diagram of Microarray Analysis Raw Data Normalized Data mRNA ????????????
6
What Sorts of Questions Can We Ask? ● What are the most highly/lowly expressed genes in a sample of interest? ● What are the differentially expressed genes across two (or more) samples of interest? ● What sets of genes are always upregulated or downregulated as a set? ● What do you think?
7
Clustering ● Clustering is the process of assembling N objects into K “clusters” based on a set of measured characteristics. ● For example, a common clustering application is clustering individual samples into clusters based on their gene expression. ● Alternatively, clustering can be used to group together individual genes who similar expression patterns.
8
Prediction ● Prediction is the process of creating an algorithm for taking an unknown sample and putting it in a known classification scheme. ● For example, a predictor might measure the gene expression levels of an unknown tissue sample and match it to the most probable classification. ● This protocol is very common in studies of different types of cancer.
9
Algorithms Of Interest ● Principal Component Analysis (PCA) ● Self-Organizing Maps (SOM) ● Support Vector Machines (SVM) ● Linear Discriminant Analysis (LDA) ● K-Means Clustering ● KNN Classifiers ● Differential Expression Statistics ● Assumptions of RMA Normalization
10
Looking At The Data ● Each array falls into one of four types: – Young – Middle-aged – Old, Mild Presbycusis – Old, Severe Presbycusis
11
Looking At The Data X13_Frisina_S2_M430A.CEL X1_b_Frisina_S2_M430A.CEL 1415670_at 10.0073897626035 10.4616952671666 1415671_at 12.1960225217605 13.1951229785856 1415672_at 13.9737085433580 13.7746451795089 1415673_at 9.62027371983307 10.9092694066664
12
Go To Work! I'll be available for questions via until 9:30am and via e-mail (tex@bioinformatics.rit.edu).tex@bioinformatics.rit.edu These slides will be made available on the course website.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.