Gene expression informatics – it’s all in your mine Douglas E. Basset Jr, Michael B. Eisen, & Mark S. Boguski Nature Genetics Supplement, 1999
Review of how it works From SHARING GENE EXPRESSION DATA: AN ARRAY OF OPTIONS By Daniel Geschwind
Steps and Data Lab Information Image Analysis Turn image (spots) into numerical representation (ratios, usually) Problem No standard procedure of software Variation
Steps and Data (continued) Normalization Raw numbers normalized to reduce systemic biases (dye absorption rate, mRNA quantities,etc.) Problem – Lose information Various methods of normalization produce differing results
Data Integration and Warehousing Goal – Genbank model Objective, natural measure of gene expression Highly curated database Controlled by rich ontologies Synergistically hyperlinked to important references
Is this goal possible with expression data? No natural objective unit of measure (like AGCT) As many entries as cell types X conditions Lacking standards (since 1999 some standards have been worked on) Problems of normalizing, especially cross experiment