Data visualization in the post-genomics era Carol Morita Genentech, Inc.
Pre-Genomics: assembling the pieces Genome project initiated GenBank
Where we are today OrganismSize (bp)# genes E.coli (bacteria)4.67 million3,237 Arabidopsis (plant)100 million25,000 C. elegans (worm)97 million19,099 Drosophila (fly)136 million13,061 Mouse3 billion~40,000 Human3 billion~40,000
American view of the genome Entrez Genome Browser National Center for Biotechnology Information National Institutes of Health
European view of the genome Ensembl Genome Browser European Molecular Biology Laboratory
What the genomes of model organisms tell us Maturation10 days9 weeks20-25 years Genome165 million bp3 billion bp Genes13,600~40,000 Almost every human gene has a counterpart in the mouse and some blocks of DNA are proving impossible to tell apart
If we are so similar genetically, why are we so different? Human genes mapped onto mouse chromosomes
Proteomics: the real work begins Definition: Description and functional characterization of the full complement of an organism’s proteins what’s at play… –Multiple proteins can be derived from one gene –Protein interactions can be complex and are poorly understood –‘Plasticity’ of the genome –Spatial and temporal regulation
Increased diversity due to alternative splicing gene A
Alternative splicing Plays an important role in: –expanding protein diversity –generating proteins with subtle or opposing functional roles –enabling an organism to respond to environmental pressures >35% of human genes undergo alternate splicing; probably higher
Complexity due to protein interactions Death Receptor Signaling pathway
DNA Microarrays Microarray chips may contain 50,000 known DNA fragments on a single slide
Visualizing microarray data Source: Silicon Genetics: GeneSpring
Limitations of DNA microarrays ‘snapshots’ of the DNA activity in a cell -- prefer movies! Many important biological events cannot be detected because transcription of DNA is not involved Protein array technology is still in its infancy
Source: Klausner, 2002 Cancer Cell1, p The curse of dimensionality