Gene Expression Analysis BI420 – Introduction to Bioinformatics Gene Expression Analysis Department of Biology, Boston College
Why study gene expression? Which genes are active at different developmental stages? in cells of different tissues? at different time points in the same cell? cells under different environmental conditions? between normal and cancerous cells?
Challenges In Measuring Expression Calling Differential Expression Challenges In Measuring Expression Differential Expression Is the difference in expression between the test and the control greater than the uncertainty in the measurement?
Gene expression naturally bounces around a lot
Methods For Measuring Gene Expression (all genes in the cell) Microarrays (older, cheaper) Sequenced based measurement RNA-Seq (replacing microarrays)
Expression microarray movie DNA microarray chip animation: http://www.bio.davidson.edu/Courses/genomics/chip/chip.html
What are expression microarrays?
Expression microarrays – “physical appearance”
cDNA preparation
Expression assay
Chip readout – absolute expression and ratio
Chip readout – relative transcription
Chip readout – example
Time course experiments Experiment: measuring gene expression as oxygen gets depleted in yeast grown in a closed container
Time course data
Data analysis – normalization balance fluorescent intensities of two dyes adjust for differences in experimental conditions
Normalization
Log2 transformation Double or half expression now has the same magnitude
Clustering – intro Why: if the expression pattern for gene B is similar to gene A, maybe they are involved in the same or related pathway How: Re-order expression vectors in the data set so that similar patterns are together
Clustering – numerical
Clustering – visual
Hierarchical clustering: pair-wise similarity
Hierarchical clustering: cluster construction
Clustering – large example
Application of microarrays: classification of cancers
RNA Seq
|-----------Annotated Gene-----------| Measuring Gene Expression With RNA Seq RNA Seq ACCCAATTTTCTGAAAATATCCGTGTCTTCCAG Align reads Count the number of reads that align uniquely within the regions of annotated genes (Shotgun, Cap-Trap, SAGE) |-----------Annotated Gene-----------| Genome
You get something like this Rep1 Rep2 Rep3 Gene A 5 3 12 Gene B 16 25 35 Gene C 10 15 3 Gene D 3 15 8 Gene E 1504 1005 1030 *Skewed distribution *Genes bounce around in replicates
Gene Ontology Enrichment Ontology - An explicit formal specification of how to represent the objects, concepts and other entities Gene Ontology- structured vocabulary used to describe aspects of the cell Arranged in a hierarchy Someone curates an ontology
Matlab example: Analyzing Gene Expression Profiles
Matlab example: Gene Ontology Enrichment in Microarray Data
Typical Steps in a Gene Expression Experiment Isolate RNA from several biological replicates (See slides 3-4) Microarray Label Hybridize to microarray Image microrray Normalize data RNA-Seq Sequence RNA Align to genome Count aligned reads Normalize data between samples Analysis – independent of measurement technique Call differential gene expression Clustering Gene Ontology enrichment