Proteome and Gene Expression Analysis Chapter 15 & 16
The Goals Functional Genomics: –To know when, where and how much genes are expressed. –To know when, where, what kind and how much of each protein is present. Systems Biology: –To understand the transcriptional and translational regulation of RNA and proteins in the cell.
Genes and Proteins First, we’ll talk about how to find out what genes are being transcribed in the cell. –This is often referred (somewhat misleadingly) to gene “expression”. Second, we’ll look at measuring the levels of proteins in the cell. –The real “expression” of protein coding genes… Third, we’ll talk about how we process and analyze the raw data using bioinformatics.
Review: Gene Arrays Put a bunch of different, short single-stranded DNA sequences at predefined positions on a substrate. Let the unknown mixture of tagged DNA or RNA molecules hybridize to the DNAs. Measure the amount of hybridized material.
Getting the Data
Getting Protein Expression Data To be able to understand protein expression, we need the concentrations of all proteins (the “proteome”) in difference cell and tissue types under varying conditions. Large scale identification of proteins is much more limited than for RNA. –Nothing really equivalent to RNA expression microarrays or high-throughput sequencing exists yet. Relatively low-throughput technologies are all that we have right now.
Measuring Protein Expression In order to measure all the types of protein in a cell we must –Extract the proteins –Purify the proteins –Identify the individual proteins How do we accomplish purification and identification of proteins.
The Technologies: Protein Expression Low-throughput –2D Gel Electrophoresis + Mass Spectrometry –Liquid chromatograph + Mass Spectrometry Protein microarrays –Limited in application at this point –Can be used for things other than protein expression like protein-protein interactions
Extracting the Proteins First, the proteins are extracted from the cells using lysis. –This involves a detergent that destroys the membranes of the cell.
Separating the Proteins: 2D Gel Electrophoresis First step: pI/pH –Proteins are introduced to a gel with an imobilized pH gradient. –A charge is applied. –Proteins migrate until the pH causes them to lose their charge (isoelectric point) and then stop. Second step: mass –First gel transferred to second gel –SDS (detergent) breaks structure and charges the proteins proportional to their mass.
Using the 2D Gel Staining makes the spots containing the individual (we hope) proteins visible. –The gel is photographed. –Protein level (concentration) can be estimated by image processing. Individual, stained spots can be cut out for evaluation by Mass Spectrometry. segmenting dust
“Two channel” 2D Gels Low signal-to-noise is a problem with protein gels, as it is with RNA expression arrays. –A similar trick of putting two cell lysates (samples) on one gel can help. –Registration problems and sample-dependent effects are thereby minimized. However, 1-channel gels allow comparing more than two samples…
Protein Identification Using Mass Spectrometry
Steps of Mass Spectrometry Digest: –Sample (spot) is digested with a proteolytic enzyme Spectrum: –Peaks correspond to the mass-charge ratio of protein fragments –These provide a fingerprint Identify: –Compare fingerprint to theoretical fingerprints –Post-translational modifications screw things up.
Spectrum: Protein Fingerprint
Tricks: Protein “chips” If you had an antibody to every possible protein and could put it on a chip, and you could label the proteins in your sample, you would have something equivalent to an RNA expression microarray. –Getting reliable antibodies is difficult and expensive. –Arrays with 500 to 2000 proteins are available commercially; Clontech, Eurogentec, Arrayit etc.
Not part of this subject, but cool…
Protein Arrays for Measuring Protein-Protein Interactions You can synthesize proteins from DNA directly on a substrate. –Nano-well approach –“Printing” approach: DAPA (DNA to Protein Array) These can be used for measuring binding between proteins, but not for identification of proteins.
Next time: Analyzing Gene and Protein Expression Data Gene expression clustering Protein Expression Clustering