Download presentation
Presentation is loading. Please wait.
Published byClaribel Lester Modified over 9 years ago
1
Analysis of Gene Networks and Signaling Pathways Based on Gene Expression and Proteome Data Marek Kimmel Rice University Houston, TX, USA kimmel@rice.edu
2
Outline Basics: gene expression vs. protein abundance. Basics: gene expression vs. protein abundance. Perceptron analysis of gene networks Perceptron analysis of gene networks Proteomic analysis of FGF-2 signaling in breast cancer Proteomic analysis of FGF-2 signaling in breast cancer
3
Now that we have the sequence of the Human Genome – What Next?
4
Clinical Sciences Basic Sciences Molecular Medicine StructuralBiology GenomicsProteomics Bioinformatics
5
BCM-HGSC Genes make up only 3% of the genome 30,000
6
Measuring Gene Expression: Oligonucleotide Gene Microarrays A Probe Pair consists of a Perfect Match (PM) & a Mismatch (MM). There are typically 20 Probe Pairs in a Probe Set. A Probe Set usually corresponds to a single gene. The Affymetrix 95A human GeneChip contains 12,626 Probe Sets. Thus, there are almost 500,000 Probe Cells on a GeneChip. Affymetrix GeneChips ™
7
Oligonucleotide Gene Microarrays Each probe is 25 nucleotides long Affymetrix GeneChips ™
8
mRNA Preparation GAATTCAGTAACCCAGGCATTATTTTATCCTCAAGTCTTAGGTTGGTTGGAGAAAGATAACAAAAAGAAACATGA TTGTGCAGAAACAGACAAACCTTTTTGGAAAGCATTTGAAAATGGCATTCCCCCTCCACAGTGTGTTCACAGTGT GGGCAAATTCACTGCTCTGTCGTACTTTCTGAAAATGAAGAACTGTTACACCAAGGTGAATTATTTATAAATTAT GTACTTGCCCAGAAGCGAACAGACTTTTACTATCATAAGAACCCTTCCTTGGTGTGCTCTTTATCTACAGAATCC AAGACCTTTCAAGAAAGGTCTTGGATTCTTTTCTTCAGGACACTAGGACATAAAGCCACCTTTTTATGATTTGTT GAAATTTCTCACTCCATCCCTTTTGCTGATGATCATGGGTCCTCAGAGGTCAGACTTGGTGTCCTTGGATAAAGA GCATGAAGCAACAGTGGCTGAACCAGAGTTGGAACCCAGATGCTCTTTCCACTAAGCATACAACTTTCCATTAGA TAACACCTCCCTCCCACCCCAACCAAGCAGCTCCAGTGCACCACTTTCTGGAGCATAAACATACCTTAACTTTAC AACTTGAGTGGCCTTGAATACTGTTCCTATCTGGAATGTGCTGTTCTCTT 5’ 3’ DNA Sequence for IL-8 GAATTCAGTAACCCAGGCATTATTT|TATCCTCAAGTCTTAGGTTGGTTGG|AGAAAGATAACAAAAAGAAACATGA| TTGTGCAGAAACAGACAAACCTTTT|TGGAAAGCATTTGAAAATGGCATTC|CCCCTCCACAGTGTGTTCACAGTGT| GGGCAAATTCACTGCTCTGTCGTAC|TTTCTGAAAATGAAGAACTGTTACA|CCAAGGTGAATTATTTATAAATTAT| GTACTTGCCCAGAAGCGAACAGACT|TTTACTATCATAAGAACCCTTCCTT|GGTGTGCTCTTTATCTACAGAATCC| AAGACCTTTCAAGAAAGGTCTTGGA|TTCTTTTCTTCAGGACACTAGGACA|TAAAGCCACCTTTTTATGATTTGTT| GAAATTTCTCACTCCATCCCTTTTG|CTGATGATCATGGGTCCTCAGAGGT|CAGACTTGGTGTCCTTGGATAAAGA| GCATGAAGCAACAGTGGCTGAACCA|GAGTTGGAACCCAGATGCTCTTTCC|ACTAAGCATACAACTTTCCATTAGA| TAACACCTCCCTCCCACCCCAACCA|AGCAGCTCCAGTGCACCACTTTCTG|GAGCATAAACATACCTTAACTTTAC| AACTTGAGTGGCCTTGAATACTGTT|CCTATCTGGAATGTGCTGTTCTCTT 5’ 3’ Chop into short pieces suitable for hybridizing to 25mers on GeneChip Attach chromophore, then inject onto the GeneChip
9
Affymetrix Hybridization PMMM AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGAGCTATACGGTTC|AGTCGGATTAAGAGCTATACGGTTC| AGTCGGATTAAGTGCTATACGGTTC|AGTCGGATTAAGTGCTATACGGTTC| AGTCGGATTAAGGGCTATACGGTTC|AGTCGGATTAAGGGCTATACGGTTC| AGTCGGATTAAGAGCTATACGGTTC|AGTCGGATTAAGAGCTATACGGTTC| AGTCGGATTAAGGGCTATACGGTTC|AGTCGGATTAAGGGCTATACGGTTC| AGTCGGATTAAGTGCTATACGGTTC|AGTCGGATTAAGTGCTATACGGTTC| AGTCGGATTAAGAGCTATACGGTTC|AGTCGGATTAAGAGCTATACGGTTC| AGTCGGATTAAGGGCTATACGGTTC|AGTCGGATTAAGGGCTATACGGTTC| |TCAGCCTAATTCGCGATATGCCAAG|TCAGCCTAATTCGCGATATGCCAAG |TCAGCCTAATTCGCGATATGCCAAG|TCAGCCTAATTCGCGATATGCCAAG X
10
PMMM AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGCGCTATACGGTTC|AGTCGGATTAAGCGCTATACGGTTC| AGTCGGATTAAGAGCTATACGGTTC|AGTCGGATTAAGAGCTATACGGTTC| AGTCGGATTAAGTGCTATACGGTTC|AGTCGGATTAAGTGCTATACGGTTC| AGTCGGATTAAGGGCTATACGGTTC|AGTCGGATTAAGGGCTATACGGTTC| AGTCGGATTAAGAGCTATACGGTTC|AGTCGGATTAAGAGCTATACGGTTC| AGTCGGATTAAGGGCTATACGGTTC|AGTCGGATTAAGGGCTATACGGTTC| AGTCGGATTAAGTGCTATACGGTTC|AGTCGGATTAAGTGCTATACGGTTC| AGTCGGATTAAGAGCTATACGGTTC|AGTCGGATTAAGAGCTATACGGTTC| AGTCGGATTAAGGGCTATACGGTTC|AGTCGGATTAAGGGCTATACGGTTC| |TCAGCCTAATTCGCGATATGCCAAG|TCAGCCTAATTCGCGATATGCCAAG |TCAGCCTAATTCGCGATATGCCAAG|TCAGCCTAATTCGCGATATGCCAAG X Forms duplex with complementary strand Mismatch! Match
11
Probe Cell Intensities Average Difference = (PM – MM)/Pairs in Average 1,662
12
Measuring Gene Expression “Spotted DNA Microarrays” Each spot is the cDNA for a specific gene. RNA from the experimental sample is labeled with Cy5 red fluorescent dye. RNA from the reference sample is labeled with Cy3 green fluorescent dye. Fluorescent intensity ratios (Cy5/Cy3) are measured. http://www.microarrays.org/software.html http://rana.lbl.gov/ http://www.bioinfo.utmb.edu/
13
Where Do We Get the Data? Disease, Pathogens, Drugs, etc… mRNA expressed in response to stimulus mRNA collected and hybridized onto microarray Microarray analyzed for spot intensities Gene co-expression patterns cDNA Gene Microarray
14
Method Get mRNA samples from multiple conditions. Get mRNA samples from multiple conditions. Hybridize to DNA microarrays. Hybridize to DNA microarrays. Measure intensities. Measure intensities. Cluster. Cluster. Analyze results. Analyze results. Design new experiment. Design new experiment.
15
Discrimination between samples Green is “down”. Green is “down”. Red is “up”. Red is “up”. We can differentiate clearly between tumor and normal tissue. We can differentiate clearly between tumor and normal tissue. Can we find differences between progressing and non-progressing tumors? Can we find differences between progressing and non-progressing tumors?
16
Problematic quality of data Note the large dynamic range. And the very large number of data points. And the limited information content.
17
Proteomics Is to protein expression what genomics is to gene expression. Is to protein expression what genomics is to gene expression. Due to variations like post- translational modifications, there are many more proteins than genes. Due to variations like post- translational modifications, there are many more proteins than genes.
18
Proteomics Holds new promise for the future understanding of complex biological systems. Holds new promise for the future understanding of complex biological systems. Post-translational modifications include: Post-translational modifications include: – Phosphorylation – Glycosylation – Oxidation Many challenges remain, e.g. isolating, identifying, characterizing, and quantifying small amounts of a very large number of varieties of proteins Many challenges remain, e.g. isolating, identifying, characterizing, and quantifying small amounts of a very large number of varieties of proteins Currently, we primarily use 2D gels and mass spectroscopy. Currently, we primarily use 2D gels and mass spectroscopy.
19
Protein Separation Using 2D Gel Electrophoresis Protein analysis uses a diseased or treated sample and a control sample. 2D gel electrophoresis is performed for each sample to separate proteins based on their molecular weight and charge. Protein analysis uses a diseased or treated sample and a control sample. 2D gel electrophoresis is performed for each sample to separate proteins based on their molecular weight and charge. Black marks on the gel images indicate a protein or cluster of proteins and are referred to as "features." Black marks on the gel images indicate a protein or cluster of proteins and are referred to as "features." The x-axis is the Isoelectric point (pI) which is analagous to pH, while the y-axis is molecular weight (Mw) or size. The x-axis is the Isoelectric point (pI) which is analagous to pH, while the y-axis is molecular weight (Mw) or size. http://www.incyte.com/proteomics/tour/separation.shtml
20
Protein Separation
21
Protein Analysis Gels are fixed and stained with a fluorescent dye, then scanned. Gels are fixed and stained with a fluorescent dye, then scanned. Expression levels are measured based on the size of each feature on the gel. Expression levels are measured based on the size of each feature on the gel. Provides information about those proteins which are up and down- regulated, including how their abundance changed. Provides information about those proteins which are up and down- regulated, including how their abundance changed. http://www.incyte.com/proteomics/tour/analysis.shtml
22
Protein Analysis http://www.incyte.com/proteomics/tour/analysis.shtml
23
Protein Characterization Proteins are excised from the gel and treated with a succession of enzymes that cut amino acid chains into short polypeptides about 5-10 amino acids in length. Proteins are excised from the gel and treated with a succession of enzymes that cut amino acid chains into short polypeptides about 5-10 amino acids in length. The polypeptide fragments for each protein are then separated by capillary electrophoresis and analyzed using rapid-throughput mass spectrometry. The polypeptide fragments for each protein are then separated by capillary electrophoresis and analyzed using rapid-throughput mass spectrometry. At this point, we know the amino acid sequence of the polypeptide fragments, their mass, as well as post- translational modifications that occurred such as glycosylation and phosphorylation. At this point, we know the amino acid sequence of the polypeptide fragments, their mass, as well as post- translational modifications that occurred such as glycosylation and phosphorylation.
24
Protein Characterization
25
Systems Biology Consolidates genomics and proteomics differential expression data into a systematic description of pathways. Consolidates genomics and proteomics differential expression data into a systematic description of pathways. – Signaling pathways. – Inflammatory response pathways. – Metabolic pathways. – Etc… Potential for understanding the interrelationships between genes, proteins, and disease and identifying potential therapeutic targets. Potential for understanding the interrelationships between genes, proteins, and disease and identifying potential therapeutic targets.
26
Gene Expression vs. Protein Abundance What exactly are we measuring? What exactly are we measuring? What is the relationship between What is the relationship between - “level of gene expression” and - “abundance of proteins” ?
27
Dogma of Molecular Biology
28
Balance equations In the steady state, for a given gene i
29
Complicating Factors For any gene, product (protein) abundance is not necessarily proportional to the relative expression level, even under “steady state”. For any gene, product (protein) abundance is not necessarily proportional to the relative expression level, even under “steady state”. Products do not follow 1-order elimination kinetics. Instead they enter into complicated interactions with each other and with external factors. Products do not follow 1-order elimination kinetics. Instead they enter into complicated interactions with each other and with external factors.
30
Application: Identification of Gene Networks General ideas: Level of expression of a gene affects levels of expressions of other genes Level of expression of a gene affects levels of expressions of other genes Only three levels possible: Only three levels possible: Normal (0) Over-expression (1) Under-expression (-1) Data: Arrays of perturbed expression levels in a set of genes Data: Arrays of perturbed expression levels in a set of genes Model: Perceptron (simplest neural net) Model: Perceptron (simplest neural net)
31
Reference Kim et al. (2000) “General nonlinear framework for the analysis of gene interaction via multivariate expression arrays” Journal of Biomedical Optics 5, 411– 424
33
Data table Data table Perceptron function: Perceptron function: g(.) is sigmoidal, X’s and Y quantized to 3 levels
34
Training: Estimating coefficients a so that a coefficient of determination ( ) is maximized. Training: Estimating coefficients a so that a coefficient of determination ( ) is maximized. Of all possible dependencies, only these with above threshold, are retained. Of all possible dependencies, only these with above threshold, are retained.
36
Application FGF-2 Signaling Pathways and Breast Cancer General ideas: Use 2-D protein gels and mass spectrometry to measure abundance changes of proteins in cancer cells, relative to normal cells. Use 2-D protein gels and mass spectrometry to measure abundance changes of proteins in cancer cells, relative to normal cells. Use perturbed systems to draw conclusions on some specific signaling pathways. Use perturbed systems to draw conclusions on some specific signaling pathways. Example: Signaling pathways of one of the Fibroblast growth factors (FGF-2) in breast cancer. Example: Signaling pathways of one of the Fibroblast growth factors (FGF-2) in breast cancer.
37
Reference Hondermarck et al. (2001) “Proteomics of breast cancer for marker discovery and signal pathway profiling” Proteomics 1, 1216–1232
38
Figure 2. Silver stained 2- DE profile of MCF-7 breast cancer cells. The major proteins were determined by MALDI-TOF and MS/MS after trypsin digestion.
39
Figure 3 MALDI-TOF and MS/MS spectra obtained for HSP70. (A) MALDI-TOF and (B) MS/MS analysis of peak m/z 1488.5 was performed. The letters labeling the peaks are the single letter code for the amino acids identified by MS/MS. Database searching allowed the identification of HSP70.
40
Figure 5 2-D patterns showing the down- regulation of 14-3-3 sigma (indicated by an arrow) in seven representativ e breast tumor samples (C– I)
41
Design of experiments Previously depicted: “abundance proteomics”, no clues as to how things work. Previously depicted: “abundance proteomics”, no clues as to how things work. “Functional proteomics” “Functional proteomics” Use perturbations of the hypothetical causal factor. Measure not simply abundance but characteristics indicating, e.g., Synthesis rates Activation
43
Figure 7 Changes of protein synthesis induced by FGF- 2 stimulation in MCF-7 breast cancer cells. 35 S-labeled proteins from unstimulated (A, C) or stimulated (B, D) MCF-7 cells were separated by 2- DE and 2-D gels were subjected to autoradiography.
45
Credits Bruce Luxon (UTMB, Galveston, TX) Bruce Luxon (UTMB, Galveston, TX) George Weinstock (BCM, Houston, TX) George Weinstock (BCM, Houston, TX) Guy de Maupassant Guy de Maupassant [“three major virtues of a French writer: clarity, clarity, and clarity”]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.