Download presentation
Presentation is loading. Please wait.
Published byKathryn West Modified over 6 years ago
1
Proteomics technologies and protein-protein interaction
Lars Kiemer Center for Biological Sequence Analysis The Technical University of Denmark Advanced bioinformatics – November 2005
2
Outlining the problem Around 30% of the human proteins still have no annotated function. Even if the function is known, we often don’t know anything about the big picture (regulation?, multiple functions?, pathogenesis?, mutations?, splice variants?). In fact, the individual proteins are as interesting as bricks in a wall – what we want to know about is the system.
3
Example: signal transduction cascade
EXTRACELLULAR NCAM NCAM CB1 NCAM FGFR NCAM bRaf Ras PKC Frs2 Sos Ca2+ Raf C-Fos DAGL Grb2 Shc CYTOPLASM MEK CREB PKA PLC Fyn MAPK Rap1 MAPK CaMKII Fak NUCLEUS GAP43
4
Example: signal transduction cascade
EXTRACELLULAR NCAM CB1 NCAM NCAM FGFR 2-AG DAG PIP2 Frs2 Ras NCAM DAGL Grb2 Sos Sos Fyn PLC Shc IP3 Raf Grb2 Fak Ca2+ PKC cAMP Rap1 bRaf PKA MEK CYTOPLASM GAP43 NUCLEUS MAPK CaMKII CREB MAPK C-Fos Transcription
5
Obtaining data High-throughput data can provide information about interactions with other proteins, protein abundance in different tissues, transcriptional regulation, etc. High-throughput experimental techniques provide large data sets – thus no manual curation is possible. These data sets often contain false positives. But combining several such data sets increases confidence.
6
Protein interactions reveal a lot!
Hints of the function of a protein are revealed when its interaction partners are known. Guilt by association! Complexes in which none of the interaction partners have known functions are even more interesting.
7
Yeast-two-hybrid screening
Has been widely used Only binary interactions High false postive rate Proteins must be able to enter the nucleus
8
Affinity purification
Large-scale Can be done on any preparation of cells Often complexes are purified and the order of binding is not obtained An extra step is needed to identify purified proteins
9
Ions are detected as they disharge on the detector
Mass spectrometer Detector Ions are detected as they disharge on the detector Ion Source Converts the analyte into gasphase ions Mass Analyzer(s) Separates gas-phase Ions by m/z Q1 q2 + TOF 3 principal components
10
Mass spectrometry in short
Extremely sensitive Weight precision of one atom In principle, detection of one, relatively short peptide allows for unambiguous identification. Some proteins are difficult to chop up with proteases. Some peptides are very difficult to ionize. Due to the high sensitivity of the method, contaminations are difficult to avoid.
11
Protein interaction databases: Spoke/Matrix
Affinity pulldown Bait Prey The organisation level of medical research has through the centuries moved from a scale of anatomy and physiology, to the scale of molecular interactions. Functional genomics is a discipline in medical research, that tries to decipher how genetic information is coordinated in time and space, in such a manner that sequence based information generates function. Or to rephrase: What is it on a molecular, biological and genomic level, that separates individuals affected with a given disease, from healthy individuals generally? If answered, this question can lead to the discovery of biological and pathway information, and in the best case lead to a cure for the particular disorder studied. The key to answering this question, is finding the genes in which mutations lead to a particular disease. Once found, a number of methods can be applied to investigate the genes experimentally in an attempt of linking the gene to a biological function in time and space. For these reasons finding disease genes has always been a field of interest, and after the publication the complete human genomic sequence in 2001 (Woods, Young et al. 1999; Venter, Adams et al. 2001), the molecular dissection of human diseases has moved into hyperdrive. This can be seen by looking in the catalog of human genes and genetic disorders identified through the last 3-4 years in the Online Mendelian Inheritance in Man database. This project concerns disease gene finding in diseases exhibiting genetic heterogeneity. We have developed a new method in an attempt to pinpoint disease genes, in genomic intervals known to link to a particular heterogeneous disorder. The method revolves around large scale protein-protein interaction queries, followed by sequence alignments against the linkage intervals. Any interesting findings have been reported to experimental groups, and if possible collaborations have been established in order to verify our predictions experimentially. BBS5 IDENTIFICATION Comparative genomics identifies a flagellar and basal body proteome that includes the BBS5 human disease gene. Li JB, Gerdes JM, Haycraft CJ, Fan Y, Teslovich TM, May-Simera H, Li H, Blacque OE, Li L, Leitch CC, Lewis RA, Green JS, Parfrey PS, Leroux MR, Davidson WS, Beales PL, Guay-Woodford LM, Yoder BK, Stormo GD, Katsanis N, Dutcher SK. Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA. Cilia and flagella are microtubule-based structures nucleated by modified centrioles termed basal bodies. These biochemically complex organelles have more than 250 and 150 polypeptides, respectively. To identify the proteins involved in ciliary and basal body biogenesis and function, we undertook a comparative genomics approach that subtracted the nonflagellated proteome of Arabidopsis from the shared proteome of the ciliated/flagellated organisms Chlamydomonas and human. We identified 688 genes that are present exclusively in organisms with flagella and basal bodies and validated these data through a series of in silico, in vitro, and in vivo studies. We then applied this resource to the study of human ciliation disorders and have identified BBS5, a novel gene for Bardet-Biedl syndrome. We show that this novel protein localizes to basal bodies in mouse and C. elegans, is under the regulatory control of daf-19, and is necessary for the generation of both cilia and flagella. Spoke Matrix Truth?
12
Protein interaction databases: Overlap
A total of articles represented in the databases (June 2005). Database Unique article references # interaction pairs in unique references. DIP 1.353 5.403 (binary?) MINT 1.406 5.430 (spoke) Intact 355 6.836 (spoke) GRID 1.232 (binary?) BIND* (protein part) 5.733 (spoke/matrix) HPRD 6.989 (matrix) The organisation level of medical research has through the centuries moved from a scale of anatomy and physiology, to the scale of molecular interactions. Functional genomics is a discipline in medical research, that tries to decipher how genetic information is coordinated in time and space, in such a manner that sequence based information generates function. Or to rephrase: What is it on a molecular, biological and genomic level, that separates individuals affected with a given disease, from healthy individuals generally? If answered, this question can lead to the discovery of biological and pathway information, and in the best case lead to a cure for the particular disorder studied. The key to answering this question, is finding the genes in which mutations lead to a particular disease. Once found, a number of methods can be applied to investigate the genes experimentally in an attempt of linking the gene to a biological function in time and space. For these reasons finding disease genes has always been a field of interest, and after the publication the complete human genomic sequence in 2001 (Woods, Young et al. 1999; Venter, Adams et al. 2001), the molecular dissection of human diseases has moved into hyperdrive. This can be seen by looking in the catalog of human genes and genetic disorders identified through the last 3-4 years in the Online Mendelian Inheritance in Man database. This project concerns disease gene finding in diseases exhibiting genetic heterogeneity. We have developed a new method in an attempt to pinpoint disease genes, in genomic intervals known to link to a particular heterogeneous disorder. The method revolves around large scale protein-protein interaction queries, followed by sequence alignments against the linkage intervals. Any interesting findings have been reported to experimental groups, and if possible collaborations have been established in order to verify our predictions experimentially. BBS5 IDENTIFICATION Comparative genomics identifies a flagellar and basal body proteome that includes the BBS5 human disease gene. Li JB, Gerdes JM, Haycraft CJ, Fan Y, Teslovich TM, May-Simera H, Li H, Blacque OE, Li L, Leitch CC, Lewis RA, Green JS, Parfrey PS, Leroux MR, Davidson WS, Beales PL, Guay-Woodford LM, Yoder BK, Stormo GD, Katsanis N, Dutcher SK. Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA. Cilia and flagella are microtubule-based structures nucleated by modified centrioles termed basal bodies. These biochemically complex organelles have more than 250 and 150 polypeptides, respectively. To identify the proteins involved in ciliary and basal body biogenesis and function, we undertook a comparative genomics approach that subtracted the nonflagellated proteome of Arabidopsis from the shared proteome of the ciliated/flagellated organisms Chlamydomonas and human. We identified 688 genes that are present exclusively in organisms with flagella and basal bodies and validated these data through a series of in silico, in vitro, and in vivo studies. We then applied this resource to the study of human ciliation disorders and have identified BBS5, a novel gene for Bardet-Biedl syndrome. We show that this novel protein localizes to basal bodies in mouse and C. elegans, is under the regulatory control of daf-19, and is necessary for the generation of both cilia and flagella. *Approx. 10% of pp interactions in BIND are db’ imports
13
Species bias in available data
A few select organisms are very well-studied, while others are not. The BIND database, species distribution (Alfarano et al., NAR, 2005):
14
Trans-organism protein interaction network
Orthologs? Orthologous genes are direct descendants of a gene in a common ancestor: S. cerevisiae D. melanogaster The organisation level of medical research has through the centuries moved from a scale of anatomy and physiology, to the scale of molecular interactions. Functional genomics is a discipline in medical research, that tries to decipher how genetic information is coordinated in time and space, in such a manner that sequence based information generates function. Or to rephrase: What is it on a molecular, biological and genomic level, that separates individuals affected with a given disease, from healthy individuals generally? If answered, this question can lead to the discovery of biological and pathway information, and in the best case lead to a cure for the particular disorder studied. The key to answering this question, is finding the genes in which mutations lead to a particular disease. Once found, a number of methods can be applied to investigate the genes experimentally in an attempt of linking the gene to a biological function in time and space. For these reasons finding disease genes has always been a field of interest, and after the publication the complete human genomic sequence in 2001 (Woods, Young et al. 1999; Venter, Adams et al. 2001), the molecular dissection of human diseases has moved into hyperdrive. This can be seen by looking in the catalog of human genes and genetic disorders identified through the last 3-4 years in the Online Mendelian Inheritance in Man database. This project concerns disease gene finding in diseases exhibiting genetic heterogeneity. We have developed a new method in an attempt to pinpoint disease genes, in genomic intervals known to link to a particular heterogeneous disorder. The method revolves around large scale protein-protein interaction queries, followed by sequence alignments against the linkage intervals. Any interesting findings have been reported to experimental groups, and if possible collaborations have been established in order to verify our predictions experimentially. BBS5 IDENTIFICATION Comparative genomics identifies a flagellar and basal body proteome that includes the BBS5 human disease gene. Li JB, Gerdes JM, Haycraft CJ, Fan Y, Teslovich TM, May-Simera H, Li H, Blacque OE, Li L, Leitch CC, Lewis RA, Green JS, Parfrey PS, Leroux MR, Davidson WS, Beales PL, Guay-Woodford LM, Yoder BK, Stormo GD, Katsanis N, Dutcher SK. Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA. Cilia and flagella are microtubule-based structures nucleated by modified centrioles termed basal bodies. These biochemically complex organelles have more than 250 and 150 polypeptides, respectively. To identify the proteins involved in ciliary and basal body biogenesis and function, we undertook a comparative genomics approach that subtracted the nonflagellated proteome of Arabidopsis from the shared proteome of the ciliated/flagellated organisms Chlamydomonas and human. We identified 688 genes that are present exclusively in organisms with flagella and basal bodies and validated these data through a series of in silico, in vitro, and in vivo studies. We then applied this resource to the study of human ciliation disorders and have identified BBS5, a novel gene for Bardet-Biedl syndrome. We show that this novel protein localizes to basal bodies in mouse and C. elegans, is under the regulatory control of daf-19, and is necessary for the generation of both cilia and flagella. H. sapiens (O'Brien K, Remm et al. 2005)
15
Trans-organism protein interaction network
H. sapiens MOSAIC D. melanogaster Experim. C. elegans Experim. The organisation level of medical research has through the centuries moved from a scale of anatomy and physiology, to the scale of molecular interactions. Functional genomics is a discipline in medical research, that tries to decipher how genetic information is coordinated in time and space, in such a manner that sequence based information generates function. Or to rephrase: What is it on a molecular, biological and genomic level, that separates individuals affected with a given disease, from healthy individuals generally? If answered, this question can lead to the discovery of biological and pathway information, and in the best case lead to a cure for the particular disorder studied. The key to answering this question, is finding the genes in which mutations lead to a particular disease. Once found, a number of methods can be applied to investigate the genes experimentally in an attempt of linking the gene to a biological function in time and space. For these reasons finding disease genes has always been a field of interest, and after the publication the complete human genomic sequence in 2001 (Woods, Young et al. 1999; Venter, Adams et al. 2001), the molecular dissection of human diseases has moved into hyperdrive. This can be seen by looking in the catalog of human genes and genetic disorders identified through the last 3-4 years in the Online Mendelian Inheritance in Man database. This project concerns disease gene finding in diseases exhibiting genetic heterogeneity. We have developed a new method in an attempt to pinpoint disease genes, in genomic intervals known to link to a particular heterogeneous disorder. The method revolves around large scale protein-protein interaction queries, followed by sequence alignments against the linkage intervals. Any interesting findings have been reported to experimental groups, and if possible collaborations have been established in order to verify our predictions experimentially. BBS5 IDENTIFICATION Comparative genomics identifies a flagellar and basal body proteome that includes the BBS5 human disease gene. Li JB, Gerdes JM, Haycraft CJ, Fan Y, Teslovich TM, May-Simera H, Li H, Blacque OE, Li L, Leitch CC, Lewis RA, Green JS, Parfrey PS, Leroux MR, Davidson WS, Beales PL, Guay-Woodford LM, Yoder BK, Stormo GD, Katsanis N, Dutcher SK. Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA. Cilia and flagella are microtubule-based structures nucleated by modified centrioles termed basal bodies. These biochemically complex organelles have more than 250 and 150 polypeptides, respectively. To identify the proteins involved in ciliary and basal body biogenesis and function, we undertook a comparative genomics approach that subtracted the nonflagellated proteome of Arabidopsis from the shared proteome of the ciliated/flagellated organisms Chlamydomonas and human. We identified 688 genes that are present exclusively in organisms with flagella and basal bodies and validated these data through a series of in silico, in vitro, and in vivo studies. We then applied this resource to the study of human ciliation disorders and have identified BBS5, a novel gene for Bardet-Biedl syndrome. We show that this novel protein localizes to basal bodies in mouse and C. elegans, is under the regulatory control of daf-19, and is necessary for the generation of both cilia and flagella. S. cerevisiae Experim.
16
Repetition of experiments adds credibility
Light blue connection – 1 experiment. Darker blue connection – >1 experiment, 1 organism. Purple connection - >1 experiment, >1 organisms.
17
Adding co-expression data
Red connector – co-expression in 80 different tissues with a correlation coefficient above 0.7. Grey nodes – no expression data available. Su et al. profiled mRNA abundance in 79 human tissues. We mapped these genes to nodes in the protein-protein interaction network and coloured connections between pairs of nodes which were shown to have similar profiles with a correlation coefficient above 0.7.
18
Nucleolus dynamics Nodes are coloured according to level of protein in the nucleolus following transcriptional inhibition (Andersen et al., Nature, 2005). decreased unchanged Relative level of protein in the nucleolus after inhibition transcription increased
19
Adding up to make high quality associations
Integration of various data sources builds up confidence
20
Upon integration comes enlightenment
21
Upon integration comes enlightenment
22
Identifying functional complexes
Ribosome (predominantly 60S) DNA repair SMARCA complex TFIID Arp2/3
23
Summary Protein-protein interactions can reveal hints about the function of a protein (guilt by association). Information about protein interactions is obtained with different technologies each with its own advantages and weaknesses. Due to the high degree of systemic conservation, interactions can be inferred from observed interactions in other species. Data are always error-prone. Repeated observations build up confidence. Integrating different types of data can futher build up confidence.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.