Biology 224 Dr. Tom Peavy Sept 27 & 29 <Images from Bioinformatics and Functional Genomics by Jonathan Pevsner> Protein Structure & Analysis- part 2
Gene Ontology (GO) Consortium
The Gene Ontology Consortium An ontology is a description of concepts. The GO Consortium compiles a dynamic, controlled vocabulary of terms related to gene products. There are three organizing principles: Molecular function Biological process Cellular component
Page 241 GO terms are assigned to Entrez Gene entries
Gene product cytochrome c GO entry terms: molecular function = electron transporter activity, the biological process = oxidative phosphorylation and induction of cell death the cellular component = mitochondrial matrix and mitochondrial inner membrane. Example
GO consortium ( No centralized GO database. Instead, curators of organism-specific databases assign GO terms to gene products for each organism. AmiGO is the searchable portion of the GO --Gene Symbol, name, UniProt access numbers, and Text searches can be used to find GO entries
The Gene Ontology Consortium: Evidence Codes ICInferred by curator IDAInferred from direct assay IEAInferred from electronic annotation IEPInferred from expression pattern IGIInferred from genetic interaction IMPInferred from mutant phenotype IPIInferred from physical interaction ISSInferred from sequence or structural similarity NASNon-traceable author statement NDNo biological data TASTraceable author statement
Page 231 ProDom entry for HIV-1 pol shows many related proteins
Physical properties of proteins Many websites are available for the analysis of individual proteins. ExPASy and ISREC are two excellent resources. The accuracy of these programs is variable. Predictions based on primary amino acid sequence (such as molecular weight prediction) are likely to be more trustworthy. For many other properties (such as posttranslational modification of proteins by specific sugars), experimental evidence may be required rather than prediction algorithms. Page 236
Page 235 Access a variety of protein analysis programs from the top right of the ExPASy home page
Page 244
Proteomics: High throughput protein analysis Proteomics is the study of the entire collection of proteins encoded by a genome “Proteomics” refers to all the proteins in a cell and/or all the proteins in an organism Large-scale protein analysis 2D protein gels Yeast two-hybrid Rosetta Stone approach Pathways Page 247
Two-dimensional protein gels First dimension: isoelectric focusing Second dimension: SDS-PAGE Page 248
Two-dimensional protein gels First dimension: isoelectric focusing Electrophorese ampholytes to establish a pH gradient Can use a pre-made strip Proteins migrate to their isoelectric point (pI) then stop (net charge is zero) Range of pI typically 4-9 (5-8 most common) Page 248
Two-dimensional protein gels Second dimension: SDS-PAGE Electrophorese proteins through an acrylamide matrix Proteins are charged and migrate through an electric field Conditions are denaturing (SDS) and reducing (2-mercaptoethanol) Can resolve hundreds to thousands of proteins Page 248
Proteins identified on 2D gels (IEF/SDS-PAGE) Direct protein microsequencing by Edman degradations -- done at many core facilities (e.g. UC Davis) -- typically need 5 picomoles -- often get 10 to 20 amino acids sequenced Protein mass analysis by MALDI-TOF -- done at core facilities -- often detect posttranslational modifications -- matrix assisted laser desorption/ionization time-of-flight spectroscopy Page 250-1
Page 252
Evaluation of 2D gels (IEF/SDS-PAGE) Advantages: Visualize hundreds to thousands of proteins Improved identification of protein spots Disadvantages: Limited number of samples can be processed Mostly abundant proteins visualized Technically difficult Page 251