Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proteins and Protein Function Charles Yan Spring 2006.

Similar presentations


Presentation on theme: "Proteins and Protein Function Charles Yan Spring 2006."— Presentation transcript:

1 Proteins and Protein Function Charles Yan Spring 2006

2 2 Amino Acids General structure of an amino acid 20 standard amino acids each with a different R group

3 3 Amino Acids Amino Acid3-letter code1-letter code AlanineAlaA ArginineArgR AsparagineAsnN AspartateAspD CysteineCysC GlutamineGlnQ GlutamateGluE GlycineGlyG HistidineHisH IsoleucineIleI Table 1. 20 standard amino acids

4 4 Amino Acids Amino Acid3-letter code1-letter code LeucineLeuL LysineLysK MethionineMetM PhenylalaninePheF ProlineProP SerineSerS ThreonineThrT TryptophanTrpW TyrosineTyrY ValineValV Table 1. 20 standard amino acids (Cont.)

5 5 Amino Acids Amino Acid3-letter code1-letter code Asparagine (N) or aspartate (D)AsxB Glutamine (Q) or glutamate (E)GlxZ Any amino acidXaaX Authority IUPAC-IUB Joint Commission on Biochemical Nomenclature. Reference IUPAC-IUB Joint Commission on Biochemical Nomenclature. Nomenclature and Symbolism for Amino Acids and Peptides. Eur. J. Biochem. 138:9-37(1984). Amino Acid Abbreviations (IUPAC)

6 6 Proteins Two separate amino acids can be linked together by a peptide bond A chain of amino acids linked by peptide bonds is called a polypeptide. A protein is made up of one or more polypeptide chains For simplicity, in this course, a protein is a chain of amino acids linked by peptide bonds, e.g. VSQLLKQRVRYAPYLSKVRRAEELLPLFKHGQYIGWSGFTGVGAPKVI

7 7 Protein Database UniProt (Universal Protein Resource) (http://www.pir.uniprot.org/) is the world's most comprehensive catalog of information on proteins. It is a collaboration betweenhttp://www.pir.uniprot.org/ Swiss Institute of Bioinformatics (SIB) Department of Bioinformatics and Structural Biology of the Geneva University European Bioinformatics Institute (EBI) Georgetown University Medical Center's Protein Information Resource (PIR) It includes three components

8 8 Protein Database UniProt Knowledgebase (UniProtKB): the central access point for extensive curated protein information. UniProtKB/Swiss-Prot: a manually annotated protein sequence database which provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases. UniProtKB/Swiss-Prot Release 48.7 of 20-Dec-2005: 204,086 entries UniProtKB/TrEMBL: a computer-annotated supplement of Swiss-Prot that contains all the translations of EMBL nucleotide sequence entries not yet integrated in Swiss-Prot. UniProtKB/TrEMBL Release 31.7 of 20-Dec-2005: 2,506,886 entries UniProt Reference Clusters (UniRef): databases combine closely related sequences into a single record to speed searches. UniProt Archive (UniParc): a comprehensive repository, reflecting the history of all protein sequences

9 9 Protein Database

10 10 Protein Database

11 11 Protein Database

12 12 Protein Database

13 13 Protein Database

14 14

15 15 Gene Ontology Protein synthesis Translation Goal: find all the proteins that are involved protein synthesis

16 16 Gene Ontology Volkswagen Golf Golf I like golf. Me too!

17 17 Gene Ontology Ontology n. the branch of metaphysics dealing with the nature of being. (The New Oxford American Dictionary, Edited by Elizabeth J. Jewell, Frank Abate, Oxford University Press, 2001,pp 1197.)  Metaphysics n. the branch of philosophy that deals with the first principles of things, including abstract concepts such as being, knowing, substance, cause, identity, time, and space. (The New Oxford American Dictionary, Edited by Elizabeth J. Jewell, Frank Abate, Oxford University Press, 2001,pp 1074.)

18 18 Gene Ontology The Gene Ontology (GO) (http://www.geneontology.org/) project is a collaborative effort to address the need for consistent descriptions of gene products in different databases. The project began as a collaboration between three model organism databases: FlyBase (Drosophila),the Saccharomyces Genome Database (SGD) and the Mouse Genome Database (MGD) in 1998. Since then, the GO Consortium has grown to include many databases, including several of the world's major repositories for plant, animal and microbial genomes.

19 19 Gene Ontology Develop structured, controlled vocabularies (ontologies) that describe gene products Make associations between the ontologies and the genes and gene products in the collaborating databases, Develop tools that facilitate the creation, maintainence and use of ontologies The use of GO terms facilitates uniform queries across databases

20 20 Gene Ontology The three components of GO are molecular function, biological process and cellular component GO terms are organized in structures called directed acyclic graphs (DAGs), which differ from hierarchies in that a child, or more specialized, term can have many parent, or less specialized, terms hexose biosynthesis monosaccharide biosynthesishexose metabolism

21 21 Gene Ontology The controlled vocabularies are structured so that you can query them at different levels GO browser AmiGO (http://www.godatabase.org/cgi- bin/amigo/go.cgi)

22 22

23 23 Protein function Three steps to get a set of proteins that have a certain function Search for the GO term (http://www.godatabase.org/cgi-bin/amigo/go.cgi) Search for the proteins belong to a certain GO (http://www.pir.uniprot.org/search/textSearch.shtml) Save the sequence in FASTA format

24 24 Search for the GO

25 25 Search for the proteins belong to a certain GO

26 26 Save sequences in FASTA format


Download ppt "Proteins and Protein Function Charles Yan Spring 2006."

Similar presentations


Ads by Google