Download presentation
Presentation is loading. Please wait.
Published byAubrie Carr Modified over 9 years ago
1
Computational Analysis of Proteins Dr. K. Sivakumar Department of Chemistry SCSVMV University chemshiva@gmail.com Chemistry – Our Life, Our Future National Workshop on Modern Techniques in Analytical Chemistry www.kanchiuniv.ac.in/DrKSivakumar_chemistry.html
2
AMINO ACIDS: THE BUILDING BLOCKS OF PROTEINS Triple & single letter codes of amino acids General structure of an amino acid Amino Acid Triple letter code Single letter code AlanineAlaA CysteineCysC Aspartic acidAspD Glutamic acidGluE PhenylalaninePheF GlycineGlyG HistidineHisH IsoleucineIleI LysineLysK LeucineLeuL MethionineMetM AsparagineAsnN ProlineProP GlutamineGlnQ ArginineArgR SerineSerS ThreonineThrT ValineValV TryptophanTrpW TyrosineTyrY 2
3
PROTEIN SEQUENCING ( Order of amino acids in proteins) MALSFTVGQLIFLFWTMRITEASPD Methionine Alanine Leucine Serine Phenylalanine Protein sequence Protein sequencer Protein sequencing - determining the order of amino acid sequence Methods– Mass Spec., Edman degradation,…. Amino acids in a protein - determines the properties of proteins Proteins are sequenced - by microbiologists and biotechnologists for various purposes. 3
4
4 www.writersujatha.com Refer “GENOME” by Sujatha, for simple explanations on sequencing process
5
5 Various levels of protein structure……..
6
Methane Primary structure Secondary structure Tertiary structure Protein Primary structure Secondary structure Tertiary structure M for Metheonine M for group of atoms C for carbon C for single atom
7
Protein sequences are continuously submitted by sequencing centers and updated in protein databases. Till date more than 10 Lac proteins are sequenced and publicly made available through protein databases. For example, 524,420 Protein Sequence Databases No. of Sequences 1,365,912 13,593,921 7
8
Sequence growth in Protein sequence databases: Ref: SwissProt – Feb’ 2011Ref: GenomeNet – Feb’ 2011
9
70,947 Till 01, Feb, 2011 9 524,420 - ~ 5 Lac Protein Sequence Databases No. of Sequences 1,365,912 - > 10 Lac 13,593,921 - ~ 1 Cr The ONLY Protein Structure DatabaseNo. of Structure Ref: K. Sivakumar, Advanced BioTech, V (9), 20-27 (2007)
10
10 PDB contains (70,947) structures determined by X-ray, NMR & Electron microscopy EM ~350 NMR ~8,700 X-ray ~60,500
11
Most of the sequenced proteins lack a descriptive, documented physico-chemical and STRUCTURAL characterization. Because, experimental methods (X-ray, NMR, EM) are, Trial and error based Time consuming Expensive 11 Computational methods are, Minimizing the number of experimental trials. Reduces the cost of experimental investigation. Facilitates experimental analysis be more focused. Ref: K. Sivakumar, S. Balaji, Ganga Radhakrishnan, Journal of Theoretical and Computational Chemistry, 6 (1), 127-140 (2007).
12
12 Need for computational analysis > 10 Lac sequences are available in public databases Sequences are highly valuable resources, because… Huge amount of structural, functional & evolutionary information are locked up in sequences By contrast, the # of unique protein structures is very less - this represents a huge information deficit So, We need to construct 3D Models by COMPUTATIONAL METHODS
13
13 3D Structure can be modelled by… Homology Modeling Threading Ab initio
14
Ref: K. Sivakumar, Advanced BioTech, IV (11), 18-23 (2006) Repeated with other suitable templates 14 Homology Modeling – Principle…
15
? KQFTKCELSQNLYDIDGYGRIALPELICTMF HTSGYDTQAIVENDESTEYGLFQISNALWCK SSQSPQSRNICDITCDKFLDDDITDDIMCAK KILDIKGIDYWIAHKALCTEKLEQWLCEKE Predicting Protein Structure: Comparative Modeling (formerly, homology modeling) Use as template & model 8lyz 1alc KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFES NFNTQATNRNTDGSTDYGILQINSRWWCNDGRTPGS RNLCNIPCSALLSSDITASVNCAKKIVSDGNGMNAW VAWRNRCKGTDVQAWIRGCRL Share Similar Sequence Homologous Target sequence Template sequence Template structure
16
What is Homology Modeling? Predicts the three-dimensional structure of a given protein sequence (TARGET) based on an alignment to one or more known protein structures (TEMPLATES) If similarity between the TARGET sequence and the TEMPLATE sequence is detected, structural similarity can be assumed. In general, 30% sequence identity is required for generating useful models.
17
17 Homology Modeling Get protein sequence from sequence database http://expasy.org/sprot/
18
18 Click to get protein details
19
19 Click to get protein sequence
20
20 protein sequence in fasta format Save it in a notepad for further use
21
21 http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins Using Protein Blast server to find similar STRUCTURE Click to search, similar structures in PDB Paste sequence in Fasta format Choose PDB
22
22 Graphical summary of Blastp suite Blast search of O70456 Vs PDB
23
23 List of similar structure - Blastp suite
24
24 Detailed summary of Blastp suite
25
25 Paste sequence only Type the PDB ID Method1: EsyPred3D server - Submit the sequence and PDB ID Click to submit
26
26 Get built in structure through email in Inbox
27
27 Download the attached the *.pdb file and save it
28
28 Open and visualize the *.pdb file in RasMol
29
29 Open and visualize the *.pdb file in RasMol
30
30 Method2: SWISS-MODEL server Click for modeling
31
31 Submit sequence only in Fasta format (without PDB ID) Similarity search (BlastP) will be done by SWISS-MODEL server Paste sequence Click to submit
32
32 Get built in structure through email in Inbox
33
33 The links in the email will lead to Click to download 3D structure
34
34 Open and visualize the *.pdb file in RasMol
35
35 Structure retrieval from Protein 3D Structure Database – PDB……….
36
36 Structure retrieval from Protein 3D Structure Database – PDB………. PDB ID Click for protein details 491 sequence in SwissProt for « Keratin »
37
37 Structure retrieval from Protein 3D Structure Database – PDB………. Click for downloading structure
38
38 Structure retrieval from Protein 3D Structure Database – PDB………. Save & Know the location
39
39 Open and visualize the *.pdb file in RasMol Structure of 3EUU
40
40 MNRVDLSLFIPDSLTAETGDLKIKTYKVVLIAR AASIFGVKRIVIYHDDADGEARFIRDILTYMDT PQYLRRKVFPIMRELKHVGILPPLRTPHHPTG Sequence data Structural data (in notepad) Structural data (in RasMol)
41
41 Built model validation by ProQ server Click for uploading structure
42
42 Built model validation by ProQ server Click & upload the structure
43
43 Built model validation by ProQ server Submit after uploading
44
44 Built model validation by ProQ server result
45
45 Built model validation by Ramachandran Plot Click & upload the structure
46
46 Submit after uploading Built model validation by Ramachandran Plot….
47
47 Built model validation by Ramachandran Plot…. RESULTS G.N.Ramachandran
48
Ref: K. Sivakumar, S. Balaji, Ganga Radhakrishnan, Journal of Chemical Sciences, 119 (5), 571-579 (2007) 3D structure modeling and validation 48
49
Disulphide bridges in 3D structure of Q01758 Backbone of Q01758 (rainbow smelt fish) 10 Cysteines - ball and stick 10 Sulphur in Cysteines and 5 SS bonds (dotted lines) 49
50
Disulphide bridges in 3D structure of P05140 Ribbon model of P05140 (sea raven) 10 Cysteines - ball and stick 10 Sulphur in Cysteines and 5 SS bonds (dotted lines) 50
51
Secondary structure prediction from modeled 3D structure Q01758 P05140 Beta strand -helices Coil 51
52
52 Finding cavities in the built model using Castp server Click for calculation
53
53 Finding cavities in the built model using Castp server Click, upload & Submit the structure
54
54 Finding cavities in the built model using Castp server - RESULTS
55
55 For literature
56
56
58
58
59
59 Download sequence file for any one of the following proteins from Swissprot/Protein Information Resource/Protein Research Foundation, Antifreeze Vascular Endothelial growth factor protein Keratin Generate atleast 3 homology models using EsyPred server or SWISS- model server (i.e., using different PDB structures) Visualize the structure using RasMol tool Compare and Evaluate the modelled 3D structure using RamPage, ProQ Server and Combinatorial Extension servers. EXERCISE Target sequence codeTemplate (PDB) Codes RamPageProQ Percentage of residues in favoured region LG ScoreMaxSub
60
60 Generate the report in MS-Word file and submit to chemscsvmv@gmail.com Repeat the exercise for other protein sequences of your choice EXERCISE……
61
Thank you all! 61 P05140
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.