PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D structure prediction ModBase-A database of 3D struc. Predict. Protein Structure Prediction
How are structures solved experimentally? X-Ray crystalography: Diffraction patterns are recorded from x-ray beams hitting a crystalized array of molecules. NMR: Nuclear magnetic resonance, magnetic nuclei absorb and re-emit magnetic radiation in frequencies depending on their properties. Cryo-EM: In Cryo-electron microscopy molecules are frozen in a thin tube, EM records many low resolution projections of the molecule and it is computationaly combined. Many other methods exist.
Embedding: from distances to shape חיפהת"אירושליםאילתאשדוד חיפה ת"א ירושלים אילת אשדוד
PDB: Curation of solved structures
PDB file Accession number Java based visualization tools Structural Classification
PDB provides the atomic coordinates of the structure : Which can be viewed by different visualization tools
SCOP: Structural Classification of Proteins Based on known protein structures Manually created by visual inspection Hierarchical database structure: –Class, Fold, Superfamily, Family, Protein and Species
Parents of node Children of node Node
Parents of node Children of node Node
CATH: Protein Structure Classification by Class, Architecture, Topology and Homology Class: The secondary structure composition: mainly-alpha, mainly-beta and alpha-beta. Architecture: The overall shape of the domain structure. Orientations of the secondary structures : e.g. barrel or 3- layer sandwich. Topology: Structures are grouped into fold groups at this level depending on both the overall shape and connectivity of the secondary structures. Homologous Superfamily: Evolutionary conserved structures
CATH: Protein Structure Classification by Class, Architecture, Topology and Homology
Prediction: Comparative Modeling Various methods: –Homology modeling –Protein threading –Side-chain geometry prediction Accuracy of the comparative model is related to the sequence identity on which it is based >50% sequence identity = high accuracy 30%-50% sequence identity = 90% modeled <30% sequence identity = low accuracy (many errors)
SWISS-MODEL An automated protein homology modeling server.
SWISS-MODEL The SWISS-MODEL algorithm can be divided into three steps: 1.Search for suitable templates: the server finds all similarities of a query sequence to sequences of known structure. It uses the BLASTP2 program with the ExNRL-3D database (a derivative of PDB database, specified for SWISS-MODEL). You get these partial results as a SwissModel TraceLog file. 2.Check sequence identity with target: All templates with sequence identities above 25% are selected 3.Create the model using the ProModII program. You get this as a SwissModel-Model file.
SWISS-MODEL Get PDB file by Load to J-Mol
Single Structure Homology Modeling
Swiss-Model file Structures used for the homology model query
ModBase A Homology Model Database
GenTHREADER An automated protein threading server. Input sequence Type of Analysis (PSIPRED,MEMSAT,genTHREAD)
GenTHREADER
Output The output sequences show some extent of sequence homology But high level of secondary structure conservation
Ab inito modeling Based on physical (chemical) properties of amino acids –Leading contender in the field foldit –Crowd-sourcing software –Designed as a game where the goal is to optimize a structure –Dozens of published papers referencing itpublished papers
Exercise In this exercise we will analyze two structures of the protein Lysozyme. the sequences of those proteins have small differences. 1.Download Pymol (after registering) 2. Load the two structures 1LYD.pdb,1L35.pdb 3.Use the Cartoon option for visualizing the structures. 4.Align the structures using the command: align /1lyd,/1l35 Analyze the difference in structures, what is the RMSD (Root Mean Square – represents the distance between the structures)?
Results Show Cartoon Hide lines