EMBL-EBI MSDpisa a web service for studying Protein Interfaces, Surfaces and Assemblies Eugene Krissinel

Slides:



Advertisements
Similar presentations
Dr Roger Bennett Rm. 23 Xtn Lecture 19.
Advertisements

Data Curation in Crystallography: Publisher Perspectives JISC Data Cluster Consultation Workshop CCLRC, Didcot, Oxon 10 October 2006.
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Lecture 14: Special interactions. What did we cover in the last lecture? Restricted motion of molecules near a surface results in a repulsive force which.
Recent developments 1) Tests (outlier analysis) and Bug fixing ( with Paul) 2) Regeneration of Values of Bonds and Bond-angles existing all structures.
Applications and integration with experimental data Checking your results Validating your results Structure determination from powder data calculations.
A New Analytical Method for Computing Solvent-Accessible Surface Area of Macromolecules.
A Brief Description of the Crystallographic Experiment
Jaguar in the Real World Used in national labs, industrial companies, and academic institutions worldwide Application areas include pharmaceutical, chemical,
Recursive domains in proteins
Mossbauer Spectroscopy in Biological Systems: Proceedings of a meeting held at Allerton House, Monticello, Illinois. Editors: J. T. P. DeBrunner and E.
An Integrated Approach to Protein-Protein Docking
Structure Validation in Chemical Crystallography Principles and Application Ton Spek, National Single Crystal Service Facility, Utrecht University SAB-Delft,
Review of “Stability of Macromolecular Complexes” Dan Kulp Brooijmans, Sharp, Kuntz.
Unit 2, Part 3: Characterizing Nanostructure Size Dr. Brian Grady-Lecturer
The Geometry of Biomolecular Solvation 1. Hydrophobicity Patrice Koehl Computer Science and Genome Center
eHiTS Score Darryl Reid, Zsolt Zsoldos, Bashir S. Sadjad, Aniko Simon, The next stage in scoring function evolution: a new statistically.
AM Recitation 2/10/11.
Protein Interfaces, Surfaces and Assemblies
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 8-1 Review and Preview.
Estimation and Hypothesis Testing Now the real fun begins.
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
14. Introduction to inference
STUDY OF STRUCTURAL FEATURES OF PROTEINS OF BIOTECHNOLOGICAL INTEREST BY MD SIMULATIONS Anna Marabotti Dept. Chemistry and Biology, University of Salerno,
Lecture Notes 4 Pruning Zhangxi Lin ISQS
Coordinate handling and exploitation An overview of coordinate functionality in CCP4 suite Coordinate functionality in REFMAC group of programs (A. Vaguine)
COMPARATIVE or HOMOLOGY MODELING
Chapter 8 Introduction to Hypothesis Testing
Study Questions: 1) Define biology and science.. Study Questions: 1)Define biology and science. - Biology: The scientific study of living systems - Science:
EMBL-EBI Adel Golovin MSDsite The project is funded by the European Commission as the TEMBLOR, contract-no. QLRI-CT under the RTD programme.
BALBES (Current working name) A. Vagin, F. Long, J. Foadi, A. Lebedev G. Murshudov Chemistry Department, University of York.
Fall 2002CS/PSY Empirical Evaluation Analyzing data, Informing design, Usability Specifications Inspecting your data Analyzing & interpreting results.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Andrew Thomson on Generalised Estimating Equations (and simulation studies)
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes MSD Protein.
Flexible Multi-scale Fitting of Atomic Structures into Low- resolution Electron Density Maps with Elastic Network Normal Mode Analysis Tama, Miyashita,
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
Secondary structure prediction
EMBL-EBI MSDpisa a web service for studying Protein Interfaces, Surfaces and Assemblies Eugene Krissinel
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
Chapters 1 & 3 Chemistry- A Study of the Properties and Changes of Matter.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
A Technical Introduction to the MD-OPEP Simulation Tools
INTERACTIONS IN PROTEINS AND THEIR ROLE IN STRUCTURE FORMATION.
Biochemistry - as science; biomolecules; metabolic ways. Structure of proteins, methods of its determination.
EBI is an Outstation of the European Molecular Biology Laboratory. Quaternary Structure.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Newer method to sequence whole genomes –Uses allyl protecting group: Sequencing by Synthesis.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
EBI is an Outstation of the European Molecular Biology Laboratory. Assessment of macromolecular interactions and identification of macromolecular assemblies.
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
EBI is an Outstation of the European Molecular Biology Laboratory. Sanchayita Sen, Ph.D. PDB Depositions Validation & Structure Quality.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-PISA a web based service for understanding Protein Interfaces, Surfaces and Assemblies.
EMBL-EBI Representative sets and Clustering.. EMBL-EBI Representative sets A subset of data that provides a statistically valid sample set for the complete.
FlexWeb Nassim Sohaee. FlexWeb 2 Proteins The ability of proteins to change their conformation is important to their function as biological machines.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
EMBL-EBI Eugene Krissinel SSM - MSDfold. EMBL-EBI MSDfold (SSM)
Advanced Higher Biology Unit 3 Investigative Biology.
1 Three-Body Delaunay Statistical Potentials of Protein Folding Andrew Leaver-Fay University of North Carolina at Chapel Hill Bala Krishnamoorthy, Alex.
We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained.
A new protein-protein docking scoring function based on interface residue properties Reporter: Yu Lun Kuo (D )
PDBe Protein Interfaces, Surfaces and Assemblies
Reduce the need for human intervention in protein model building
Large Time Scale Molecular Paths Using Least Action.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

EMBL-EBI MSDpisa a web service for studying Protein Interfaces, Surfaces and Assemblies Eugene Krissinel

EMBL-EBI What PISA is about Crystal = translated Unit Cell More than 80% of protein structures are solved by means of X-ray diffraction on crystals. An X-ray diffraction experiment produces atomic coordinates of the crystal’s Asymmetric Unit (ASU). In general, neither ASU nor Unit Cell has any relation to Biological Unit, or stable protein complex which acts as a unit in physiological processes. Is there a way to infer Biological Unit from the protein crystallography data? Unit Cell = all space symmetry group mates of ASU PDB file

EMBL-EBI ? no image or bad image In (very) simple words … 2 crystallisation 3 in crystal ?? good image but no associations in vivo 1

EMBL-EBI At first glance … … the solution is simple as 1-2: 1.Evaluate all protein contacts (interfaces) in crystal 2.Leave only the strongest (“biologically relevant”) ones - and what you get will have chances to be a stable protein complex. Small technical problem: How to discriminate between “real” (biologically relevant) and “superficial” (inter-assembly, or crystal packing) interfaces?

EMBL-EBI Real and superficial protein interfaces Most often used discrimination criteria - interface area. A cut-off at 900 Å 2 gives about 80% success rate of discrimination between monomers and dimers. Big proteins would be always sticky if this criteria is true …

EMBL-EBI Free energy gain of interface formation. A cut-off at -8 kcal/M gives about 82% success rate of discrimination between monomers and dimers. Can energy measure be uniform for all weights and shapes? Real and superficial protein interfaces

EMBL-EBI Real and superficial protein interfaces P-value of hydrophobic patches. A measure of probability for the interface to be more hydrophobic than found. A cut-off at 0.2 gives about 60% success rate of discrimination between monomers and dimers.

EMBL-EBI Real and superficial protein interfaces Packing edge factor. A measure showing how closely the mass packing edge matches the actual interface. A cut-off at 0.3 gives about 60% success rate of discrimination between monomers and dimers interface packing edge

EMBL-EBI  No ultimate discriminating parameter for the identification of biologically relevant protein interfaces may be proposed at present even for dimeric complexes Jones, S. & Thornton, J.M. (1996) Principles of protein-protein interactions, Proc. Natl. Acad. Sci. USA, 93,  Formation of N>2 -meric complexes is most probably a corporate process involving a set of interfaces. Therefore significance of an interface should not be detached from the context of protein complex Real and superficial protein interfaces

EMBL-EBI Making assemblies from significant interfaces  PQS MSD-EBI (Kim Henrick) Trends in Biochem. Sci. (1998) 23, 358 Method: recursive splitting of the largest complexes as allowed by crystal symmetry. Termination criteria is derived from the individual statistical scores of crystal contacts. The results are not curated.  PITA Thornton group EBI (Hannes Ponstingl) J. Appl. Cryst. (2003) 36, 1116 Method: progressive build-up by addition of monomeric chains that suit the selection criteria. The results are partly curated. Despite failure to find an ultimate measure for interface biological relevance, two approaches were developed that use scoring of individual interfaces:

EMBL-EBI  It is not properties of individual interfaces but rather chemical stability of protein complex in general that really matters  Protein chains will most likely associate into largest complexes that are still stable  A protein complex is stable if its free energy of dissociation is positive: Chemical stability of protein complexes How to calculate  G diss ?

EMBL-EBI Protein affinity Solvation energy of protein complex Solvation energies of dissociated subunits Free energy of H-bond formation Number of H- bonds between dissociated subunits Free energy of salt bridge formation Number of salt bridges between dissociated subunits Dissociation into stable subunits with minimum Choice of dissociation subunits:  G int is function of protein interfaces

EMBL-EBI Solvation free energy Atomic solvation parameters Atom’s accessible surface area Atom’s accessible surface area in reference (unfolded) state protein solvent Eisenberg, D. & McLachlan, A.D. (1986) Nature 319,

EMBL-EBI Entropy of macromolecules in solutions Translational entropyRotational entropySidechain entropy Mass Solvent-accessible surface area Tensor of inertia Murray C.W. and Verdonik M.L. (2002) J. Comput.-Aided Mol. Design 16, Symmetry number c t, c r and F are semiempirical parameters

EMBL-EBI Entropy of dissociation Fitted parameter Mass of i-th subunit k-th principal moment of inertia of i-th subunit  S is function of protein complex

EMBL-EBI How to identify an assembly in crystal? We now know (or we think that we know) how to evaluate chemical stability of protein complexes. Given a 3D-arrangement of protein chains, we can now say whether there are chances that this arrangement is a stable assembly, or biological unit. But how to get potential assemblies in first place?

EMBL-EBI How to catch a Desert Lion? Method of Desert Lion Catch all lions and keep One living in Desert

EMBL-EBI Enumerating assemblies in crystal  crystal is represented as a periodic graph with monomeric chains as vertices and interfaces as edges  each set of assemblies is identified by engaged interface types  all assemblies may be enumerated by a backtracking scheme engaging all possible combinations of different interface types Example: crystal with 3 interface types Assembly set Engaged interface types only monomers dimer N dimer N Assembly set Engaged interface types dimer N all crystal

EMBL-EBI Clever backtracking The number of different interface types may reach a hundred. The algorithm is not going to complete backtracking of combinations unless it is clever enough to  check geometry and engage induced interfaces as soon as they emerge  check geometry and terminate backtracking if assembly contains two identical chains in parallel orientations  see the future and terminate backtracking if there are no stable assemblies down the current branch of the recursion tree Engaged interfaces Induced interface Otherwise assembly will be infinite due to translation symmetry in crystal Based on the observation that entropy of dissociation of unstable assemblies only increases down the recursion tree … only then the algorithm completes in 0.1 secs to 1.5 hours depending on the structure …

EMBL-EBI Detection of Biological Units in Crystals 1.Build periodic graph of the crystal 2.Enumerate all possibly stable assemblies 3.Evaluate assemblies for chemical stability 4.Leave only sets of stable assemblies in the list and range them by chances to be a biological unit : Larger assemblies take preference Single-assembly solutions take preference Otherwise, assemblies with higher  G diss take preference Method Summary

EMBL-EBI Are we any close? Assembly classification on the benchmark set of 218 structures published in Ponstingl, H., Kabir, T. and Thornton, J. (2003) Automatic inference of protein quaternary structures from crystals. J. Appl. Cryst. 36, homomers and 20 heteromers Fitted parameters: 1.Free energy of a H-bond : 2.Free energy of a salt bridge : 3.Constant entropy term : 4.Surface entropy factor : = 0.51 kcal/mol = 0.21 kcal/mol = 11.7 kcal/mol = 0.57·10 -3 kcal/(mol*Å 2 ) Classification error in  G diss : ± 5 kcal/mol

EMBL-EBI A better method?  PQS server :78% (not optimised on the benchmark set, but manually curated)  PITA software : 84% (optimised with 18 parameters, system overfit(?))  Present study :90% (optimised with 4 parameters, system underfit) Percent of successful classifications, as measured on the same benchmark set of 218 PDB entries:

EMBL-EBI What is beyond the benchmark set? Classification results obtained for 366 recent depositions into PDB in reference to manual classification in MSD-EBI : homomers and 45 heteromers Classification error in  G diss : ± 5 kcal/mol

EMBL-EBI Is it ever going to be 100%?  theoretical models for protein affinity and entropy change upon protein complexation are primitive  coordinate (experimental) data is of a limited accuracy  there is no feasible way to take conformations in crystal into account  experimental data on multimeric states is very limited and not always reliable - calibration of parameters is difficult  protein assemblies may exist in some environments and dissociate in other - a definite answer is simply not there Nobody should be that naive, because :

EMBL-EBI Web-server PISA A new MSD-EBI tool for working around Protein Interfaces, Surfaces and Assemblies

EMBL-EBI

Conclusions  Stable protein complexes, which are likely to be biological units, may be calculated from protein crystallography data at 80-90% success rate  Biological relevance of a particular protein interface cannot be reliably inferred from the interface properties only. Instead, one should conclude about significance of an interface from the analysis of the relevant protein assemblies Acknowledgement. This work has been supported by research grant No. 721/B19544 from the Biotechnology and Biological Sciences Research Council (BBSRC) UK.