Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan 2010. Research Complex at Harwell Eugene Krissinel CCP4,

Slides:



Advertisements
Similar presentations
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Advertisements

Data Curation in Crystallography: Publisher Perspectives JISC Data Cluster Consultation Workshop CCLRC, Didcot, Oxon 10 October 2006.
Chapter 7 Sampling and Sampling Distributions
On Comparing Classifiers : Pitfalls to Avoid and Recommended Approach
Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Machine Learning: Intro and Supervised Classification
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
Research Methodology Statistics Maha Omair Teaching Assistant Department of Statistics, College of science King Saud University.
CHAPTER 14: Confidence Intervals: The Basics
1 Functions and Applications
Functional Site Prediction Selects Correct Protein Models Vijayalakshmi Chelliah Division of Mathematical Biology National Institute.
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
NCBI data, sliding window programs and dot plots Sept. 25, 2012 Learning objectives-Become familiar with OMIM and PubMed. Understand the difference between.
EMBL-EBI MSDpisa a web service for studying Protein Interfaces, Surfaces and Assemblies Eugene Krissinel
05/27/2006 Modeling and Determining the Structures of Proteins and Macromolecular Assemblies Depts. of Biopharmaceutical Sciences and Pharmaceutical Chemistry.
Biochemistry 300 Introduction to Structural Biology Walter Chazin 5140 BIOSCI/MRBIII
Biochemistry 301 Overview of Structural Biology Techniques Jan. 19, 2004.
Biochemistry 300 Introduction to Structural Biology Walter Chazin 5140 BIOSCI/MRBIII
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
Bridging the solution divide: comprehensive structural analyses of dynamic RNA, DNA, and protein assemblies by small-angle X-ray scattering By Rambo and.
An Integrated Approach to Protein-Protein Docking
Review of “Stability of Macromolecular Complexes” Dan Kulp Brooijmans, Sharp, Kuntz.
Ensemble Learning (2), Tree and Forest
Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Biochemistry 300 Introduction to Structural Biology Walter Chazin 5140 BIOSCI/MRBIII
Protein Interfaces, Surfaces and Assemblies
Protein Tertiary Structure Prediction
23 May June May 2002 From genes to drugs via crystallography 19 May 1996 Experimental and computational approaches to structure based.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Coordinate handling and exploitation An overview of coordinate functionality in CCP4 suite Coordinate functionality in REFMAC group of programs (A. Vaguine)
Introduction to Macromolecular X-ray Crystallography Biochem 300 Borden Lacy Print and online resources: Introduction to Macromolecular X-ray Crystallography,
BALBES (Current working name) A. Vagin, F. Long, J. Foadi, A. Lebedev G. Murshudov Chemistry Department, University of York.
1 PyMOL Evolutionary Trace Viewer 1.1 Lichtarge Lab Sept. 13, 2010.
Structural biology should be computable! Protein structures determined by amino acid sequences Protein structures and complexes correspond to global free.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes MSD Protein.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
1/27 Discrete and Genetic Algorithms in Bioinformatics 許聞廉 中央研究院資訊所.
EMBL-EBI MSDpisa a web service for studying Protein Interfaces, Surfaces and Assemblies Eugene Krissinel
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
R. Keegan 1, J. Bibby 3, C. Ballard 1, E. Krissinel 1, D. Waterman 1, A. Lebedev 1, M. Winn 2, D. Rigden 3 1 Research Complex at Harwell, STFC Rutherford.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
PREDICTION OF CATALYTIC RESIDUES IN PROTEINS USING MACHINE-LEARNING TECHNIQUES Natalia V. Petrova (Ph.D. Student, Georgetown University, Biochemistry Department),
EBI is an Outstation of the European Molecular Biology Laboratory. Quaternary Structure.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
EBI is an Outstation of the European Molecular Biology Laboratory. Assessment of macromolecular interactions and identification of macromolecular assemblies.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Deposition, Validation, Search and Analysis Services.
Macromolecular Structure Database Project EMSD Infra-structure Services for Europe To develop an autonomous structural database capability in Europe
Introduction. Zn 2+ homeostasis is regulated at the transcriptional level by the DNA-binding protein SmtB. Manipulation of Zn 2+ homeostasis could act.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-PISA a web based service for understanding Protein Interfaces, Surfaces and Assemblies.
Molecular dynamics simulations of toxin binding to ion channels Quantitative description protein –ligand interactions is a fundamental problem in molecular.
Mean Field Theory and Mutually Orthogonal Latin Squares in Peptide Structure Prediction N. Gautham Department of Crystallography and Biophysics University.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
2014 Using machine learning to predict binding sites in proteins Jenelle Bray Stanford University October 10, 2014 #GHC
PDBe Protein Interfaces, Surfaces and Assemblies
PROTEIN MODELLING Presented by Sadhana S.
Michael T. Bradley & A. Luke MacNeill
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
Reduce the need for human intervention in protein model building
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
An Integrated Approach to Protein-Protein Docking
Large Time Scale Molecular Paths Using Least Action.
Sequence comparison: Significance of similarity scores
Ligand Binding to the Voltage-Gated Kv1
Presentation transcript:

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Eugene Krissinel CCP4, STFC Research Complex at Harwell Didcot, United Kingdom CCP4 Study Weekend, Nottingham, UK, 7-8 January 2010 Macromolecular Complexes in Crystals and Solutions E. Krissinel (2010) J. Comp. Chem. 31, E. Krissinel and K. Henrick (2007) J. Mol. Biol. 372,

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Structural Biology From Crystals Why do we want to know structure of a macromolecule? -for many things, but probably firstly for finding out how it interacts with other molecules Macromolecular crystals present us with models of biological structures and their interactions if you want to know how A interacts with B – crystallize them together! (crystallographers sweet dream)

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Structural Biology From Crystals A decamer? or a dimer? Crystals present us with both real and artifactual interactions, which may be difficult to differentiate. Often used techniques: Theoretical:Sharp Eye and Scientific Authority PISA software infers significant interactions and macromolecular assemblies from crystals by evaluating their free Gibbs energy: Experimental:Complementing studies (EM, NMR, scattering) Bioinformatical:Homology and interface similarity analysis Computational:Energy estimates and modelling Rules of thumb:e.g. manifestation in different crystal forms

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Detection of Biological Units in Crystals: PISA Summary 1.Enumerate all possible assemblies in crystal packing, subject to crystal properties: space symmetry group, geometry and composition of Asymmetric Unit Larger assemblies take preference Single-assembly solutions take preference Otherwise, assemblies with higher G diss take preference 3.Leave only sets of stable assemblies in the list and range them by chances to be a biological unit : Achieved with Graph Theory techniques, by representing a crystal as an infinite periodic graph of connected macromolecules 2.Evaluate assemblies for chemical stability: E. Krissinel and K. Henrick (2007) J. Mol. Biol. 372,

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell 1mer2mer3mer4mer6merOtherSumCorrect 1mer % 2mer % 3mer % 4mer % 6mer % Total: % Classification of protein assemblies Assembly classification on the benchmark set of 218 protein structures published in Ponstingl, H., Kabir, T. and Thornton, J. (2003) Automatic inference of protein quaternary structures from crystals. J. Appl. Cryst. 36, homomers and 22 heteromers Classification error in : ± 5 kcal/mol

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Classification of protein-DNA complexes Assembly classification on the benchmark set of 212 protein – DNA complexes published in Luscombe, N.M., Austin, S.E., Berman H.M. and Thornton, J.M. (2000) An overview of the structures of protein-DNA complexes. Genome Biol. 1, mer3mer4mer5mer6mer10merOtherSumCorrect 2mer % 3mer % 4mer % 5mer % 6mer % 10mer % Total:21293% Classification error in : ± 5 kcal/mol

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Free energy distribution of misclassifications

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Example of misclassification: 1QEX BACTERIOPHAGE T4 GENE PRODUCT 9 (GP9), THE TRIGGER OF TAIL CONTRACTION AND THE LONG TAIL FIBERS CONNECTOR Predicted: homohexamer Dissociates into 2 trimers 106 kcal/mol Biological unit: homotrimer Dissociates into 3 monomers 90 kcal/mol

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Example of misclassification: 1QEX Rossmann M.G., Mesyanzhinov V.V., Arisaka F and Leiman P.G. (2004) The bacteriophage T4 DNA injection machine. Curr. Opinion Struct. Biol. 14: BACTERIOPHAGE T4 GENE PRODUCT 9 (GP9), THE TRIGGER OF TAIL CONTRACTION AND THE LONG TAIL FIBERS CONNECTOR

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Example of misclassification: 1QEX BACTERIOPHAGE T4 GENE PRODUCT 9 (GP9), THE TRIGGER OF TAIL CONTRACTION AND THE LONG TAIL FIBERS CONNECTOR 1QEX hexamer 1QEX trimer 1S2E trimer Correct mainchain tracing Classed correctly Wrong mainchain tracing!

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Example of misclassification: 1D3U TATA-BINDING PROTEIN / TRANSCRIPTION FACTOR Predicted: octamer Dissociates into 2 tetramers 20 kcal/mol Functional unit: tetramer

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Example of misclassification: 1CRX CRE RECOMBINASE / DNA COMPLEX REACTION INTERMEDIATE Predicted: dodecamer Dissociates into 2 hexamers 28 kcal/mol Functional unit: trimer

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Example of misclassification: 1CRX CRE RECOMBINASE / DNA COMPLEX REACTION INTERMEDIATE Guo F., Gopaul D.N. and van Duyne G.D. (1997) Structure of Cre recombinase complexed with DNA in a site-specific recombination synapse. Nature 389:40-46.

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Example of misclassification: 1TON TONIN Predicted: dimer Dissociates at 37 kcal/mol Biological unit: monomer Apparent dimerization is an artefact due to the presence of Zn +2 ions added to the buffer to aid crystallization. Removal Zn from the file results in 3 kcal/mol Fujinaga M., James M.N.G. (1997) Rat submaxillary gland serine protease, tonin structure solution and refinement at 1.8 Å resolution. J.Mol.Biol. 195:

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Example of misclassification: 1YWK Predicted: homohexameric G diss 4.4 kcal/mol dissociating into 3 dimers Believed to be: monomeric 6 units in ASU Structural homologue 1XRU: RMSD 0.9 Å Seq.Id 50% Homohexameric with G diss 9.3 kcal/mol

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Choice of ASU

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Example of misclassification: 1YWK Predicted: homohexameric G diss 4.4 kcal/mol dissociating into 3 dimers Believed to be: monomeric 6 units in ASU Structural homologue 1XRU: RMSD 0.9 Å Seq.Id 50% Homohexameric with G diss 9.3 kcal/mol

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell obviously wrong Why does it work? 90% success rate achieved on the benchmark set Feedback from PDB and MSD curators suggests that 90%-95% of PISA classifications agree with intuitive and common-sense considerations Mandatory processing tool at wwPDB since 2007 Average 3 citations/week User feedback is encouraging The problem with PISA is that, apparently, it works well Two possible reasons for PISA to work well: Energy models and calculations are quite accurate probably correct PISA relies heavily on geometry of interactions given by crystal structure. PISA does not dock structures; rather, it uses natures dockings assuming that they are correct. In essence, it exploits a combination of chemistry and crystal informatics.

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell If this is all about crystal informatics, then... Apparently, PISA gives a reasonably good solution for crystal environment Do crystals always (or most probably) give correct geometry of interactions? Do crystals always give correct (i.e. natural) structures and complexes? Can crystals misrepresent structures and interactions? If yes, how such a case may be identified? But what is the relation between natural and crystallized structures?

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Distortion and Re-assembly Crystal optimizes energy of the whole system, therefore it may sacrifice biologically relevant interactions to the favour of unspecific contacts Distortion Probably, distortions are always there Re-assembly There is a chance for re-assembly if interaction is weak

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Docking experiment Objectives: to find out whether PISA models can give geometry of interactions to identify conditions for complex distortion and re-assembly Data set: 4065 protein dimers identified by PISA decreased redundancy by removing structures with high structure and sequence similarity Rigid body docking = rotation + translation Idea: attempt to reproduce crystal dimers geometry optimized by crystal – no conformation modelling required if there is no reassemble effects and PISA energies are good, all dimers should be found by docking any docking failures should be due to energy errors, or crystal effects, or both

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Docking results 4065 protein pairs docked 2520 came back to the significant crystal interface 1545 arrived at interface not found in crystal 38% failures E. Krissinel (2010) J. Comp. Chem. 31,

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Fail rate of docking The plot shows the probability of docking algorithm to fail as a function of free energy of dimer dissociation. The probabilities were calculated using equipopulated bins. Overall, 38% failures

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Why it may fail? Thermodynamics of docking All docking positions (dimers) are possible, however with different occurrence probabilities in both solvent and in crystal + E. Krissinel (2010) J. Comp. Chem. 31,

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Crystal Misrepresentation Hypothesis Docking always finds the highest–energy dimer But crystallization may capture any dimer with probability P i Then the probability for docking to fail (that is, to disagree with the crystal) is perfect docking, imperfect crystals E. Krissinel (2010) J. Comp. Chem. 31,

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Why it may fail? Another look imperfect docking, perfect crystals crystal always captures the highest- energy dimer but due to finite accuracy of calculations, another dimer may appear as best docking solution error function E. Krissinel (2010) J. Comp. Chem. 31, Math is complicated

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Misrepresentation effects and docking errors docking results Pure crystal misrepresentation effect (0 kcal/mol error substituted) Effect of both crystal misrepresentation and energy errors (2.3 kcal/mol fitted) E. Krissinel (2010) J. Comp. Chem. 31,

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Conclusions Chemical-thermodynamical models for protein complex stability allow one to recover biological units from protein crystallography data at 80-90% success rate Considerable part of misclassifications is due to the difference of experimental and native environments and artificial interactions induced by crystal packing Crystals are likely to misrepresent weak macromolecular complexes Protein interface and assembly analysis software (PISA) is available, please use it

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell Acknowledgements Kim Henrick European Bioinformatics Institute General introduction and PQS expertise Mark Shenderovich Structural Bioinformatics Inc. Helpful discussion Hannes Ponstingl Sanger Centre Sharing the expertise and benchmark data Sergei Strelkov University of Leuven Mystery of bacteriophage T4 MSD & PDB teams EBI & Rutgers Everyday use of PISA, examples, verification and feedback CCP4 Daresbury-York-Oxford-Cambridge Encouragement and publicity ~5000 PISA users Worldwide Using PISA and feedback Biotechnology and Biological Sciences Research Council (BBSRC) UK Research grant No. 721/B19544

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell

Macromolecular Complexes in Crystals and Solutions CCP4 Study Weekend, Nottingham, UK, 7-8 Jan Research Complex at Harwell