Seminar series 2 Protein structure validation. In 't verleden ligt het heden; in 't nu, wat worden zal. The past: Linus Pauling ‘Inventor’ of helix and.

Slides:



Advertisements
Similar presentations
Refinement of a pdb-structure and Convert A. Search for a pdb with the closest sequence to your protein of interest. B. Choose the most suitable entry.
Advertisements

Protein NMR terminology COSY-Correlation spectroscopy Gives experimental details of interaction between hydrogens connected via a covalent bond NOESY-Nuclear.
Areas of Spectrum.
A Ala Alanine Alanine is a small, hydrophobic
Crystals are made from very large numbers of small units: Unit cells. The unit cell may contain more than one protein. The packing of the unit cells gives.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Protein Secondary Structure II Lecture 2/24/2003.
From last time: Why are some materials solids at room temperature, and others are liquids or gases? The temperature of a material is related to the average.
An overview of amino acid structure Topic 2. Biomacromolecule A naturally occurring substance of large molecular weight e.g. Protein, DNA, lipids etc.
Solving NMR Structures II: Calculation and evaluation What NMR-based (solution) structures look like the NMR ensemble inclusion of hydrogen coordinates.
RECOORD REcalculated COORdinates Database Aart Nederveen Bijvoet Center for Biomolecular Research Utrecht University Jurgen Doreleijers.
Power and weakness of data Power: data + software + bioinformatician = answer. Weakness: Data errors. Data poorly understood. Poor software. Never enough.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Structure validation Everything that can go wrong, will go wrong, especially with things as complicated as protein structures. Everything that can go wrong,
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Solving NMR structures II: Calculation and evaluation The NMR ensemble Methods for calculating structures distance geometry, restrained molecular dynamics,
Homology modelling ? X-ray ? NMR ?. Homology Modelling !
3. Crystals What defines a crystal? Atoms, lattice points, symmetry, space groups Diffraction B-factors R-factors Resolution Refinement Modeling!
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
©CMBI 2002 Homology modelling ? X-ray ? NMR ? Intro Proteins Modelling 8 Steps Detect Threading Alignment Template Side chain Indels Optimize Validate.
Structure validation Everything that can go wrong, will go wrong. Everything that could go wrong has gone wrong. Especially with something as complicated.
Homology modelling ? X-ray ? NMR ?. Homology Modelling !
1 Computational Biology, Part 13 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, All rights reserved.
The Chemical Context of Life Chapter 2. Matter  Matter consists of chemical elements in pure form and in combinations called compounds; living organisms.
A PEPTIDE BOND PEPTIDE BOND Polypeptides are polymers of amino acid residues linked by peptide group Peptide group is planar in nature which limits.
Proteins account for more than 50% of the dry mass of most cells
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Macromolecular structure
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
COMPARATIVE or HOMOLOGY MODELING
CHEMISTRY. Composition of Matter Matter - Everything in universe is composed of matter Matter is anything that occupies space or has mass Mass – quantity.
Protein Secondary Structure Lecture 2/19/2003. Three Dimensional Protein Structures Confirmation: Spatial arrangement of atoms that depend on bonds and.
CHEMISTRY. Composition of Matter Matter - Everything in universe is composed of matter Matter is anything that occupies space or has mass Mass – quantity.
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Transmembrane proteins in the Protein Data Bank: identification and classification Gabor, E. Tusnady, Zsuzanna Dosztanyi and Istvan Simon Bioinformatics,
SMART Teams: Students Modeling A Research Topic Jmol Training 101!
Protein Sequences. The Genetic Code The natural extension of the genetic code…
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
1.Overall amino acid structure 2.Amino acid stereochemistry 3.Amino acid sidechain structure & classification 4.‘Non-standard’ amino acids 5.Amino acid.
CS790 – BioinformaticsProtein Structure and Function1 Review of fundamental concepts  Know how electron orbitals and subshells are filled Know why atoms.
Doug Raiford Lesson 17.  Framework model  Secondary structure first  Assemble secondary structure segments  Hydrophobic collapse  Molten: compact.
1 11/9/2015 ATOMS. 2 11/9/2015 Major atoms in biology n C H O N S P n # of atoms in outer shell – H, 1 – C4 – N, P5 – O, S 6.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
ECCB 2020 Gent Introduction to protein structure validation (and improvement) Gert Vriend Protein structure validation.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Presented by Andrew Le. Xanthorhodopsin is a light-driven proton pump that associates with the vitamin retinol and salinixanthin, a carotenoid pigment.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Seminar series 2 Protein structure validation. Structure validation Everything that can go wrong, will go wrong, especially with things as complicated.
Atomic structure model
X-ray detection xray/facilities.html.
Bioinformatics 2 -- lecture 9
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Proteins: 3D-Structure Chapter 6 (9 / 17/ 2009)
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
©CMBI 2002 Structure validation Murphy’s law: Everything that can go wrong, will go wrong, especially with things as complicated as protein structures.
Instructor: Elon Yariv – PDB founded with 7 X-ray structures – 102,318 X-ray, 11,256 NMR & 933 EM structures.
CHEMISTRY Cloth Strips bonding Atomic Structure drawings Balancing Equations Molar Solutions Acids and Bases Enzyme Lab Endothermic and exothermic reactions.
Computational Structure Prediction
March 21, 2008 Christopher Bruns
Protein Structure Prediction and Protein Homology modeling
Intermolecular forces
Chapter 3 Proteins.
1. Pure Protein (0.3 mL, mM; ~ 10 mg)
Levels of Protein Structure
Hydration and DNA Recognition by Homeodomains
Uniformity, Ideality, and Hydrogen Bonds in Transmembrane α-Helices
Structure validation Everything that can go wrong, will go wrong, especially with things as complicated as protein structures.
Christian X. Weichenberger, Manfred J. Sippl  Structure 
Presentation transcript:

Seminar series 2 Protein structure validation

In 't verleden ligt het heden; in 't nu, wat worden zal. The past: Linus Pauling ‘Inventor’ of helix and strand. Inventor of Bioinformatics?! Worked on proteins.

The history of bioinformatics is proteins The future of bioinformatics is proteins Only the present is a bit confused……

Structure validation Everything that can go wrong, will go wrong, especially with things as complicated as protein structures.

What is real?

ATOM 1 N LEU ATOM 2 CA LEU ATOM 3 C LEU ATOM 4 O LEU ATOM 5 CB LEU ATOM 6 CG LEU ATOM 7 CD1 LEU ATOM 8 CD2 LEU

X-ray

‘FFT-inv’ FFT-inv

X-ray R-factor Error = Σ w.(obs-calc) 2 R-factor = Σ w.|obs-calc|

X-ray resolution

NMR data collection

NMR data NMR data consists mainly of short inter-atomic distances between atoms. We call these NOEs. Most NOEs are between close neighbours in the sequence. Those hold little information. The ‘good’ NOEs are between atoms far away in the sequence. There are few of those, normally. NOEs are known with low precision. E.g. NOEs are binned , , and

NMR Q-factor Error = Σ NOE-violations + Energy term 2

NMR versus X-ray ‘Error’ 1-2 Å Å Mobilityyesnot really Crystal artefactsnoyes Material needed20 mg1 mg Cost of hardware4 M Euronear infinite (share) Drug designnoalmost Better combine and use the best of both worlds.

Why ? Why does a sane (?) human being spend fourteen years to search for millions of errors in the PDB?

Because: Everything we know about proteins comes from PDB files. If a template is wrong the model will be wrong. Errors become less dangerous when you know about them.

What do we check? Administrative errors. Crystal-specific errors. NMR-specific errors. Really wrong things. Improbable things. Things worth looking at. Ad hoc things.

1FCC

Smile or cry? A 5RXN 1.2 B 7GPB 2.9 C 1DLP 3.3 D 1BIW 2.5

X-ray specific

Further… 4 The SCALE matrix gives a left handed axis system 26 Scale matrix represents wrong crystal class 4 Negated value in scale matrix 11Value in first row of scale matrix mistyped 10Value in second row of scale matrix mistyped 6Value in third row of scale matrix mistyped 88Determinant of MTRIX is incorrect 195Warning: New symmetry found 62Warning: MTRIX is not a pure rotation matrix 165Warning: Duplicate atoms encountered. 57Error: Threonine nomenclature problem 324Error: Weights outside the range 709Error: Weights outside the range 520Error: Decreasing residue numbers 362Error: Water clusters without contacts 10973Warning: Water molecules need moving

Further, further… 1599Error: B-factor over-refinement 901Error: Atoms too close to symmetry axes 21090Error: Abnormally short interatomic distances 169Note: No Van der Waals overlaps 9100Warning: Unusual bond lengths 8214Warning: Possible cell scaling problem 18458Warning: Unusual bond angles 2515Error: Ramachandran Z-score very low 15408Warning: Omega angles too tightly restrained 4987Error: Side chain planarity problems 780Warning: Inside/Outside residue distribution 12684Warning: Backbone oxygen evaluation 18612Error: HIS, ASN, GLN side chain flips

Little things hurt big

How bad is bad?

Errors or discoveries? Buried histidine. Warning for buried histidine triggered biochemical follow -up and new mechanism for KH-module of Vigilin. (A. Pastore, 1VIG).

Contact Probability

DACA

Contact probability box

Using contact probability

His, Asn, Gln ‘flips’

Where are the protons?

Hydrogen bond network

15% should be flipped

Your best check:

How difficult can it be? 1CBQ 2.2 A

How difficult can it be?

Progress A Chirality B Bond length C Planarity D Bond angle

Progress E Water island F Bond angle G Atom on axis H Chain name

Progress Chi-1 vs Chi 2 Ramachandran Structures at 1.8 – 2.0 A

Conclusions Everything that could go wrong has gone wrong. Errors are on a ‘sliding scale’. Error detection can detect a lot, but surely not everything (yet).

Acknowledgements: Rob Hooft Elmar Krieger Sander Nabuurs Chris Spronk Robbie Joosten Maarten Hekkelman