Download presentation
Presentation is loading. Please wait.
1
Seminar series 2 Protein structure validation
2
In 't verleden ligt het heden; in 't nu, wat worden zal. The past: Linus Pauling ‘Inventor’ of helix and strand. Inventor of Bioinformatics?! Worked on proteins.
3
The history of bioinformatics is proteins The future of bioinformatics is proteins Only the present is a bit confused……
4
Structure validation Everything that can go wrong, will go wrong, especially with things as complicated as protein structures.
5
What is real?
6
ATOM 1 N LEU 1 -15.159 11.595 27.068 1.00 18.46 ATOM 2 CA LEU 1 -14.294 10.672 26.323 1.00 9.92 ATOM 3 C LEU 1 -14.694 9.210 26.499 1.00 12.20 ATOM 4 O LEU 1 -14.350 8.577 27.502 1.00 13.43 ATOM 5 CB LEU 1 -12.829 10.836 26.772 1.00 13.48 ATOM 6 CG LEU 1 -11.745 10.348 25.834 1.00 15.93 ATOM 7 CD1 LEU 1 -11.895 11.027 24.495 1.00 13.12 ATOM 8 CD2 LEU 1 -10.378 10.636 26.402 1.00 15.12
7
X-ray
9
‘FFT-inv’ FFT-inv
10
X-ray R-factor Error = Σ w.(obs-calc) 2 R-factor = Σ w.|obs-calc|
11
X-ray resolution
12
NMR data collection
13
NMR data NMR data consists mainly of short inter-atomic distances between atoms. We call these NOEs. Most NOEs are between close neighbours in the sequence. Those hold little information. The ‘good’ NOEs are between atoms far away in the sequence. There are few of those, normally. NOEs are known with low precision. E.g. NOEs are binned 2.5-4.0, 4.0-5.5, and 5.5-7.0.
14
NMR Q-factor Error = Σ NOE-violations + Energy term 2
15
NMR versus X-ray ‘Error’ 1-2 Å0.1-0.5 Å Mobilityyesnot really Crystal artefactsnoyes Material needed20 mg1 mg Cost of hardware4 M Euronear infinite (share) Drug designnoalmost Better combine and use the best of both worlds.
16
Why ? Why does a sane (?) human being spend fourteen years to search for millions of errors in the PDB?
17
Because: Everything we know about proteins comes from PDB files. If a template is wrong the model will be wrong. Errors become less dangerous when you know about them.
18
What do we check? Administrative errors. Crystal-specific errors. NMR-specific errors. Really wrong things. Improbable things. Things worth looking at. Ad hoc things.
19
1FCC
20
Smile or cry? A 5RXN 1.2 B 7GPB 2.9 C 1DLP 3.3 D 1BIW 2.5
21
X-ray specific
22
Further… 4 The SCALE matrix gives a left handed axis system 26 Scale matrix represents wrong crystal class 4 Negated value in scale matrix 11Value in first row of scale matrix mistyped 10Value in second row of scale matrix mistyped 6Value in third row of scale matrix mistyped 88Determinant of MTRIX is incorrect 195Warning: New symmetry found 62Warning: MTRIX is not a pure rotation matrix 165Warning: Duplicate atoms encountered. 57Error: Threonine nomenclature problem 324Error: Weights outside the 0.0 -- 1.0 range 709Error: Weights outside the 0.0 -- 1.0 range 520Error: Decreasing residue numbers 362Error: Water clusters without contacts 10973Warning: Water molecules need moving
23
Further, further… 1599Error: B-factor over-refinement 901Error: Atoms too close to symmetry axes 21090Error: Abnormally short interatomic distances 169Note: No Van der Waals overlaps 9100Warning: Unusual bond lengths 8214Warning: Possible cell scaling problem 18458Warning: Unusual bond angles 2515Error: Ramachandran Z-score very low 15408Warning: Omega angles too tightly restrained 4987Error: Side chain planarity problems 780Warning: Inside/Outside residue distribution 12684Warning: Backbone oxygen evaluation 18612Error: HIS, ASN, GLN side chain flips
24
Little things hurt big
25
How bad is bad?
26
Errors or discoveries? Buried histidine. Warning for buried histidine triggered biochemical follow -up and new mechanism for KH-module of Vigilin. (A. Pastore, 1VIG).
27
Contact Probability
29
DACA
34
Contact probability box
35
Using contact probability
36
His, Asn, Gln ‘flips’
37
Where are the protons?
38
Hydrogen bond network
39
15% should be flipped
40
Your best check:
41
How difficult can it be? 1CBQ 2.2 A
42
How difficult can it be?
43
Progress A Chirality B Bond length C Planarity D Bond angle
44
Progress E Water island F Bond angle G Atom on axis H Chain name
45
Progress Chi-1 vs Chi 2 Ramachandran Structures at 1.8 – 2.0 A
46
Conclusions Everything that could go wrong has gone wrong. Errors are on a ‘sliding scale’. Error detection can detect a lot, but surely not everything (yet).
47
Acknowledgements: Rob Hooft Elmar Krieger Sander Nabuurs Chris Spronk Robbie Joosten Maarten Hekkelman
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.