Presentation is loading. Please wait.

Presentation is loading. Please wait.

Seminar series 2 Protein structure validation. In 't verleden ligt het heden; in 't nu, wat worden zal. The past: Linus Pauling ‘Inventor’ of helix and.

Similar presentations


Presentation on theme: "Seminar series 2 Protein structure validation. In 't verleden ligt het heden; in 't nu, wat worden zal. The past: Linus Pauling ‘Inventor’ of helix and."— Presentation transcript:

1 Seminar series 2 Protein structure validation

2 In 't verleden ligt het heden; in 't nu, wat worden zal. The past: Linus Pauling ‘Inventor’ of helix and strand. Inventor of Bioinformatics?! Worked on proteins.

3 The history of bioinformatics is proteins The future of bioinformatics is proteins Only the present is a bit confused……

4 Structure validation Everything that can go wrong, will go wrong, especially with things as complicated as protein structures.

5 What is real?

6 ATOM 1 N LEU 1 -15.159 11.595 27.068 1.00 18.46 ATOM 2 CA LEU 1 -14.294 10.672 26.323 1.00 9.92 ATOM 3 C LEU 1 -14.694 9.210 26.499 1.00 12.20 ATOM 4 O LEU 1 -14.350 8.577 27.502 1.00 13.43 ATOM 5 CB LEU 1 -12.829 10.836 26.772 1.00 13.48 ATOM 6 CG LEU 1 -11.745 10.348 25.834 1.00 15.93 ATOM 7 CD1 LEU 1 -11.895 11.027 24.495 1.00 13.12 ATOM 8 CD2 LEU 1 -10.378 10.636 26.402 1.00 15.12

7 X-ray

8

9 ‘FFT-inv’ FFT-inv

10 X-ray R-factor Error = Σ w.(obs-calc) 2 R-factor = Σ w.|obs-calc|

11 X-ray resolution

12 NMR data collection

13 NMR data NMR data consists mainly of short inter-atomic distances between atoms. We call these NOEs. Most NOEs are between close neighbours in the sequence. Those hold little information. The ‘good’ NOEs are between atoms far away in the sequence. There are few of those, normally. NOEs are known with low precision. E.g. NOEs are binned 2.5-4.0, 4.0-5.5, and 5.5-7.0.

14 NMR Q-factor Error = Σ NOE-violations + Energy term 2

15 NMR versus X-ray ‘Error’ 1-2 Å0.1-0.5 Å Mobilityyesnot really Crystal artefactsnoyes Material needed20 mg1 mg Cost of hardware4 M Euronear infinite (share) Drug designnoalmost Better combine and use the best of both worlds.

16 Why ? Why does a sane (?) human being spend fourteen years to search for millions of errors in the PDB?

17 Because: Everything we know about proteins comes from PDB files. If a template is wrong the model will be wrong. Errors become less dangerous when you know about them.

18 What do we check? Administrative errors. Crystal-specific errors. NMR-specific errors. Really wrong things. Improbable things. Things worth looking at. Ad hoc things.

19 1FCC

20 Smile or cry? A 5RXN 1.2 B 7GPB 2.9 C 1DLP 3.3 D 1BIW 2.5

21 X-ray specific

22 Further… 4 The SCALE matrix gives a left handed axis system 26 Scale matrix represents wrong crystal class 4 Negated value in scale matrix 11Value in first row of scale matrix mistyped 10Value in second row of scale matrix mistyped 6Value in third row of scale matrix mistyped 88Determinant of MTRIX is incorrect 195Warning: New symmetry found 62Warning: MTRIX is not a pure rotation matrix 165Warning: Duplicate atoms encountered. 57Error: Threonine nomenclature problem 324Error: Weights outside the 0.0 -- 1.0 range 709Error: Weights outside the 0.0 -- 1.0 range 520Error: Decreasing residue numbers 362Error: Water clusters without contacts 10973Warning: Water molecules need moving

23 Further, further… 1599Error: B-factor over-refinement 901Error: Atoms too close to symmetry axes 21090Error: Abnormally short interatomic distances 169Note: No Van der Waals overlaps 9100Warning: Unusual bond lengths 8214Warning: Possible cell scaling problem 18458Warning: Unusual bond angles 2515Error: Ramachandran Z-score very low 15408Warning: Omega angles too tightly restrained 4987Error: Side chain planarity problems 780Warning: Inside/Outside residue distribution 12684Warning: Backbone oxygen evaluation 18612Error: HIS, ASN, GLN side chain flips

24 Little things hurt big

25 How bad is bad?

26 Errors or discoveries? Buried histidine. Warning for buried histidine triggered biochemical follow -up and new mechanism for KH-module of Vigilin. (A. Pastore, 1VIG).

27 Contact Probability

28

29 DACA

30

31

32

33

34 Contact probability box

35 Using contact probability

36 His, Asn, Gln ‘flips’

37 Where are the protons?

38 Hydrogen bond network

39 15% should be flipped

40 Your best check:

41 How difficult can it be? 1CBQ 2.2 A

42 How difficult can it be?

43 Progress A Chirality B Bond length C Planarity D Bond angle

44 Progress E Water island F Bond angle G Atom on axis H Chain name

45 Progress Chi-1 vs Chi 2 Ramachandran Structures at 1.8 – 2.0 A

46 Conclusions Everything that could go wrong has gone wrong. Errors are on a ‘sliding scale’. Error detection can detect a lot, but surely not everything (yet).

47 Acknowledgements: Rob Hooft Elmar Krieger Sander Nabuurs Chris Spronk Robbie Joosten Maarten Hekkelman


Download ppt "Seminar series 2 Protein structure validation. In 't verleden ligt het heden; in 't nu, wat worden zal. The past: Linus Pauling ‘Inventor’ of helix and."

Similar presentations


Ads by Google