H.F. Fan & Y.X. Gu Beijing National Laboratory for Condensed Matter Physics Institute of Physics, Chinese Academy of Sciences P.R. China H.F. Fan & Y.X. Gu Beijing National Laboratory for Condensed Matter Physics Institute of Physics, Chinese Academy of Sciences P.R. China Direct Methods in Protein Crystallography Direct Methods in Protein Crystallography
The phase problem & direct methods Sayre’s equation & tangent formula Use of direct methods in protein crystallography Direct-method SAD/SIR phasing Direct-method aided model completion The phase problem & direct methods Sayre’s equation & tangent formula Use of direct methods in protein crystallography Direct-method SAD/SIR phasing Direct-method aided model completion
The Phase Problem
The Point of View from Direct Methods: The Point of View from Direct Methods: Phases are not missing but just hidden in the magnitudes! Phases are not missing but just hidden in the magnitudes!
What is a Direct Method ? It derives phases directly from the magnitudes.
Why it is possible ?
Each reflection is accompanied by an unknown phase, but yields two simultaneous equations. Hence in theory, a diffraction data set of 3n reflections can be used to solve a structure with n independent atoms (assuming 3 parameters per atom). That is to say, the phases may, at least in theory, be derived from a large enough set of magnitudes given the known quantities of atomic scattering factors.
Conditions for the Sayre Equation to be valid 1. Positivity 2. Atomicity 3. Equal-atom structure Conditions for the Sayre Equation to be valid 1. Positivity 2. Atomicity 3. Equal-atom structure Sayre’s Equation
sin = h’ h, h’ sin ( h’ + h h’ ) cos = h’ h, h’ cos ( h’ + h h’ ) The tangent formula h, h’ = 2 h h’ h - h’
Locating heavy atoms Ab initio phasing of protein diffraction data at 1.2Å or higher resolution SnB, SHELXD, ACORN Direct-method aided SAD/SIR phasing and structure-model completion OASIS Locating heavy atoms Ab initio phasing of protein diffraction data at 1.2Å or higher resolution SnB, SHELXD, ACORN Direct-method aided SAD/SIR phasing and structure-model completion OASIS Use of direct methods in Protein Crystallography Use of direct methods in Protein Crystallography
the SAD/SIR phase ambiguity Direct methods breaking
Bimodal distribution from SAD The phase of F” Phase information available in SAD Cochran distribution Peaked at any where from 0 to 2 Peaked at Sim distribution
Two different kinds of initial SAD phases P + -modified phases P+P+ + P P Sim P Bimodal Sim-modified phases P+P+ P Sim P Cochran
P + formula Acta Cryst. A40, (1984) Acta Cryst. A40, (1984) Acta Cryst. A41, (1985) Reducing the phase problem to a sign problem Breaking the SAD/SIR phase ambiguity by the Cochran distribution incorporating with partial structure information + -
Comparison of 4 typical reflections from the protein histone methyltransferase SET 7/9
Comparison of cumulative phase errors in descending order of F obs Comparison of cumulative phase errors in descending order of F obs Errors of P + -modified phases ( o ) Number of reflections Errors of Sim-modified phases ( o ) Histone methyltransferase SET 7/9
Direct-method phasing of the 2Å experimental SAD data of the protein aPP Avian Pancreatic Polypeptide Space group: C2 Unit cell: a = 34.18, b = 32.92, c = 28.44Å; = o Protein atoms in ASU: 301 Resolution limit: 2.0Å Anomalous scatterer: Hg (in centric arrangement) Wavelength: 1.542Å (Cu-K ) f” = Locating heavy atoms & SAD phasing: direct methods Acta Cryst. A46, 935 (1990) Avian Pancreatic Polypeptide Space group: C2 Unit cell: a = 34.18, b = 32.92, c = 28.44Å; = o Protein atoms in ASU: 301 Resolution limit: 2.0Å Anomalous scatterer: Hg (in centric arrangement) Wavelength: 1.542Å (Cu-K ) f” = Locating heavy atoms & SAD phasing: direct methods Acta Cryst. A46, 935 (1990) Data courtesy of Professor Tom Blundell
Direct-method SAD/SIR phasing combined with density modification OASIS + DM, OASIS + RESOLVE, SOLVE/RESOLVE + OASIS Direct-methods aided dual-space structure-model completion ARP/wARP + OASIS, PHENIX + OASIS Direct-method SAD/SIR phasing combined with density modification OASIS + DM, OASIS + RESOLVE, SOLVE/RESOLVE + OASIS Direct-methods aided dual-space structure-model completion ARP/wARP + OASIS, PHENIX + OASIS Further developments
TTHA1634 from Thermus thermophilus HB8 Data courtesy of Professor Nobuhisa Watanabe Department of Biotechnology and Biomaterial Chemistry, Nagoya University, Japan Space group: P Unit cell: a = , b = , c = Å Number of residues in the AU: 1206 Resolution limit: 2.1Å Multiplicity: 29.2 Anomalous scatterer: S (22) X-ray wavelength: = 1.542Å (Cu-K ) Bijvoet ratio: / = 0.55% Phasing method: A single run of OASIS DM (Cowtan) Model building: ARP/wARP ARP/wARP found 1178 of the total 1206 residues, all docked into the sequence. Ribbon model plotted by PyMOL
Reciprocal-space fragment extension OASIS + DM Reciprocal-space fragment extension OASIS + DM Dual-space fragment extension Real-space fragment extension RESOLVE BUILD and/or ARP/wARP Real-space fragment extension RESOLVE BUILD and/or ARP/wARP Partial structure Partial structure No Yes OK? End Partial model Partial model
Glucose isomerase S-SAD Cu-K 17% Cycle 0 97% Cycle 6 Glucose isomerase S-SAD Cu-K Cr-K Se, S-SAD Alanine racemase Cycle 0 52% Cr-K Se, S-SAD Alanine racemase Cycle 4 97% 25% Cycle 0 Xylanase S-SAD Synchrotron = 1.49Å Xylanase S-SAD Synchrotron = 1.49Å 99% Cycle 6 52% Cycle 0 Lysozyme S-SAD Cr-K Lysozyme S-SAD Cr-K 98% Cycle 6 Azurin Cu-SAD Synchrotron = 0.97Å Cycle 0 42% Azurin Cu-SAD Synchrotron = 0.97Å Cycle 3 95% Ribbon models plotted by PyMOL Data courtesy of Professor N. Watanabe, Professor S. Hasnain, Dr. Z. Dauter and Dr. C. Yang
Direct-method aided MR-model completion Direct-method aided MR-model completion Dual-space fragment extension without SAD/SIR information Dual-space fragment extension without SAD/SIR information
Partial structure Partial structure Density modification by DM Density modification by DM No MR model MR model Yes End Model completion by ARP/wARP or PHENIX Model completion by ARP/wARP or PHENIX OK? Phase improvement by OASIS Phase improvement by OASIS
P + > 0.5 ” model P + < 0.5 ” model ~ ~ ””
MR-model completion of 1UJZ Space group: I222 a=62.88, b=74.55, c= Number of residuals in AU: 215 Resolution limit: 2.1Å
46 residues 13 with side chains MR model MR model Cycle 2 ARP/wARP-DM iteration Cycle 1 Cycle 3 ARP/wARP-OASIS-DM iteration Cycle 7 Cycle residues all with side chains Final model Final model 215 residues 1UJZ Ribbon models plotted by PyMOL
Range of phase error in degrees Cycle 1Cycle 3Cycle 5Cycle 7 Nr. of Reflns. % of P + > ½ Nr. of Reflns. % of P + > ½ Nr. of Reflns. % of P + > ½ Nr. of Reflns. % of P + > ½ UJZ Phase Statistics
MR-model completion of an originally unknown protein Space group: P a=71.81, b=81.40, c=108.95Å Number of residuals in AU: 728 Solvent content: 0.37 Resolution limit: 2.5Å
Starting model R-factor: 0.34 R-free: 0.44 No. of residuals: 479 with side chains: 479
After phenix.autobuild R-factor: 0.33 R-free: 0.40 No. of residuals: 503 with side chains: 503
After 4 cycles of oasis-phenix R-factor: 0.24 R-free: 0.30 No. of residuals: 597 with side chains: 588
What’s the low resolution limit for direct methods? What’s the low resolution limit for direct methods?
SAD phasing at different resolutions TTHA1634 Cu-K data, / ~ 0.55% SAD phasing at different resolutions TTHA1634 Cu-K data, / ~ 0.55% 2.1Å 3.0Å 3.5Å 4.0Å Very good Good Marginally traceable Still informative Maps at 1 phased by a single run of OASIS + DM (Cowtan) plotted by PyMOL
dealing with low resolution SIR/SAD data Combining SOLVE/RESOLVE and OASIS + DM
R-phycoerythrin SIR data from the native and the p-chloromercuriphenyl sulphonic acid derivative Space group: R3 Unit cell: a = b = 189.8, c = 60.0Å; = 120 o Number of residues in the ASU: 668 Resolution limit: 2.8Å Replacing atoms: Hg X-rays: Cu-K , λ = 1.542Å J.Mol.Biol (1996) Chinese Physics 16, (2007) SOLVE/RESOLVE & OASIS + DM Maps plotted by PyMOL
SOLVE/RESOLVE & OASIS + DM SOLVE/RESOLVE SOLVE/RESOLVE & OASIS + DM Tom70p Space group: P2 1 Unit cell: a = 44.89, b = 168.8, c = 83.4Å; β = o Number of residues: 1086 Resolution limit: 3.3Å Multiplicity: 3.3 Anomalous scatterer: Se (24) X-rays: Synchrotron, λ = Å, Δf" = 6.5 Bijvoet ratio: / = 4.3% Nature Structural & Molecular Biology 13, (2006) Chinese Physics B 17, 1-9 (2008) Maps plotted by PyMOL
OASIS-2006 Institute of Physics Chinese Academy of Sciences Beijing , P.R. China Institute of Physics Chinese Academy of Sciences Beijing , P.R. China
Institute of Biophysics, Chinese Academy of Sciences, Beijing, China Acknowledgements Professor Zhengjiong Lin 1 Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, China 2 National Laboratory of Protein Engineering and Plant Genetic Engineering, Peking University, Beijing, China 3 Institute of Biophysics, Chinese Academy of Sciences, Beijing China 1 Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, China 2 National Laboratory of Protein Engineering and Plant Genetic Engineering, Peking University, Beijing, China 3 Institute of Biophysics, Chinese Academy of Sciences, Beijing China Drs Y. He 1, D.Q. Yao 1, J.W. Wang 1, S. Huang 1, J.R. Chen 1, Q. Chen 2, H. Li 3, Prof. T. Jiang 3, Mr. T. Zhang 1, Mr. L.J. Wu 1 & Prof. C.D. Zheng 1 Drs Y. He 1, D.Q. Yao 1, J.W. Wang 1, S. Huang 1, J.R. Chen 1, Q. Chen 2, H. Li 3, Prof. T. Jiang 3, Mr. T. Zhang 1, Mr. L.J. Wu 1 & Prof. C.D. Zheng 1 The project is supported by the Chinese Academy of Sciences and the 973 Project (Grant No 2002CB713801) of the Ministry of Science and Technology of China.
Thank you!