Experimental phasing in Crank2 Pavol Skubak and Navraj Pannu Biophysical Structural Chemistry, Leiden University, The Netherlands http://www.bfsc.leidenuniv.nl/software/crank/

Slides:



Advertisements
Similar presentations
Phasing Goal is to calculate phases using isomorphous and anomalous differences from PCMBS and GdCl3 derivatives --MIRAS. How many phasing triangles will.
Advertisements

Twinning and other pathologies Andrey Lebedev University of York.
Haifu Fan & Yuanxin Gu Beijing National Laboratory for Condensed Matter Physics Institute of Physics, Chinese Academy of Sciences P.R. China Haifu Fan.
Automated phase improvement and model building with Parrot and Buccaneer Kevin Cowtan
Protein x-ray crystallography
Introduction to protein x-ray crystallography. Electromagnetic waves E- electromagnetic field strength A- amplitude  - angular velocity - frequency.
Overview of the Phase Problem
M.I.R.(A.S.) S.M. Prince U.M.I.S.T.. The only generally applicable way of solving macromolecular crystal structure No reliance on homologous structure.
M.I.R.(A.S.) S.M. Prince U.M.I.S.T.. The only generally applicable way of solving macromolecular crystal structure No reliance on homologous structure.
Bob Sweet Bill Furey Considerations in Collection of Anomalous Data.
Small Molecule Example – YLID Unit Cell Contents and Z Value
Experimental Phasing stuff. Centric reflections |F P | |F PH | FHFH Isomorphous replacement F P + F H = F PH FPFP F PH FHFH.
A Brief Description of the Crystallographic Experiment
Direct Methods and Many Site Se-Met MAD Problems using BnP Direct Methods and Many Site Se-Met MAD Problems using BnP W. Furey.
Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.
In honor of Professor B.C. Wang receiving the 2008 Patterson Award In honor of Professor B.C. Wang receiving the 2008 Patterson Award Direct Methods and.
In Macromolecular Crystallography Use of anomalous signal in phasing
Automated protein structure solution for weak SAD data Pavol Skubak and Navraj Pannu Automated protein structure solution for weak SAD data Pavol Skubak.
Direct Methods By Fan Hai-fu, Institute of Physics, Beijing Direct Methods By Fan Hai-fu, Institute of Physics, Beijing
Phasing based on anomalous diffraction Zbigniew Dauter.
Radiation-damage- induced phasing with anomalous scattering Peter Zwart Physical biosciences division Lawrence Berkeley National Laboratories Not long.
28 Mar 06Automation1 Overview of developments within CCP4 Generation 1 ccp4i tasks Generation 2 isolated scripts / web service Generation 3 integrated.
Progress report on Crank: Experimental phasing Biophysical Structural Chemistry Leiden University, The Netherlands.
Patterson Space and Heavy Atom Isomorphous Replacement
Crank and Databases Steven Ness Leiden University The Netherlands.
Phasing Today’s goal is to calculate phases (  p ) for proteinase K using PCMBS and EuCl 3 (MIRAS method). What experimental data do we need? 1) from.
1. Diffraction intensity 2. Patterson map Lecture
Zhang, T., He, Y., Wang, J.W., Wu, L.J., Zheng, C.D., Hao, Q., Gu, Y.X. and Fan, H.F. (2012) Institute of Physics, Chinese Academy of Sciences Beijing,
THE PHASE PROBLEM Electron Density
Macromolecular Crystallography and Structural Genomics – Recent Trends Prof. D. Velmurugan Department of Crystallography and Biophysics University of Madras.
OASIS What is it? How it works? What next? What is it? How it works? What next?
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
Methods in Chemistry III – Part 1 Modul M.Che.1101 WS 2010/11 – 8 Modern Methods of Inorganic Chemistry Mi 10:15-12:00, Hörsaal II George Sheldrick
Computational Crystallography InitiativePhysical Biosciences Division First Aid & Pathology Data quality assessment in PHENIX Peter Zwart.
Direct Use of Phase Information in Refmac Abingdon, University of Leiden P. Skubák.
Pattersons The “third space” of crystallography. The “phase problem”
Progress report on Crank: Model building Biophysical Structural Chemistry Leiden University, The Netherlands.
Atomic structure model
Fitting EM maps into X-ray Data Alexei Vagin York Structural Biology Laboratory University of York.
Anomalous Differences Bijvoet differences (hkl) vs (-h-k-l) Dispersive Differences 1 (hkl) vs 2 (hkl) From merged (hkl)’s.
Absolute Configuration Types of space groups Non-centrosymmetric Determining Absolute Configuration.
Before Beginning – Must copy over the p4p file – Enter../xl.p4p. – Enter../xl.hkl. – Do ls to see the files are there – Since the.p4p file has been created.
Phasing in Macromolecular Crystallography
H.F. Fan 1, Y.X. Gu 1, F. Jiang 1,2 & B.D. Sha 3 1 Institute of Physics, CAS, Beijing, China 2 Tsinghua University, Beijing, China 3 University of Alabama.
Today: compute the experimental electron density map of proteinase K Fourier synthesis  (xyz)=  |F hkl | cos2  (hx+ky+lz -  hkl ) hkl.
Amyloid Precursor Protein (APP)
Stony Brook Integrative Structural Biology Organization
“New” methods By Paul Ellis.
Score maps improve clarity of density maps
OASIS-2004 A direct-method program for
Istituto di Cristallografia, CNR,
Model Building and Refinement for CHEM 645
Solving Crystal Structures
CCP4 6.1 and beyond: Tools for Macromolecular Crystallography
Complete automation in CCP4 What do we need and how to achieve it?
Solving THE PHASE PROBLEM – A very brief (but not very rigorous) introduction Gordon Leonard ESRF Structural Biology Group l CCP4-DLS Workshop December.
Phasing Today’s goal is to calculate phases (ap) for proteinase K using MIRAS method (PCMBS and GdCl3). What experimental data do we need? 1) from native.
Reduce the need for human intervention in protein model building
CCP4 from a user perspective
Introduction to Isomorphous Replacement and Anomalous Scattering Methods Measure native intensities Prepare isomorphous heavy atom derivatives Measure.
Nobel Laureates of X Ray Crystallography
S. Takeda, A. Yamashita, K. Maeda, Y. Maeda
r(xyz)=S |Fhkl| cos2p(hx+ky+lz -ahkl)
Provide quick feedback to data collection experiments.
Determining protein structures
By Fan Hai-fu, Institute of Physics, Beijing
Volume 54, Issue 4, Pages (May 2007)
Presentation transcript:

Experimental phasing in Crank2 Pavol Skubak and Navraj Pannu Biophysical Structural Chemistry, Leiden University, The Netherlands http://www.bfsc.leidenuniv.nl/software/crank/

X-ray structure solution pipeline Data collection Data processing Experimental phasing Molecular Replacement Density Modification Model building Refinement Rebuilding Validation

Crank2 for experimental phasing Crank2 is suitable for SAD, MAD and SIRAS. Crank2 has been shown to be particularly powerful weak anomalous signal to low resolution SAD datasets (4.5 Angstroms)

Single isomorphous replacement A “native” data set from a crystal that just contains your macromolecule. A “derivative” data set A macromolecular crystal that has been soaked with a heavy atom.

Phase problem Acentric Centric |Fp(h)| |Fp(h)| |Fp(h)| |Fp(h)| is the observed structure factor amplitude. Argand diagram

Phase information from (error-free) isomorphous replacement Acentric Centric Two solutions Single solution |FP(h)| |FPH(h)| FH(h) |FPH(h)| |FP(h)| |FP(h)| |FPH(h)| FH(h) |FPH(h)|: “derivative” observed amplitude FH(h): heavy atom structure factor FPH(h) = FH(h) + FP(h) (vector addition)

What are some of the errors in isomorphous replacement? Sometimes heavy atoms induce changes in the unit cell or changes in the macromolecule (non-isomorphism). The scale between native and derivative data sets. Errors in the heavy atom positions. Errors in the measurement of the amplitudes. But, heavy atoms could be easier than growing Se-met protein!

Effect of changing the wavelength: Anomalous scattering F+(h) =  (fj + f’j + i fj) exp(2i h . xj) F-(h) =  (fj + f’j - i fj) exp(2i h . xj)

Single-wavelength Anomalous Diffraction Solving structures using Friedel pairs collected at one wavelength from a crystal that contains an anomalous substructure.

Solving phases: using the anomalous signal Anomalous scattering arises due to a phase-shift of X-rays diffracted by tightly bound electrons The strength of the anomalous effect depends on the wavelength of the X-rays Anomalous scattering causes very small changes in intensity between F(+) and F(-) F+(h) =  (fj + fj + i fj) exp(2i h . xj) F-(h) =  (fj +fj' - i fj) exp(2i h . xj)

Simultaneously combining experimental phasing steps to improve structure solution Traditionally structure solution is divided into distinct steps: Substructure detection Obtain initial phases (Density) modify the initial experimental map Build and refine the model By combining these steps, we can improve the process.

Experimental data, anomalous substructure Traditional structure solution Step-wise Information is propagated via ‘phase probabilities’ Experimental phasing Phases, phase probabilities Traditional approach Density modification, phase combination Phases, phase probabilities Model building and refinement Model

Phase probabilities The phase distribution can be approximated via 4 “Hendrickson-Lattmann” coefficients, A, B, C, D. We rely on programs to estimate these coefficients. ‘Even better than the real thing?’

Density modification 2. Phase weighting: FFT |F|, φ ρ(x) φ=f(φexp,φmod) Modify ρ |Fmod|, φmod ρmod(x) FFT-1 Reciprocal space Real space

HL propagation and the independence assumption Use the experimental data and anomalous substructure directly! Do not need to assume independence or rely on HL coefficients. Need multivariate distributions at each step that take into account correlations between the model and data.

Experimental data, anomalous substructure Step-wise multivariate structure solution Still step-wise Information is propagated via the data and model(s). Experimental phasing Phases Traditional approach Density modification, phase combination Phases Model building and refinement Model

What are step-wise multivariate functions? The multivariate functions consider all the correlations between the data and the model together. Earlier methods neglected correlations and took into account, for example, “DANO” (DeltaF) instead of F+ and F-. By taking into account of F+ and F-, we can explicitly consider the measurement error we determine.

Combined structure solution Simultaneously use information from experimental phasing, density modification and model refinement Experimental data, anomalous substructure Combined experimental phasing, phase combination and model refinement Phases Electron density Model Density modification Model building

The “combined” multivariate function The combined multivariate function consider all the correlations between the data and the substructure, (partially-built) model and density modified phases.

Tests of > 140 real SAD data sets Resolution range of data sets is 0.94 to 3.88 Angstroms Types of anomalous scatterers: selenium, sulfur, chloride, iodide, bromide, calcium, zinc (and others). We compare with the step wise multivariate approach (current CRANK) versus the combined approach.

Model building results on over 140 real SAD data sets (using parrot and buccaneer)

Summary of large scale test The average fraction of the model built increased from 60% to 74% with the new approach. If we exclude data sets built to 85% by the current approach or where the substructure was not found, 45 data sets remain and the average fraction of the model built increased from 28 to 77%.

3.88 Angstrom RNA polymerase II 3.88 Angstrom SAD data with signal from zinc. Authors could not solve the structure with SAD data alone, but with a partial model, multi-crystal MAD and manual building. > 80% can be built with SAD data alone with the new algorithm automatically to an R-free of 37.6%

Related RNA polymerases complex from Cramer et al. 3.3 Angstrom data with signal from zinc. Could not solve the structure with anomalous data alone. With the new method, a majority can be built automatically in minutes.

Crank1 vs Crank2 Crank1 and Crank2 are suitable for S/MAD and S/MIRAS experiments and implements multivariate functions. Crank1 and Crank2 are available in ccp4 6.5 and Crank2 is available in ccp4 7.0 with gui ccp4i, ccp4i2 and ccp4online

Important parameters in substructure detection The number of cycles run. The number of atoms to search for. Should be within 10-20% of actual number A first guess uses a probabilistic Matthew’s coefficient The resolution cut-off: For MAD, look at signed anomalous difference correlation. For SAD, a first guess is 0.5 + high resolution limit.

Is my map good enough? Statistics from substructure phasing: Look at FOM. Refined occupancies. Statistics from density modification: Compare the “contrast” from hand and enantiomorph (output of solomon or shelxe). Does it look like a protein? (model visualization) For Crank2, look to see if R-comb < 40%.

Bias reduction in density modification Density modified map is obtained from experimental map leading to artificially high correlations between the observed and modified amplitudes. β correction is applied to the Luzzati error parameter to reduce bias of modified data.

FOM and phase error after DM with/without bias reduction

Combining molecular replacement and anomalous scattering (MR-SAD) When do you use “MR-SAD”? If you have a partial model (usually obtained by molecular replacement) and anomalous diffraction. Two approaches implemented in Crank2 Use only the substructure model only Use the substructure and partial model

The two pipelines

MR-SAD data sets Final PDB code Reso-lution (Å) Anomalous Scatterer   Final PDB code Reso-lution (Å) Anomalous Scatterer Number of residues Correct MR Residues (%) Incorrect MR Residues (%) RMSD correct residues Dataset 1 NA 3.6 Se 800 42.5 23.5 1.6 Dataset 2 3.2 378 60.8 12.9 1.7 GPCR ECR-Mb 5kvm 3.0 I 459 49.7 11.7 1.5 AAA-ATPase 4d80 1776 75.2 22.1 F1-ATPase 2w6f S, P 3587 46.7 2.3 0.9 SecYEG-SecA 3din 4.5 2886 47.9 40.7

MR-SAD results: R-free values   Molecular replacement solution Substructure-only pipeline Rebuild pipeline Dataset 1 49.8 32.6 29.8 Dataset 2 53.7 28.9 32.3 GPCR ECR-Mb 48.6 39.1 38.4 AAA-ATPase 47.5 39.0 40.9 F1-ATPase 46.5 34.8 33.8 SecYEG-SecA 51.8 39.9 39.6

References Crank Ness et al (2004) Structure 12, 1753-1761. Pannu et al (2011) Acta Cryst D67, 331-337. Combined approach and Crank2 Skubak and Pannu (2013) Nature Communications 4: 2777. Using data directly in refinement Skubak et al (2004) Acta Cryst D60, 2196-2201. Skubak et al (2009) Acta Cryst D65, 1051-1061. Multivariate phase combination Waterreus et al (2010) Acta Cryst D66, 783-788.

Cyttron Acknowledgements All dataset contributors Garib Murshudov, Kevin Cowtan, George Sheldrick, Victor Lamzin http://www.bfsc.leidenuniv.nl/software/crank/ Cyttron