Developments in xia2 Graeme Winter CCP4 Dev Meeting 2008
What is xia2? Automated robust data reduction and analysis Thorough – takes additional steps when many users wouldn’t bother In: images from e.g. synchrotron beamline Out: measurements for downstream phasing via e.g. HAPPy, Mr BUMP, Phenix…
Recent changes Inclusion in CCP4 6.1 Many command line options Integrated with AutoRickshaw (EMBL H) Robust lattice determination Support for Q270, Pilatus Zero input option
3 Month plans BioXHit ends in June => so does xia2 development Include robust system to decide resolution limits etc (next slides) Finish release to go with release version of CCP4 6.1
Chef Let’s cook them books!
What is chef? A tool to help you use the best of the reflections you have Uses unmerged intensities Uses robust statistics to decide: d* min for different functions (resolution) D max for different functions (dose) Additional program “doser” to add dose information to unmerged MTZ files
In MTZ files from scala with “output unmerged” set DOSE / TIME information for doser: BATCH 1 DOSE 2.5 TIME 2.5 BATCH 2 DOSE 7.5 TIME 8.2 …
Running doser hklin TS03_12287_chef_INFL.mtz hklout infl.mtz < doser.in doser hklin TS03_12287_chef_LREM.mtz hklout lrem.mtz < doser.in doser hklin TS03_12287_chef_PEAK.mtz hklout peak.mtz < doser.in chef hklin1 infl.mtz hklin2 lrem.mtz hklin3 peak.mtz << eof isigma 2.0 resolution 1.65 range width 30 max 1500 print comp rd rdcu anomalous on labin BASE=DOSE eof
Output Resolution vs. dose Completeness vs. dose for each data set
Methods Based on “new” cumulative-pairwise R factor R CP: Inspired by R d in Diederichs (2006)
And R CP means..? How well do the measurements up to dose D agree? Closely related to I/σ Reasonably robust as it does not depend on sigma estimates or means Gets bigger when systematic variation contributes to spread
Requirements Radiation damaged MAD data – what do I want for: Substructure determination – big anomalous / dispersive signal Phase calculation – well measured ΔF Phase extension & improvement – good F Refinement – good F 85% Limit R CP < R(I/σ) + S(I/σ, N m, N u )
Example JCSG TB0541 – heavily radiation damaged… 3 wavelength MAD – INFL + LREM, PEAK Massive signal P43212, 90 degrees * 3 => plenty of data Chef says “use data to 1.65A, D=~600s”
Before (INFL) For TS03/12287/INFL High resolution limit Low resolution limit Completeness Multiplicity I/sigma Rmerge Rmeas(I) Rmeas(I+/-) Rpim(I) Rpim(I+/-) Wilson B factor Anomalous completeness Anomalous multiplicity Anomalous correlation
After (INFL – first 60 degrees) For TEST001/12287/LREM High resolution limit Low resolution limit Completeness Multiplicity I/sigma Rmerge Rmeas(I) Rmeas(I+/-) Rpim(I) Rpim(I+/-) Wilson B factor Anomalous completeness Anomalous multiplicity Anomalous correlation
Why improvement? Limit radiation damage => σF more meaningful Limit damage => ΔF better Without systematic damage get higher resolution for given I/σ
However… Pipe MTZ through scaleit / solve / cad / resolve / Arp/Warp and get very similar results – slight improvement though This is most interesting, because it means that 55% of the “data” did not add to the quality of the result
Plans Currently writing this up for J. Appl. Cryst Chef will be included in CCP4 6.1 Next: include this as part of xia2 (makes 0.3.0) Extend chef to make decisions about anomalous / dispersive differences