Crystal structure solution via PED data: the BEA algorithm Carmelo Giacovazzo Università di Bari Istituto di Cristallografia, IC-CNR

Slides:



Advertisements
Similar presentations
Intensities Learning Outcomes By the end of this section you should: understand the factors that contribute to diffraction know and be able to use the.
Advertisements

Estimation of Means and Proportions
PLATON, New Options Ton Spek, National Single Crystal Structure Facility, Utrecht, The Netherlands. Delft, Sept. 18, 2006.
Kjell Simonsson 1 Low Cycle Fatigue (LCF) analysis (last updated )
Erice 2011 Electron Crystallography – June Introduction to Sir2011 Giovanni Luca Cascarano CNR- Istituto di Cristallografia – Bari - Italy.
Errors in Chemical Analyses: Assessing the Quality of Results
X-Ray Crystallography
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
The General Linear Model Or, What the Hell’s Going on During Estimation?
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
Structure of thin films by electron diffraction János L. Lábár.
Small Molecule Example – YLID Unit Cell Contents and Z Value
CHE (Structural Inorganic Chemistry) X-ray Diffraction & Crystallography lecture 3 Dr Rob Jackson LJ1.16,
Computing Protein Structures from Electron Density Maps: The Missing Loop Problem I. Lotan, H. van den Bedem, A. Beacon and J.C. Latombe.
A Brief Description of the Crystallographic Experiment
Evaluating Hypotheses
Qualitative, quantitative analysis and “standardless” analysis NON DESTRUCTIVE CHEMICAL ANALYSIS Notes by: Dr Ivan Gržetić, professor University of Belgrade.
The goal of Data Reduction From a series of diffraction images (films), obtain a file containing the intensity ( I ) and standard deviation (  ( I ))
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Features Direct Methods for Image Processing in HREM of Solving Aperiodic Structures Searching Algorithm for Finding Modulation waves in 4D Fourier.
Copyright © Cengage Learning. All rights reserved. CHAPTER 11 ANALYSIS OF ALGORITHM EFFICIENCY ANALYSIS OF ALGORITHM EFFICIENCY.
Least-Squares Regression
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
In serial femtosecond crystallography (SFX) with hard X-ray free-electron laser as light source, a set of three-dimensional single-crystal diffraction.
INTRODUCTION When two or more instruments sound the same portion of atmosphere and observe the same species either in different spectral regions or with.
X-Ray Diffraction Dr. T. Ramlochan March 2010.
Ionic Conductors: Characterisation of Defect Structure Lecture 15 Total scattering analysis Dr. I. Abrahams Queen Mary University of London Lectures co-financed.
Chem Patterson Methods In 1935, Patterson showed that the unknown phase information in the equation for electron density:  (xyz) = 1/V ∑ h ∑ k.
Chem Structure Factors Until now, we have only typically considered reflections arising from planes in a hypothetical lattice containing one atom.
Phasing Today’s goal is to calculate phases (  p ) for proteinase K using PCMBS and EuCl 3 (MIRAS method). What experimental data do we need? 1) from.
1. Diffraction intensity 2. Patterson map Lecture
Chapter 3: Structures via Diffraction Goals – Define basic ideas of diffraction (using x-ray, electrons, or neutrons, which, although they are particles,
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
What’s a modulated structure ?
Theory of kinematical di ffraction: a practical recap by Carmelo Giacovazzo Istituto di Cristallografia, CNR Dipartimento Geomineralogico, Bari University.
Linear Systems – Iterative methods
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Copyright ©2011 by Pearson Education, Inc. publishing as Pearson [imprint] Introductory Circuit Analysis, 12/e Boylestad Chapter 9 Network Theorems.
Protein Structure Determination Lecture 4 -- Bragg’s Law and the Fourier Transform.
Quantum Two 1. 2 Angular Momentum and Rotations 3.
The Structure and Dynamics of Solids
Atomic structure model
Anomalous Differences Bijvoet differences (hkl) vs (-h-k-l) Dispersive Differences 1 (hkl) vs 2 (hkl) From merged (hkl)’s.
Absolute Configuration Types of space groups Non-centrosymmetric Determining Absolute Configuration.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Methods in Chemistry III – Part 1 Modul M.Che.1101 WS 2010/11 – 9 Modern Methods of Inorganic Chemistry Mi 10:15-12:00, Hörsaal II George Sheldrick
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Copyright © Cengage Learning. All rights reserved. 15 Distribution-Free Procedures.
Fourier transform from r to k: Ã(k) =  A(r) e  i k r d 3 r Inverse FT from k to r: A(k) = (2  )  3  Ã(k) e +i k r d 3 k X-rays scatter off the charge.
Today: compute the experimental electron density map of proteinase K Fourier synthesis  (xyz)=  |F hkl | cos2  (hx+ky+lz -  hkl ) hkl.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Crystallography : How do you do? From Diffraction to structure…. Normally one would use a microscope to view very small objects. If we use a light microscope.
Chapter 9 Estimation using a single sample. What is statistics? -is the science which deals with 1.Collection of data 2.Presentation of data 3.Analysis.
Virtual University of Pakistan
CHAPTER 9 Testing a Claim
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Diffraction in TEM Janez Košir
Istituto di Cristallografia, CNR,
Break and Noise Variance
Rietveld method. method for refinement of crystal structures
Phasing Today’s goal is to calculate phases (ap) for proteinase K using MIRAS method (PCMBS and GdCl3). What experimental data do we need? 1) from native.
Adjustment of Temperature Trends In Landstations After Homogenization ATTILAH Uriah Heat Unavoidably Remaining Inaccuracies After Homogenization Heedfully.
Toshiro Oda, Keiichi Namba, Yuichiro Maéda  Biophysical Journal 
Axel T Brünger, Paul D Adams, Luke M Rice  Structure 
Chapter 8: Estimating with Confidence
Experience-Dependent Asymmetric Shape of Hippocampal Receptive Fields
Distribution-Free Procedures
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
Presentation transcript:

Crystal structure solution via PED data: the BEA algorithm Carmelo Giacovazzo Università di Bari Istituto di Cristallografia, IC-CNR workshop Facets of Electron Crystallography Berlin 7-9 july 2010

PED-Direct Methods Recently introduced PED techniques reduce the number of reflections which are simultaneously excited and therefore allow to describe the scattering by few beam approximations. Most of the statistical features of PED amplitudes are still unknown, with particular attention to the effects they produce on the efficiency of the phasing procedures, specifically on Direct Methods approaches, which so far are the most popular phasing techniques.

A relatively large number of test cases is used. In our tests we use the amplitudes kindly provided by some friends ( Gemmi, Mugnaioli, Kolb) for solving the structures: no attempt for improving them by techniques correcting for dynamical effects. As overall result, a number of recipes ( among which the BEA algorithm) are obtained, which make the crystal structure solution via PED data more straightforward.

PED-ADT techniques Although PED techniques curtail the problem of dynamical diffraction, it is difficult to collect fully 3-dimensional data by them. Usually 2-dimensional reflections from few well oriented zone axis are collected. More recently the ADT technique has been developed which, in combination with PED, allows much larger completeness values. Five of our twelve test data were collected by combining PED with ADT.

Test structures CodeSGASUCOLLRESNR 0.9 / NR RES COMP (0.9/RES) aker P mO 14 Mg 2 Al 6 Si 4 Ca 4 P / / 23 Anatase I 4 1 /a m dO 8 Ti 4 P / / 22 Barite P n m aO 16 S 4 Ba 4 P-A / / 82 Charo P 2 1 /mCa 24 K 14 Na 10 Si 72 O 186 P-A / / 97 gann P 6 3 m cGa 2 N 2 P / / 32 mayenite I -4 3 dO 88 Al 28 Ca 24 P / / 44 mullite P b a mAl 4.56 Si 1.44 O 9.72 P-A / / 86 natrolite F d d 2Na 2 (Al 2 Si 3 O 10 ) (H 2 O) 2 P-A / / 99 Sno2 P 4 2 /m n mO 4 Sn 2 P / / 24 Srtio3 P m -3 mO 3 Ti 1 Sr 1 P / / 43 srtio3_s P b n mO 12 Ca 0.48 Sr 3.52 Ti 4 P / / 6 zn8sb7 P -1Zn 32 Sb 28 P-A / / 54

Data resolution ED reveals the electrostatic potential, which is the sum of nuclear and of electronic potentials: viceversa, the electrostatic potential around each atom influences the atomic scattering. The region substantially determining ED scattering amplitudes is wider than for XD: therefore the atomic scattering curve for ED more rapidly decreases with s than the corresponding X-ray curves..

Data Resolution In Fig. we show carbon and sulphur atomic scattering curves for e- and X-ray scattering: ED curves have been rescaled so that their maximum values are equal to Z.

Carbon and sulphur atomic scattering curves versus RES for XD (blue lines) and, after rescaling, for ED (red lines)

The more rapid decay of the electron scattering may suggest that ED data should be detectable only at resolution smaller than for X-rays. This effect however is contrasted by the stronger electron scattering (1000 times about) than X-rays, but is strengthened by the smaller size of the crystal samples submitted to electron diffraction. As a first conclusion, PED data can really attain quite high resolution as effect of three sample parameters (electrostatic potentials, strength of the scattering, size of the crystal) and of the shorter wavelength (much shorter than for X-rays): their data quality versus the resolution, however, has still to be assessed. c

Data Resolution In conclusion, the shorter electron wavelength combined with PED techniques allows to collect data at very high resolution, but their measurement errors are expected to be particularly large. However we will show below that such reflections may be useful in the phasing process, and contribute to make the structural model more complete.

Data completeness COMP=data completeness=number of measured / number of measurable reflections, at RES and at 0.9 Å resolution. NR= corresponding number of symmetry independent reflections. COMP is quite unsatisfactory when the nominal RES value is smaller than 0.4 Å: in particular data collected by combining PED and ADT show lower resolution but better data completeness, so providing a better basis for accurate crystal structure refinement.

Test structures CodeSGASUCOLLRESNR 0.9 / NR RES COMP (0.9/RES) aker P mO 14 Mg 2 Al 6 Si 4 Ca 4 P / / 23 Anatase I 4 1 /a m dO 8 Ti 4 P / / 22 Barite P n m aO 16 S 4 Ba 4 P-A / / 82 Charo P 2 1 /mCa 24 K 14 Na 10 Si 72 O 186 P-A / / 97 gann P 6 3 m cGa 2 N 2 P / / 32 mayenite I -4 3 dO 88 Al 28 Ca 24 P / / 44 mullite P b a mAl 4.56 Si 1.44 O 9.72 P-A / / 86 natrolite F d d 2Na 2 (Al 2 Si 3 O 10 ) (H 2 O) 2 P-A / / 99 Sno2 P 4 2 /m n mO 4 Sn 2 P / / 24 Srtio3 P m -3 mO 3 Ti 1 Sr 1 P / / 43 srtio3_s P b n mO 12 Ca 0.48 Sr 3.52 Ti 4 P / / 6 zn8sb7 P -1Zn 32 Sb 28 P-A / / 54

Data Quality Indications on the data quality may be obtained by the internal R-value The sum is over symmetry equivalent reflections. Since X-Ray structures are known the kinematical data quality may be estimated via The sum is over the measured reflections

CodeR int (∞-0.9)R int (0.9-RES)RESID RES RESID 0.9 aker anatase barite charo gann mayenite mullite natrolite sno srtio srtio3_s zn8sb

Direct Methods (DM) 200 trials were run for each test structure by using Sir2008 (Version 3.0), included in the package Il Milione. DM section is automatically followed by a direct space procedure which, via EDM techniques and diagonal LSQ, tries to improve DM models. RESID is used as a FOM for recognizing the correct solution. All the following results are referred to the best solution found within the five trials with the lowest values of RESID.

Protocols We used three different protocols. Protocol 1. All the measured data up to RES are used in the phasing process. Protocol 2. Only data up to 0.7 Å resolution were used in the phasing process Protocol 3. Only data up to 0.9 Å resolution were used in the phasing process. The results are in the figure

Model quality The analysis of the results suggests that: the structure models obtained by Prot.1 are more complete than those obtained by Prot.2, which are more complete than those obtained by Prot. 3. This behavior may be related to the recent use of the extrapolation techniques in protein structure determination, where phases and amplitudes of non-measured (because out of the measured reciprocal space) reflections are estimated via probabilistic approaches and used during the phase determination process.

Model quality The RESID values available at the end of the automatic phasing process ( where peaks are labeled, and thermal factors are still isotropic) are shown in the next Fig.. Obviously they are larger for Prot.1 than for Prot.2 than for Prot.3. Most of the RESID values are far from the values obtained by Sir2008 for X-ray data (usually between 0.08 and 0.12).

About the BEA algorithm If PED techniques are used, the dynamical effects are no more dominant but still present. Two questions arise: 1) Is the average ( over symmetry equivalent reflections) a good representative of the correct intensity? A: yes and no 2)In absence of a theoretical formulation establishing which of the equivalent reflections is less affected by dynamical scattering, can one use a practical criterion for selecting the best unique reflection?

The BEA algorithm BEA suggests to choose as unique reflection that one which better agrees with the current structural model. The effect is certainly cosmetic (the final RESID value may be much smaller than that obtained by using the average of the equivalent amplitudes). But it may also be substantial: i.e., the crystal structure solution becomes more straightforward and the final structural model may be more complete. We call this algorithm BEA (best equivalent amplitude).

The BEA algorithm BEA has some similarity with a criterion used in powder crystallography during the phasing process. When a structural model is not available the experimental diffraction profile is decomposed according to LeBail or to Pawley algorithms. If a structural model is available, the experimental diffraction profile is partitioned in a way proportional to the calculated structure factors of the overlapping reflections.

The BEA procedure When Direct Methods are run, the average amplitude of the symmetry equivalent reflections is used (no structural model is available at this stage). As soon as a model is available, the best equivalent reflection may be recognized and the corresponding amplitude is used as coefficient of the observed Fourier maps and as observed amplitude in the diagonal least-squares automatically performed by Sir2008. Obviously the best reflection changes with the structural model.

In the ED case BEA recognizes that uncorrected dynamical effects still affect the experimental data (even in the PED case), that such effects are not corrected by a posteriori techniques, and that the merging of the symmetry equivalent reflections may lead to averaged values which are not good representatives of the true intensities.

Check BEA To check BEA we used the following three protocols. Prot. 4. All the measured data up to RES are used in the phasing process (as in Protocol 1), and BEA is used. Prot. 5. Only data up to 0.7 Å resolution were used in the phasing process (as in Protocol 2), and BEA is applied. Prot. 6. Only data up to 0.9 Å resolution were used (as in Protocol 3) and BEA is applied.

Compare the results attained via Protocols 4,5,6 with those obtained via Protocols 1,2,3 respectively (compare protocol i and i+3, because they imply the same data resolution). We observe that the use of BEA: A)usually leads to more complete structural models. B)may strongly improve the RESID values, particularly when the nominal data resolution is high, and makes them very close to the X-ray values usually obtained at this stage of refinement.

Efficiency of Prots. 1 and 4 Prot. 1- Sir2008 is able to find a complete solution in all the cases except for anatase (one O and one Ti atom in the asymmetric: Ti was missed), barite (4 atoms over 5 well located: one O was missed), mullite (one O missed), srtio3_s ( Ti and Ca-Sr atoms were missed). Prot 4- All the atomic positions of the test structures were found, but for mullite, for which one O with chemical occupancy equal to 0.14 was missed.

Advantages of BEA It usually leads to more complete structural models. It may strongly improve the RESID values, particularly when the nominal data resolution is high, and makes them very close to the values usually obtained for X-ray data at this stage of refinement. More extended applications are needed to extrapolate the usefulness of BEA for the final refinement stages.