Solving Structures from Powder Data in Direct Space - State of the Art - Armel Le Bail Université du Maine, Laboratoire des oxydes et Fluorures, CNRS UMR.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Introduction to Monte Carlo Markov chain (MCMC) methods
Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.
An introduction to the Rietveld method Angus P. Wilkinson School of Chemistry and Biochemistry Georgia Institute of Technology.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
INTRODUCTION Massive inorganic crystal structure predictions were recently Performed, justifying the creation of new databases. Among them, the PCOD [1]
Tutorial Moshe Goldstein Fritz Haber Research Center, Hebrew U. February 13, 2012.
Practice of analysis and interpretation of X-ray diffraction data
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Chem Single Crystals For single crystals, we see the individual reciprocal lattice points projected onto the detector and we can determine the values.
Structure of thin films by electron diffraction János L. Lábár.
Applications and integration with experimental data Checking your results Validating your results Structure determination from powder data calculations.
Small Molecule Example – YLID Unit Cell Contents and Z Value
A New Analytical Method for Computing Solvent-Accessible Surface Area of Macromolecules.
Gizem ALAGÖZ. Simulation optimization has received considerable attention from both simulation researchers and practitioners. Both continuous and discrete.
The Comparison of the Software Cost Estimating Methods
A Brief Description of the Crystallographic Experiment
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.
3. Crystals What defines a crystal? Atoms, lattice points, symmetry, space groups Diffraction B-factors R-factors Resolution Refinement Modeling!
An Agent-Oriented Approach to the Integration of Information Sources Michael Christoffel Institute for Program Structures and Data Organization, University.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
Phase Identification by X-ray Diffraction
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
Introduction to Macromolecular X-ray Crystallography Biochem 300 Borden Lacy Print and online resources: Introduction to Macromolecular X-ray Crystallography,
Process Flowsheet Generation & Design Through a Group Contribution Approach Lo ï c d ’ Anterroches CAPEC Friday Morning Seminar, Spring 2005.
Modeling and simulation of systems Simulation optimization and example of its usage in flexible production system control.
Free energies and phase transitions. Condition for phase coexistence in a one-component system:
INTRODUCTION The COD was created in March 2003 and was built on the PDB model of open access on the Internet. It is intended that this database [1] consists.
CONCLUSIONS COD server is technically in position to store and serve all structures that are currently solved. COD deposition procedure ensures syntactic.
Monte Carlo Simulation of Liquid Water Daniel Shoemaker Reza Toghraee MSE 485/PHYS Spring 2006.
SPANISH CRYPTOGRAPHY DAYS (SCD 2011) A Search Algorithm Based on Syndrome Computation to Get Efficient Shortened Cyclic Codes Correcting either Random.
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
McMaille – Sous le Capot (Under the Bonnet) A.Le Bail Université du Maine Laboratoire des Oxydes et Fluorures CNRS – UMR 6010 FRANCE
EVALUATING PAPERS KMS quality- Impact on Competitive Advantage Proceedings of the 41 st Hawaii International Conference on System Sciences
High Resolution Models using Monte Carlo Measurement Uncertainty Research Group Marco Wolf, ETH Zürich Martin Müller, ETH Zürich Dr. Matthias Rösslein,
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Ionic Conductors: Characterisation of Defect Structure Lecture 15 Total scattering analysis Dr. I. Abrahams Queen Mary University of London Lectures co-financed.
Crystallographic Databases I590 Spring 2005 Based in part on slides from John C. Huffman.
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
Evolving Virtual Creatures & Evolving 3D Morphology and Behavior by Competition Papers by Karl Sims Presented by Sarah Waziruddin.
© A. Kwasinski, 2014 ECE 2795 Microgrid Concepts and Distributed Generation Technologies Spring 2015 Week #7.
1. Diffraction intensity 2. Patterson map Lecture
The PLATON/TwinRotMat Tool for Twinning Detection Ton Spek National Single Crystal Service Facility, Utrecht University, The Netherlands. Delft, 29-Sept-2008.
Molecular Specification Anan Wu Typical Gaussian Input Molecular specification This input section mainly specifies the nuclear positions.
MODELING MATTER AT NANOSCALES 3. Empirical classical PES and typical procedures of optimization Classical potentials.
Kanpur Genetic Algorithms Laboratory IIT Kanpur 25, July 2006 (11:00 AM) Multi-Objective Dynamic Optimization using Evolutionary Algorithms by Udaya Bhaskara.
Human Centric Computing (COMP106) Assignment 2 PROPOSAL 23.
INTRODUCTION The results from a third structure determination by powder diffractometry (SDPD) round-robin are summarized. From the 175 potential participants.
An Introduction to Monte Carlo Methods in Statistical Physics Kristen A. Fichthorn The Pennsylvania State University University Park, PA
COD (CRYSTALLOGRAPHY OPEN DATABASE) and PCOD (PREDICTED) COD Advisory Board : Daniel Chateigner (France), XiaoLong Chen (China), Marco E. Ciriotti (Italy),
Materials Process Design and Control Laboratory ON THE DEVELOPMENT OF WEIGHTED MANY- BODY EXPANSIONS USING AB-INITIO CALCULATIONS FOR PREDICTING STABLE.
Metaheuristics for the New Millennium Bruce L. Golden RH Smith School of Business University of Maryland by Presented at the University of Iowa, March.
Lecture 3 Patterson functions. Patterson functions The Patterson function is the auto-correlation function of the electron density ρ(x) of the structure.
Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.
Sect. 4.5: Cayley-Klein Parameters 3 independent quantities are needed to specify a rigid body orientation. Most often, we choose them to be the Euler.
Advanced Higher Computing Science
Crystallography images
The Rietveld Method Armel Le Bail
McMaille v3: indexing via Monte Carlo search
Monte Carlo indexing with McMaille
Rietveld method. method for refinement of crystal structures
PREDICTED CORNER SHARING TITANIUM SILICATES
Structure Solution in Direct Space University of St. Andrews, Scotland
Université du Maine – France
Getting Cell Parameters from Powder Diffraction Data
The PLATON/TwinRotMat Tool for Twinning Detection
Presentation transcript:

Solving Structures from Powder Data in Direct Space - State of the Art - Armel Le Bail Université du Maine, Laboratoire des oxydes et Fluorures, CNRS UMR 6010, Avenue O. Messiaen, Le Mans Cedex 9, France.

CONTENT Introduction Computer Programs DASH EAGER ENDEAVOUR ESPOIR FOX OCTOPUS POWDERSOLVE PSSP SAFE SA (Simulated Annealing) TOPAS Conclusions / Advertisements / Tests with ESPOIR

Introduction The final SDPD step is always the Rietveld method application. Going to this last step needs at least an approximate model to be improved by the Rietveld refinement and eventually completed by further Fourier difference synthesis. How can be obtained this starting approximate (or sometimes complete) model by a direct space approach is the only question considered here. Not to mention that before that structure solution step, you must have yet recorded a powder pattern (so you must have a sample), established that the structure is unknown, indexed the powder pattern, proposed a space group, and you must possess some chemical knowledge of the sample.

The SDPD maze… From the book: Structure Determination from Powder Diffraction Data IUCr Monographs on Crystallography 13 Oxford Science Publications (2002)

Chemical Information Chemical knowledge is indispensable to the application of the direct space methods since they consist in placing atoms, either independent or as a whole molecule, at some positions in the cell, generally wrong positions at the beginning of the process, and moving them by translations (as well as rotations for a molecule or polyhedron) up to obtain a satisfying fit to the powder pattern or to a mathematical representation of that pattern. Going from wrong atomic positions to the final roughly correct ones is made by a process called global optimization which can be realized by different but finally similar procedures: Monte Carlo (MC), Monte Carlo with simulated annealing (SA) or/and with parallel tempering (PT), genetic algorithm (GA). These processes present a similarity in the use of random number sequences: atoms and molecules realize a random walk.

Some Definitions Sometimes the "direct space methods" (not to be confused with the direct methods) are called "global optimization methods" or "model building methods", and even sometimes "real space methods". "Direct space" was the definition retained in the pioneering papers. "Direct space" as opposed to "reciprocal space" has an adequate crystallographic structural sense, and should be preferred to "real space", which, opposed to "imaginary" would call to mind both parts of the diffusion factors. "Global optimization" has a large sense and designates the task of finding the absolutely best set of parameters in order to optimize an objective function, a task not at all limited to crystallography.

Direct Space Pioneering Papers Already an old story M. W. Deem and J. M. Newsam, Nature 342 (1989) M. W. Deem and J. M. Newsam, J. Am. Chem. Soc. 114 (1992) J.W. Newsam, M.W. Deem & C.M. Freeman, Accuracy in Powder Diffraction II, NIST Special Publication 846 (1992) L.A. Solovyov & S.D. Kirik, Mat. Sci. Forum (1993) K.D.M. Harris, M. Tremayne, P. Lightfoot & P.G. Bruce, J. Am. Chem. Soc. 116 (1994) D. Ramprasad, G.P. Pez, B.H. Toby, T.J. Markley & R.M. Pearlstein, J. Amer. Chem. Soc. 117 (1995)

For instance : J.W. Newsam, M.W. Deem & C.M. Freeman, Accuracy in Powder Diffraction II, NIST Special Publication 846 (1992)

Computer Programs - Nowadays Selection of programs applying direct space methods for the structure solution from powder diffraction data (CH 2 CH 2 O) 6 :LiAsF 6 ProgramAccessGO DataExample DoF DASHCSAPCapsaicin 16 EAGER AGAWPPh 2 P(O)(CH 2 ) 7 P(O)Ph 2 18 ENDEAVOURCSA IAg 2 PdO 2 45 ESPOIROMC LGormanite54 FOXOSAWPAl 2 (CH 3 PO 3 ) 3 24 OCTOPUSAMCWPRed Fluorescein7 POWDERSOLVECMCWPDocetaxel29 PSSPOSALMalaria Pigment Beta Haematin 14 SAFEASAWPC 32 N 3 O 6 H SAASAWP(CH 2 CH 2 O) 6 :LiAsF 6 79 TOPASCSAWPCaffeine Anhydrous93 Access : C = Commercial with academic prices, O = Open access, A = contact the authors GO = Global Optimization : MC = Monte Carlo, SA = MC+Simulated Annealing, GA = Genetic Algorithm Data : P = Pawley, L = Le Bail, I = Integrated intensities, WP = Whole Pattern DoF = degrees of freedom corresponding to the example

Degrees of Freedom (DoF) Irrespectively to the number of atoms, a molecule can be located easily in a cell, as a rigid body, corresponding to 3 positional and 3 orientational degrees of freedom (DoF), by checking the fit quality on, say, the first 50 peaks of the diffraction pattern. But the number of DoFs will increase by one for every added free torsion angle, and more complications arise if several independent molecules have to be located altogether or/and if water molecules or chlorine/sulphur/etc atoms are involved. For inorganic compounds, in principle an atom in general position corresponds to 3 DoFs (the three xyz atomic coordinates), however, chemistry may say if some polyhedra are to be expected, then an octahedron for instance, instead of corresponding to 7x3=21 DoFs when described by the atomic coordinates, can be translated and rotated as a whole polyhedron, corresponding to only 6 DoFs.

Flexibility Most of these computer programs are also able to start from a complete set of independent atoms, at random at the beginning, and then will try to find their positions, moving them while matching to the data. Combinations of (several) molecules (or polyhedra) together with independent atoms are of course possible. Limitation The main difficulty may come finally at the Rietveld refinement stage, if the powder pattern quality becomes too low compared to the number of parameters, then it will be necessary to apply some constraints and/or restraints. And it may be difficult to complete the structure…

These computer programs are obtaining more and more success, surpassing in number the solutions by traditional approaches (Patterson or Direct methods as applied in computer programs like SHELXS - etc - or adapted to powder data in EXPO). Nevertheless, the number of SDPD per year remains quite small (close to 100, to be compare to from single crystal data). Cumulated histogram of the total number of published SDPD. Picture from the SDPD Database: iniref.html

The following details about the direct space computer programs were gathered and presented by Yuri G. Andreev at the EPDIC-8 congress (Uppsala, Suède, 2002), obtained from the authors themselves. Things have not changed a lot after two years. Note that EXPO2004/5 adds also now the direct space approach to its traditional way to solve structures (direct methods especially adapted to powder data), and even can mix the two approaches. To this list of programs may be added a few others which have special abilities for zeolites (ZEFSA-II, FOCUS, GRINSP). Comments

SA Correlated integrated intensities Chem. Commun. 931 (1998) Capsaicin - most complex structure in terms of number of variables Chem. Commun. 931 (1998) 10 torsions and 6 external DoF. Academics receive a 95% discount DASH W.I.F. David and K. Shankland Rutherford Appleton Laboratory, further developed by J. Cole and J. van de Streek CCDC, UK Telmisartan forms A and B - fairly typical structure J. Pharm. Sci. 89, 1465 (2000) 7 torsions and 6 external DoF.

EAGER K.D.M. Harris, R.L. Johnston, D. Albesa Jové, M.H. Chao, E.Y. Cheung, S. Habershon, B.M. Kariuki, O.J. Lanning, E. Tedesco, G.W. Turner University of Birmingham, UK Genetic Algorithm Full profile Acta Cryst. A, 54, 632 (1998) Heptamethylene-1,7-bis(diphenylphosphane oxide) Ph 2 P(O)(CH 2 ) 7 P(O)Ph 2 - typical structure. B.M. Kariuki, P. Calcagno, K.D.M. Harris, D. Philp, R.L. Johnston. Angew. Chem. Int. Ed. 38, 831 (1999). 35 non-H atoms in the a.u. 18 DoF including 12 torsion angles. Under active development

ENDEAVOUR K. Brandenburg and H. Putz, Crystal Impact, Bonn, Germany Combined global optimization of R-factor and potential energy using SA Integrated intensities J.Appl.Cryst. 32, 864 (1999) Ag 2 NiO 2 - typical structure Schreyer and Jansen, Sol. State Sci. 3(1-2), 25, (2001). 15 atoms in the a.u. of P1. 45 DoF. Available from Crystal Impact at reduced price for academic users.

ESPOIR A. Le Bail, Universite du Maine, France Reverse Monte Carlo and pseudo SA Integrated intensities or full profile on a pseudo powder pattern regenerated from extracted |F obs | Mat. Sci. Forum , 65 (2001). Souzalite/Gormanite Le Bail, Stephens and Hubert, European J. Mineralogy 15 (2003) atoms in the a.u. of P-1. Fe at 0,0,0 54 DoF. Free and Open - all available : executable as well as Fortran and Visual C++ source code (GPL - GNU Public Licence). Web site:

FOX V. Favre-Nicolin and R. Cerny, University of Geneva, Switzerland (Free Objects for Xtallography) Parallel Tempering or SA. Automatic correction of special positions and of sharing of atoms between polyhedra, without any a priori knowledge; multi-pattern Full profile, integrated intensities, partial integrated intensities J.Appl.Cryst. 35 (2002) 734. Aluminum methylphosphonate Al 2 (CH 3 PO 3 ) 3 - most complex structure Edgar et al. Chem. Commun. 808, (2002). 3 molecules and 2 Al atoms in the a.u. 24 DoF including bond lengths and bond angles. Free, open-source published under the GPL license

OCTOPUS K.D.M. Harris, M. Tremayne and B.M. Kariuki University of Birmingham, UK Monte Carlo Full profile J. Am. Chem. Soc. 116, 3543 (1994). Red fluorescein - typical structure. Tremayne, Kariuki and Harris. Angew. Chem. Int. Ed. 36, 770 (1997). 25 non-H atoms in the a.u. 7 DoF including 1 torsion angle. Under active development

POWDERSOLVE (part of Reflex Plus integrated package) G. Engel, S. Wilke, D. Brown, F. Leusen, O. Koenig, M. Neumann, C. Conesa- Morarilla Accelrys Ltd., Cambridge, UK Monte Carlo SA and Monte Carlo parallel tempering (Falcioni and Deem. J. Chem. Phys. 110 (1999)1754.) Full profile J. Appl.Cryst. 32, 1169 (1999) Docetaxel (C 43 H 53 NO 14 ·3H 2 O) - most complex structure L. Zaske, M.-A. Perrin and F. Leveiller, J. Phys. IV, Pr10, 221 (2001) 29 DoF including 3 rotations, 12 translations and 14 torsion angles. Can be purchased from Accelrys Inc., generous discounts given to academic researchers

SA Free, including open source. Available at PSSP P. Stephens and S. Pagola State University of New York, Stony Brook, (Powder Structure Solution Program) USA Integrated intensities (Le Bail) with novel handling of peak overlap Submitted to J.Appl.Cryst. Preprint available on Malaria Pigment Beta Haematin - most complex structure. Pagola, Stephens, Bohle, Kosar, and Madsen. Nature (2000) 43 non-H atoms in the a.u. 14 DoF.

SA + option of using a structure envelope. Not ready for general distribution but will be in public domain (still not by the end of 2004). Verify at: SAFE S. Brenner, L.B. McCusker and Ch. Baerlocher ETH Zentrum, Zurich, Switzerland (Simulated Annealing and Fragment search within an Envelope) Full-profile J. Appl. Cryst. 35, 243 (2002) Tri-  -peptide C 32 N 3 O 6 H 53 - the most complex structure. Brenner, McCusker and Baerlocher J.Appl.Cryst. 35, 243 (2002) 17 torsion angles and 6 positional and orientational DoF.

SA Simulated Annealing Y. G. Andreev and P. G. Bruce, University of St. Andrews Full-profile J. Appl. Cryst. 30, 294 (1997) (CH 2 CH 2 O) 6 :LiAsF 6 (CH 2 CH 2 O) 6 :LiAsF 6 - most complex structure. MacGlashan, Andreev, and Bruce Nature (1999) 26 non-H atoms in the a.u. 79 DoF including 15 torsion angles. Free, very user unfriendly. Requires changing of the code for each new structure determination. Customised molecular description without Z- matrix input.

TOPAS A.A. Coelho, R.W.Cheary, A. Kern, T. Taut. Bruker AXS GmbH, Karlsruhe, Germany SA (together with user definable penalty functions, rigid bodies, various bond length restraints and lattice energy minimization techniques including user definable force fields) Full-profile or integrated intensities J. Appl. Cryst. 33, 899 (2000) Caffeine Anhydrous C 8 H 10 N 4 O 2 Stowasser and Lehmann, Abstract submitted to the XIX IUCr Congress 5 molecules in the a.u. 93 DoF. Discounted price (500 €) for academic users. See :

Which software could solve the problems proposed during two previous SDPD Round Robin ? 1998 SDPDRR : DASH SDPDRR : FOX and TOPAS

CONCLUSIONS The capacities for solving structures from powder diffraction data have never been so efficient than during the past 5-12 years evolutions. One has to find his way in the SDPD maze and to select the appropriate methods and computer programs at each step of the problem (identification - which should fail to establish any relation with a known structure-, indexing, whole pattern fitting with cell constraints, structure solution, Rietveld refinement).

Advice When a SDPD is decided, you know already the complexity level. Then select the appropriate radiation, a 3rd generation synchrotron pattern being the best choice for complex cases. It is better to wait a bit for a good pattern and to solve the problem than to waste large time and not to solve the problem. Applying direct space methods requires generally much less data (3 to 5 intensities per degree of freedom may be sufficient) than direct methods. However, big organic or organometallic problems can be completely solved only if one disposes of a maximum of knowledge about the molecular formula together with the most excellent data.

Use your common sense Very complex molecules will present more serious difficulties at the Rietveld structure refinement stage : the ratio of the effective number of structure factors with the number of atomic coordinates to refine may be as small as 3 or less (because there is soon no accurate intensity on the powder pattern at resolution d <1.5 Å), so that the model needs to be constrained/restrained. This may lead to difficulties to locate some additional water molecule, or to be absolutely sure that there is not any misunderstanding somewhere which could explain why the Bragg R factor R B is going to be sometimes as large as 10 or 15%. No need to say that some proposed H atom positions will have sometimes a low credibility. You will have to know « how much is too much », or your manuscript will be rejected (by a good reviewer).

Advertisements…

15 CD-ROM of this distance learning course are available during this workshop (you may duplicate them)

Sessions 1 and 2 are in free access, but after that, you will not obtain the encrypted solutions, nor help, nor the diploma, if you do not pay the fees (250 € for students in developing countries)…

This SDPD distance learning course provides informations and guidance on the complete state of the art, not only on a few selected software. You will find inside of the CD-ROM documents or internet links about everything concerning Structure Determination by Powder Diffractometry.

Next and ultimate (for today ;-) advertisement…: