Structure Prediction (especially with GRINSP) Armel Le Bail Université du Maine, Laboratoire des oxydes et Fluorures, CNRS UMR 6010, Avenue O. Messiaen, 72085 Le Mans Cedex 9, France. Email : alb@cristal.org XX Conference on Applied Crystallography, Wisla, Poland, September 2006
CONTENTS Techniques and future role, programs - GRINSP algorithm - Examples of predictions (fast show) - Running GRINSP - Availability - Satellite programs, the PCOD - Demonstration XX Conference on Applied Crystallography, Wisla, Poland, September 2006
Techniques and future role, programs Crystal structure prediction is : anticipating (establishing now) the results of future synthesis attempts or discovery in nature. Main current role : give hopes to chemists… If predictions of crystal structures and physical properties become accurate and exhaustive, the future role is then clear : restrict efforts to synthesize only the most interesting compounds. The anticipated structure knowledge may even give clues about the possible way of synthesis : for instance, for zeolites or microporous compounds in general, knowing in advance the building units and the size of cavities or channels indicate which precursors and molecules would have more chance to produce the desired structure.
Some Techniques The ideal approach is the finding of energy minima by exploring atom combinations with : free cell parameters, free symmetry, free composition, etc, making lists of the more probable structures. You can imagine the time needed if this has to be made by quantum mechanics calculations, exploring combinations of all elements… Modelling quenching from the melt is also a developing technique. (would not give access to metastable phases : these being approached by the modelling of the decomposition of organometallic or hydrated compounds, or modelling of precipitation from a solution, etc) The building of clusters, larger and larger, up to see a starting of three dimensional order is studied as well.
Full or partial prediction ? Blind tests in organic chemistry (organized by the CSD) provide a molecule shape and ask for cell and atomic positions prediction. This is « molecule packing prediction », not « full prediction ». Hundreds of packing propositions, with different predicted cell parameters, are ranked by energy calculations, expecting to obtain the real structure among the 4 « more stable » models. (the real structure is known but unpublished and unknown to the blind test participants) For finding lists of packing prediction software for organic compounds, see : W.D.S. Motherwell et al., Acta Cryst. B58 (2002) 647-661. G. M. Day et al., Acta Cryst. B61 (2005) 511-527.
Full Prediction Software Recommended lectures (review papers) : 1- S.M. Woodley, in: Application of Evolutionary Computation in Chemistry, R. L. Johnston (ed), Structure and bonding series, Springer-Verlag 110 (2004) 95-132. 2- J.C. Schön & M. Jansen, Z. Krist. 216 (2001) 307-325; 361-383. The top is by Quantum Mechanics approach HF (Hartree-Fock) : CRYSTAL (Dovesi et al., 1989) DFT (Density Functional Theory) : CASTEP DFT + FP-LAPW (Full-Potential linearized augmented plane wave) : WIEN2K
The Problem is that Quantum Mechanics is (even now) too computer-time demanding so that it is mainly used for ultimately checking the feasibility of structure candidates proposed by more empirical approaches (using various potentials or extrapolating geometrical considerations established by data mining). Software using simplified potential approaches : G42 (energy landscape), GULP, SPuDS Software using mixte approaches (potentials + geometry) : AASBU (Cerius-2 + GULP); Zeolites by graph theory + GULP Software using a pure geometrical approaches : GRINSP (+WIEN2K if not too much models)
Is structure prediction interesting now for solving structures ? Yes in one special case : Such a technique is the last chance if no indexing is obtained And no in the general case : if crystals or an indexed powder pattern are available, then solving by the now classical direct and/or direct-space methods is much faster and efficient. But, with increasing efficiency, new structures would be more and more determined directly at the identification stage against databases of predicted structures…
Geometrically Restrained INorganic Structure Prediction GRINSP algorithm Geometrically Restrained INorganic Structure Prediction Applies the knowledge about the geometrical characteristics of a particular group of inorganic crystal structures (N-connected 3D networks with N = 3, 4, 5, 6, for one or two N values). Explores that limited and special space (exclusive corner-sharing polyhedra) by a Monte Carlo approach. The cost function is very basic, depending on weighted differences between ideal and calculated interatomic distances for first neighbours M-X, X-X and M-M for binary MaXb or ternary MaM'bXc compounds. J. Appl. Cryst. 38, 2005, 389-395. J. Solid State Chem. 179, 2006, 3159-3166.
More details about the GRINSP algorithm Two steps : Step 1 - Generation of raw models Haphazard (by Monte Carlo) is used to determine the cell dimensions; select Wyckoff positions; place M/M’ atoms. The cell is progessively filled up to the respect of geometrical restraints and constraints fixed by the user (exact coordination, but large tolerance on distances), if possible. The number of M/M' atoms placed is not predetermined. Atoms do not move. It is recommended to survey all the 230 space groups.
Step 2 - Optimization The X atoms are placed at the (M/M')-(M/M') midpoints (corner-sharing). Interatomic distances and cell parameters are optimized (by Monte Carlo) : it is verified that regular polyhedra (M/M’)Xn can really be built starting from the raw initial models with M/M’ atoms only. Cost function : R = [(R1+R2+R3)/ (R01+R02+R03)], where Rn and R0n for n = 1, 2, 3 are defined by : Rn = [wn(d0n-dn)]2, R0n = [wnd0n]2, Where the d0n are the ideal distances M-X (n=1), X-X (n=2) and M-M (n=3), the dn being the observed distances in the model. Weighting is applied through the wn .
More details on step 2 Atoms move that time, no jump is allowed which would break coordinations. The cell parameters established at step 1 can change considerably during the optimization (up to 30%). The original space group of which the Wychoff positions were used to place the M/M' atoms at step 1 may not be convenient after placing the X atoms and optimization, this is why the final model is proposed in the P1 space group (coordinates placed into a CIF). The final choice of the symmetry has to be done by applying a checking software like PLATON (A.L. Spek).
Examples of predictions obtained from all these software (fast show…)
Hypothetical Carbon Polymorph Suggested By CASTEP XX Conference on Applied Crystallography, Wisla, Poland, September 2006
Another CASTEP prediction XX Conference on Applied Crystallography, Wisla, Poland, September 2006
Zeolites !
XX Conference on Applied Crystallography, Wisla, Poland, September 2006
J.C. Schön & M. Jansen, Z. Krist. 216 (2001) 307-325; 361-383. G42 A concept of 'energy landscape' of chemical systems is used by Schön and Jansen for structure prediction with their program named G42. J.C. Schön & M. Jansen, Z. Krist. 216 (2001) 307-325; 361-383. XX Conference on Applied Crystallography, Wisla, Poland, September 2006
XX Conference on Applied Crystallography, Wisla, Poland, September 2006
XX Conference on Applied Crystallography, Wisla, Poland, September 2006
SPuDS Dedicated especially to the prediction of perovskites. M.W. Lufaso & P.M. Woodward, Acta Cryst. B57 (2001) 725-738. XX Conference on Applied Crystallography, Wisla, Poland, September 2006
AASBU approach XX Conference on Applied Crystallography, Wisla, Poland, September 2006
XX Conference on Applied Crystallography, Wisla, Poland, September 2006
XX Conference on Applied Crystallography, Wisla, Poland, September 2006
GRINSP Predictions Hypothetical zeolite PCOD1010026 SG : P432, a = 14.623 Å, FD = 11.51 1600 zeotypes predicted up to now with cell parameters < 16 Å
Other GRINSP predictions : > 3000 B2O3 polymorphs Hypothetical B2O3 - PCOD1062004. Triangles BO3 sharing corners. = 3-connected 3D nets
square-based pyramids > 500 V2O5 polymorphs square-based pyramids = 5-connected 3D nets
Corner-sharing octahedra. = 6-connected 3D nets 12 AlF3 polymorphs Corner-sharing octahedra. = 6-connected 3D nets
SiO4 tetrahedraand BO3 triangles Borosilicates PCOD2050102, Si5B2O13, R = 0.0055. SiO4 tetrahedraand BO3 triangles > 3000 models
Aluminoborates Example : [AlB4O9]-2, cubic, SG : Pn-3, a = 15.31 Å, R = 0.0051: AlO6 octahedra and BO3 triangles > 2000 models
Two-sizes octahedra AlF6 and Fluoroaluminates Known Na4Ca4Al7F33 : PCOD1000015 - [Ca4Al7F33]4-. Two-sizes octahedra AlF6 and CaF6
Titanosilicates TiO6 octahedra and SiO4 tetrahedra > 1000 models
Opened doors, Limitations, Problems GRINSP limitation : exclusively corner-sharing polyhedra. Opening the door potentially to > 50.000 hypothetical compounds. The predicted titanosilicates can be extrapolated to phosphates, sulfates, and/or replacing Ti by Nb, V, Zr, Ga, etc. More than 10.000 should be included into PCOD before the end of 2006. Then, their powder patterns will be calculated and possibly used for search-match identification.
Expected improvements : Edge, face, corner-sharing, mixed. Hole detection, filling them automatically, appropriately, for electrical neutrality. Using bond valence rules or/and energy calculations to define a new cost function. Extension to quaternary compounds, combining more than two different polyhedra. Etc, etc. Do it yourself, the GRINSP software is open source…
Two things that don’t work well enough up to now… Validation - Ab initio calculations (WIEN2K, etc) : not fast enough for the validation of > 10000 structure candidates (was 2 months for 12 AlF3 models) Identification - There is no efficient tool for the identification of the known structures (from the ICSD) among >10000 hypothetical compounds
1- The user has first to build a file according to his/her desires Running GRINSP : 1- The user has first to build a file according to his/her desires Example : TiO6/VO5 - space group 55 ! Title line 55 55 ! Space groups range (you may test the range 1 230) 2 0 2 192 ! Npol, connectivity, min & max number of M/M’ atoms 6 5 ! Polyhedra coordinations Ti O ! Elements for the first polyhedra V O ! Elements for the second polyhedra 3. 30. 3. 30. 3. 30. ! Min & max a, b, c 5. 35. ! Min & max framework density 20000 300000 0.02 0.12 ! Ncells, MCmax, Rmax, Rmax to optimize 5000 1 ! Number of MC steps/atom at optimization, code for cell 1 ! Code for output files Note : that calculation would need 1 day with a single processor running at 3GHz.
2 – Verify that the atom pairs are defined : See into the file distgrinsp.txt distributed with the package : V O 5 3.050 4.050 3.550 1.526 2.126 1.826 2.282 2.882 2.582 4.20 7.00 Ti O 6 3.300 4.300 3.800 1.650 2.250 1.950 2.458 3.057 2.758 4.45 6.95 Distances minimum, maximum and ideals for pairs V-V, V-O et O-O in fivefold coordination, plus a range for second V-V neighbours (square pyramids favoured). The same for Ti-Ti, Ti-O et O-O in octahedral coordination TiO6. Trigonal prisms may well be produced, but with larger R values.
It will call other files : wyckoff.txt and connectivity.txt 3- Start GRINSP It will call other files : wyckoff.txt and connectivity.txt
The Wyckoff.txt file contains the general and special positions for all the standard space groups You don’t have to modify it, unless you detect an error or want to insert non-standard settings.
The connectivity.txt file contains the coordination sequences (CS) of identified existing structures (the example file contains the known zeotypes) or/and virtual structures already predicted. GRINSP will inform you if new predictions are matching structures with CS already included inside of that file.
4- Wait… (hours, days, weeks, months…) and see the summary at the end of the output file with extension .imp :
See the results (here by applying Diamond to a CIF) : 5 – See the results (here by applying Diamond to a CIF) :
Availability GRINSP is « Open Source », GNU Public Licence Downloadable from the Internet at : http://www.cristal.org/grinsp/
Satellite programs distributed with the GRINSP package GRINS : allows to build quickly isostructural compounds by substitution of elements from previous models. - FeF3, CrF3, GaF3, etc, from AlF3 - gallophosphates, zirconosicilates, or sulfates, etc, from titanosilicates. CUTCIFP, CIF2CON, CONNECT, FRAMDENS programs for - cutting multiple CIFs into series of single CIFs, - extraction of coordination sequences from CIFs, - analysis of series of CIFs, recognition of identical/ different models and sorting them according to R, - extraction of framework densities, sorting.
Example of CIF produced by GRINSP and inserted into the PCOD The coordination sequence is added at the end as a comment …..
Demonstration Prediction of the -AlF3 structure, the latest MX3 structure-type solved in 1992 from powder diffraction data. GRINSP can be used as a structure determination tool in special cases (cell and chemistry known, being sure having a N-connected 3D net or a N/N’-connected framework…)