Recent activities around the crystallography open databases COD, PCOD and P2D2 Armel Le Bail Université du Maine, Laboratoire des oxydes et Fluorures,

Slides:



Advertisements
Similar presentations
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Advertisements

INTRODUCTION Massive inorganic crystal structure predictions were recently Performed, justifying the creation of new databases. Among them, the PCOD [1]
Recent developments 1) Tests (outlier analysis) and Bug fixing ( with Paul) 2) Regeneration of Values of Bonds and Bond-angles existing all structures.
Practice of analysis and interpretation of X-ray diffraction data
Chem Single Crystals For single crystals, we see the individual reciprocal lattice points projected onto the detector and we can determine the values.
Structure of thin films by electron diffraction János L. Lábár.
Applications and integration with experimental data Checking your results Validating your results Structure determination from powder data calculations.
What is e-Science? e-Science refers to large scale science that will increasingly be carried out through distributed global collaborations enabled by the.
T.G. Fawcett, S. N. Kabbekodu, F. Needham, J. R. Blanton, D. M. Crane, J. Faber International Centre for Diffraction Data Using PDF-4+/Organics to discover.
Frontiers Between Crystal Structure Prediction and Determination by Powder Diffractometry Armel Le Bail Université du Maine, Laboratoire des Oxydes et.
Timothy G. Fawcett, Soorya N. Kabbekodu, Fangling Needham and Cyrus E. Crowder International Centre for Diffraction Data, Newtown Square, PA, USA Experimental.
Inorganic Structure Prediction with GRINSP Armel Le Bail Université du Maine, Laboratoire des oxydes et Fluorures, CNRS UMR 6010, Avenue O. Messiaen,
Crystallographic Data Publication at Source International Union of Crystallography Peter R. Strickland and Brian McMahon IUCr 5 Abbey Square Chester CH1.
Inorganic structure prediction : too much and not enough Armel Le Bail Université du Maine, Laboratoire des oxydes et Fluorures, CNRS UMR 6010, Avenue.
Microporous Titanium Silicates Predicted by GRINSP Armel Le Bail Université du Maine, Laboratoire des oxydes et Fluorures, CNRS UMR 6010, Avenue O. Messiaen,
Advanced Identification Tools. This tutorial will demonstrate how a user can increase both the speed and efficiency of the material identification process.
Putting 3D-print files of crystallographic models into open access International Advisory Board of the Crystallography Open Database & T. J. Snyder solid.
The Need for Speed. The PDF-4+ database is designed to handle very large amounts of data and provide the user with an ability to perform extensive data.
Electron Diffraction Search and Identification Strategies.
Molecular Graphics. Molecular Graphics What? PDF-4 products contain data sets with atomic coordinates. A molecular graphic package embedded in the product.
Data Mining with DDView+ and the PDF-4 Databases Solid Solution Crystal Cell Parameters Some slides of this tutorial have sequentially-layered information.
Dynamic Web Pages (Flash, JavaScript)
Crystallography Open Database (COD), Predicted Crystallography Open Database (PCOD) and Material Properties Open Database (MPOD)
INTRODUCTION The COD was created in March 2003 and was built on the PDB model of open access on the Internet. It is intended that this database [1] consists.
CONCLUSIONS COD server is technically in position to store and serve all structures that are currently solved. COD deposition procedure ensures syntactic.
Advanced Searches Using History Advanced Searches What? For a given session, a list of Standard Format Past Searches is automatically saved each time.
McMaille – Sous le Capot (Under the Bonnet) A.Le Bail Université du Maine Laboratoire des Oxydes et Fluorures CNRS – UMR 6010 FRANCE
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
High Throughput Screening of Materials (CCP9) Friday 20 th April 2012 CXD Workshop.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
Crystallographic Databases I590 Spring 2005 Based in part on slides from John C. Huffman.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
1 SIeve+ Introduction SIeve+ is a Plug-In module to the DDView+ software which is integrated in the PDF-4 products. SIeve+ is licensed separately at an.
The PCOD and P2D2 databases (P for Predicted) Armel Le Bail Université du Maine, Laboratoire des oxydes et Fluorures, CNRS UMR 6010, Avenue O. Messiaen,
Sorting Options. Search Preferences What? – Search Preferences allow the customization of the DDView+ Results table display fields. Why? – To view the.
INTRODUCTION The results from a third structure determination by powder diffractometry (SDPD) round-robin are summarized. From the 175 potential participants.
COD (CRYSTALLOGRAPHY OPEN DATABASE) and PCOD (PREDICTED) COD Advisory Board : Daniel Chateigner (France), XiaoLong Chen (China), Marco E. Ciriotti (Italy),
News in Open Databases COD, PCOD, TCOD, MPOD, FPSM … Association Française de Cristallographie 2013, Bordeaux D. Chateigner, S. Grazulis, J. Butkus, A.
Pore size distributionassessed by different techniques Pore size distribution assessed by different techniques M. A. Slasli a, F.Stoeckli a, D.Hugi-Cleary.
Structure Prediction (especially with GRINSP)
CHARACTERIZATION OF THE STRUCTURE OF SOLIDS
PDBe Protein Interfaces, Surfaces and Assemblies
Software for Crystallographic and Rietveld Analysis
Crystallography images
Smart IT Job Advisor and Analysis on web application
The Rietveld Method Armel Le Bail
McMaille v3: indexing via Monte Carlo search
Getting the Most out of the PDBe
Database Requirements for CCP4 17th October 2005
Institute of Biotechnology
Dynamic Web Pages (Flash, JavaScript)
Reverse Monte Carlo and Rietveld modelling of BaMn(Fe,V)F7 glass structures from neutron data _____ A. Le Bail Université du Maine – France
Advances in Structure Prediction of Inorganic Compounds
COD (CRYSTALLOGRAPHY OPEN DATABASE) and PCOD (PREDICTED)
Mobilizing EPA’s CompTox Chemistry Dashboard Data on Mobile Devices
PREDICTED CORNER SHARING TITANIUM SILICATES
ICDD Release 2008 New Features
Solid state (Calculations & Doping)
Homology Modeling.
A similarity index for comparing diffraction patterns
Solid state (Calculations & Doping)
Solid Crystal Structures. (based on Chap
Solid Crystal Structures. (based on Chap
The site to download BALBES:
Solid Crystal Structures. (based on Chap
Inorganic Structure Prediction with GRINSP
Web Application Development Using PHP
Ton Spek Utrecht University The Netherlands Vienna –ECM
Instructors Tim Fawcett Suri Kabekkodu Diane Sagnella
Getting Cell Parameters from Powder Diffraction Data
Presentation transcript:

Recent activities around the crystallography open databases COD, PCOD and P2D2 Armel Le Bail Université du Maine, Laboratoire des oxydes et Fluorures, CNRS UMR 6010, Avenue O. Messiaen, 72085 Le Mans Cedex 9, France. Email : lebail@univ-lemans.fr

CONTENT Foundations of the COD, PCOD, P2D2 databases - Current state of these open databases - Some external applications - Future - Conclusion

FOUNDATIONS COD = Crystallography Open Database Foundation March 2003 PCOD = Predicted Crystallography Open Database Foundation December 2003 P2D2 = Predicted Powder Diffraction Database Foundation February 2007

OPEN DATA Crystallography Databases Open access on the Web : and Crystallography Databases ——— Open access on the Web : PDB (proteins) NDB (nucleic acids) AMCSD (minerals) Toll databases : CSD (organic, organometallic) ICSD (inorganic, minerals) CRYSTMET (metals, intermetallics) ICDD (powder patterns)

PETITION RESULTS Signatures for « open access to crystal data » between May and June 2005 : 1150 YES, 4 NO, from 78 countries Including the signature of one Nobel Laureate (Richard J. Roberts, supporting PubChem as well) 35 countries with more than 5 signatures, representing 1054 signatures are listed on the graph : For understanding their motivations, read the texts sent by the signers at: http://www.crystallography.net/petition/ > 1800 signatures in 2008

COD The COD was built on the PDB model of open access on the Internet. This database consists of any small or medium crystal structure (inorganic, organic, organometallic). Currently the total entry number is close to 70000, including ~10000 entries from the American Mineralogist Crystal Structure Database (AMCSD), and CIF files donations from a few laboratories in Europe or from individuals. The distribution is made through an Apache/MYSQL/PHP system that takes queries on chemistry, ranges of cell parameters, volumes, etc, as well as combinations of fields, and can download or upload CIF files.

New COD coordinator since January 2008 : = http://cod.ibt.lt/ New COD coordinator since January 2008 : Dr. Saulius Gražulis, Institute of Biotechnology, Graiciuno 8, LT-02241 Vilnius, Lietuva (Lithuania) 68223 entries – April 2008 Recent addition of the IUCr logo giving permission to download their CIFs in September 2007

SEARCH the COD The COD wishes to offer minimal and simple search possibilities, allowing you : to verify if the structure you intend to solve is not already solved, to find models or fragments for solving your current problem, to make a correct job if an editor asks you to review a manuscript.

SEARCH OPTIONS Search page Results

GET the COD TOOLS EasyPHP (Apache server, MySQL, PHP scripts) You can download the complete database and make it run on your PC. You can reuse the complete system and create your lab CIF repository.

PCOD (P = Predicted) The PCOD, created in December 2003, is a COD subset of crystal structures predicted by the GRINSP computer program. It already contains > 100000 CIF files corresponding to M2X3, MX2, M2X5, MX3 or MaM’bXc formulations (X = O, F; M/M’ = B, Na, Si, Al, P, S, Ca, V, Ti, Fe, Nb, Re, Zr, etc), including hypothetical zeolites and other binary compounds with N-connected 3D frameworks of M atoms (N = 3, 4, 5, 6) as well as ternary compounds with mixed M/M’ frameworks. The PCOD is open for search, download and upload of predicted crystal structures (coming from any prediction computer program, inorganic or small and medium organic molecules).

110210 entries

SEARCHING PCOD Search page Results

VIRTUAL MODELS in PCOD Zeolite B2O3 nanotubes [Ca3Al4F21]3-

Entries in the PCOD

GRINSP is an Open Source software

External Applications 1 - Identification from calculated powder patterns : Actual structures : COD : Match ! Crystal Impact Virtual structures : PCOD : P2D2 -> EVA- Bruker 2 - Structural fingerprints for nanocrystals by means of TEM, HRTEM 3 - Interface with COD and PCOD for visualization, importing, exporting data to other applications like GULP to calculate energies, phonon properties, molecular dynamics, free energies and so on…

Identification from calculated powder patterns (from the COD) : Match Identification from calculated powder patterns (from the COD) : Match! sofware from Crystal Impact

Identification from calculated powder patterns

Predicted crystal structures (from the PCOD) provide predicted fingerprints

Providing a way for « immediate structure solution » Calculated powder patterns in the P2D2 allow for identification by search-match (EVA - Bruker and Highscore - Panalytical) Providing a way for « immediate structure solution » We « simply » need for a complete database of predicted structures ;-)

Example 1 – The actual and virtual structures have the same chemical formula, PAD = 0.52% (percentage of absolute difference on cell parameters, averaged) : -AlF3, tetragonal, a = 10.184 Å, c = 7.174 Å. Predicted : 10.216 Å, 7.241 Å. A global search (no chemical restraint) is resulting in the actual compound (PDF-2) in first position and the virtual one (PPDF-1) in 2nd (green mark in the toolbox).

Example 2 – Model showing uncomplete chemistry, PAD = 0. 63 Example 2 – Model showing uncomplete chemistry, PAD = 0.63. Actual compound : K2TiSi3O9H2O, orthorhombic, a = 7.136 Å, b = 9.908 Å, c =12.941 Å. Predicted framework : TiSi3O9, a = 7.22 Å, b = 9.97 Å, c =12.93 Å. Without chemical restraint, the correct PDF-2 entry is coming at the head of the list, but no virtual model. By using the chemical restraint (Ti + Si + O), the correct PPDF-1 entry comes in second position in spite of large intensity disagreements with the experimental powder pattern (K and H2O are lacking in the PCOD model) :

Example 3 – Model showing uncomplete chemistry, PAD = 0. 88 Example 3 – Model showing uncomplete chemistry, PAD = 0.88. Predicted framework : Ca4Al7F33, cubic, a = 10.876 Å. Actual compound : Na4Ca4Al7F33, a = 10.781 Å. By a search with chemical restraints (Ca + Al + F) the virtual model comes in fifth position, after 4 PDF-2 correct entries, if the maximum angle is limited to 30°(2) :

Example 4 : heulandite

Example 5 : Mordenite

Two main problems in identification by search-match process from the P2D2 : - Inaccuracies in the predicted cell parameters, introducing discrepancies in the peak positions. - Uncomplete chemistry of the models, influencing the peak intensities. However, identification may succeed satisfyingly if the chemistry is restrained adequately during the search and if the averaged difference in cell parameters is smaller than 1%.

A similarity index less sensitive to cell parameter discrepancies « New similarity index for crystal structure determination from X-ray powder diagrams, » D.W.M. Hofmann and L. Kuleshova, J. Appl. Cryst. 38 (2005) 861-866.

Typical case to be solved by prediction δ-Zn2P2O7 Bataille et al., J. Solid State Chem. 140 (1998) 62-70. Uncertain indexing, line profiles broadened by size/microstrain effects (Powder pattern not better from synchrotron radiation than from conventional X-rays) α β δ γ But the fingerprint is there…

P. Moeck and P. Fraundorf, Z. Kristallogr. 222 (2007) 634-635. Other fingerprints than powder patterns may be calculated from structural data : building fingerprints for nanocrystal identification by transmission electron microscopy. P. Moeck and P. Fraundorf, Z. Kristallogr. 222 (2007) 634-635.

J. Appl. Crys. 41 (2008) 471-475.

Examples of search with that COD User Interface

This was done already for the predicted AlF3 , using WIEN2K, not GULP: From that COD/PCOD graphical user interface, you may decide to study more seriously some series of structures predicted by the GRINSP software. This was done already for the predicted AlF3 , using WIEN2K, not GULP: A. Le Bail, F. Calvayrac, J. Solid State Chem. 179 (2006) 3159-3166.

Expected GRINSP improvements : Edge, face, corner-sharing, mixed. Hole detection, filling them automatically, appropriately, for electrical neutrality. Using bond valence rules or/and energy calculations to define a new cost function. Extension to quaternary compounds, combining more than two different polyhedra. Etc, etc. Do it yourself, the GRINSP software is open source… Nothing planned about hybrids…

Two things that don’t work well enough up to now… Validation of the Predictions - Ab initio calculations (WIEN2K, etc) : not fast enough for the validation of > 100000 structure candidates (was 2 months for 12 AlF3 models) Identification (is this predicted structure already known?) - There is no efficient tool for the fast comparison of these thousands of inorganic predicted structures to the known structures (inside of ICSD)

One advice, if you become a structure predictor Send your data (CIFs) to the PCOD, thanks… http://www.crystallography.net/pcod/

Future for the COD, PCOD, P2D2 COD : need to attain > 500000 real structures entries… Need to convince the ACS and RSC to give permission to download systematically their CIFs Need to decide more search-match software producers to incorporate powder patterns calculated from the COD PCOD and P2D2 : virtual structures Need to improve the quality of the predicted crystal structures by bond valence and energy calculations, etc

CONCLUSION To you to see what you can do with or for the COD, PCOD, P2D2 database… Knowing that : Structure and properties full prediction is THE challenge of this XXIth century in crystallography

Participate to the SDPDRR-3 (Structure Determination by Powder Diffractometry Round robin) http://sdpd.univ-lemans.fr/SDPDRR3/ 2 structures to solve: a calcium tartrate and a lanthanum tungsten oxyde Deadline : April 30, 2008