From Papertape Input to ‘Forensic Crystallography’ A History of the Program PLATON Ton Spek, Bijvoet Center Utrecht University The Netherlands K.N.Trueblood.

Slides:



Advertisements
Similar presentations
Configuration management
Advertisements

Configuration management
INTRODUCTORY TALK Ton Spek National Single Crystal Service Facility Utrecht University.
Marshing: Past, Present and Future Ton Spek National Single Crystal Service Facility Utrecht University The Netherlands.
The MISSYM Family: Software for the detection of Missed or Pseudo Symmetry A.L.Spek Utrecht University The Netherlands.
PLATON, New Options Ton Spek, National Single Crystal Structure Facility, Utrecht, The Netherlands. Delft, Sept. 18, 2006.
PLATON TUTORIAL A.L.Spek, National Single Crystal Service Facility,
Structure Comparison, Analysis and Validation Ton Spek National Single Crystal Facility Utrecht University.
CIF, PLATON-2014, SHELXL-2014, VALIDATION & SQUEEZE
Chapter 4: Trees Part II - AVL Tree
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
An Update on Current and New Structure Analysis Tools in PLATON Ton Spek, Bijvoet Center for Biomolecular Research, Utrecht University, The Netherlands.
PLATON Validation and Analysis Tools Ton Spek National Single Crystal Service Facility, Utrecht University, The Netherlands. Sevilla, 14-Dec-2010.
The Crystallographic Information File (CIF) Description and Usage Ton Spek, Bijvoet Center for Biomolecular Research Utrecht University Leiden, 27-Jan
PLATON/CheckCIF Issues Ton Spek Utrecht University The Netherlands Bruker User Meeting UCSD, La Jolla, March 22-24, 2012.
Small Molecule Example – YLID Unit Cell Contents and Z Value
PLATON/SQUEEZE Ton Spek Bijvoet Center Utrecht University, The Netherlands. PLATON Workshop Chicago, 24-July-2010.
Structure determination of incommensurate phases An introduction to structure solution and refinement Lukas Palatinus, EPFL Lausanne, Switzerland.
A Brief Description of the Crystallographic Experiment
The PLATON Toolbox Ton Spek National Single Crystal Service Facility, Utrecht University, The Netherlands. Kyoto, 20-Aug-2008.
Automatic Detection of Poor or Incorrect Single Crystal Structures A.L.Spek Utrecht University The Netherlands.
Structure Validation Challenges in Chemical Crystallography Ton Spek Utrecht University, The Netherlands. Madrid, Aug. 26, 2011.
Software Tools for the Analysis of Z’ > 1 Structures A.L.Spek, Utrecht University, National Single Crystal Service Facility The Netherlands. BCA-Meeting,
CheckCIF/PLATON Crystal Structure Validation
Data Flow SADABS sad.hkl sad.abs sad.prp name.ins name.hkl SAINTXPREPSMARTSHELX n.xxx p4p n.raw n._ls m.p4p copy to sad.p4p.
The System-S Approach to Automated Structure Determination: Problems and Solutions Ton Spek National Single Crystal Service Utrecht University, The Netherlands.
Automated Crystal Structure Validation Ton Spek, National Single Crystal Facility, Utrecht University, Utrecht, The Netherlands Platon Workshop Chicago,
Why Crystal Structure Validation ? Ton Spek, National Single Crystal Facility, Utrecht University, Utrecht, The Netherlands Slovenia, 17-june-2010.
PLATON, AN OVERVIEW Ton Spek National Single Crystal Service Facility, Utrecht University, The Netherlands. Platon Workshop Chicago, 24-July-2010.
Why Small Molecule Crystal Structure Validation ? Ton Spek, National Single Crystal Facility, Utrecht University, Utrecht, The Netherlands Sevilla, 14-Dec-2010.
The Crystallographic Information File (CIF) Description and Usage Ton Spek, Bijvoet Center for Biomolecular Research Utrecht University Sevilla, 14-Dec
SYSTEM-S The Challenge of Automated Structure Determination Ton Spek National Single Crystal Service Utrecht University, The Netherlands.
Structure Validation in Chemical Crystallography with CheckCIF/PLATON Ton Spek, National Single Crystal Service Facility, Utrecht University The Netherlands.
Structure Validation in Chemical Crystallography Ton Spek, Bijvoet Centre for Biomolecular Research, Utrecht University, The Netherlands. CCP4-Leeds, 5-Jan
Testing an individual module
Structure Validation in Chemical Crystallography Principles and Application Ton Spek, National Single Crystal Service Facility, Utrecht University SAB-Delft,
PLATON and STRUCTURE VALIDATION Ton Spek National Single Crystal Service Facility, Utrecht University, The Netherlands. Goettingen, 13-Oct-2007.
SWIS Digital Inspections Project (SWIS DIP) Chris Allen, Information Management Branch California Integrated Waste Management Board November 5, 2008 The.
1 History of compiler development 1953 IBM develops the 701 EDPM (Electronic Data Processing Machine), the first general purpose computer, built as a “defense.
Configuration Management (CM)
On the Proper Reporting and Archival of Crystal Structure Data Ton Spek Utrecht University, NL (ACA2015-Philadelphia)
PLATON, A set of Tools for the Interpretation of Structural Results Ton Spek National Single Crystal Service Facility, Utrecht University,The Netherlands.
Structure Validation Ton Spek, Bijvoet Centre Utrecht University The Netherlands PLATON Course, Utrecht, April 18, 2012.
PLATON TUTORIAL A.L.Spek, National Single Crystal Service Facility, Utrecht, The Netherlands.
Ton Spek Utrecht University The Netherlands IUCr-Montreal Aug 11, 2014
Crystallographic Databases I590 Spring 2005 Based in part on slides from John C. Huffman.
1. Diffraction intensity 2. Patterson map Lecture
The PLATON/TwinRotMat Tool for Twinning Detection Ton Spek National Single Crystal Service Facility, Utrecht University, The Netherlands. Delft, 29-Sept-2008.
PLATON, A Multipurpose Crystallographic Tool Ton Spek, National Single Crystal Service Facility, Utrecht, The Netherlands.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
The PLATON Toolbox History and Applications Ton Spek Utrecht University, The Netherlands. Bruker User Meeting, UCSD La Jolla, March 22-24, 2012.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
Methods in Chemistry III – Part 1 Modul M.Che.1101 WS 2010/11 – 8 Modern Methods of Inorganic Chemistry Mi 10:15-12:00, Hörsaal II George Sheldrick
PLATON/SQUEEZE Ton Spek Bijvoet Center Utrecht University, The Netherlands. PLATON Course Utrecht, April 18, 2012.
Absolute Configuration Types of space groups Non-centrosymmetric Determining Absolute Configuration.
Updates on Validation and SQUEEZE Ton Spek Utrecht University Bruker User Meeting Jacksonville (FL), Jan 19, 2016.
Lecture 3 Patterson functions. Patterson functions The Patterson function is the auto-correlation function of the electron density ρ(x) of the structure.
What is Needed for Proper Structure Validation and How to Act upon Validation ALERTS Ton Spek Utrecht University The Netherlands ACA-Denver, july 26, 2016.
The PLATON checkCIF and SQUEEZE Tools
(check)CIF, SHELXL-2014, SQUEEZE
What Makes a Crystal Structure Report Valid?
Crystal Structure Validation with PLATON
History of compiler development
Compiler Construction
Why Crystal Structure Validation ?
The SQUEEZE Tool in PLATON and its use with SHELXL2013
Arrays.
Ton Spek Utrecht University The Netherlands Vienna –ECM
The PLATON/TwinRotMat Tool for Twinning Detection
Presentation transcript:

From Papertape Input to ‘Forensic Crystallography’ A History of the Program PLATON Ton Spek, Bijvoet Center Utrecht University The Netherlands K.N.Trueblood Award Lecture Chicago, July 29, 2010.

Some History Back in 1966 I started crystallography as a student in the Laboratory for Crystal and Structural Chemistry at Utrecht University that was at that time headed by Prof. A.F. Peerdeman. Peerdeman (co-author of the famous Bijvoet, Peerdeman & van Bommel paper on absolute configuration) was the successor of Prof. J.M.Bijvoet. Dorothy Hodgkin came over during that time to tell about the Vitamin B12 structure and her oversees collaboration with Ken Trueblood

After WWII, Bijvoet had Managed to start a new lab in a stately house (used by the Gestapo during WWII) close to the centre of the city of Utrecht. Part of the house was his private domain. After his retirement, he still kept a pied-a-terre for when he was in Utrecht. As a student, I shared the family bedroom … in its double function as student room.

Former ‘Crystal Palace’ and home of Prof. J.M. Bijvoet

Computing-I The Crystal Palace was the home of the first two generations of computing platforms within the university of Utrecht (Zebra and X1 respectively). In 1966 computing had moved to a University computing centre elsewhere in the city. Computing was done from then on with an Algol language specific X8 computer (From a Dutch company Electrologica, later part of Philips) Processing was essentially one job at a time.

~1966, Electrologica X8 ALGOL60 ‘Mainframe’ (<1MHz) 16kW

Computing-II Jobs were run by an operator during daytime shifts Most of our crystallographic work was done during the once-a-week 13 hour nightshift when we as crystallographers had the computer for ourselves. Half of the staff stayed overnight. We were during that nightshift the scientist, the software developer and the system operator in one. I/O was paper tape based. One job at a time. Very little memory. No stored binaries, thus recompilation everytime.

Computing-III Programs and data were on paper tape The preparation of programs and program input were done on the so called Flexowriter. This very noisy electical typewriter was also often used as output medium. Editing was done with a pair of sissors to cut out unwanted material from the source code and adhesive tape to glue a substitute in the paper tape.

Flexowriter for the creation and editing of programs and input data

The Science My supervisor, Dr. J.A. Kanters, gave me an interesting assignment to work on. He handed me a batch of white crystals with unknown composition (code named M200). The assignment was to find out what was the structure, using single crystal X-ray techniques only.

Data Collection for M200 Preliminary investigations done with film data pointed at space group P-1. A Patterson synthesis based on integrated Weissenberg projection data subsequently suggested a light atom structure. Eventually a three-dimensional data set was collected with an Enraf-Nonius AD3 diffractometer (two weeks of datacollection !).

Nonius AD3 Diffractometer

Structure Determination of M200 It took half a year to finally find the structure. The laboratory had a tradition in Direct Methods (Beurskens, de Vries, Kroon, Krabbendam) However, all available software failed to solve my structure (these were pre-MULTAN days..) In the end I had to write my own Direct Methods program (AUDICE) that solved the triclinic structure including many other unsolved structures that were hanging around in the lab.

The Structure 3-Methoxy-glutaconic acid

The Program

The Program AUDICE AUDICE was one of the Symbolic Addition programs that were developed in that period. Its specialty was that at the start of the evaluation of the strong triple product indications for a positive sign, 27 symbols were introduced for strong starting reflections rather than in the order of three by some other approaches. Eventually, 8 solutions were produced by eliminating 24 symbols based on multiple ‘indications’. In addition the ‘correlation method’ was used to improve the reliability of triple phase relations.

H K H+K L The Correlation Method P+ for triple H,K,H+K depends on |E(H)E(K)E(H+K)| ‘Correlation Method’  Improved P+ on the basis of P+ of three adjacent triples |E(H)E(L)E(H+L)| |E(K)E(L-K)E(L)| |E(H+K)E(L-K)E(H+L)| I.e. Strengthening of P+(|E(H)E(K)E(H+K)| when in addition E(H+L),E(L-K),E(L) strong (Note: Theoretically formalized in terms of neighbourhoods, Hauptman)

Epilogue The structure of M200 has been published Unfortunately, attempts to publish AUDICE in Acta Cryst. stranded on the referee requirement to compare its performance on non ALGOL (real..) platforms. Anyway AUDICE was superseded by the program MULTAN (Fortran) on the new CDC University Mainframe. The structure solves and refines in a matter of seconds on current hardware with SYSTEM S =>

Automatic Structure Solution of M200 in the No-Questions-Asked Mode

Direct Methods Meetings Multiple meetings and schools were organized in the 70’s with Direct Methods (software and theory) as its major subject. Examples are the NATO schools in Parma and York, the schools in Erice (1974 & 1978) and the meetings at the Medical Foundation (Buffalo) where I met Ken Trueblood. Important one’s werealso the CECAM workshops on Direct Methods (5 weeks!, bringing together people working in the field to work on current issues) in the early 70’s in Orsay (near Paris) around a big IBM-360 with lectures by Hauptman. (Participants: Germain, Main, Destro, Viterbo). The program MULTAN was finalized there. Photo of the participants of the Parma 1973 meeting and the 1978 Erice School next :

Hauptman Lectures Parma Spring 1973

The National Facility In 1971, a national single crystal service facility was started, with me to make it all happen.. I kept that position for 38 until my emeritus status in The project is now continued by my former co- worker Martin Lutz My last postdoc was Maxime Siegler, now staff crystallographer at the John Hopkins University. The program PLATON is a side product of the national facility (note: never explicitly funded !)

PLATON Work on PLATON started in The idea was to produce with a single ‘CALC ALL’ instruction an exhaustive listing of derived geometry to give to our clients. Over time numerous additional tools have been added on the basis or the needs in our service setting. PLATON is, in combination with SHELX, one of the major tools for our service.

PLATON Tools The available tools are shown as clickable options on the opening window of the program. Examples are ADDSYM for the detection of missed symmetry, TwinRotMat for automatic twinning detection and SYSTEM S for guided/automated structure determination) Here we will look in some detail at a few of the tools: SQUEEZE for the handling of disordered solvents Structure Validation (used as part of the IUCr CheckCIF) FLIPPER, a new approach to structure determination

The Disordered Solvent Problem Molecules of interest often co-crystallize (only) with the inclusion of a suitable solvent molecule. Solvent molecules often fill voids in a structure with little interaction and located on symmetry sites and with population less than 1.0 Often the nature of the (mixture) of included solvent(s) is unclear. Inclusion of the scattering contribution of the solvent can be done either with a disorder model or with SQUEEZE.

THE MOLECULE THAT INVOKED THE BYPASS/SQUEEZE TOOL Salazopyrin from DMF – R = 0.096

Structure Modeling and Refinement Problem for the Salazopyrin structure Difference Fourier map shows disordered channels rather than maxima How to handle this in the Refinement ? SQUEEZE !

Looking down the Infinite Channels in the Salazopyrin Structure How to model this disorder in the L.S-Refinement ?

The SQUEEZE Tool The SQUEEZE tool offers an alternative to the refinement of a disorder model for a structure containing disordered solvent. The contribution of the disordered solvent to the calculated structure factors is taken into account by back-Fourier transformation of the electron density found in the solvent region of the difference map. This requires an iterative series of difference map improvements. Firstly, the solvent accessible region has to be indentified to be used as a mask over the difference density map.

Solvent Accessible Voids A typical crystal structure has only in the order of 65% of the available space filled. The remainder volume is in voids (cusps) in-between atoms (too small to accommodate an H-atom) Solvent accessible voids can be defined as regions in the structure that can accommodate at least a sphere with radius 1.2 Angstrom without intersecting with any of the van der Waals spheres assigned to each atom in the structure. Next Slide: Void Algorithm: Cartoon Style 

STEP #1 – EXCLUDE VOLUME INSIDE THE VAN DER WAALS SPHERE DEFINE SOLVENT ACCESSIBLE VOID

STEP # 2 – EXCLUDE AN ACCESS RADIAL VOLUME TO FIND THE LOCATION OF ATOMS WITH THEIR CENTRE AT LEAST 1.2 ANGSTROM AWAY White Area: Ohashi Volume. Location of possible Atom centers

DEFINE SOLVENT ACCESSIBLE VOID STEP # 3 – EXTEND INNER VOLUME WITH POINTS WITHIN 1.2 ANGSTROM FROM ITS OUTER BOUNDS The

Listing of all voids in the unit cell EXAMPLE OF A VOID ANALYSIS The numbers in [ ] refer to the Ohashi Volume

VOID APPLICATIONS Detection of Solvent Accessible Voids in a Structure Calculation of Kitaigorodskii Packing Index Determination of the available space in solid state reactions (Ohashi) Determination of pore volumes, pore shapes and migration paths in microporous crystals As part of the SQUEEZE routine to handle the contribution of disordered solvents in a crystal structure.

SQUEEZE Takes the contribution of disordered solvents to the calculated structure factors into account by back-Fourier transformation of density found in the ‘solvent accessible volume’ outside the ordered part of the structure (iterated). Refine with SHELXL using the solvent free.hkl Or CRYSTALS using the SQUEEZE solvent contribution and the the full Fobs Note:SHELXL lacks option for fixed contribution to Structure Factor Calculation.

SQUEEZE Algorithm 1.Calculate difference Fourier map (FFT) 2.Use the VOID-map as a mask on the FFT-map to set all density outside the VOID’s to zero. 3.FFT -1 this masked Difference map -> contribution of the disordered solvent to the structure factors 4.Calculate an improved difference map with F(obs) phases based on F(calc) including the recovered solvent contribution and F(calc) without the solvent contribution. 5.Recycle to 2 until convergence.

SQUEEZE In the Complex Plane Fc(model) Fc(solvent) Fc(total) Fobs Solvent Free Fobs Black: Split Fc into a discrete and solvent contribution Red: For SHELX refinement, temporarily substract recovered solvent contribution from Fobs.

Real World Example THF molecule disordered over a center of inversion Comparison of the result of a disorder model refinement with a SQUEEZE refinement

Disorder Model Refinement Final R = 0.033

Disorder Model R = SQUEEZE Model R = Comparison of the Results of the two Modeling Procedures

LISTING OF FINAL SQUEEZE CYCLE RESULTS

AANALYSIS AANALYSIS ANALYSIS OF R-VALUE IMPROVEMENT WITH RESOLUTION

Concluding Remarks The CSD includes in the order of 1000 entries where SQUEEZE was used. Care should be taken for issues such as charge balance

Charge Flipping Charge Flipping as an alternative for structure solution by Direct Methods was introduced by G. Oszlanyi & A. Suto (2004). Acta Cryst. A60, 134. Similar to SQUEEZE it involves iterated forward and backward Fourier transforms. PLATON implements an experimental version of Charge Flipping named FLIPPER. Following is an example of the P2 1, Z=2 structure of vitamin C solved by FLIPPER starting with all reflections assigned a phase of zero degrees.

FLIPPER Charge Flipping is done with data in space group P1. The space group is determined from the solution The methods can be used for automatic structure determination of non disordered structures Following is the real time display of the progress in the development of the structure after each Fourier cycle, followed a full refinement.

Automated Structure Validation It is easy to miss problems with a structure as a busy author or as a referee Increasingly: Black-Box style analyses done by non-experts Limited number of referees & experts available It is easy to hide problems with a ball-and-stick style illustration Sadly, fraudulous results and structures have now been identified in the literature thus contaminating the assumed solid information in the CSD.

Structure Validation with PLATON Automated Structure Validation was pioneered and ‘pushed’ by Syd Hall as section editor of Acta Cryst C. by: 1.The creation of the CIF Standard for data archival and exchange (Hall et al., (1991) Acta Cryst., A47, Having CIF adopted by Sheldrick for SHELXL93 3.Making CIF the Acta Cryst. submission standard 4.Setting up early CIF checking procedures for Acta 5.Inviting me to include PLATON checking tools such as ADDSYM and VOID search.

WHAT ARE THE VALIDATION QUESTIONS ? Single Crystal Structure Validation addresses three simple but important questions: 1 – Is the reported information complete? 2 – What is the quality of the analysis? 3 – Is the Structure Correct?

How is Validation Currently Implemented ? Validation checks on CIF data can be executed at any time, both in-house (PLATON/CHECK) or through the WEB-based IUCr CHECKCIF server. A file, check.def, defines the issues that are tested (currently more than 400) with levels of severity and associated explanation and advise. ( ) Most non-trivial tests on the IUCr CheckCIF server are executed with routines in the program PLATON. (Identified as PLATxyz)

VALIDATION ALERT LEVELS CheckCIF/PLATON creates a report in the form of a list of ALERTS with the following ALERT levels: ALERT A – Serious Problem ALERT B – Potentially Serious Problem ALERT C – Check & Explain ALERT G – Verify or Take Notice

VALIDATION ALERT TYPES 1 - CIF Construction/Syntax errors, Missing or Inconsistent Data. 2 - Indicators that the Structure Model may be Wrong or Deficient. 3 - Indicators that the quality of the results may be low. 4 – Info, Cosmetic Improvements, Queries and Suggestions.

PLATON/CHECK CIF + FCF Results

Which Key Validation Issues are Addressed Missed Space Group symmetry (“being Marshed”) Wrong chemistry (Mis-assigned atom types). Too many, too few or misplaced H-atoms. Unusual displacement parameters. Hirshfeld Rigid Bond test violations. Missed solvent accessible voids in the structure. Missed Twinning. Absolute structure Data quality and completenes.

Evaluation and Performance The validation scheme has been very successful for Acta Cryst. C & E in setting standards for quality and reliability. The missed symmetry problem has been solved for the IUCr journals (unfortunately not generally yet: There are still numerous ‘Marshable’ structures). Most major chemical journals currently have now some form of a validation scheme implemented. Recently included: FCF validation

FCF-VALIDATION - Check of the CIF & FCF data Consistency (including R-values, cell dimensions) - Check of Completeness of the reflection data set. - Automatic Detection of ignored twinning - Detection of Applied Twinning Correction without having been Reported in the paper. - Validity check of the reported Flack parameter value against the Hooft parameter value. - Analysis of the details of the Difference Density Fourier Map for unreported features.

Sloppy, Novice or Fraudulent ? Errors are easily made and unfortunately not always discernable from fraud. Wrong element type assignments can be caused as part of an incorrect analysis of an unintended reaction product. Alternative element types can be (and have been) substituted deliberately to create a ‘new publishable’ structures. Reported and calculated R-values differing in the first relevant digit !?

Some Relevant ALERTS Wrong atom type assignments generally cause: Serious Hirshfeld Rigid Bond Violation ALERTS Larger than expected difference map minima and maxima. wR2 >> 2 * R1 High values for the SHELXL refined weight parameter

[Sn(IV)(NO 3 ) 4 (C 10 H 8 N 2 ) 2 ] Acta Cryst. (2007), E63, m1566.

2.601 Ang. Missing H in bridge & Sn(IV) => Lanthanide(III)

The Ultimate Shame Recently a whole series of ‘isomorphous’ substitions was detected for an already published structure. Similar series have now been detected for coordination complexes (Transition metals and lanthanides) How could referees let those pass ? Over 100 structures now retracted Fraud detected by looking at all papers of the same authors of a ‘strange’ structure (and their institutions)

BogusVariations (with Hirshfeld ALERTS) on the Published Structure 2-hydroxy-3,5-nitrobenzoic acid (ZAJGUM)

Comparison of the Observed data for two ‘isomorphous’ compounds. SLOPPY Or FRAUD ? Conclusion The Same Data ! The Only Difference Is the SCALE ! Tool: platon –d name1.fcf name2.fcf

Thanks ! My former co-workers over 38 years and in particular my successor Dr. Martin Lutz Dr. Louis Farrugia for following my frequent updates with his MS-Windows implementation The users of the software for ideas and bug reports. Lachlan Cranswick for promoting my software and who is sadly no longer with us here.

IUCr Crystallographic Computing School 2005 Siena