Presentation is loading. Please wait.

Presentation is loading. Please wait.

E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Similar presentations


Presentation on theme: "E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,"— Presentation transcript:

1 e-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart, W. Emmerich – CS H. Nowell, S. L. Price – Chem eMaterials

2 Combinatorial Computational Catalysis Polymorphism prediction of polymorphs – a drug substance may exist as two or more crystalline phases in which the molecules are packed differently. explore which sites are involved in catalysis – used in diverse industries including petroleum, chemical, polymers, agrochemicals, and environmental.

3 Combinatorial Computational Catalysis explore which sites are involved in catalysis – used in diverse industries including petroleum, chemical, polymers, agrochemicals, and environmental. Polymorphism prediction of polymorphs – a drug substance may exist as two or more crystalline phases in which the molecules are packed differently.

4 simulations take too long to run data are distributed across many sites and systems no catalogue system output in legacy text files, different for each program few tools to access, manage and transfer data workflow management is manual licensing within distributed environment e-Science Issues to Address

5 Acid Sites in Zeolites Determine the extra framework cation position within the zeolite framework. Explore which proton sites are involved in catalysis and then characterise the active sites. To produce a database with structural models and associated vibrational modes for Si/Al ratios. Improve understanding of the role of the Si/Al ratio in zeolite chemistry.

6 Chabazite: 1T site, 12 Si centres per unit cell, 8 membered ring channels (3.8Å * 3.8Å).

7

8

9

10

11 Si/Al – 11 = 4 Si/Al – 5 = 160 Si/Al – 3 = 5760 Si/Al – 2 = 184,320 The number of calculations quickly becomes an issue when realistic Si/Al ratios are considered. A Si/Al ratio of 2 would require 184,320 calculations at ~100 second each. = 5120.0 hours = 213 days of cpu time. The Problem When substitution of a second Al is considered there are now 4 * (10 * 4) possible structures as symmetry has been broken. Note this is for a very simple zeolite with 36 ions per unit cell, materials of interest have 296.

12 A combined MC and EM approach has been developed to model zeolitic materials with low and medium Si/Al ratios. Firstly Al is inserted into a siliceous unit cell and then charge compensate with cations. MC/EM

13 Name OpSys Arch State Activity LoadAv Mem ActvtyTime vm1-8@faraday.r IRIX65 SGI Owner Idle 1.192 128 3+03:01:02 vm1-14@tyndall.r IRIX65 SGI Unclaimed Idle 0.000 507 0+00:15:09 ising2.ri.ac. LINUX INTEL Unclaimed Idle 0.200 501 [?????] vm1-16@strutt1-4 OSF1 ALPHA Owner Idle 1.113 1024 0+0:26:46 xp2.ri.ac.uk OSF1 ALPHA Owner Idle 1.113 256 49+12:26:46 xp3.ri.ac.uk OSF1 ALPHA Unclaimed Idle 0.000 256 0+00:55:00 d8.ri.ac.uk WINNT40 INTEL Unclaimed Idle 0.000 255 0+02:09:45 ATLANTIC WINNT51 INTEL Unclaimed Idle 0.008 256 0+01:02:30 BABBLE.ri.ac. WINNT51 INTEL Unclaimed Idle 0.252 512 0+00:22:57 D500.ri.ac.uk WINNT51 INTEL Owner Idle 0.533 254 0+05:26:06 PCDAVIDC.ri.a WINNT51 INTEL Unclaimed Idle 0.000 504 0+03:51:26 e-sam.ri.ac.u WINNT51 INTEL Unclaimed Idle 0.001 512 0+03:16:39 pcalexey.ri.a WINNT51 INTEL Unclaimed Idle 0.002 256 0+00:35:53 Machines Owner Claimed Unclaimed Matched Preempting ALPHA/OSF1 18 1 0 1 0 0 INTEL/LINUX 1 0 0 1 0 0 INTEL/WINNT40 1 0 0 1 0 0 INTEL/WINNT51 14 1 0 5 0 0 SGI/IRIX65 22 15 0 7 0 0 Total 56 17 0 15 0 0 RI Condor Pool We have set up and tested a Condor pool at the RI, which has 50+ heterogeneous nodes from desktop PC’s, machines controlling instruments to main servers of the DFRL.

14 Name OpSys Arch State Activity LoadAv Mem ActvtyTime vm1-8@faraday.r IRIX65 SGI Owner Idle 1.192 128 3+03:01:02 vm1-14@tyndall.r IRIX65 SGI Unclaimed Idle 0.000 507 0+00:15:09 ising2.ri.ac. LINUX INTEL Unclaimed Idle 0.200 501 [?????] vm1-16@strutt1-4 OSF1 ALPHA Owner Idle 1.113 1024 0+0:26:46 xp2.ri.ac.uk OSF1 ALPHA Owner Idle 1.113 256 49+12:26:46 xp3.ri.ac.uk OSF1 ALPHA Unclaimed Idle 0.000 256 0+00:55:00 d8.ri.ac.uk WINNT40 INTEL Unclaimed Idle 0.000 255 0+02:09:45 ATLANTIC WINNT51 INTEL Unclaimed Idle 0.008 256 0+01:02:30 BABBLE.ri.ac. WINNT51 INTEL Unclaimed Idle 0.252 512 0+00:22:57 D500.ri.ac.uk WINNT51 INTEL Owner Idle 0.533 254 0+05:26:06 PCDAVIDC.ri.a WINNT51 INTEL Unclaimed Idle 0.000 504 0+03:51:26 e-sam.ri.ac.u WINNT51 INTEL Unclaimed Idle 0.001 512 0+03:16:39 pcalexey.ri.a WINNT51 INTEL Unclaimed Idle 0.002 256 0+00:35:53 Machines Owner Claimed Unclaimed Matched Preempting ALPHA/OSF1 18 1 0 1 0 0 INTEL/LINUX 1 0 0 1 0 0 INTEL/WINNT40 1 0 0 1 0 0 INTEL/WINNT51 14 1 0 5 0 0 SGI/IRIX65 22 15 0 7 0 0 Total 56 17 0 15 0 0 RI Condor Pool But where is PC-CRAC???

15 Level of Optimisation 50eV

16 Level of Optimisation 240eV

17 MOR Mordenite – 1 dimensional channel system simulation cell contains two unit cells 296 atoms, with 96 Si centres (referred to as T sites). Substituting 8 T sites with 8 Na cations

18 Gulp WinXP Gulp Files Workflow MC_subs Perl script MS Excel SRB

19 Gulp WinXP Gulp Files Workflow II MC_subs Perl script MS Excel SRB Si-zeo structure Interatomic pots Input file Batch of labelled Gulp files C++ f90 Scommands Subset of data in formatted file Script auto batch sub Script for cleaning dirs

20 Extensive use of Condor pools (UCL ~950 nodes in teaching pools). ~150 cpu-years of previously unused compute resource have been utilised in this study. Close collaboration with the NERC e-minerals project has allowed access to this resource. 150,000 calculations have been performed each with varying numbers of particles per simulation box, which means a total of ~75,000,000 particles have been included in our simulations of Mordenite to date. Condor Stats

21 Jobs submitted in 1,000 job batches – issue of stability. Shadows – not my game but a pain when Condor Master dies due to too many jobs hitting the queue (guilty feeling as Master was not solely running pool but also being used for science by pool administrator. Maximum number of jobs in queue. Condor Specifics

22 Handling of data and analysis becomes RDS. However, keeping the pool full of jobs is also a tedious step when jobs are short, which is the ideal for the UCL pool (re: turning off pool once a day) – drip feeding. Condor Specifics Thought in application design is key – many on UCL pool are TOTALLY unsuitable for UCL Condor Pool.

23 MOR Mordenite – 1 dimensional channel system simulation cell contains two unit cells 296 atoms, with 96 Si centres (referred to as T sites). Substituting 8 T sites with 8 Na cations

24 0 It can be seen that there are two distinct regions, -12079eV to -12076eV and -12075eV to -12073eV, but there is no obvious correlation between total energy and cell volume. 100 100 Configurations 20eV

25 However, when 10,000 structures are considered it is clear that the most stable structures correspond to cation placements that do not cause the cell to expand. This requires that the cations sit in the large channel. 0 10000 10000 Configurations 25eV

26 10000 Configurations

27 Comparison of Regions -12079.5eV-12075.04eV

28 Analysis mysql, allows input from a text file, C/C++ program or mysql command line and GUI Properties: Total energy, cell volume, lattice parameters, T-O distances, T-O-T bond angles, cation-framework oxygen distances, coordination of user specified species etc.

29 Gulp WinXP Gulp Files Workflow III MC_subs SRB db mysql

30 PropertyGoodBad Lattice Energy (eV) < -12070> -12068 Al-Na average distance (Å) > 3.6< 3.4 cell volume (Å 3) < 5420> 5475 average cation – Oxygen (Å) > 2.75< 2.65 Building an Ensemble

31 Validation Comparison with experiment is very promising showing a large difference in the quality of the fit between ‘good’ set and ‘bad’.

32 Monitor

33 Jobs Analysis Model/Configuration Generator Distributed Computing Portal Steering Improve generation / model strategy User Input: Structural model Si/Al, cation types, [H 2 O] etc. User Input: Diffraction data, chemical analysis, building units, Si/Al, cation types, [H 2 O] etc. Analysis (geometry, energy, fit) D. Lewis, R. Coates, S. French UCL Chem / RI Drip Feeding and Interactive Steering using Relational Databases db

34 Workflow IV SSHCML Workflow service needs to be exposed to outside world as a web service Since we require new WSDL interfaces for each application it is a perfect opportunity to employ a standard representation for chemical structures. XML standard in Chemistry is CML (Chemical Markup Language)

35 We are now doing science that was not possible before the advancements made within e-Science. Key Achievement

36

37 FER Ferrite – 2 dimensional channel system simulation cell contains 115 atoms. substituting at 4 T sites with 4 Na cations

38 Only 75 out of 100 configurations optimise 14eV 100 Configurations Again there are steps in Total Energy and again this time no correlation with volume for the low number of configurations.

39 15eV 10000 Configurations However, this time when 10,000 structures are considered there are no clear steps in the volume. The volume still increases with decreasing stability but this is due to cell expansion caused by Al to Al interactions. Only 7500 out of 10000 optimise

40 Comparison of Regions

41

42 MFI ZSM5 – 3 dimensional channel system simulation cell contains 292 atoms substituting at 4 sites with 4 Na cations

43 10eV 10000 Configurations There is a step in Total Energy but this time only one and from then the trend is smooth.

44 When confirmed the lowest energy positions of Al the cation is exchanged for a proton and again energy minimised. This method will allow us to construct realistic models of low and medium Si/Al zeolites. Such structures can be used for further simulations and aid the interpretation of experimental data. What Next

45 BaTiO 3 Solid Solutions

46 BaSrTiO 3

47 Solid Solutions SrTiO 3

48 upload files as part of workflow to SRB generate metadata upload extracted data from files more extensive use of CML Ongoing and Future Work

49 We are now doing science that was not possible before the advancements made within e-Science. Key Achievement

50

51 1. First use of CML schema for defining Web Service port types. 2. Calculation of 50,000 configurations of zeolite Mordenite (24,000,000 particles) to gain insight into structure when a realistic ratio of Al substitution is included in model. 3. Successfully exposed Fortran codes as OGSI Web Services - prototype application deployed on 80 nodes. The prototype computational polymorph application is being ported to a larger production machine. 4. First use of BPEL standard for orchestrating web services in a Grid application. 5. Open Source BPEL implementation in development enabling late binding and dynamic deployment of large computational processes. 6. Integration of OGSI and BPEL with Sun Grid Engine. 7. Development of Graphic User Interface for polymorph application - connects to relational database via EJB interface. 8. Infrastructure for metadata and data management 9. SRB and dataportal are already being used to hold datasets and being used for transferring the data between different scientists and computer applications. 10. Implementation of Condor pool at Ri. Achievements To Date

52 Polymorph Prediction Different crystal structures of a molecule are called polymorphs. Polymorphs may have considerably different properties (e.g. bioavailability, solubility, morphology) Polymorph prediction is of great importance to the pharmaceutical industry where the discovery of a new polymorph during production or storage of a drug may be disastrous Drug molecules are often flexible and this makes the polymorph prediction process more challenging…

53 MOLPAK Generation of ~6000 densely packed crystal structures using rigid molecular probe DMAREL Lattice energy optimisation For flexible molecules: conformational optimisation n feasible rigid molecular probes representing energetically plausible conformers Data : Unit cell volume, density, lattice energy Restricted number of structures selected crystal structures and properties stored in Database Morphology n times n = number of conformers Polymorph Prediction Workflow

54 Store data files from simulations in the Storage Resource Broker Storage Resource Broker

55 We are now doing science that was not possible before the advancements made within e-Science. Key Achievement


Download ppt "E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,"

Similar presentations


Ads by Google