E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Slides:



Advertisements
Similar presentations
SOMA2 – Drug Design Environment. Drug design environment – SOMA2 The SOMA2 project Tekes (National Technology Agency of Finland) DRUG2000 program.
Advertisements

X-SIGMA (An XML based Simple data Integration system for Gathering, Managing and Accessing scientific experimental data in grid environments) Karpjoo
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,
Using eScience to calibrate our tools: parameterisation of quantum mechanical calculations with grid technologies Kat Austen Dept. of Earth Sciences, University.
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
Dan Bradley Computer Sciences Department University of Wisconsin-Madison Schedd On The Side.
The UCL Condor Pool Experience John Brodholt 1, Paul Wilson 3, Wolfgang Emmerich 2 and Clovis Chapman Department of Earth Sciences, University College.
Experience of the SRB in support of collaborative grid computing Martin Dove University of Cambridge.
1 Solid Form Control and Design through Structural Informatics Ghazala Sadiq Karachi, IYCr South Asia Summit Meeting, 2014.
Applications and integration with experimental data Checking your results Validating your results Structure determination from powder data calculations.
What is e-Science? e-Science refers to large scale science that will increasingly be carried out through distributed global collaborations enabled by the.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
University of Leeds Department of Chemistry The New MCM Website Stephen Pascoe, Louise Whitehouse and Andrew Rickard.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
OxGrid, A Campus Grid for the University of Oxford Dr. David Wallom.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
A Data Curation Application Using DDI: The DAMES Data Curation Tool for Organising Specialist Social Science Data Resources Simon Jones*, Guy Warner*,
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Talend 5.4 Architecture Adam Pemble Talend Professional Services.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 31 Slide 1 Service-centric Software Engineering 2.
History of the National INFN Pool P. Mazzanti, F. Semeria INFN – Bologna (Italy) European Condor Week 2006 Milan, 29-Jun-2006.
Visualisation & Grid Applications of Electromagnetic Scattering from Aircraft Mark Spivack (PI), Andrew Usher, Xiaobo Yang, Mark Hayes CeSC, DAMTP & BAE.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
HERA/LHC Workshop, MC Tools working group, HzTool, JetWeb and CEDAR Tools for validating and tuning MC models Ben Waugh, UCL Workshop on.
Grid tool integration within the eMinerals project Mark Calleja.
Experiences with a HTCondor pool: Prepare to be underwhelmed C. J. Lingwood, Lancaster University CCB (The Condor Connection Broker) – Dan Bradley
Grid MP at ISIS Tom Griffin, ISIS Facility. Introduction About ISIS Why Grid MP? About Grid MP Examples The future.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Ecoinformatics Workshop Summary SEEK, LTER Network Main Office University of New Mexico Aluquerque, NM.
Grid Service Orchestration using the Business Process Execution Language Wolfgang Emmerich Professor of Distributed Computing Dept. of Computer Science.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Building the e-Minerals Minigrid Rik Tyer, Lisa Blanshard, Kerstin Kleese (Data Management Group) Rob Allan, Andrew Richards (Grid Technology Group)
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
The EDGeS project receives Community research funding 1 Porting Applications to the EDGeS Infrastructure A comparison of the available methods, APIs, and.
FRANEC and BaSTI grid integration Massimo Sponza INAF - Osservatorio Astronomico di Trieste.
INFSO-RI Enabling Grids for E-sciencE Running ECCE on EGEE clusters Olav Vahtras KTH.
A Demonstration of Collaborative Web Services and Peer-to-Peer Grids Minjun Wang Department of Electrical Engineering and Computer Science Syracuse University,
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Organising social science data – computer science perspectives Simon Jones Computing Science and Mathematics University of Stirling, Stirling, Scotland,
Weekly Work Dates:2010 8/20~8/25 Subject:Condor C.Y Hsieh.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Computational chemistry with ECCE on EGEE.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
Millions of Jobs or a few good solutions …. David Abramson Monash University MeSsAGE Lab X.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
© Geodise Project, University of Southampton, Workflow Support for Advanced Grid-Enabled Computing Fenglian Xu *, M.
Intersecting UK Grid & EGEE/LCG/GridPP Activities Applications & Requirements Mark Hayes, Technical Director, CeSC.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 22 February 2006.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
OMII-BPEL Grid Services Orchestration using the Business Process Execution Language (BPEL) Liang Chen Bruno Wassermann Project Inspector: Wolfgang Emmerich.
DIRAC: Workload Management System Garonne Vincent, Tsaregorodtsev Andrei, Centre de Physique des Particules de Marseille Stockes-rees Ian, University of.
Reading e-Science Centre
Grid Portal Services IeSE (the Integrated e-Science Environment)
SLAC monitoring Web Services
Introduction to the SHIWA Simulation Platform EGI User Forum,
Presentation transcript:

e-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart, W. Emmerich – CS H. Nowell, S. L. Price – Chem eMaterials

Combinatorial Computational Catalysis Polymorphism prediction of polymorphs – a drug substance may exist as two or more crystalline phases in which the molecules are packed differently. explore which sites are involved in catalysis – used in diverse industries including petroleum, chemical, polymers, agrochemicals, and environmental.

Combinatorial Computational Catalysis explore which sites are involved in catalysis – used in diverse industries including petroleum, chemical, polymers, agrochemicals, and environmental. Polymorphism prediction of polymorphs – a drug substance may exist as two or more crystalline phases in which the molecules are packed differently.

simulations take too long to run data are distributed across many sites and systems no catalogue system output in legacy text files, different for each program few tools to access, manage and transfer data workflow management is manual licensing within distributed environment e-Science Issues to Address

Acid Sites in Zeolites Determine the extra framework cation position within the zeolite framework. Explore which proton sites are involved in catalysis and then characterise the active sites. To produce a database with structural models and associated vibrational modes for Si/Al ratios. Improve understanding of the role of the Si/Al ratio in zeolite chemistry.

Chabazite: 1T site, 12 Si centres per unit cell, 8 membered ring channels (3.8Å * 3.8Å).

Si/Al – 11 = 4 Si/Al – 5 = 160 Si/Al – 3 = 5760 Si/Al – 2 = 184,320 The number of calculations quickly becomes an issue when realistic Si/Al ratios are considered. A Si/Al ratio of 2 would require 184,320 calculations at ~100 second each. = hours = 213 days of cpu time. The Problem When substitution of a second Al is considered there are now 4 * (10 * 4) possible structures as symmetry has been broken. Note this is for a very simple zeolite with 36 ions per unit cell, materials of interest have 296.

A combined MC and EM approach has been developed to model zeolitic materials with low and medium Si/Al ratios. Firstly Al is inserted into a siliceous unit cell and then charge compensate with cations. MC/EM

Name OpSys Arch State Activity LoadAv Mem ActvtyTime IRIX65 SGI Owner Idle :01:02 IRIX65 SGI Unclaimed Idle :15:09 ising2.ri.ac. LINUX INTEL Unclaimed Idle [?????] OSF1 ALPHA Owner Idle :26:46 xp2.ri.ac.uk OSF1 ALPHA Owner Idle :26:46 xp3.ri.ac.uk OSF1 ALPHA Unclaimed Idle :55:00 d8.ri.ac.uk WINNT40 INTEL Unclaimed Idle :09:45 ATLANTIC WINNT51 INTEL Unclaimed Idle :02:30 BABBLE.ri.ac. WINNT51 INTEL Unclaimed Idle :22:57 D500.ri.ac.uk WINNT51 INTEL Owner Idle :26:06 PCDAVIDC.ri.a WINNT51 INTEL Unclaimed Idle :51:26 e-sam.ri.ac.u WINNT51 INTEL Unclaimed Idle :16:39 pcalexey.ri.a WINNT51 INTEL Unclaimed Idle :35:53 Machines Owner Claimed Unclaimed Matched Preempting ALPHA/OSF INTEL/LINUX INTEL/WINNT INTEL/WINNT SGI/IRIX Total RI Condor Pool We have set up and tested a Condor pool at the RI, which has 50+ heterogeneous nodes from desktop PC’s, machines controlling instruments to main servers of the DFRL.

Name OpSys Arch State Activity LoadAv Mem ActvtyTime IRIX65 SGI Owner Idle :01:02 IRIX65 SGI Unclaimed Idle :15:09 ising2.ri.ac. LINUX INTEL Unclaimed Idle [?????] OSF1 ALPHA Owner Idle :26:46 xp2.ri.ac.uk OSF1 ALPHA Owner Idle :26:46 xp3.ri.ac.uk OSF1 ALPHA Unclaimed Idle :55:00 d8.ri.ac.uk WINNT40 INTEL Unclaimed Idle :09:45 ATLANTIC WINNT51 INTEL Unclaimed Idle :02:30 BABBLE.ri.ac. WINNT51 INTEL Unclaimed Idle :22:57 D500.ri.ac.uk WINNT51 INTEL Owner Idle :26:06 PCDAVIDC.ri.a WINNT51 INTEL Unclaimed Idle :51:26 e-sam.ri.ac.u WINNT51 INTEL Unclaimed Idle :16:39 pcalexey.ri.a WINNT51 INTEL Unclaimed Idle :35:53 Machines Owner Claimed Unclaimed Matched Preempting ALPHA/OSF INTEL/LINUX INTEL/WINNT INTEL/WINNT SGI/IRIX Total RI Condor Pool But where is PC-CRAC???

Level of Optimisation 50eV

Level of Optimisation 240eV

MOR Mordenite – 1 dimensional channel system simulation cell contains two unit cells 296 atoms, with 96 Si centres (referred to as T sites). Substituting 8 T sites with 8 Na cations

Gulp WinXP Gulp Files Workflow MC_subs Perl script MS Excel SRB

Gulp WinXP Gulp Files Workflow II MC_subs Perl script MS Excel SRB Si-zeo structure Interatomic pots Input file Batch of labelled Gulp files C++ f90 Scommands Subset of data in formatted file Script auto batch sub Script for cleaning dirs

Extensive use of Condor pools (UCL ~950 nodes in teaching pools). ~150 cpu-years of previously unused compute resource have been utilised in this study. Close collaboration with the NERC e-minerals project has allowed access to this resource. 150,000 calculations have been performed each with varying numbers of particles per simulation box, which means a total of ~75,000,000 particles have been included in our simulations of Mordenite to date. Condor Stats

Jobs submitted in 1,000 job batches – issue of stability. Shadows – not my game but a pain when Condor Master dies due to too many jobs hitting the queue (guilty feeling as Master was not solely running pool but also being used for science by pool administrator. Maximum number of jobs in queue. Condor Specifics

Handling of data and analysis becomes RDS. However, keeping the pool full of jobs is also a tedious step when jobs are short, which is the ideal for the UCL pool (re: turning off pool once a day) – drip feeding. Condor Specifics Thought in application design is key – many on UCL pool are TOTALLY unsuitable for UCL Condor Pool.

MOR Mordenite – 1 dimensional channel system simulation cell contains two unit cells 296 atoms, with 96 Si centres (referred to as T sites). Substituting 8 T sites with 8 Na cations

0 It can be seen that there are two distinct regions, eV to eV and eV to eV, but there is no obvious correlation between total energy and cell volume Configurations 20eV

However, when 10,000 structures are considered it is clear that the most stable structures correspond to cation placements that do not cause the cell to expand. This requires that the cations sit in the large channel Configurations 25eV

10000 Configurations

Comparison of Regions eV eV

Analysis mysql, allows input from a text file, C/C++ program or mysql command line and GUI Properties: Total energy, cell volume, lattice parameters, T-O distances, T-O-T bond angles, cation-framework oxygen distances, coordination of user specified species etc.

Gulp WinXP Gulp Files Workflow III MC_subs SRB db mysql

PropertyGoodBad Lattice Energy (eV) < > Al-Na average distance (Å) > 3.6< 3.4 cell volume (Å 3) < 5420> 5475 average cation – Oxygen (Å) > 2.75< 2.65 Building an Ensemble

Validation Comparison with experiment is very promising showing a large difference in the quality of the fit between ‘good’ set and ‘bad’.

Monitor

Jobs Analysis Model/Configuration Generator Distributed Computing Portal Steering Improve generation / model strategy User Input: Structural model Si/Al, cation types, [H 2 O] etc. User Input: Diffraction data, chemical analysis, building units, Si/Al, cation types, [H 2 O] etc. Analysis (geometry, energy, fit) D. Lewis, R. Coates, S. French UCL Chem / RI Drip Feeding and Interactive Steering using Relational Databases db

Workflow IV SSHCML Workflow service needs to be exposed to outside world as a web service Since we require new WSDL interfaces for each application it is a perfect opportunity to employ a standard representation for chemical structures. XML standard in Chemistry is CML (Chemical Markup Language)

We are now doing science that was not possible before the advancements made within e-Science. Key Achievement

FER Ferrite – 2 dimensional channel system simulation cell contains 115 atoms. substituting at 4 T sites with 4 Na cations

Only 75 out of 100 configurations optimise 14eV 100 Configurations Again there are steps in Total Energy and again this time no correlation with volume for the low number of configurations.

15eV Configurations However, this time when 10,000 structures are considered there are no clear steps in the volume. The volume still increases with decreasing stability but this is due to cell expansion caused by Al to Al interactions. Only 7500 out of optimise

Comparison of Regions

MFI ZSM5 – 3 dimensional channel system simulation cell contains 292 atoms substituting at 4 sites with 4 Na cations

10eV Configurations There is a step in Total Energy but this time only one and from then the trend is smooth.

When confirmed the lowest energy positions of Al the cation is exchanged for a proton and again energy minimised. This method will allow us to construct realistic models of low and medium Si/Al zeolites. Such structures can be used for further simulations and aid the interpretation of experimental data. What Next

BaTiO 3 Solid Solutions

BaSrTiO 3

Solid Solutions SrTiO 3

upload files as part of workflow to SRB generate metadata upload extracted data from files more extensive use of CML Ongoing and Future Work

We are now doing science that was not possible before the advancements made within e-Science. Key Achievement

1. First use of CML schema for defining Web Service port types. 2. Calculation of 50,000 configurations of zeolite Mordenite (24,000,000 particles) to gain insight into structure when a realistic ratio of Al substitution is included in model. 3. Successfully exposed Fortran codes as OGSI Web Services - prototype application deployed on 80 nodes. The prototype computational polymorph application is being ported to a larger production machine. 4. First use of BPEL standard for orchestrating web services in a Grid application. 5. Open Source BPEL implementation in development enabling late binding and dynamic deployment of large computational processes. 6. Integration of OGSI and BPEL with Sun Grid Engine. 7. Development of Graphic User Interface for polymorph application - connects to relational database via EJB interface. 8. Infrastructure for metadata and data management 9. SRB and dataportal are already being used to hold datasets and being used for transferring the data between different scientists and computer applications. 10. Implementation of Condor pool at Ri. Achievements To Date

Polymorph Prediction Different crystal structures of a molecule are called polymorphs. Polymorphs may have considerably different properties (e.g. bioavailability, solubility, morphology) Polymorph prediction is of great importance to the pharmaceutical industry where the discovery of a new polymorph during production or storage of a drug may be disastrous Drug molecules are often flexible and this makes the polymorph prediction process more challenging…

MOLPAK Generation of ~6000 densely packed crystal structures using rigid molecular probe DMAREL Lattice energy optimisation For flexible molecules: conformational optimisation n feasible rigid molecular probes representing energetically plausible conformers Data : Unit cell volume, density, lattice energy Restricted number of structures selected crystal structures and properties stored in Database Morphology n times n = number of conformers Polymorph Prediction Workflow

Store data files from simulations in the Storage Resource Broker Storage Resource Broker

We are now doing science that was not possible before the advancements made within e-Science. Key Achievement