QM/MM Modelling Lecture 2 Implementation Tutorial Practical QM/MM Calculations with the ChemShell Software.

Slides:



Advertisements
Similar presentations
Time averages and ensemble averages
Advertisements

Configuration management
Chapter 10: Designing Databases
INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.
Programming Paradigms and languages
Kinematic Synthesis of Robotic Manipulators from Task Descriptions June 2003 By: Tarek Sobh, Daniel Toundykov.
Chapter 3 Loaders and Linkers
SYSTEM PROGRAMMING & SYSTEM ADMINISTRATION
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
Introduction to C Programming
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
 2007 Pearson Education, Inc. All rights reserved Introduction to C Programming.
Guide To UNIX Using Linux Third Edition
ASP.NET Programming with C# and SQL Server First Edition
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Introduction to C Programming
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 2: Operating-System Structures Modified from the text book.
C++ fundamentals.
CS102 Introduction to Computer Programming
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Introduction to FORTRAN
Practical quantum Monte Carlo calculations: QWalk
Computational Chemistry, WebMO, and Energy Calculations
Process Flowsheet Generation & Design Through a Group Contribution Approach Lo ï c d ’ Anterroches CAPEC Friday Morning Seminar, Spring 2005.
The Nuts and Bolts of First-Principles Simulation Durham, 6th-13th December : DFT Plane Wave Pseudopotential versus Other Approaches CASTEP Developers’
Javier Junquera Molecular dynamics in the microcanonical (NVE) ensemble: the Verlet algorithm.
1 Computing Software. Programming Style Programs that are not documented internally, while they may do what is requested, can be difficult to understand.
CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.
1 CSC 221: Introduction to Programming Fall 2012 Functions & Modules  standard modules: math, random  Python documentation, help  user-defined functions,
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. C How To Program - 4th edition Deitels Class 05 University.
Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room A, Chris Hill, Room ,
CHAPTER TEN AUTHORING.
Chapter 6: User-Defined Functions
March 16 & 21, Csci 2111: Data and File Structures Week 9, Lectures 1 & 2 Indexed Sequential File Access and Prefix B+ Trees.
Chapter 3: Computer Software. Stored Program Concept v The concept of preparing a precise list of exactly what the computer is to do (this list is called.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Copyright 2003 Scott/Jones Publishing Standard Version of Starting Out with C++, 4th Edition Chapter 13 Introduction to Classes.
Cohesion and Coupling CS 4311
Programming Fundamentals. Today’s Lecture Why do we need Object Oriented Language C++ and C Basics of a typical C++ Environment Basic Program Construction.
The european ITM Task Force data structure F. Imbeaux.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Object-Oriented Program Development Using Java: A Class-Centered Approach, Enhanced Edition.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Computational Aspects of Multi-scale Modeling Ahmed Sameh, Ananth Grama Computing Research Institute Purdue University.
SOFTWARE DESIGN. INTRODUCTION There are 3 distinct types of activities in design 1.External design 2.Architectural design 3.Detailed design Architectural.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Progress on Component-Based Subsurface Simulation I: Smooth Particle Hydrodynamics Bruce Palmer Pacific Northwest National Laboratory Richland, WA.
Workshop Session. Workshop Preparation ChemShell overview Introduction to Tcl Script basics Modules overview ¤creating Input data objects ¤dl_poly ¤gamess.
Threaded Programming Lecture 2: Introduction to OpenMP.
Differences Training BAAN IVc-BaanERP 5.0c: Application Administration, Customization and Exchange BaanERP 5.0c Tools / Exchange.
MSc in High Performance Computing Computational Chemistry Module Parallel Molecular Dynamics (i) Bill Smith CCLRC Daresbury Laboratory
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
Chapter – 8 Software Tools.
Text TCS INTERNAL Oracle PL/SQL – Introduction. TCS INTERNAL PL SQL Introduction PLSQL means Procedural Language extension of SQL. PLSQL is a database.
CEN6502, Spring Understanding the ORB: Client Side Structure of ORB (fig 4.1) Client requests may be passed to ORB via either SII or DII SII decide.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
ALLOY: A Formal Methods Tool Glenn Gordon Indiana University of Pennsylvania COSC 481- Formal Methods Dr. W. Oblitey 26 April 2005.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
Coupling and Cohesion Schach, S, R. Object-Oriented and Classical Software Engineering. McGraw-Hill, 2002.
SOFTWARE DESIGN AND ARCHITECTURE
Implementation Issues
Spark Presentation.
DL_POLY Miguel A. Gonzalez Institut Laue-Langevin
Chapter 2: System Structures
Software Design Lecture : 9.
Presentation transcript:

QM/MM Modelling Lecture 2 Implementation Tutorial Practical QM/MM Calculations with the ChemShell Software

Outline Implementation Issues Three QM/MM software approaches what factors affect the design Three QM/MM software approaches QM added as an extension to forcefield MM environment added to a small-molecule treatment Modular scheme with a range of QM and MM methods ChemShell Concepts and sample modules Approaches to periodic systems High Performance Computing Approach within QUASI Parallel computing techniques within GAMESS-UK, DL_POLY etc Tutorial Session Simple Tcl & ChemShell scripts, using the PyMOL GUI

Implementation Issues Most implementations are based on existing QM and MM packages Implementers face a basic conflict between modularity keeping programs separate minimal modifications to allow easy upgrades Performance and Functionality supporting more complex interactions minimising overhead, recalculation, file access etc

Software Approaches QM/MM Implementations can QM added as an extension to forcefield CHARMM/GAMESS-UK MM environment added to a small-molecule treatment GAMESS-UK/AMBER, Gaussian/AMBER (Manchester) Modular scheme with a range of QM and MM methods specialised optimisation algorithms e.g. ChemShell

QM/MM Software Approach (i) Specialised for a classical modelling approach, by integrating QM code into MM package CHARMM + GAMESS(US), MNDO (Harvard & NIH) AMBER + Gaussian (UCSF, Manchester) QM/Pot, GULP + TURBOMOLE (Berlin) CHARMM + GAMESS (UK) Daresbury & Bernie Brooks, Eric Billings (NIH) Similar to existing ab-initio interfaces; CHARMM side follows coupling to GAMESS(US) (Milan Hodoscek) Support for Gaussian delocalised point charges Advantages Good MD capabilities, model building etc Disadvantages Restricted to certain classes of systems by forcefield choice

QM/MM Software Approach (ii) Extend QM code by adding MM environment as a perturbation QM code is “in control” MM environment will typically be relaxed for each QM structure (hidden coordinates) e.g. IMOMM, Gaussian/AMBER (Manchester) Advantages Familiarity for quantum chemists Tools for small molecule manipulation (internal coordinates, transition state search etc) are available Disadvantages Relatively poor tools for management of conformational search of QM/MM surface, QM/MM dynamics etc

QM/MM Software Approach (iii) Modular approach covering a range of MM and QM codes e.g. ChemShell Advantages Flexibility of applications areas Ability to choose the best QM code for the specific task Ease of update of component codes Disadvantages Software Complexity range of forcefield types wide variation in QM and MM program design Close integration needed for performance (e.g. HPC), but weak coupling simplifies maintenance

Script design goals Keep all control input together in one place Avoid proliferation of small scripts and control files Permit flexible user scripting Support heirarchical command structure (modules can invoke each other) Relationship between modules should be clear from the script The same module should be able to appear more than once (with different control arguments) Re-entrant invocation should be possible for some modules (e.g. optimisers) Support both loosely-coupled and efficient execution Provide option to keep data objects accessed by different modules in memory

ChemShell Extended Tcl Interpreter Scripting capability Interfaces to a range of QM and MM codes including GAMESS-UK DL_POLY MNDO97 TURBOMOLE CHARMM GULP Gaussian94 Implementation of QM/MM coupling schemes link atom placement, forces etc boundary charge corrections

ChemShell Architecture - Languages An extended Tcl interpeter, written in.. Tcl Control scripts Interfaces to 3rd party executables GUI construction (Tk and itcl) Extensions C Tcl command implementations Object management (fragment, matrix, field, graph) Tcl and C APIs I/O Open GL graphics Fortran77 QM and MM codes: GAMESS-UK, GULP, MNDO97, DL_POLY

ChemShell Architecture Core ChemShell Tcl interpreter, with code to support chemistry data: Optimiser and dynamics drivers QM/MM Coupling schemes Utilities Graphics GAMESS-UK (ab-initio, DFT) MNDO (semi-empirical) DL_POLY (MM) GULP (Shell model, defects) Features Single executable possible Parallel implementations External Modules CHARMM TURBOMOLE Gaussian94 MOPAC AMBER CADPAC Features Interfaces written in Tcl No changes to 3rd party codes

ChemShell basics ChemShell control files are Tcl Scripts Usually we use a .chm suffix ChemShell commands have some additional structure, usually they take the following form command arg1=value1 arg2=value2 Arguments can serve many functions mm_defs=dl_poly.ff Identify a data file to use coords=c Use object c as the source of the structure use_pairlist=yes Provide a Boolean flag (yes/no, 1/0,, on/off) list_option=full Provide a keyword setting theory=gamess Indicate which compute module to use Sometimes command arg1=value1 arg2=value2 data

ChemShell Object types fragment - molecular structure creation: c_create, load_pdb …. Universal!! zmatrix z_create, newopt, z_surface matrix creation: create_matrix, energy and gradient evaluators, dynamics field creation: cluster_potential etc, graphical display, charge fits GUI only 3dgraph

ChemShell Object Representations Between calculations, and sometimes between commands in a script, objects are stored as files. Usually there is no suffix, objects are distinguished internally by the block structure. % cat c block = fragment records = 0 block = title records = 1 phosphine block = coordinates records = 34 p 4.45165900000000e+00 0.00000000000000e+00 -8.17756491786826e-16 c 6.18573550000000e+00 -2.30082107458395e+00 1.93061811508830e+00 c 8.21288557680633e+00 -3.57856465464377e+00 9.30188790875432e-01 c 9.49481331797844e+00 -5.31433733714784e+00 2.37468849115612e+00 c 8.74959098234423e+00 -5.77236643959209e+00 4.81961751564967e+00 …….... Multi block objects are initiated by an empty block (e.g. fragment) Unrecognised blocks are silently ignored

Object Caching During a run objects can be cached in memory, the command to request this is the name of the object (similar to a declaration in a compiled language) # fragment c c_create coords=c { h 0 0 0 h 0 0 1 } list_molecule coords=c delete_object c # No file is created here Confusion of objects with files can lead to confusion!!

Object Input and Output If you access an object from a disk, ChemShell will always update the disk copy when it has finished (there is no easy way of telling if a command or procedure has changed it). Usually this is harmless (e.g. output formats are precise enough), but unrecognised data in the input will not be present in the output, take a copy if you need to keep it. E.g. if a GAMESS-UK punchfile contains a fragment object and a single data field (e.g. the potential) you can use it as both a fragment object and a field object % rungamess test % cp test.pun my_structure % cp test.pun my_field % chemsh …….

Energy Gradient Evaluators Many modules are designed to work with a variety of methods to compute the energy and gradient. The procedure relies on the interfaces to the codes being consistent, each comprises a set of callable functions e.g. initialisation energy, gradient kill update the particular set of functions being requested by a command option, usually theory= Example evaluators (depends on locally available codes) gamess, turbomole dl_poly, charmm mopac, mndo hybrid You can write your own in Tcl

Module options, using the : The : syntax is used to pass control options to sub-module. e.g. when running the optimiser, to set the options for the module computing the energy and gradient. {} can be used if there is more than one argument to pass on. Nested structures are possible using Tcl lists newopt function=zopt : { theory=gamess : { basis=sto3g } zmatrix=z } gamess arguments Command zopt arguments newopt arguments

Object Oriented Interfaces Purpose To allow a script to manage multiple instances of the same data structure Support for recursive/reentrant algorithms Examples PE surface drivers newopt, dynamics Coordinate transformation routines copt, zopt, dlcopt QM and MM interfaces dl_poly, mopac Syntax Modelled on Tk, itcl, and other object-oriented Tcl extensions mopac mop1 coords=water.c energy=water.e hamiltonian=am1 mop1 energy # additional invocations mop1 delete

Using multiple objects - A QM/MM optimisation scheme newopt1 Object types newopt: optimiser object zopt: Z-matrix/cartesian transformation dl_poly: MM energy and forces hopt: Custom written target function (150 lines of Tcl) Instances zopt1: reduced (active site) coordinates zopt2: environment coordinates newopt1: master optimisation, QM/MM hamiltonian, in a reduced coordinate space newopt2: relax the environment using a pure MM energy expression hopt newopt2 zopt2 dl_poly2 zopt1 hybrid dl_poly1 mopac

HPC Exploitation Adapt Tcl interpreter to run in parallel, master-slave architecture one node reads input, all nodes execute commands in-core storage of data objects replicated: default distributed: specific matrix objects Use of parallel third party codes (e.g. TURBOMOLE) Direct coupling to parallel codes: GA-based GAMESS-UK MPI-based MNDO, GULP, DL_POLY

GAMESS-UK Parallel Implementation Replicated data scheme store P, F, S etc on every node minimal communications (load balancing, global sum) up to ca. 2000 basis functions Message-passing version (MPI, TCGMSG) SCF and DFT Suitable for < 32 processors Global Array version: Parallel functionality SCF, DFT, MP2, SCF Hessian Parallel algorithms GAs for in-core storage of transformed integrals (to vvoo) and MP2 amplitudes parallel linear algebra (PEIGS, DIIS, MXM etc) GA-mapped ATMOL file system

Thrombin QM/MM Benchmark QM region (69 Atoms) is modelled by SCF 6-31G calculation (401 GTOs), total system size for the classical calculation is 16,659 atoms. The MM calculation was performed using all non-bonded interactions (i.e. without a pairlist), and the QM/MM interaction was cut off at 15 angstroms using a neutral group scheme (leading to the inclusion in the QM calculation of 1507 classical centres). Number of Nodes

QM/MM Thrombin Benchmark QM/MM calculation on thrombin pocket: QM region (69 Atoms) is modelled by SCF 6-31G calculation (401 GTOs), total system size for the classical calculation is 16,659 atoms. QM/MM interaction was cut off at 15 angstroms using a neutral group scheme (leading to the inclusion in the QM calculation of 1507 classical centres). Scaling for Thrombin + Inhibitor (NAPAP) + Water Number of Nodes

Workshop Session

Workshop Preparation ChemShell overview Introduction to Tcl Script basics Modules overview creating Input data objects dl_poly gamess QM/MM Methods hybrid QM/MM models available Input examples ChemShell GUI

ChemShell A Tcl interpreter for Computational Chemistry Interfaces ab-initio (GAMESS-UK, Gaussian, CADPAC, TURBOMOLE, MOLPRO, NWChem etc) semi-emprical (MOPAC, MNDO) MM codes (DL_POLY, CHARMM, GULP) optimisation, dynamics (based on DL_POLY routines) utilities (clusters, charge fitting etc) coupled QM/MM methods Choice of QM and MM codes A variety of QM/MM coupling schemes electrostatic, polarised, connection atom, Gaussian blur .. QUASI project developments and applications e.g. Organometallics, Enzymes, Oxides, Zeolites

ChemShell Extended Tcl Interpreter Scripting capability Interfaces to a range of QM and MM codes including GAMESS-UK DL_POLY MNDO97 TURBOMOLE CHARMM GULP Gaussian94 Implementation of QM/MM coupling schemes link atom placement, forces etc boundary charge corrections

ChemShell Architecture - Languages An extended Tcl interpeter, written in.. Tcl Control scripts Interfaces to 3rd party executables GUI construction (Tk and itcl) Extensions C Tcl command implementations Object management (fragment, matrix, field, graph) Tcl and C APIs I/O Open GL graphics Fortran77 QM and MM codes: GAMESS-UK, GULP, MNDO97, DL_POLY

ChemShell Architecture Core ChemShell Tcl interpreter, with code to support chemistry data: Optimiser and dynamics drivers QM/MM Coupling schemes Utilities Graphics GAMESS-UK (ab-initio, DFT) MNDO (semi-empirical) DL_POLY (MM) GULP (Shell model, defects) Features Single executable possible Parallel implementations External Modules CHARMM TURBOMOLE Gaussian94 MOPAC AMBER CADPAC Features Interfaces written in Tcl No changes to 3rd party codes

ChemShell basics ChemShell control files are Tcl Scripts Usually we use a .chm suffix ChemShell commands have some additional structure, usually they take the following form command arg1=value1 arg2=value2 Arguments can serve many functions mm_defs=dl_poly.ff Identify a data file to use coords=c Use object c as the source of the structure use_pairlist=yes Provide a Boolean flag (yes/no, 1/0,, on/off) list_option=full Provide a keyword setting theory=gamess Indicate which compute module to use Sometimes command arg1=value1 arg2=value2 data

Very Simple Tcl (i) Variable Assignment (all variables are strings) set a 1 Variable use [ set a ] $a Command result substitution [ <Tcl command> ] Numerical expressions set a [ expr 2 * $b ] Ouput to stdout puts stdout “this is an output string”

Very Simple Tcl (ii) Lists - often passed to ChemShell commands as arguments set a { 1 2 3 } set a “1 2 3” set a [ list 1 2 3 ] $ is evaluated within [ list .. ] and “ “ but not { } [ list … ] construct is best for building nested lists using variables Arrays - not used in ChemShell arguments, useful for user scripts Associative - can be indexed using any string set a(1) 1 set a(fred) x parray a

Very Simple Tcl (iii) Continuation lines: can escape the newline tclsh % set a “this \ is \ a single variable” this is a single variable tclsh % { } will incorporate newlines into the list tclsh % set a {this is a single variable} this a single variable tclsh %

Very Simple Tcl (iv) Procedures Files Sometimes needed to pass to ChemShell commands to provide an action proc my_procedure { my_arg1 my_arg2 args } { puts stdout “my_procedure” return “the result” } Files set fp [ open my.dat w ] puts $fp “set x $x” close $fp ….. source my.dat

ChemShell Object types fragment - molecular structure creation: c_create, load_pdb …. Universal!! zmatrix z_create, newopt, z_surface matrix creation: create_matrix, energy and gradient evaluators, dynamics field creation: cluster_potential etc, graphical display, charge fits GUI only 3dgraph

ChemShell Object Representations Between calculations, and sometimes between commands in a script, objects are stored as files. Usually there is no suffix, objects are distinguished internally by the block structure. % cat c block = fragment records = 0 block = title records = 1 phosphine block = coordinates records = 34 p 4.45165900000000e+00 0.00000000000000e+00 -8.17756491786826e-16 c 6.18573550000000e+00 -2.30082107458395e+00 1.93061811508830e+00 c 8.21288557680633e+00 -3.57856465464377e+00 9.30188790875432e-01 c 9.49481331797844e+00 -5.31433733714784e+00 2.37468849115612e+00 c 8.74959098234423e+00 -5.77236643959209e+00 4.81961751564967e+00 …….... Multi block objects are initiated by an empty block (e.g. fragment) Unrecognised blocks are silently ignored

Object Caching During a run objects can be cached in memory, the command to request this is the name of the object (similar to a declaration in a compiled language) # fragment c c_create coords=c { h 0 0 0 h 0 0 1 } list_molecule coords=c delete_object c # No file is created here Confusion of objects with files can lead to confusion!!

Object Input and Output If you access an object from a disk, ChemShell will always update the disk copy when it has finished (there is no easy way of telling if a command or procedure has changed it). Usually this is harmless (e.g. output formats are precise enough), but unrecognised data in the input will not be present in the output, take a copy if you need to keep it. E.g. if a GAMESS-UK punchfile contains a fragment object and a single data field (e.g. the potential) you can use it as both a fragment object and a field object % rungamess test % cp test.pun my_structure % cp test.pun my_field % chemsh …….

Energy Gradient Evaluators Many modules are designed to work with a variety of methods to compute the energy and gradient. The procedure relies on the interfaces to the codes being consistent, each comprises a set of callable functions e.g. initialisation energy, gradient kill update the particular set of functions being requested by a command option, usually theory= Example evaluators (depends on locally available codes) gamess, turbomole dl_poly, charmm mopac, mndo hybrid You can write your own in Tcl

Module options, using the : The : syntax is used to pass control options to sub-module. e.g. when running the optimiser, to set the options for the module computing the energy and gradient. {} can be used if there is more than one argument to pass on. Nested structures are possible using Tcl lists newopt function=zopt : { theory=gamess : { basis=sto3g } zmatrix=z } gamess arguments Command zopt arguments newopt arguments

Loading Objects - Z-matrices z_create zmatrix=z { zmatrix angstrom c x 1 1.0 n 1 cn 2 ang f 1 cf 2 ang 3 phi variables cn 1.135319 cf 1.287016 phi 180. constants ang 90. end } z_list zmatrix=z set p [ z_prepare_input zmatrix=z ] puts stdout $p z_create provides input processor for the z-matrix object z_list can be used to display the object in a readable form z_prepare_input provides the reverse transformation if you need something to edit z_to_c provides the cartesian representation

Additional Z-matrix features (i) Can include some atoms specified using cartesian coordinates Can use symbols for atom-path values (i1,i2,i3) Can append % to symbols to make them unique (e.g. o%1) Can create/destroy and set variables using Tcl commands z_create zmatrix=z1 { zmatrix o%1 o%2 o%1 3. h%1 o%1 1.8 o%2 121.0 h%2 o%2 1.85 o%1 122.0 h%1 29.0 end } z_var zmatrix=z1 result=z2 control= "release all" z_substitute zmatrix=z2 values= {r2=3.0 r3=2.0} z_list zmatrix=z2

Additional Z-matrix features (ii) Combine cartesian and internal definitions z_create zmatrix=z1 { coordinates ... zmatrix .... end } c_to_z will create a fully cartesian z-matrix

Loading Data Object - Coordinates c_create coords=h2o.c { title water dimer coordinates au o 0.0000000000 0.0000000000 0.0000000000 h 0.0000000000 -1.4207748912 1.0737442022 h 0.0000000000 1.4207748912 1.0737442022 o -4.7459987607 0.0000000000 -2.7401036621 h -3.1217528345 0.0000000000 -2.0097934033 h -4.4867611522 0.0000000000 -4.5020127872 } No symbols allowed Can also use read_xyz, read_pdb, read_xtl

Periodic Systems (i) Crystallographic cell constants can be provided, along with fractional coordinates # c_create coords=mgo.c { space_group 1 cell_constants angstrom 4.211200 4.211200 4.211200 90.00 90.00 90.00 coordinates Mg 0.10000000 0.00000000 0.00000000 O 0.50000000 0.50000000 -0.50000000 Mg 0.00000000 0.50000000 -0.50000000 O 0.50000000 1.00000000 -1.00000000 Mg 0.50000000 0.00000000 -0.50000000 O 1.00000000 0.50000000 -1.00000000 Mg 0.50000000 0.50000000 0.00000000 O 1.00000000 1.00000000 -0.50000000 } list_molecule coords=c set p [ c_prepare_input coords=c ]

Periodic Systems (ii) Alternatively, input the cell explicitly, in c_create or attach to the structure later # c_create coords=d { title primitive unit cell of diamond coordinates au c 0.8425347285 0.8425347285 0.8425347285 c -0.8425347285 -0.8425347285 -0.8425347285 cell au 0.00000 3.37014 3.37014 3.37014 0.00000 3.37014 3.37014 3.37014 0.00000 } extend_fragment coords=d cell_indices= { -2 2 -2 2 -2 2 } result=d2 set_cell coords=d cell= { 0.00000 3.37014 3.37014 3.37014 0.00000 3.37014 3.37014 3.37014 0.00000 }

QM Code Interfaces Provides access to third party codes GAMESS-UK MOPAC MNDO TURBOMOLE Gaussian98 Standardised interfaces argument structure hamiltonian (includes functional) charge, mult, scftype basis (internal library or keywords) accuracy direct symmetry maxcyc...

GAMESS-UK Interface Can be built in two ways Notes Interface calls GAMESS-UK and the job is executed using rungamess (so you may need to have some environment varables set) parallel execution can be requested even if ChemShell is running serially GAMESS-UK is built as part of ChemShell mainly intended for parallel machines energy coords=c \ theory=gamess : { basis=dzp hamiltonian=b3lyp } \ energy=e Notes The jobname is gamess1 unless specified Some code-specific options dumpfile= specify dumpfile routing getq = load vectors from foreign dumpfile

GAMESS-UK example - basis library energy coords=c \ theory=gamess : { basisspec = { { sto-3g *} {dzp o} } } \ energy=e basisspec has the structure { { basis1 atoms-spec1} {basis2 atomspec2} ….. } Assignment proceeds left to right using pattern matching for atom labels * is a wild card This example gives sto-3g for all atoms except o Library can be extended in the Tcl script (see examples/gamess/explicitbas.chm) ECPs are used where appropriate for the basis

Core modules: DL_POLY Features Energy and gradient routines from DL_POLY (Bill Smith UK, CCP5) General purpose MM energy expression, including approximations to CFF91 (e.g. zeolites) CHARMM AMBER MM2 Topology generator automatic atom typing parameter assignment based on connectivity topology from CHARMM PSF input FIELD, CONFIG, CONTROL are generated automatically FIELD is built up using terms defined in the file specified by mm_defs= argument Periodic boundary conditions are limited to parallelopiped shaped cells Can have multiple topologies active at one time

DL_POLY forcefield terms Terms are input using atom symbols (or * wild card) Individual keyword terms: bond mm2bond quarbond angle mm2angle quarangle ptor mm2tor htor cfftor aa-couple aat-couple vdw powers m_n_vdw 6_vdw mm2_vdw Input units are kcal/mol, angstrom etc in line with most forcefield publications For full description see the manual

Automatic atom type assigment Forcefield definition can incorporate connectivity-based atom type definitions which will be used to assign types Atom types are hierarchical, most specific applicable type will be used (algorithm is iterative) e.g. to use different parameters for ipso-C of PPh3 define a new type by a connection to phosphorous query ci "ipso c" supergroup c target c atom p connect 1 2 endquery charge c -0.15 charge h 0.15

DL_POLY Example # dummy forcefield read_input dl_poly.ff { bond c c 100 1.5 bond c h 100 1.0 angle c c c 100 120 angle c c h 100 120 vdw h h 2500 1000000 vdw c c 2500 1000000 vdw h c 2500 1000000 htor c c c c 100 0.0 i-j-k-l charge c -0.15 charge h 0.15 } energy theory=dl_poly : mm_defs = dl_poly.ff coords=c energy=e

Using DL_POLY with CHARMM Parameters Replicates CHARMM energy expression (without UREY) Uses standard CHARMM datafiles Requires CHARMM program + script to run as far as energy evaluation for initial setup Atom charges and atom types are obtained by communication with a running CHARMM process (usually only run once) # run charmm using script provided charmm.preinit charmm_script=all.charmm coords=charmm.c # Store type names from the topology file load_charmm_types2 top_all22_prot.inp charmm_types # These requires CTCL (i.e. charmm running) set types [ get_charmm_types ] set charges [ get_charmm_charges ] set groups [ get_charmm_groups ] # charmm.shutdown

Using DL_POLY with CHARMM Parameters Then provide dl_poly interface with .psf (topology) charmm_psf = .rtf (for atom types) charmm_mass_file= .inp parameter files charmm_parameter_file= theory=dl_poly : [ list \ list_option=full \ cutoff = [ expr 15 / 0.529177 ] \ scale14 = { 1.0 1.0 } \ atom_types= $types \ atom_charges= $charges \ use_charmm_psf=yes \ charmm_psf_file=4tapap_wat961.psf \ charmm_parameter_file=par_all22_prot_mod.inp \ charmm_mass_file= $top ]

Core modules: Geometry Optimisers Small Molecules Internal coordinates (delocalised, redundant etc) Full Hessian O(N3) cost per step BFGS, P-RFO Macromolecules Cartesian coordinates Partial Hessian (e.g. diagonal) O(N) cost per step Conjugate gradient L-BFGS Coupled QM/MM schemes Combine cartesian and internal coordinates Reduce cost of manipulating B, G matrices Define subspace (core region) and relax environment at each step reduce size of Hessian exploit greater stability of minimisation vs. TS algorithms Use approximate scheme for environmental relaxation

QUASI - Geometry Optimisation Modules newopt A general purpose optimiser Target functions, specified by function = copt : cartesian (obsolete) zopt: z-matrix (now also handles cartesians) new functions can be written in Tcl (see example rosenbrock) For QM/MM applications e.g. P-RFO adapted for presence of soft modes Hessian update includes partial finite difference in eigenmode basis New algorithms can be coded in Tcl using primitive steps (forces, updates, steps, etc). hessian Generates hessian matrices (e.g. for TS searching)

Newopt example - minimisation # # function zopt allows the newopt optimiser to work with # the energy as a function of the internal coordinates of # the molecule newopt function=zopt : { theory=gamess : { basis=dzp } } \ zmatrix=z

Newopt example - transition state determination # functions zopt.* allow the newopt optimiser to work with # the energy as a function of the internal coordinates of # the molecule set args "{theory=gamess : { basis=3-21g } zmatrix=z}" hessian function=zopt : [ list $args ] \ hessian=h_fcn_ts method=analytic newopt function=zopt : [ list $args ] \ method=baker \ input_hessian=h_fcn_ts \ follow_mode=1

HDLCopt optimiser hdlcopt Hybrid Delocalised Internal Coordinate scheme (Alex Turner, Walter Thiel, Salomon Billeter) Developed within QUASI project O(N) overall scaling per step Key elements: Residue specification, often taken from a pdb file (pdb_to_res) allows separate delocalised coordinates to be generated for each residue Can perform P-RFO TS search in the first residue with relaxation of the others increased stability for TS searching Much smaller Hessians Further information on algorithm S.R. Billeter, A.J. Turner and W. Thiel, PCCP 2000, 2, p 2177

HDLCopt example # procedure to update the last step proc hdlcopt_update { args } { parsearg update { coords } $args write_xyz coords= $coords file=update.xyz end_module } # select residues set residues [ pdb_to_res "4tapap_wat83.pdb" ] # load coordinates read_pdb file=4tapap_wat83.pdb coords=4tapap_wat83.c hdlcopt coords=4tapap_wat83.c result=4tapap_wat83.opt \ theory=mndo : { hamiltonian=am1 charge=1 optstr={ nprint=2 kitscf=200 } } \ memory=200 residues= $residues \ update_procedure=hdlcopt_update

GULP Interface Simple interface to GULP energy and forces GULP licensing from Julian Gale GULP must be compiled in in alpha version only for the workshop ChemShell fragment object supports shells Shells are relaxed by GULP with cores fixed, ChemShell typically controls the core positions Provide forcefield in standard GULP format

GULP interface example read_input gulp.ff { # from T.S.Bush, J.D.Gale, C.R.A.Catlow and P.D. Battle # J. Mater Chem., 4, 831-837 (1994) species Li core 1.000 Na core 1.000 ... buckingham Li core O shel 426.480 0.3000 0.00 0.0 10.0 Na core O shel 1271.504 0.3000 0.00 0.0 10.0 spring Mg 349.95 Ca 34.05 } add_shells coords=mgo.c symbols= {O Mg} newopt function=copt : [ list coords=mgo.c theory=gulp : [ list mm_defs=gulp.ff ] ]

ChemShell CHARMM Interface Full functionality from standard academic CHARMM Dual process model CHARMM runs a separate process CHARMM/Tcl interface (CTCL, Alex Turner) uses named pipe to issue CHARMM commands and return results. commands to export data for DL_POLY and hybrid modules atomic types and charges neutral groups topologies and parameters Acccess to coupling models internal to CHARMM GAMESS(US), MOPAC GAMESS-UK (under development) collaboration with NIH explore additional coupling schemes (e.g. double link atom)

CHARMM Interface - example # start charmm process, create chemshell object # containing the initial structure charmm.preinit script=charmm.in coords=charmm.c # ChemShell commands with theory=charmm hdlcopt theory=charmm coords=charmm.c # destroy charmm process charmm.shutdown

Molecular Dynamics Module Design Features Generic - can integrate QM, MM, QM/MM trajectories Based on DL_POLY routines Integration by Verlet leap-frog SHAKE constraints Quaternion rigid body motion NVT, NPT, NVE integration Script-based control of primitive steps Simulation Protocols equilibration simulated annealing Tcl access to ChemShell matrix and coordinate objects e.g. force modification for harmonic restraint Data output trajectory output, restart files

Molecular Dynamics - arguments Object oriented syntax follows Tk etc dynamics dyn1 coords=c … etc Arguments theory= module used to compute energy and forces coords= initial configuration of the system timestep= integration timestep (ps) [0.0005] temperature= simulation temperature (K) [293] mcstep= Max step displacement (a.u.) for Monte Carlo [0.2] taut= Tau(t) for Berendsen Thermostat (ps) [0.5] taup= Tau(p) for Berendsen Barostat (ps) [5.0] compute_pressure= Whether to compute pressure and virial (for NVT simulation) verbose= Provide additional output energy_unit= Unit for output

Molecular dynamic - arguments Arguments (cont.) rigid_groups= rigid group (quaternion defintions) constraints= interatomic distances for SHAKE ensemble= Choice of ensemble [NVE] frozen= List of frozen atoms trajectory_type= Additional fields for trajectory (> 0 for velocity, > 1 for forces) trajectory_file= dynamics.trj file for trajectory output Methods: Dyn1 configure temperature=300 configure - modify simulation parameters initvel - initialise random velocities forces - evaluate molecular forces step - Take MD step

Molecular Dynamics - More methods update - Request MM or QM/MM pairlist update mctest - Test step (Monte Carlo only) output - print data (debugging use only printe - print step number, kinetic, potential, total energy, temperature, pressure volume and virial. get - Return a variable from the dynamics, temperature, input_temperature, pressure, input_pressure, total_energy, kinetic_energy, potential_energy time trajectory - Output the current configuration to the trajectory file destroy - free memory and destroy object fcap - force cap load - recover positions/velocities dump - save positions/velocities dumpdlp - write REVCON

Molecular Dynamics - Example dynamics dyn1 coords=c theory=mndo temperature=300 timestep=0.005 dyn1 initvel set nstep 0 while {$nstep < 10000 } { dyn1 force dyn1 step ..... # additional Tcl commands here incr nstep } dyn1 configure temperature=300 # etc dyn1 destroy

Hybrid Module All control data held in Tcl lists created Implements by setup program (Z-matrix style input) by scripts or GUI etc from user-supplied list of QM atoms provided as an argument Implements Book-keeping Division of atom lists addition of link atoms summation of energy/forces Charge shift, and addition of a compensating dipole at M2 Force resolution when link atoms are constrained to bond directions, e.g. the force on first layer MM atom (M1) arising from the force on the link atom is evaluated: Uses neutral group cutoff for QM/MM interactions

Hybrid Module Typical input options qm_theory= mm_theory= qm_region = { } list of atoms in the QM region coupling = type of coupling groups = { } neutral charge groups cutoff = QM/MM cutoff atom_charges = MM charges

The PyMOL GUI Implemented as a Python extension to PyMOL, an open source molecular modelling program. Provides control of QM, MM, and QM/MM calculations using HDLCOpt, Newopt, and Dynamics modules Visualisation of structures, frequencies, orbitals Under development this is an understatement, it was written last week Writes a ChemShell script (in Tcl) comprising control parameters job-specific segments

PyMOL tips

The PyMOL ChemShell GUI