Automatic Construction of Ab Initio Potential Energy Surfaces Interpolative Moving Least Squares (IMLS) Fitting of Ab Initio Data for Constructing Global.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Lesson 10: Linear Regression and Correlation
Roundoff and truncation errors
Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Fast Algorithms For Hierarchical Range Histogram Constructions
Pattern Recognition and Machine Learning
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Motion Planning CS 6160, Spring 2010 By Gene Peterson 5/4/2010.
Introduction to Molecular Orbitals
P M V Subbarao Professor Mechanical Engineering Department
EARS1160 – Numerical Methods notes by G. Houseman
Geometry Optimization Pertemuan VI. Geometry Optimization Backgrounds Real molecules vibrate thermally about their equilibrium structures. Finding minimum.
Potensial Energy Surface Pertemuan V. Definition Femtosecond spectroscopy experiments show that molecules vibrate in many different directions until an.
PHYS2020 NUMERICAL ALGORITHM NOTES ROOTS OF EQUATIONS.
Lecture 2: Numerical Differentiation. Derivative as a gradient
Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.
Chapter 1 Introduction The solutions of engineering problems can be obtained using analytical methods or numerical methods. Analytical differentiation.
Curve-Fitting Regression
Matching a 3D Active Shape Model on sparse cardiac image data, a comparison of two methods Marleen Engels Supervised by: dr. ir. H.C. van Assen Committee:
Potential Energy Surfaces
Optimal Bandwidth Selection for MLS Surfaces
Jana van Greunen - 228a1 Analysis of Localization Algorithms for Sensor Networks Jana van Greunen.
Classification and Prediction: Regression Analysis
Molecular Modeling: Geometry Optimization C372 Introduction to Cheminformatics II Kelsey Forsythe.
Calibration & Curve Fitting
Correlation and Linear Regression
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo
1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:
Geometry Optimisation Modelling OH + C 2 H 4 *CH 2 -CH 2 -OH CH 3 -CH 2 -O* 3D PES.
PATTERN RECOGNITION AND MACHINE LEARNING
Algorithms and Software for Large-Scale Simulation of Reactive Systems _______________________________ Ananth Grama Coordinated Systems Lab Purdue University.
Introduction to MATLAB for Engineers, Third Edition Chapter 6 Model Building and Regression PowerPoint to accompany Copyright © The McGraw-Hill Companies,
Materials Process Design and Control Laboratory ON THE DEVELOPMENT OF WEIGHTED MANY- BODY EXPANSIONS USING AB-INITIO CALCULATIONS FOR PREDICTING STABLE.
Lecture Presentation Software to accompany Investment Analysis and Portfolio Management Seventh Edition by Frank K. Reilly & Keith C. Brown Chapter 7.
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Polynomial Chaos For Dynamical Systems Anatoly Zlotnik, Case Western Reserve University Mohamed Jardak, Florida State University.
Essentials of Marketing Research
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
WB1440 Engineering Optimization – Concepts and Applications Engineering Optimization Concepts and Applications Fred van Keulen Matthijs Langelaar CLA H21.1.
Chapter 8 Curve Fitting.
Model Construction: interpolation techniques 1392.
Automatic Construction of Ab Initio Potential Energy Surfaces Interpolative Moving Least Squares (IMLS) Fitting of Ab Initio Data for Constructing Global.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Investment Analysis and Portfolio Management First Canadian Edition By Reilly, Brown, Hedges, Chang 6.
6. Introduction to Spectral method. Finite difference method – approximate a function locally using lower order interpolating polynomials. Spectral method.
FULL DIMENSIONAL VIBRATIONAL CALCULATIONS FOR METHANE USING AN ACCURATE NEW AB INITIO BASED POTENTIAL ENERGY SURFACE International Symposium on Molecular.
Serge Andrianov Theory of Symplectic Formalism for Spin-Orbit Tracking Institute for Nuclear Physics Forschungszentrum Juelich Saint-Petersburg State University,
Molecular simulation methods Ab-initio methods (Few approximations but slow) DFT CPMD Electron and nuclei treated explicitly. Classical atomistic methods.
Theoretical Chemical Dynamics Studies for Elementary Combustion Reactions Donald Thompson, Gia Maisuradze, Akio Kawano, Yin Guo, Oklahoma State University.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Lecture 13. Geometry Optimization References Computational chemistry: Introduction to the theory and applications of molecular and quantum mechanics, E.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
L15 – Spatial Interpolation – Part 1 Chapter 12. INTERPOLATION Procedure to predict values of attributes at unsampled points Why? Can’t measure all locations:
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Calculating Potential Energy Curves With Quantum Monte Carlo Andrew D Powell, Richard Dawes Department of Chemistry, Missouri University of Science and.
Machine Learning 5. Parametric Methods.
Materials Process Design and Control Laboratory ON THE DEVELOPMENT OF WEIGHTED MANY- BODY EXPANSIONS USING AB-INITIO CALCULATIONS FOR PREDICTING STABLE.
James Brown, Tucker Carrington Jr. Computing vibrational energies with phase-space localized functions and an iterative eigensolver.
A New Potential Energy Surface for N 2 O-He, and PIMC Simulations Probing Infrared Spectra and Superfluidity How precise need the PES and simulations be?
CF14 EGI-XSEDE Workshop Session Tuesday, May 20 Helsinki, Findland Usecase 2 TTU-COMPCHEM Collaboration on Direct Classical and Semiclassical Dynamics.
Part 5 - Chapter
Automated construction of Potential Energy
Statistical Methods For Engineers
Algorithms and Software for Large-Scale Simulation of Reactive Systems
Filtering and State Estimation: Basic Concepts
6.5 Taylor Series Linearization
Algorithms and Software for Large-Scale Simulation of Reactive Systems
Presentation transcript:

Automatic Construction of Ab Initio Potential Energy Surfaces Interpolative Moving Least Squares (IMLS) Fitting of Ab Initio Data for Constructing Global Potential Energy Surfaces for Spectroscopy and Dynamics Donald L. Thompson University of Missouri – Columbia Richard Dawes, Al Wagner, & Michael Minkoff Fourth International meeting : "Mathematical Methods for Ab Initio Quantum Chemistry" November 2008 Laboratoire J.A. Dieudonné CNRS et Université de Nice - Sophia-Antipolis

Potential Energy Surfaces  Basis for quantum and classical dynamics, spectroscopy  Electronic structure calculations can provide accurate energies (even gradients and Hessians) – but at a high cost (Highly accurate energy calculations for a single geometry can take hours or days) We want to:  Generate accurate global PESs fit to a minimum number (100’s – 1000’s) of ab initio points  Make ab initio dynamics feasible for the highest levels of quantum chemistry methods (for which gradients may not be directly available) As “blackbox” as possible

: Requirements:  Minimize number of ab initio points  Minimal human effort and cost of fitting  Low-cost accurate evaluations Our approach: Interpolating Moving Least Squares (IMLS)  Much cheaper than high-level quantum chemistry  Doesn’t need gradients, but can use gradients and Hessians  Can use high-degree polynomials How to make efficient and practical:  Optimally place minimum number of points  Weight functions  Reuse fitting coefficients (store local expansions)  Use zeroth-order PES and fit difference  Other techniques

Least-Squares Fitting Usual applications are for data with statistical errors, but that follow known functional forms. errors, but trends that follow known functional forms. Fitting ab initio energies b initio energies do not have random errors  Ab initio energies do not have random errors  A PES does not have a precisely known functional form  the energy points lie on a surface of unknown shape  Thus, fit with a general basis set (e.g., polynomials)  Basis functions ~ the “true” function provides a more compact representation

Weighted least squares equations B T W(z) B a(z) = B T W(z)V W=1 gives standard least squares We use standard routines

Weighted vs. standard least squares Standard, first degree fit to the 5 points IMLS, first degree IMLS fits perfectly at each point Standard, second degree IMLS, second degree First Degree Second Degree

Optimum Point Placement  We want to do the fewest number of ab initio calculations  A non-uniform distribution of points is best  We can use the fact that IMLS fits perfectly at each point to determine where to place points for the most accurate fit using the fewest possible points  Use fits of different degree IMLS fits Illustrate for 1-D Morse potential 5 “seed” points

Automatic Point Placement: 1-D Illustration Start with 5 uniformly placed points Fit with 2 nd & 3 rd degree IMLS Add new point where they differ the most Squared difference indicates where new points are needed

Point Placement Automatic Point Placement 1 new point added 5 initial points 2 new points added 3 new points added

Density adaptive weight function Automatic point placement will generate a nonuniform density of points. Thus, we use a flexible, density-dependent weight function

High Dimensional Model Representation (HDMR) basis set Can represent high dimensional function through an expansion of lower order termsCan represent high dimensional function through an expansion of lower order terms Can also use full dimensional expansion but restrict the order of terms differentlyCan also use full dimensional expansion but restrict the order of terms differently Evaluation scales as NM 2. HDMR greatly reduces M.Evaluation scales as NM 2. HDMR greatly reduces M. This also reduces the number of points required.This also reduces the number of points required.

Accurate PESs from Low-Density Data Initial testing for 3-D: HCN-HNC We used the global PES fit to ab initio points by van Mourik et al.* as a source for (cheap) points.  Saves time obtaining points  Allows extensive error analyses We fit using (12,9,7) HDMR basis: 1-coordinate term truncated at 12 th degree 2-coordinate term truncated at 9 th degree 3-coordinate term truncated at 7th degree 180 basis functions * T. van Mourik, G. J. Harris, O. L. Polyansky, J. Tennyson, A. G. Császár, and P. J. Knowles, J. Chem. Phys. 115, 3706 (2001).

Error as function of automatically selected data points 3-D HCN:HNC Automatic surface generation Using (12,9,7) & (11,8,6) bases Data Points: van Mourik et al. PES Seed points: Start with 4, 6, & 8 for r, R & cosθ Energy cutoff: 100 kcal/mol RMS Mean Successive Order: Solid True Error: Open The difference in successive orders follows closely the true error. Thus, adding points based on difference criteria results in converged true error

Convergence rate dependence on basis set: HCN Number of PointsRMS Error (kcal/mol) Obeys power law over 3 orders of magnitude Accuracy follows Farwig’s* formula for power-law convergence  Linear on log-log plot with slope ~(n+1)/D, where n = degree of basis * R. Farwig, J. Comput. Appl. Math. 16, 79 (1986); Math. Comput. 46, 577 (1986). 8 th degree & HDMR (12,9,7) both have ~ 180 fcts., but HDMR converges faster

Cutting cost: Local IMLS Cost of evaluation scales as NM 2 for standard IMLS (N=# ab initio points, M=# basis functions) High-degree standard IMLS is too costly to use directly, thus we use local-IMLS: Local approximants (polynomials) of the potential near data points are calculated using IMLS (expensive) & the interpolated value is taken to be a weighted sum of them In standard IMLS they are recomputed at each evaluation point (very accurate, but too costly)In standard IMLS they are recomputed at each evaluation point (very accurate, but too costly) The coefficients are generally slowly varyingThe coefficients are generally slowly varying In the L-IMLS approach coefficients are computed & stored at a relatively small number of pointsIn the L-IMLS approach coefficients are computed & stored at a relatively small number of points Evaluations are low cost weighted interpolations between stored pointsEvaluations are low cost weighted interpolations between stored points

Overcoming scaling problem for automatic point selection We get high accuracy & low cost with high-degree L-IMLS But must find optimum place to add each ab initio pointWe get high accuracy & low cost with high-degree L-IMLS But must find optimum place to add each ab initio point Trivial in 1-DTrivial in 1-D as shown as shown earlier earlier With L-IMLS the functions whose maxima we seek are continuously globally defined as are their gradientsWith L-IMLS the functions whose maxima we seek are continuously globally defined as are their gradients So, define negative of the squared-difference surfaceSo, define negative of the squared-difference surface We can use efficient minimization schemes such as conjugate gradient to find local minimaWe can use efficient minimization schemes such as conjugate gradient to find local minima Difference between successive orders of IMLSDifference between successive orders of IMLS Can also use variance of weighted contributions to interpolated value with local IMLSCan also use variance of weighted contributions to interpolated value with local IMLS Grid or random search scales very poorly with dimensionGrid or random search scales very poorly with dimension

Method schematic

Automated PES fitting in 3-D: HCN-HNC Basis set not well supported Spectroscopic accuracy  To less than 1 cm -1 within 792 pts with Hessians or 1000 pts with gradients The PES is fit up to 100 kcal/mol ~ cm Used 30 random starting points for minimizations HDMR (12,9,7) For 0.1 kcal/mol But we can do even better Discussed below

Dynamic Basis Procedure Avoids including points in the seed data that are not optimally located Start with very small initial grid of points & use automatic surface generation with a small basis, successively increasing the basis as points are added

Automated Dynamic Basis: 6-D (HOOH) Dynamic basis Fit up to 100 kcal/mol Fit to analylic H 2 O 2 PES* * B. Kuhn et al. J. Chem. Phys. 111, 2565 (1999) RMS error based on randomly selected test points A min. of 591 pts. would be needed if we started with the (10,7,5,4) basis. We started with 108. Convergence also much faster

Spectroscopic Accuracy: 9-D (CH 4 ) Test Case: Schwenke & Partridge PES: a least squares fit to ~8000 CCSD(T)/cc-pVTZ ab initio data over the range 0-26,000 cm -1  We fit the range 0-20,000 cm -1 (57.2 kcal/mol).  Energies & gradients only (Hessians data not cost effective as shown earlier)  Bond distances  Exploited permutation symmetry  Dynamic basis procedure D. W. Schwenke & H. Partridge, Spectrochim Acta Part A 57, 887 (2001)

Automated PES fitting in 9-D (CH 4 ) With 1552 pts. the E only RMS error is 0.41 kcal/mol & including gradients brings it down to 0.32 kcal/mol. The RMS error for the Schwenke-Partridge PES (based on 8000 pts) is ~0.35 kcal/mol The IMLS fitting is essentially automatic, little human effort, and no prior knowledge of the topology 9,6,4,4

A General 3-Atom IMLS-QC Code Input fileInput file Accuracy targetAccuracy target Energy rangeEnergy range Basis setBasis set Number of seed points and coordinate rangesNumber of seed points and coordinate ranges Type of coordinates, Jacobi, valence, bond distancesType of coordinates, Jacobi, valence, bond distances Generates input files for Gaussian, MolPro, and Aces IIGenerates input files for Gaussian, MolPro, and Aces II Energies only or energies & gradientsEnergies only or energies & gradients

A New PES for the Methylene Radical We have generated a spectroscopically accurate PES for CH2 for energies up to 20,000 cm -1 (216 vibrational states). CASSCF calculations in valence coordinates. Vibrational levels were computed using a discrete variable representation (DVR) method. DVR typically requires 10’s of thousands of ab initio points. For a benchmark we performed a DVR calculation using ab initio calculations at all 22,400 DVR points.

Singlet Methylene: fit to energies and gradients CASSCF calculation in valence coordinates. Energy range of cm -1. Estimated error vs. true error (sets of 500 random ab initio calcs). True error (RMS and mean) are sub-wavenumber using 355 points. Black: estimated errors Red: true errors True and estimated errors are in near perfect agreement

Singlet Methylene Vibrational Levels: Discrete Variable Representation (DVR) Calculation Absolute errors for 216 vibrational levels (below 20,000 cm -1 ). Variational vibrational calculations were performed using DVR and a PES fitted with a mean estimated error of 2.0 cm -1 Exact levels were benchmarked by a DVR calculation using ab initio calculations at all 22,400 DVR points.

Plot of absolute errors for 216 vibrational levels (below 20,000 cm -1 ). Variational vibrational calculations were performed using a DVR and fitted PESs with mean estimated errors of 0.5 cm -1 Exact levels were benchmarked by a DVR calculation using ab initio calculations at all 22,400 DVR points. Singlet Methylene Vibrational Levels: Discrete Variable Representation (DVR) Calculation

Singlet Methylene Vibrational Levels: Comparisons 2.0 cm -1 mean estimated error 0.5 cm -1 mean estimated error

Singlet Methylene Vibrational Levels: Discrete Variable Representation (DVR) Calculation Absolute errors for 216 vibrational levels (below 20,000 cm -1 ). Variational vibrational calculations were performed using a DVR and PES fitted with mean estimated errors of 0.33 cm -1 Exact levels were benchmarked by a DVR calculation using ab initio calculations at all 22,400 DVR points. Mean and maximum errors for levels computed with this PES are 0.10 and 0.41 cm -1.

Singlet Methylene Vibrational Levels: Comparisons 2.0 cm -1 mean estimated error 0.33 cm -1 mean estimated error

IMLS & Classical Trajectories Preliminary Efforts Two difference approaches: IMLS-accelerate direct dynamics Dynamics Driven Fitting (both under development) In both cases IMLS “intercepts” ab initio PES calls & the electronic structure code is called only if necessary (based on error estimate)

Accelerated Direct Dynamics Test case: HONO cis-trans isomerization  Trajectories were initiated with 8 quanta in the HON bend to cause rapid IVR & then isomerization (Want rapid exploration of configuration space)  Integration stepsize: 0.05 fs  Trajectories were stopped once they spent 3 times the period of the torsion mode in the range of the trans torsion angle or violated energy conservation criterion  Used HF/cc-pVDZ – want fast ab initio calculation to test the method IMLS “intercepts” direct dynamics ab initio PES calls. Electronic structure code is called only if necessary (based on error estimate) Data collection trajectories are moved back in time if the rare event of adding new ab initio data occursIMLS “intercepts” direct dynamics ab initio PES calls. Electronic structure code is called only if necessary (based on error estimate) Data collection trajectories are moved back in time if the rare event of adding new ab initio data occurs

Accelerated direct dynamics with IMLS: HONO (10,7,5,5) basis of 651 functions Values and gradients used The fit began after 25 ab initio "seed" points were generated Factor of ~20 speed up with 0.06 drift in total energy Speedup depends on error tolerance 7.6 evaluations per ab initio call for error tolerance 76.3 evaluations per ab initio call for error tolerance

Dynamics Driven Fitting: HONO cis-trans isomerization rate A series of sets of trajectories, with various energy conservation limits, are used to explore configuration space.

Accelerated direct dynamics: HONO cis-trans isomerization rate Results for PESs fit with 8 different maximum error tolerances

Concluding Comments IMLS allows automated generation of PESs for various applicationsIMLS allows automated generation of PESs for various applications SpectroscopySpectroscopy DynamicsDynamics Flexible fits to energies, energies and gradients, or higher derivatives…Flexible fits to energies, energies and gradients, or higher derivatives… Interfaced to general classical trajectory code: GenDynInterfaced to general classical trajectory code: GenDyn Interfaced to electronic structure codesInterfaced to electronic structure codes Gaussian, Molpro, Aces IIGaussian, Molpro, Aces II Robust, efficient, practical methods that assures fidelity toRobust, efficient, practical methods that assures fidelity to the ab initio data the ab initio data