Presentation is loading. Please wait.

Presentation is loading. Please wait.

Saturday, 02 May 2015Saturday, 02 May 2015Saturday, 02 May 2015Saturday, 02 May 2015 2005 Taipei Summer Institute on Strings, Particles and Fields on Strings,

Similar presentations


Presentation on theme: "Saturday, 02 May 2015Saturday, 02 May 2015Saturday, 02 May 2015Saturday, 02 May 2015 2005 Taipei Summer Institute on Strings, Particles and Fields on Strings,"— Presentation transcript:

1 Saturday, 02 May 2015Saturday, 02 May 2015Saturday, 02 May 2015Saturday, 02 May 2015 2005 Taipei Summer Institute on Strings, Particles and Fields on Strings, Particles and Fields Simulation Algorithms for Lattice QCD A D Kennedy School of Physics, The University of Edinburgh

2 2 Saturday, 02 May 2015A D Kennedy Contents Monte Carlo methods Functional Integrals and QFT Central Limit Theorem Importance Sampling Markov Chains Convergence of Markov Chains Autocorrelations Hybrid Monte Carlo MDMC Partial Momentum Refreshment Symplectic Integrators Dynamical fermions Grassmann algebras Reversibility PHMC RHMC Chiral Fermions On-shell chiral symmetry Neuberger’s Operator Into Five Dimensions Kernel Schur Complement Constraint Approximation tanh Золотарев Representation Continued Fraction Partial Fraction Cayley Transform Chiral Symmetry Breaking Numerical Studies Conclusions Approximation Theory Polynomial approximation Weierstraß’ theorem Бернштейне polynomials Чебышев’s theorem Чебышев polynomials Чебышев approximation Elliptic functions Liouville’s theorem Weierstraß elliptic functions Expansion of elliptic functions Addition formulæ Jacobian elliptic functions Modular transformations Золотарев’s formula Arithmetico-Geometric mean Conclusions Bibliography Hasenbusch Acceleration

3 3 Saturday, 02 May 2015A D Kennedy The Expectation value of an operator  is defined non-perturbatively by the Functional Integral The action is S (  ) Normalisation constant Z chosen such that d  is the appropriate functional measure Continuum limit: lattice spacing a  0 Thermodynamic limit: physical volume V   Defined in Euclidean space-time Lattice regularisation Functional Integrals and QFT

4 4 Saturday, 02 May 2015A D Kennedy Monte Carlo methods: I Monte Carlo integration is based on the identification of probabilities with measures There are much better methods of carrying out low dimensional quadrature All other methods become hopelessly expensive for large dimensions In lattice QFT there is one integration per degree of freedom We are approximating an infinite dimensional functional integral

5 5 Saturday, 02 May 2015A D Kennedy Measure the value of  on each configuration and compute the average Monte Carlo methods: II Generate a sequence of random field configurations chosen from the probability distribution

6 6 Saturday, 02 May 2015A D Kennedy Central Limit Theorem: I Distribution of values for a single sample  =  (  ) Central Limit Theorem where the variance of the distribution of  is Law of Large Numbers The Laplace–DeMoivre Central Limit theorem is an asymptotic expansion for the probability distribution of

7 7 Saturday, 02 May 2015A D Kennedy The first few cumulants are Central Limit Theorem: II Note that this is an asymptotic expansion Generating function for connected moments

8 8 Saturday, 02 May 2015A D Kennedy Central Limit Theorem: III Connected generating function Distribution of the average of N samples

9 9 Saturday, 02 May 2015A D Kennedy Take inverse Fourier transform to obtain distribution Central Limit Theorem: IV

10 10 Saturday, 02 May 2015A D Kennedy Central Limit Theorem: V where and Re-scale to show convergence to Gaussian distribution

11 11 Saturday, 02 May 2015A D Kennedy Importance Sampling: I Sample from distribution Probability Normalisation Integral Estimator of integral Estimator of variance

12 12 Saturday, 02 May 2015A D Kennedy Importance Sampling: II Minimise variance Constraint N=1 Lagrange multiplier Optimal measure Optimal variance

13 13 Saturday, 02 May 2015A D Kennedy Example: Optimal weight Optimal variance Importance Sampling: III

14 14 Saturday, 02 May 2015A D Kennedy Importance Sampling: IV 1 Construct cheap approximation to |sin(x)| 2 Calculate relative area within each rectangle 5 Choose another random number uniformly 4 Select corresponding rectangle 3 Choose a random number uniformly 6 Select corresponding x value within rectangle 7 Compute |sin(x)|

15 15 Saturday, 02 May 2015A D Kennedy Importance Sampling: V But we can do better! For which With 100 rectangles we have V = 16.02328561 With 100 rectangles we have V = 0.011642808

16 16 Saturday, 02 May 2015A D Kennedy Markov Chains: I State space  (Ergodic) stochastic transitions P’:    Distribution converges to unique fixed point Deterministic evolution of probability distribution P: Q  Q

17 17 Saturday, 02 May 2015A D Kennedy The sequence Q, PQ, P²Q, P³Q,… is Cauchy Define a metric on the space of (equivalence classes of) probability distributions Prove that with  > 0, so the Markov process P is a contraction mapping The space of probability distributions is complete, so the sequence converges to a unique fixed point Convergence of Markov Chains: I

18 18 Saturday, 02 May 2015A D Kennedy Convergence of Markov Chains: II

19 19 Saturday, 02 May 2015A D Kennedy Convergence of Markov Chains: III

20 20 Saturday, 02 May 2015A D Kennedy Markov Chains: II Use Markov chains to sample from Q Suppose we can construct an ergodic Markov process P which has distribution Q as its fixed point Start with an arbitrary state (“field configuration”) Iterate the Markov process until it has converged (“thermalized”) Thereafter, successive configurations will be distributed according to Q But in general they will be correlated To construct P we only need relative probabilities of states Do not know the normalisation of Q Cannot use Markov chains to compute integrals directly We can compute ratios of integrals

21 21 Saturday, 02 May 2015A D Kennedy Metropolis algorithm Markov Chains: III How do we construct a Markov process with a specified fixed point ? Integrate w.r.t. y to obtain fixed point condition Sufficient but not necessary for fixed point Detailed balance Other choices are possible, e.g., Consider cases and separately to obtain detailed balance condition Sufficient but not necessary for detailed balance

22 22 Saturday, 02 May 2015A D Kennedy Markov Chains: IV Composition of Markov steps Let P 1 and P 2 be two Markov steps which have the desired fixed point distribution They need not be ergodic Then the composition of the two steps P 2  P 1 will also have the desired fixed point And it may be ergodic This trivially generalises to any (fixed) number of steps For the case where P 1 is not ergodic but (P 1 ) n is the terminology weakly and strongly ergodic are sometimes used

23 23 Saturday, 02 May 2015A D Kennedy Markov Chains: V This result justifies “sweeping” through a lattice performing single site updates Each individual single site update has the desired fixed point because it satisfies detailed balance The entire sweep therefore has the desired fixed point, and is ergodic But the entire sweep does not satisfy detailed balance Of course it would satisfy detailed balance if the sites were updated in a random order But this is not necessary And it is undesirable because it puts too much randomness into the system

24 24 Saturday, 02 May 2015A D Kennedy Markov Chains: VI  Coupling from the Past Propp and Wilson (1996) Use fixed set of random numbers Flypaper principle: If states coalesce they stay together forever – Eventually, all states coalesce to some state  with probability one – Any state from t = -  will coalesce to  –  is a sample from the fixed point distribution

25 25 Saturday, 02 May 2015A D Kennedy Autocorrelations: I This corresponds to the exponential autocorrelation time Exponential autocorrelations The unique fixed point of an ergodic Markov process corresponds to the unique eigenvector with eigenvalue 1 All its other eigenvalues must lie within the unit circle In particular, the largest subleading eigenvalue is

26 26 Saturday, 02 May 2015A D Kennedy Autocorrelations: II Integrated autocorrelations Consider the autocorrelation of some operator Ω Without loss of generality we may assume Define the autocorrelation function

27 27 Saturday, 02 May 2015A D Kennedy Autocorrelations: III The autocorrelation function must fall faster that the exponential autocorrelation For a sufficiently large number of samples Define the integrated autocorrelation function

28 28 Saturday, 02 May 2015A D Kennedy In order to carry out Monte Carlo computations including the effects of dynamical fermions we would like to find an algorithm which Update the fields globally Because single link updates are not cheap if the action is not local Take large steps through configuration space Because small-step methods carry out a random walk which leads to critical slowing down with a dynamical critical exponent z=2 Hybrid Monte Carlo: I Does not introduce any systematic errors z relates the autocorrelation to the correlation length of the system,

29 29 Saturday, 02 May 2015A D Kennedy A useful class of algorithms with these properties is the (Generalised) Hybrid Monte Carlo (HMC) method Introduce a “fictitious” momentum p corresponding to each dynamical degree of freedom q Find a Markov chain with fixed point  exp[-H(q,p) ] where H is the “fictitious” Hamiltonian ½p 2 + S(q) The action S of the underlying QFT plays the rôle of the potential in the “fictitious” classical mechanical system This gives the evolution of the system in a fifth dimension, “fictitious” or computer time This generates the desired distribution exp[-S(q) ] if we ignore the momenta p (i.e., the marginal distribution) Hybrid Monte Carlo: II

30 30 Saturday, 02 May 2015A D Kennedy The HMC Markov chain alternates two Markov steps Molecular Dynamics Monte Carlo (MDMC) (Partial) Momentum Refreshment Both have the desired fixed point Together they are ergodic Hybrid Monte Carlo: III

31 31 Saturday, 02 May 2015A D Kennedy MDMC: I If we could integrate Hamilton’s equations exactly we could follow a trajectory of constant fictitious energy This corresponds to a set of equiprobable fictitious phase space configurations Liouville’s theorem tells us that this also preserves the functional integral measure dp dq as required Any approximate integration scheme which is reversible and area preserving may be used to suggest configurations to a Metropolis accept/reject test With acceptance probability min[1,exp(-  H)]

32 32 Saturday, 02 May 2015A D Kennedy MDMC: II Molecular Dynamics (MD), an approximate integrator which is exactly We build the MDMC step out of three parts A Metropolis accept/reject step Area preserving, Reversible, A momentum flip The composition of these gives with y being a uniformly distributed random number in [0,1]

33 33 Saturday, 02 May 2015A D Kennedy The Gaussian distribution of p is invariant under F The extra momentum flip F ensures that for small  the momenta are reversed after a rejection rather than after an acceptance For  =  / 2 all momentum flips are irrelevant This mixes the Gaussian distributed momenta p with Gaussian noise  Partial Momentum Refreshment

34 34 Saturday, 02 May 2015A D Kennedy Special cases The usual Hybrid Monte Carlo (HMC) algorithm is the special case where  =  / 2  = 0 corresponds to an exact version of the Molecular Dynamics (MD) or Microcanonical algorithm (which is in general non-ergodic) the L2MC algorithm of Horowitz corresponds to choosing arbitrary  but MDMC trajectories of a single leapfrog integration step (  =  ). This method is also called Kramers algorithm. Hybrid Monte Carlo: IV

35 35 Saturday, 02 May 2015A D Kennedy Hybrid Monte Carlo: V Further special cases The Langevin Monte Carlo algorithm corresponds to choosing  =  / 2 and MDMC trajectories of a single leapfrog integration step (  =  ). The Hybrid and Langevin algorithms are approximations where the Metropolis step is omitted The Local Hybrid Monte Carlo (LHMC) or Overrelaxation algorithm corresponds to updating a subset of the degrees of freedom (typically those living on one site or link) at a time Hybrid Monte Carlo: V

36 36 Saturday, 02 May 2015A D Kennedy If A and B belong to any (non-commutative) algebra then, where  constructed from commutators of A and B (i.e., is in the Free Lie Algebra generated by {A,B }) Symplectic Integrators: I More precisely, where and Baker-Campbell-Hausdorff (BCH) formula

37 37 Saturday, 02 May 2015A D Kennedy Symplectic Integrators: II Explicitly, the first few terms are In order to construct reversible integrators we use symmetric symplectic integrators The following identity follows directly from the BCH formula

38 38 Saturday, 02 May 2015A D Kennedy Symplectic Integrators: III We are interested in finding the classical trajectory in phase space of a system described by the Hamiltonian The basic idea of such a symplectic integrator is to write the time evolution operator as

39 39 Saturday, 02 May 2015A D Kennedy Symplectic Integrators: IV Define and so that Since the kinetic energy T is a function only of p and the potential energy S is a function only of q, it follows that the action of and may be evaluated trivially

40 40 Saturday, 02 May 2015A D Kennedy Symplectic Integrators: V From the BCH formula we find that the PQP symmetric symplectic integrator is given by In addition to conserving energy to O (  ² ) such symmetric symplectic integrators are manifestly area preserving and reversible

41 41 Saturday, 02 May 2015A D Kennedy Symplectic Integrators: VI For each symplectic integrator there exists a Hamiltonian H’ which is exactly conserved For the PQP integrator we have

42 42 Saturday, 02 May 2015A D Kennedy Symplectic Integrators: VII Substituting in the exact forms for the operations P and Q we obtain the vector field

43 43 Saturday, 02 May 2015A D Kennedy Symplectic Integrators: VIII Solving these differential equations for H’ we find Note that H’ cannot be written as the sum of a p-dependent kinetic term and a q-dependent potential term

44 44 Saturday, 02 May 2015A D Kennedy Fermion fields are Grassmann valued Required to satisfy the spin- statistics theorem Even “classical” Fermion fields obey anticommutation relations Grassmann algebras behave like negative dimensional manifolds Dynamical fermions: I

45 45 Saturday, 02 May 2015A D Kennedy Grassmann algebras: I Grassmann algebras Linear space spanned by generators {  1,  2,  3,…} with coefficients a, b,… in some field (usually the real or complex numbers) Algebra structure defined by nilpotency condition for elements of the linear space  ² = 0 There are many elements of the algebra which are not in the linear space (e.g.,  1  2 ) Nilpotency implies anticommutativity  +  = 0 0 =  ² =  ² = (  +  )² =  ² +  +  +  ² =  +  = 0 Anticommutativity implies nilpotency, 2  ² = 0 Unless the coefficient field has characteristic 2, i.e., 2 = 0

46 46 Saturday, 02 May 2015A D Kennedy A natural antiderivation is defined on a Grassmann algebra Linearity: d(a  + b  ) = a d  + b d  Anti-Leibniz rule: d(  ) = (d  )  + P(  )(d  ) Grassmann algebras have a natural grading corresponding to the number of generators in a given product deg(1) = 0, deg(  i ) = 1, deg(  i  j ) = 2,... All elements of the algebra can be decomposed into a sum of terms of definite grading Grassmann algebras: II The parity transform of  is

47 47 Saturday, 02 May 2015A D Kennedy Grassmann algebras: III Gaussians over Grassmann manifolds Like all functions, Gaussians are polynomials over Grassmann algebras There is no reason for this function to be positive even for real coefficients Definite integration on a Grassmann algebra is defined to be the same as derivation Hence change of variables leads to the inverse Jacobian

48 48 Saturday, 02 May 2015A D Kennedy Grassmann algebras: IV Where Pf(a) is the Pfaffian Pf(a)² = det(a) If we separate the Grassmann variables into two “conjugate” sets then we find the more familiar result Despite the notation, this is a purely algebraic identity It does not require the matrix a > 0, unlike its bosonic analogue Gaussian integrals over Grassmann manifolds

49 49 Saturday, 02 May 2015A D Kennedy Dynamical fermions: II Pseudofermions Direct simulation of Grassmann fields is not feasible The problem is not that of manipulating anticommuting values in a computer We therefore integrate out the fermion fields to obtain the fermion determinant It is that is not positive, and thus we get poor importance sampling and always occur quadratically The overall sign of the exponent is unimportant

50 50 Saturday, 02 May 2015A D Kennedy Dynamical fermions: III Any operator  can be expressed solely in terms of the bosonic fields E.g., the fermion propagator is

51 51 Saturday, 02 May 2015A D Kennedy Dynamical fermions: IV Represent the fermion determinant as a bosonic Gaussian integral with a non-local kernel The fermion kernel must be positive definite (all its eigenvalues must have positive real parts) otherwise the bosonic integral will not converge The new bosonic fields are called pseudofermions The determinant is extensive in the lattice volume, thus again we get poor importance sampling Including the determinant as part of the observable to be measured is not feasible

52 52 Saturday, 02 May 2015A D Kennedy Dynamical fermions: V It is usually convenient to introduce two flavours of fermion and to write The evaluation of the pseudofermion action and the corresponding force then requires us to find the solution of a (large) set of linear equations This not only guarantees positivity, but also allows us to generate the pseudofermions from a global heatbath by applying to a random Gaussian distributed field The equations for motion for the boson (gauge) fields are

53 53 Saturday, 02 May 2015A D Kennedy Dynamical fermions: VI It is not necessary to carry out the inversions required for the equations of motion exactly There is a trade-off between the cost of computing the force and the acceptance rate of the Metropolis MDMC step The inversions required to compute the pseudofermion action for the accept/reject step does need to be computed exactly, however We usually take “exactly” to by synonymous with “to machine precision”

54 54 Saturday, 02 May 2015A D Kennedy Reversibility: I Are HMC trajectories reversible and area preserving in practice? The only fundamental source of irreversibility is the rounding error caused by using finite precision floating point arithmetic For fermionic systems we can also introduce irreversibility by choosing the starting vector for the iterative linear equation solver time-asymmetrically We do this if we to use a Chronological Inverter, which takes (some extrapolation of) the previous solution as the starting vector Floating point arithmetic is not associative It is more natural to store compact variables as scaled integers (fixed point) Saves memory Does not solve the precision problem

55 55 Saturday, 02 May 2015A D Kennedy Reversibility: II Data for SU(3) gauge theory and QCD with heavy quarks show that rounding errors are amplified exponentially The underlying continuous time equations of motion are chaotic Ляпунов exponents characterise the divergence of nearby trajectories The instability in the integrator occurs when  H » 1 Zero acceptance rate anyhow

56 56 Saturday, 02 May 2015A D Kennedy Reversibility: III In QCD the Ляпунов exponents appear to scale with  as the system approaches the continuum limit     = constant This can be interpreted as saying that the Ляпунов exponent characterises the chaotic nature of the continuum classical equations of motion, and is not a lattice artefact Therefore we should not have to worry about reversibility breaking down as we approach the continuum limit Caveat: data is only for small lattices, and is not conclusive

57 57 Saturday, 02 May 2015A D Kennedy Data for QCD with lighter dynamical quarks Instability occurs close to region in  where acceptance rate is near one May be explained as a few “modes” becoming unstable because of large fermionic force Integrator goes unstable if too poor an approximation to the fermionic force is used Reversibility: IV

58 58 Saturday, 02 May 2015A D Kennedy PHMC Polynomial Hybrid Monte Carlo algorithm: instead of using Чебышев polynomials in the multiboson algorithm, Frezzotti & Jansen and deForcrand suggested using them directly in HMC Polynomial approximation to 1/x are typically of order 40 to 100 at present Numerical stability problems with high order polynomial evaluation Polynomials must be factored Correct ordering of the roots is important Frezzotti & Jansen claim there are advantages from using reweighting

59 59 Saturday, 02 May 2015A D Kennedy RHMC Rational Hybrid Monte Carlo algorithm: the idea is similar to PHMC, but uses rational approximations instead of polynomial ones Much lower orders required for a given accuracy Evaluation simplified by using partial fraction expansion and multiple mass linear equation solvers 1/x is already a rational function, so RHMC reduces to HMC in this case Can be made exact using noisy accept/reject step

60 60 Saturday, 02 May 2015A D Kennedy Chiral Fermions Conventions We work in Euclidean space γ matrices are Hermitian We write We assume all Dirac operators are γ 5 Hermitian

61 61 Saturday, 02 May 2015A D Kennedy It is possible to have chiral symmetry on the lattice without doublers if we only insist that the symmetry holds on shell On-shell chiral symmetry: I Such a transformation should be of the form (Lüscher) is an independent field from has the same Spin(4) transformation properties as does not have the same chiral transformation properties as in Euclidean space (even in the continuum)

62 62 Saturday, 02 May 2015A D Kennedy On-shell chiral symmetry: II For it to be a symmetry the Dirac operator must be invariant For an infinitesimal transformation this implies that Which is the Ginsparg-Wilson relation

63 63 Saturday, 02 May 2015A D Kennedy Both of these conditions are satisfied if we define (Neuberger) Neuberger’s Operator: I We can find a solution of the Ginsparg-Wilson relation as follows Let the lattice Dirac operator to be of the form This satisfies the GW relation iff It must also have the correct continuum limit Where we have defined where

64 64 Saturday, 02 May 2015A D Kennedy Into Five Dimensions H Neuberger hep-lat/9806025hep-lat/9806025 A Boriçi hep-lat/9909057, hep-lat/9912040, hep-lat/0402035hep-lat/9909057hep-lat/9912040hep-lat/0402035 A Boriçi, A D Kennedy, B Pendleton, U Wenger hep-lat/0110070hep-lat/0110070 R Edwards & U Heller hep-lat/0005002hep-lat/0005002 趙 挺 偉 (T-W Chiu) hep-lat/0209153, hep-lat/0211032, hep-lat/0303008hep-lat/0209153hep-lat/0211032hep-lat/0303008 R C Brower, H Neff, K Orginos hep-lat/0409118hep-lat/0409118 Hernandez, Jansen, Lüscher hep-lat/9808010hep-lat/9808010

65 65 Saturday, 02 May 2015A D Kennedy Is D N local? It is not ultralocal (Hernandez, Jansen, Lüscher) It is local iff D W has a gap D W has a gap if the gauge fields are smooth enough It seems reasonable that good approximations to D N will be local if D N is local and vice versa Otherwise DWF with n 5 → ∞ may not be local Neuberger’s Operator: II 10μ

66 66 Saturday, 02 May 2015A D Kennedy Four dimensional space of algorithms Neuberger’s Operator: III RepresentationRepresentation (CF, PF, CT=DWF) ConstraintConstraint (5D, 4D) Approximation Kernel

67 67 Saturday, 02 May 2015A D Kennedy Kernel Shamir kernel Möbius kernel Wilson (Boriçi) kernel

68 68 Saturday, 02 May 2015A D Kennedy Schur Complement It may be block diagonalised by an LDU factorisation (Gaussian elimination) In particular Consider the block matrix Equivalently a matrix over a skew field = division ring The bottom right block is the Schur complement

69 69 Saturday, 02 May 2015A D Kennedy Constraint: I So, what can we do with the Neuberger operator represented as a Schur complement? Consider the five-dimensional system of linear equations The bottom four-dimensional component is

70 70 Saturday, 02 May 2015A D Kennedy Constraint: II Alternatively, introduce a five-dimensional pseudofermion field Then the pseudofermion functional integral is So we also introduce n-1 Pauli-Villars fields and we are left with just det D n,n = det D N

71 71 Saturday, 02 May 2015A D Kennedy Approximation: tanh Pandey, Kenney, & Laub; Higham; Neuberger For even n (analogous formulæ for odd n) ωjωj

72 72 Saturday, 02 May 2015A D Kennedy Approximation: ЗолотаревЗолотарев sn(z/M,λ) sn(z,k) ωjωj

73 73 Saturday, 02 May 2015A D Kennedy Approximation: Errors The fermion sgn problem Approximation over 10 -2 < |x| < 1 Rational functions of degree (7,8) ε(x) – sgn(x) log 10 x 0.01 0.005 -0.01 -0.005 -21.5-0.50.50 Золотарев tanh(8 tanh -1 x)

74 74 Saturday, 02 May 2015A D Kennedy Representation: Continued Fraction I Consider a five-dimensional matrix of the form Compute its LDU decomposition where then the Schur complement of the matrix is the continued fraction

75 75 Saturday, 02 May 2015A D Kennedy Representation: Continued Fraction II We may use this representation to linearise our rational approximations to the sgn function as the Schur complement of the five- dimensional matrix

76 76 Saturday, 02 May 2015A D Kennedy Representation: Partial Fraction I Consider a five-dimensional matrix of the form (Neuberger & Narayanan)

77 77 Saturday, 02 May 2015A D Kennedy Compute its LDU decomposition So its Schur complement is Representation: Partial Fraction II

78 78 Saturday, 02 May 2015A D Kennedy Representation: Partial Fraction III This allows us to represent the partial fraction expansion of our rational function as the Schur complement of a five-dimensional linear system

79 79 Saturday, 02 May 2015A D Kennedy Representation: Cayley Transform I Consider a five-dimensional matrix of the form Compute its LDU decomposition So its Schur complement is Neither L nor U depend on C If where, and, then

80 80 Saturday, 02 May 2015A D Kennedy Representation: Cayley Transform II The Neuberger operator is T(x) is the Euclidean Cayley transform of For an odd function we have In Minkowski space a Cayley transform maps between Hermitian (Hamiltonian) and unitary (transfer) matrices

81 81 Saturday, 02 May 2015A D Kennedy The Neuberger operator with a general Möbius kernel is related to the Schur complement of D 5 (μ) Representation: Cayley Transform III with and P-P- μP-μP- P+P+ μP+μP+

82 82 Saturday, 02 May 2015A D Kennedy Cyclically shift the columns of the right-handed part where Representation: Cayley Transform IV P+P+ P-P- μP+μP+ μP-μP-

83 83 Saturday, 02 May 2015A D Kennedy Representation: Cayley Transform V With some simple rescaling The domain wall operator reduces to the form introduced before

84 84 Saturday, 02 May 2015A D Kennedy Representation: Cayley Transform VI It therefore appears to have exact off-shell chiral symmetry But this violates the Nielsen-Ninomiya theorem q.v., Pelissetto for non-local version Renormalisation induces unwanted ghost doublers, so we cannot use D DW for dynamical (“internal”) propagators We must use D N in the quantum action instead We can us D DW for valence (“external”) propagators, and thus use off-shell (continuum) chiral symmetry to manipulate matrix elements We solve the equation Note that satisfies

85 85 Saturday, 02 May 2015A D Kennedy Chiral Symmetry Breaking Ginsparg-Wilson defect Using the approximate Neuberger operator  L measures chiral symmetry breaking The quantity is essentially the usual domain wall residual mass (Brower et al.) m res is just one moment of  L G is the quark propagator

86 86 Saturday, 02 May 2015A D Kennedy Numerical Studies Used 15 configurations from the RBRC dynamical DWF dataset Matched π mass for Wilson and Möbius kernels All operators are even-odd preconditioned Did not project eigenvectors of H W

87 87 Saturday, 02 May 2015A D Kennedy Comparison of Representation Configuration #806, single precision

88 88 Saturday, 02 May 2015A D Kennedy Matching m π between H S and H W

89 89 Saturday, 02 May 2015A D Kennedy Computing m res using Δ L

90 90 Saturday, 02 May 2015A D Kennedy m res per Configuration m res is not sensitive to this small eigenvalueBut m res is sensitive to this one ε

91 91 Saturday, 02 May 2015A D Kennedy Cost versus m res

92 92 Saturday, 02 May 2015A D Kennedy Conclusions: I Relatively good Zolotarev Continued Fraction Rescaled Shamir DWF via Möbius (tanh) Relatively poor (so far…) Standard Shamir DWF Zolotarev DWF (Chiu) Can its condition number be improved? Still to do Projection of small eigenvalues HMC 5 dimensional versus 4 dimensional dynamics Hasenbusch acceleration 5 dimensional multishift? Possible advantage of 4 dimensional nested Krylov solvers Tunnelling between different topological sectors Algorithmic or physical problem (at μ=0) Reflection/refraction

93 93 Saturday, 02 May 2015A D Kennedy Approximation Theory We review the theory of optimal polynomial and rational Чебышев approximations, and the theory of elliptic functions leading to Золотарев’s formula for the sign function over the range R={z : ε ≤|z|≤1}. We show how Gauß’ arithmetico-geometric mean allows us to compute the Золотарев coefficients on the fly as a function of ε. This allows us to calculate sgn(H) quickly and accurately for a Hermitian matrix H whose spectrum lies in R.

94 94 Saturday, 02 May 2015A D Kennedy What is the best polynomial approximation p(x) to a continuous function f(x) for x in [0,1] ? Polynomial approximation Best with respect to the appropriate norm where n  1

95 95 Saturday, 02 May 2015A D Kennedy Weierstraß’ theorem Weierstraß: Any continuous function can be arbitrarily well approximated by a polynomial Taking n → ∞ this is the norm Taking n → ∞ this is the minimax norm

96 96 Saturday, 02 May 2015A D Kennedy Бернштейне polynomials The explicit solution is provided by Бернштейне polynomials

97 97 Saturday, 02 May 2015A D Kennedy Чебышев’s theorem The error |p(x) - f(x)| reaches its maximum at exactly d+2 points on the unit interval Чебышев: There is always a unique polynomial of any degree d which minimises

98 98 Saturday, 02 May 2015A D Kennedy Чебышев’s theorem: Necessity Suppose p-f has less than d+2 extrema of equal magnitude Then at most d+1 maxima exceed some magnitude And whose magnitude is smaller than the “gap” The polynomial p+q is then a better approximation than p to f This defines a “gap” We can construct a polynomial q of degree d which has the opposite sign to p-f at each of these maxima (Lagrange interpolation) Lagrange was born in Turin!

99 99 Saturday, 02 May 2015A D Kennedy Чебышев’s theorem: Sufficiency Then at each of the d+2 extrema Thus p’ – p = 0 as it is a polynomial of degree d Therefore p’ - p must have d+1 zeros on the unit interval Suppose there is a polynomial

100 100 Saturday, 02 May 2015A D Kennedy The notation is an old transliteration of Чебышев ! Чебышев polynomials Convergence is often exponential in d The best approximation of degree d-1 over [-1,1] to x d is Where the Чебышев polynomials are The error is

101 101 Saturday, 02 May 2015A D Kennedy Чебышев rational functions Чебышев’s theorem is easily extended to rational approximations Rational functions with nearly equal degree numerator and denominator are usually best Convergence is still often exponential Rational functions usually give a much better approximation A simple (but somewhat slow) numerical algorithm for finding the optimal Чебышев rational approximation was given by Ремез

102 102 Saturday, 02 May 2015A D Kennedy Чебышев rationals: Example A realistic example of a rational approximation is Using a partial fraction expansion of such rational functions allows us to use a multishift linear equation solver, thus reducing the cost significantly. The partial fraction expansion of the rational function above is This is accurate to within almost 0.1% over the range [0.003,1] This appears to be numerically stable.

103 103 Saturday, 02 May 2015A D Kennedy Polynomials versus rationals Optimal L 2 approximation with weight is Optimal L ∞ approximation cannot be too much better (or it would lead to a better L 2 approximation) Золотарев’s formula has L ∞ error This has L 2 error of O(1/n)

104 104 Saturday, 02 May 2015A D Kennedy Elliptic functions Elliptic function are doubly periodic analytic functions Jacobi showed that if f is not a constant it can have at most two primitive periods

105 105 Saturday, 02 May 2015A D Kennedy Liouville’s theorem ω ω’ω+ω’ 0

106 106 Saturday, 02 May 2015A D Kennedy Liouville’s theorem: Corollary I Apply this to the logarithmic derivative Laurent expansion about each pole ζ j Hence number of poles must equal the number of zeros (with multiplicity)

107 107 Saturday, 02 May 2015A D Kennedy Liouville’s theorem: Corollary II It follows that there are no non-constant holomorphic elliptic functions If f is an analytic elliptic function with no poles then f(z)-a could have no zeros either The more common form of Liouville’s theorem was actually due to Cauchy

108 108 Saturday, 02 May 2015A D Kennedy Liouville’s theorem: Corollary III Apply Liouville’s theorem to h(z) = z g(z) Where n and n’ and the number of times f(z) winds around the origin as z is taken along a straight line from 0 to ω or ω’ respectively If αk and βk are the locations of the poles and zeros of f, then the sum of the locations of the poles minus the location of the zeros vanishes modulo the periods

109 109 Saturday, 02 May 2015A D Kennedy Weierstraß elliptic functions: I A simple way to construct a doubly periodic analytic function is as the convergent sum Note that Q is an odd function

110 110 Saturday, 02 May 2015A D Kennedy Weierstraß elliptic functions: II The derivative of an elliptic function is clearly also an elliptic function, but in general its integral is not But since Q is odd its integral is even, and thus So the Weierstraß function is elliptic The only singularity is a double pole at the origin and its periodic images The name of the function is , and the function is called the Weierstraß function, but what is its name called? (with apologies to the White Knight)

111 111 Saturday, 02 May 2015A D Kennedy Weierstraß elliptic functions: III  satisfies the differential equation where Thus we may express the functional inverse of  as the elliptic integral

112 112 Saturday, 02 May 2015A D Kennedy Weierstraß elliptic functions: IV The integral of -  is not an elliptic function But it is an odd function, and so satisfies And Liouville’s theorem gives the useful identity Simple pole at the origin with unit residue

113 113 Saturday, 02 May 2015A D Kennedy Weierstraß elliptic functions: V That was so much fun we’ll do it again. Integrating gives the odd function With the periodic behaviour No poles, and simple zero at the origin

114 114 Saturday, 02 May 2015A D Kennedy Expansion of elliptic functions: I Let f be an elliptic function with periods ω and ω’ whose poles and zeros are at α j and β j respectively Recall that ; w.l.g. set this to 0 The function g then has the same poles and zeros as f And is periodic, and hence elliptic

115 115 Saturday, 02 May 2015A D Kennedy Expansion of elliptic functions: II By Liouville’s theorem f/g is an elliptic function with no poles, and is thus a constant C Every elliptic function may thus be written in the “rational” form

116 116 Saturday, 02 May 2015A D Kennedy Addition formulæ: I Similarly, any elliptic function f can be expanded in “partial fraction” form in terms of Weierstraß ζ functions. By considering this allows us to show that the ζ function satisfies the addition formula

117 117 Saturday, 02 May 2015A D Kennedy Addition formulæ: II And by differentiation we obtain an addition formula for  too The principal conclusion is that any elliptic function f can be represented in terms of rational functions of Weierstraß functions of the same argument with the same periods

118 118 Saturday, 02 May 2015A D Kennedy Jacobian elliptic functions: I Although Weierstraß elliptic functions are most elegant for theoretical studies, in practice it is useful to follow Jacobi and introduce the function sn(z;k) through the inversion of the elliptic integral

119 119 Saturday, 02 May 2015A D Kennedy Jacobian elliptic functions: II Of course, this cannot be anything new – it must be expressible in terms of the Weierstraß functions with the same periods Where the “roots” of the cubic form in Weierstraß differential equation are

120 120 Saturday, 02 May 2015A D Kennedy Jacobian elliptic functions: III We are hiding the fact that the Weierstraß functions depend not only on z but also on periods ω and ω’ It is not too hard to show that the periods of the Jacobian elliptic functions are where and the complete elliptic integral is

121 121 Saturday, 02 May 2015A D Kennedy Jacobian elliptic functions: IV For later purposes we note the relation between the even Jacobi and Weierstraß elliptic functions A corollary of this is that any even elliptic function with periods may be expressed as a rational function of sn(z;k) 2

122 122 Saturday, 02 May 2015A D Kennedy Modular transformations: I We observed earlier that there are many choices of periods ω and ω’ which generate the same period lattice: if we choose with this will be the case. Since the period lattices are the same, elliptic functions with these periods are rationally expressible in terms of each other This is called a first degree transformation

123 123 Saturday, 02 May 2015A D Kennedy Modular transformations: II We can also choose periods whose lattice has the original one as a sublattice, e.g., Elliptic functions with these periods must also be rationally expressible in terms of ones with the original periods, although the inverse may not be true This is called a transformation of degree n

124 124 Saturday, 02 May 2015A D Kennedy Let us construct an elliptic function with periods Modular transformations: III This is easily done by taking sn(z;k) and scaling z by some factor 1/M and choosing a new parameter λ We thus consider sn(z/M; λ) with periods for z/M of The periods for z are thus 4LM and 2iL’M, so we must have LM=K and L’M=K’/n

125 125 Saturday, 02 May 2015A D Kennedy Modular transformations: IV The even elliptic function must therefore be expressible as a rational function of Matching the poles and zeros, and fixing the overall normalisation from the behaviour near z=0 we find

126 126 Saturday, 02 May 2015A D Kennedy Modular transformations: V The quantity M is determined by evaluating this identity at the half period K where Likewise, the parameter λ is found by evaluating the identity at z = K + iK’/n It is useful to write the identity in parametric form

127 127 Saturday, 02 May 2015A D Kennedy Im z Re z 02KK-2K-K iK ’ -iK’ Золотарев’s formula: I Real period 4K, K=K(k) Imaginary period 2iK’, K’=K(k’), k 2 +k’ 2 =1 Fundamental region ∞ 1/k 1 0 -1 - 1/k

128 128 Saturday, 02 May 2015A D Kennedy Золотарев’s formula: II sn(z/M,λ) sn(z,k) Im z Re z 02KK-2K-K iK ’ -iK’

129 129 Saturday, 02 May 2015A D Kennedy Arithmetico-Geometric mean: I Long before the work of Jacobi and Weierstraß Gauß considered the following sequence Since these are means Furthermore So it converges to the AG mean

130 130 Saturday, 02 May 2015A D Kennedy Gauß transformation Gauß discovered the second degree modular transformation (but not in this notation, of course) Let The the Gauß transformation tells us that This is just a special case of the degree n modular transformation we used before

131 131 Saturday, 02 May 2015A D Kennedy Arithmetico-Geometric mean: II On the other hand So the quantity is a constant (the RHS is independent of n)

132 132 Saturday, 02 May 2015A D Kennedy Arithmetico-Geometric mean: III Therefore it must have the value We may thus compute if we start with, then also

133 133 Saturday, 02 May 2015A D Kennedy Arithmetico-Geometric mean: IV We have a rapidly converging recursive function to compute sn(u;k) and K(k) for real u and 0<k<1 Values for arbitrary complex u and k can be computed from this using simple modular transformations These are best done analytically and not numerically With a little care about signs we can also compute cn(u;k) and dn(u;k) which are then also needed Testing for a=b in floating point arithmetic is sinful: it is better to continue until a ceases to decrease

134 134 Saturday, 02 May 2015A D Kennedy Conclusions: II One can compute matrix functions exactly using fairly low-order rational approximations “Exactly” means to within a relative error of 1 u.l.p. Which is what “exactly” normally means for scalar computations One can also use an absolute error, but it makes no difference for the sign function! For the case of the sign function we can evaluate the Золотарев coefficients needed to obtain this precision on the fly just to cover the spectrum of the matrix under consideration There seem to be no numerical stability problems

135 135 Saturday, 02 May 2015A D Kennedy Bibliography: I Elliptic functions Наум Ильич Ахиезер (Naum Il’ich Akhiezer), Elements of the Theory of Elliptic Functions, Translations of Mathematical Monographs, 79, AMS (1990). ISBN 0065-9282 QA343.A3813.

136 136 Saturday, 02 May 2015A D Kennedy Bibliography: II Approximation theory Наум Ильич Ахиезер (Naum Il’ich Akhiezer), Theory of Approximation, Constable (1956). ISBN 0094542007. Theodore J. Rivlin, An Introduction to the Approximation of Functions, Dover (1969). ISBN 0-486- 64069-8.

137 137 Saturday, 02 May 2015A D Kennedy Hasenbusch Acceleration Hasenbusch introduced a simple method that significantly improves the performance of dynamical fermion computations. We shall consider a variant of it that accelerates fermionic molecular dynamics algorithms by introducing n pseudofermion fields coupled with the n th root of the fermionic kernel.

138 138 Saturday, 02 May 2015A D Kennedy Non-linearity of CG solver Suppose we want to solve A 2 x=b for Hermitian A by CG It is better to solve Ax=y, Ay=b successively Condition number  (A 2 ) =  (A) 2 Cost is thus 2  (A) <  (A 2 ) in general Suppose we want to solve Ax=b Why don’t we solve A 1/2 x=y, A 1/2 y=b successively? The square root of A is uniquely defined if A>0 This is the case for fermion kernels All this generalises trivially to n th roots No tuning needed to split condition number evenly How do we apply the square root of a matrix?

139 139 Saturday, 02 May 2015A D Kennedy Rational matrix approximation Functions on matrices Defined for a Hermitian matrix by diagonalisation H = UDU -1 f(H) = f(UDU -1 ) = U f(D) U -1 Rational functions do not require diagonalisation  H m +  H n = U(  D m +  D n ) U -1 H -1 = UD -1 U -1 Rational functions have nice properties Cheap (relatively) Accurate

140 140 Saturday, 02 May 2015A D Kennedy No Free Lunch Theorem We must apply the rational approximation with each CG iteration M 1/n  r(M) The condition number for each term in the partial fraction expansion is approximately  (M) So the cost of applying M 1/n is proportional to  (M) Even though the condition number  (M 1/n )=  (M) 1/n And even though  (r(M))=  (M) 1/n So we don’t win this way…

141 141 Saturday, 02 May 2015A D Kennedy Pseudofermions We want to evaluate a functional integral including the fermionic determinant det M We write this as a bosonic functional integral over a pseudofermion field with kernel M -1

142 142 Saturday, 02 May 2015A D Kennedy Multipseudofermions We are introducing extra noise into the system by using a single pseudofermion field to sample this functional integral This noise manifests itself as fluctuations in the force exerted by the pseudofermions on the gauge fields This increases the maximum fermion force This triggers the integrator instability This requires decreasing the integration step size A better estimate is det M = [det M 1/n ] n

143 143 Saturday, 02 May 2015A D Kennedy Hasenbusch’s method Clever idea due to Hasenbusch Start with the Wilson fermion action M=1-  H Introduce the quantity M’=1-  ’H Use the identity M = M’(M’ -1 M) Write the fermion determinant as det M = det M’ det (M’ -1 M) Introduce separate pseudofermions for each determinant Adjust  ’ to minimise the cost Easily generalises More than two pseudofermions Wilson-clover action

144 144 Saturday, 02 May 2015A D Kennedy Violation of NFL Theorem So let’s try using our n th root trick to implement multipseudofermions Condition number  (r(M))=  (M) 1/n So maximum force is reduced by a factor of n  (M) (1/n)-1 This is a good approximation if the condition number is dominated by a few isolated tiny eigenvalues This is so in the case of interest Cost reduced by a factor of n  (M) (1/n)-1 Optimal value n opt  ln  (M) So optimal cost reduction is (e ln  ) /  This works!

145 145 Saturday, 02 May 2015A D Kennedy RHMC technicalities In order to keep the algorithm exact to machine precision (as a Markov process) Use a good (machine precision) rational approximation for the pseudofermion heatbath Use a good rational approximation for the HMC acceptance test at the end of each trajectory Use as poor a rational approximation as we can get away with to compute the MD force


Download ppt "Saturday, 02 May 2015Saturday, 02 May 2015Saturday, 02 May 2015Saturday, 02 May 2015 2005 Taipei Summer Institute on Strings, Particles and Fields on Strings,"

Similar presentations


Ads by Google