Doing Very Big Calculations on Modest Size Computers Reducing the Cost of Exact Diagonalization Using Singular Value Decomposistion Marvin Weinstein, Assa Auerbach and V. Ravi Chandra
Some Uses For Exact Diagonalization CORE Use ED to carry out RG transformations for clusters. Mean field theory computations, DMRG (density matrix) Are all tested on small clusters against ED Also: to compute short wavelength dynamical response functions to compute Chern numbers for quantum Hall systems
Counting Memory Requirements Consider a lattice with 2N-sites and one spin-1/2 degree of freedom per site: This, of course, means memory needs grow exponentially with number of sites For 36 sites a single vector is ½ TB of ram For 40 sites it is 8 TB This means simple Lanczos becomes very memory unfriendly ! The problem is not computational speed, it is reducing memory footprint.
Singular Value Decomposition Key Message: For problems where the space is a tensor product space, SVD allows us to reduce memory for storing big vectors. The SVD says any matrix can be written as: Where M is an n x m matrix, U is n x n, S is a vector with at most min(n,m) non-zero elements, and V is m x m, and the entries in S are arranged in decreasing order
Rewriting Tensor Product States As Matrices Suppose we have two blocks with N-sites A generic vector is the sum of tensor products in block A and block B: i.e., Generically, given a vector, we can write it as a matrix or a single vector
So SVD Says… Every vector in the product space can be written as a sum of products of the form Where the ’s are the ’s and the vectors are: We can choose to represent a vector by a sum of simple tensor products, ignoring the small enough ’s
Another Look At Memory Requirements Once again, consider a 36 site lattice and ask how much memory we need to store the ground state of the Hamiltonian (before it was ½ TB) Assume we keep 100 of the largest eigenvalues in S. Dimension of an 18 site vector: 2 MB Dimension of the SVD form of the g-s is 400 MB or ½ GB We have gone from undoable to easily done, it takes some time. How do we do it ?
Key Idea For Manipulating SVD States We are starting with simple tensor product states. The Hamiltonian is: There are 11 operators that act on each side, so effective Hilbert space for SVD is 11 times the size of the SVD space we started with. This is impt. because having to do SVD on a 30, 36 or 40-site form of a vector is prohibitive.
To Be Specific Kagome Lattice Why is this interesting ?
CORE & Magen-David (MD) Blocking Each MD has 12 sites (4096 dim) Diagonalizing the single MD block yields two degenerate spin-0 states. These are RVB states: i.e., they are pairwise spin-0 singlets as shown to the left. A possible CORE computation Truncate space to the two singlets and then compute the new renormalized Hamiltonian between two blocks (24 site computation). Then do the same for the three MD’s arranged in a triangle (36 site).
Magen-David (MD) Blocking Two computations that have been done We did the 2 MD (24- site) blocking because it can also be done exactly by brute force. Of course we also needed for CORE. Results of SVD Lanczos and ordinary Lanczos were compared and the convergence is very good for ~100 states. The 3 MD (36-site) blocking is under way. With ~100 states we get convergence to or better.
Magen-David (MD) Blocking The 24 site computation We see that there are 8 operators acting on the first block and 8 on the second Thus, if we start with a single t-prod state we have 8 states on the left and 8 on the right. But they aren’t orthonormal on left and right, so we can have fewer states on each side. Orthonormalize them and expand the 8 states in the new basis. In this basis combine into a 64 component vector, do SVD and reduce back to an SVD state having the desired number of SVD states. After a few multiplications this grows to the desired 100 states.
Three MD Blocking Three MD Blocking The 36 site blocking Now there are 6 bonds or 18 spin operators linking the two blocks. With the single block Hamiltonian and the unit operator, this means each term in the SVD state is multiplied by 20. Thus, for 100 states, we have a 2000 x 2000 matrix to do SVD on. This is a “piece of cake”.
Some Results for 24 and 36 Site MD’s Error in energy as a function of number of SVD states for 24-site problem. Here we have exact answer to compare to. Rate of convergence of energy as a function of number of Lanczos iterations for the 36-site problem.
CORE Hexagon-Triangle Blocking This lattice can also be covered with hexagons and triangles. The ground state on hexagon is spin-0 singlet. There are two degenerate spin-1/2 states for each triangle. Problem: Truncate to four states per triangle and compute triangle-triangle CORE renormalized Hamiltonian.
These Triangles Then Form A New Hexagonal Lattice Now each vertex has four spin ½ states. The coupling between the vertices is a spin-pseudospin coupling and the coefficients rotate with direction. This rotation is the remnant of the original geometric frustration. NOTE: The hexagonal lattice then blocks (second CORE transf) to a triangular lattice. But with additional frustration.
SVD Entropy – Some Analytics Assume we have an SVD decomposition of the form. Introduce the parameter ‘s’ s.t. Then introduce the density of states How good is the power law assumption ? NOTE: The integral of must be unity
SVD Density of States for 30 site Blocking NOTE: This is a power law
SVD Entropy - More Analytics Given that the number of states as a function of the parameter ‘s’ is a power law, it is convenient to introduce the SVD entropy Then if we choose a cutoff on the integral s.t. It follows that the error in the state is and so This is consistent with our results !
More Than Bi-Partite SVD Lanczos Consider a disk of radius R so that N is proportional to the area of the disck. Partition the disk first in half, then divide each half again, etc. In this way obtain P = 2 p clusters. Assume the SVD entropy of each partition goes like the ‘area’ of the boundary, i.e., ~R. The total number of SVD vectors to get a fixed error is then known. From this we can estimate the optimal partitioning. Slower than 2 N.
Recap I have shown how one can, using the singular value decomposition of a vector, carry out Lanczos, or contractor, computations of eigenstates to high accuracy. The methods allow you to check convergence and, plotting the density of states and fitting to a power law, estimate the error of the computation. This allows those of us who are not running on machines with 2000 cores and 1.5TB of ram to do big computations. Imagine what people with those resources can do!