Random Matrix Theory and Numerical Linear Algebra: A story of communication Alan Edelman Mathematics Computer Science & AI Labs ILAS Meeting June 3, 2013.

Slides:



Advertisements
Similar presentations
5.1 Real Vector Spaces.
Advertisements

Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial.
Mathematics Algebra 1 Trigonometry Geometry The Meaning of Numbers Choosing the Correct Equation(s) Trig and Geometry Mathematics is a language. It is.
Alternative Simulation Core for Material Reliability Assessments Speculation how to heighten random character of probability calculations (concerning the.
Singular Values of the GUE Surprises that we Missed Alan Edelman and Michael LaCroix MIT June 16, 2014 (acknowledging gratefully the help from Bernie Wang)
Modeling and Simulation Monte carlo simulation 1 Arwa Ibrahim Ahmed Princess Nora University.
Lecture 3 Probability and Measurement Error, Part 2.
Random Matrices, Integrals and Space-time Systems Babak Hassibi California Institute of Technology DIMACS Workshop on Algebraic Coding and Information.
Mathematics. Matrices and Determinants-1 Session.
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Lecture 2 Probability and what it has to do with data analysis.
GG313 Lecture 12 Matrix Operations Sept 29, 2005.
Lecture 4 Probability and what it has to do with data analysis.
MOHAMMAD IMRAN DEPARTMENT OF APPLIED SCIENCES JAHANGIRABAD EDUCATIONAL GROUP OF INSTITUTES.
1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)
Modern Navigation Thomas Herring
Needle-like Triangles, Matrices, and Lewis Carroll Alan Edelman Mathematics Computer Science & AI Labs Gilbert Strang Mathematics Computer Science & AI.
Separate multivariate observations
Linear Algebra With Applications by Otto Bretscher. Page The Determinant of any diagonal nxn matrix is the product of its diagonal entries. True.
Copyright © Cengage Learning. All rights reserved.
Alan Edelman, Jeff Bezanson Viral Shah, Stefan Karpinski Jeremy Kepner and the vibrant open-source community Computer Science & AI Laboratories Novel Algebras.
Absolute error. absolute function absolute value.
Random Matrix Theory Numerical Computation and Remarkable Applications Alan Edelman Mathematics Computer Science & AI Labs Computer Science & AI Laboratories.
Chapter 10 Review: Matrix Algebra
Compiled By Raj G. Tiwari
Piyush Kumar (Lecture 2: PageRank) Welcome to COT5405.
1 February 24 Matrices 3.2 Matrices; Row reduction Standard form of a set of linear equations: Chapter 3 Linear Algebra Matrix of coefficients: Augmented.
Eigenvalue Problems Solving linear systems Ax = b is one part of numerical linear algebra, and involves manipulating the rows of a matrix. The second main.
Matlab tutorial course Lesson 2: Arrays and data types
Systems of Linear Equation and Matrices
Row rows A matrix is a rectangular array of numbers. We subscript entries to tell their location in the array Matrices are identified by their size.
Percentile Approximation Via Orthogonal Polynomials Hyung-Tae Ha Supervisor : Prof. Serge B. Provost.
Alan Edelman Oren Mangoubi, Bernie Wang Mathematics
Digital Image Processing, 3rd ed. © 1992–2008 R. C. Gonzalez & R. E. Woods Gonzalez & Woods Matrices and Vectors Objective.
WEEK 8 SYSTEMS OF EQUATIONS DETERMINANTS AND CRAMER’S RULE.
Algorithms and their Applications CS2004 ( ) Dr Stephen Swift 3.1 Mathematical Foundation.
Matrices CHAPTER 8.1 ~ 8.8. Ch _2 Contents  8.1 Matrix Algebra 8.1 Matrix Algebra  8.2 Systems of Linear Algebra Equations 8.2 Systems of Linear.
What are the Eigenvalues of a Sum of (Non-Commuting) Random Symmetric Matrices? : A "Quantum Information" inspired Answer. Alan Edelman Ramis Movassagh.
DISCRETE COMPUTATIONAL STRUCTURES CS Fall 2005.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Great Theoretical Ideas in Computer Science.
Scientific Computing Singular Value Decomposition SVD.
STA347 - week 31 Random Variables Example: We roll a fair die 6 times. Suppose we are interested in the number of 5’s in the 6 rolls. Let X = number of.
M ONTE C ARLO SIMULATION Modeling and Simulation CS
Introduction to Linear Algebra Mark Goldman Emily Mackevicius.
The Random Matrix Technique of Ghosts and Shadows Alan Edelman Dept of Mathematics Computer Science and AI Laboratories Massachusetts Institute of Technology.
Progress in the method of Ghosts and Shadows for Beta Ensembles Alan Edelman (MIT) Alex Dubbs (MIT) and Plamen Koev (SJS) Aug 10, 2012 IMS Singapore Workshop.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
1 Chapter 8 – Symmetric Matrices and Quadratic Forms Outline 8.1 Symmetric Matrices 8.2Quardratic Forms 8.3Singular ValuesSymmetric MatricesQuardratic.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © 2005 Dr. John Lipp.
5. The Harmonic Oscillator Consider a general problem in 1D Particles tend to be near their minimum Taylor expand V(x) near its minimum Recall V’(x 0 )
Environmental Data Analysis with MatLab 2 nd Edition Lecture 14: Applications of Filters.
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
Filters– Chapter 6. Filter Difference between a Filter and a Point Operation is that a Filter utilizes a neighborhood of pixels from the input image to.
Generating Random Variates
ALGEBRAIC EIGEN VALUE PROBLEMS
Stochastic Differential Equations and Random Matrices
Numerical Computations and Random Matrix Theory
Background: Lattices and the Learning-with-Errors problem
Advances in Random Matrix Theory (stochastic eigenanalysis)
Advances in Random Matrix Theory (stochastic eigenanalysis)
Numerical Ranges in Modern Times 14th WONRA at Man-Duen Choi
Why are random matrix eigenvalues cool?
Numerical Computations and Random Matrix Theory
Advances in Random Matrix Theory: Let there be tools
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
Math review - scalars, vectors, and matrices
Matrices are identified by their size.
Presentation transcript:

Random Matrix Theory and Numerical Linear Algebra: A story of communication Alan Edelman Mathematics Computer Science & AI Labs ILAS Meeting June 3, 2013

Page 2

Page 3

Page 4

An Intriguing Thesis Page 5 The results/methodologies from NUMERICAL LINEAR ALGEBRA would be valuable even if computers had not been built. … but I’m glad we have computers …& I’m especially grateful for the Julia computing system

Eigenvalues of GOE (β=1 means reals) Naïve Way in four languages: A=randn(n); S=(A+A’)/sqrt(2*n);eig(S) A=randn(n,n);S=(A+A’)/sqrt(2*n);eigvals(S) A=matrix(rnorm(n*n),ncol=n);S=(a+t(a))/sqrt(2*n); eigen(S,symmetric=T,only.values=T)$values; A=RandomArray[NormalDistribution[],{n,n}]; S=(A+Transpose[A])/Sqrt[2*n];Eigenvalues[s] Page 6

Eigenvalues of GOE (β=1 means reals) Naïve Way in four languages: A=randn(n); S=(A+A’)/sqrt(2*n);eig(S) A=randn(n,n);S=(A+A’)/sqrt(2*n);eigvals(S) A=matrix(rnorm(n*n),ncol=n);S=(a+t(a))/sqrt(2*n); eigen(S,symmetric=T,only.values=T)$values; A=RandomArray[NormalDistribution[],{n,n}]; S=(A+Transpose[A])/Sqrt[n];Eigenvalues[s] Page 7 If for you: simulation is just intuition, a check, a figure, or a student project, then software like this is written quickly and victory is declared.

Eigenvalues of GOE (β=1 means reals) Naïve Way in four languages: A=randn(n); S=(A+A’)/sqrt(2*n);eig(S) A=randn(n,n);S=(A+A’)/sqrt(2*n);eigvals(S) A=matrix(rnorm(n*n),ncol=n);S=(a+t(a))/sqrt(2*n); eigen(S,symmetric=T,only.values=T)$values; A=RandomArray[NormalDistribution[],{n,n}]; S=(A+Transpose[A])/Sqrt[n];Eigenvalues[s] Page 8 For me: Simulation is a chance to research efficiency, Numerical Linear Algebra Style, and this research cycles back to the mathematics! Ψ=Ψ=

Tridiagonal Model More Efficient (β=1: Same eigenvalue distribution!) (Silverstein, Trotter, general β Dumitriu&E. etc) g i ~N(0,2) Julia: eig(SymTridiagonal(d,e)) Storage: O(n) (vs O(n 2 )) Time: O(n 2 ) (vs O(n 3 )) Page 9 n

Histogram without Histogramming: Sturm Sequences Count #eigs < s: Count sign changes in Det( (A-s*I)[1:k,1:k] ) Count #eigs in [x,x+h]: Take difference in number of sign changes at x+h and x Speed comparison vs. naïve way Mentioned in Dumitriu and E 2006, Used theoretically in Albrecht, Chan, and E 2008 Page 10

Page 11 A good computational trick is a good theoretical trick! Finite Semi-Circle Laws for Any Beta! Finite Tracy-Widom Laws for Any Beta!

Efficient Tracy Widom Simulation Naïve Way: A=randn(n); S=(A+A’)/sqrt(2*n);max(eig(S)) Better Way: Only create the 10n 1/3 initial segment of the diagonal and off-diagonal as the “Airy” function tells us that the max eig hardly depends on the rest Lead to the notion of the stochastic operator limit Page 12

Google Translate Page 13

Google Translate (in my dreams) Lost In Translation: eigs = svd’s squared Page 14

The Jacobi Random Matrix Ensemble (Constantine 1963) Suppose A and B are randn(m1,n) and randn(m2,n) –(iid standard normals) The eigenvalues of or in symmetric form has joint density Page 15

The Jacobi Ensemble Geometrically Take random n-hyperplane in R m (m>n) (uniformly) Take reference hyperplane (any dimension) Orthogonal projection of the unit ball in the random hyperplane onto the reference hyperplane is an ellipsoid Semi-axes lengths are Jacobi cosines: Page 16

The GSVD Usual Presentation Underlying Geometry Take hyperplane spanned by [ ] X=span( 1 st m 1 columns of I m ) Y=span(last m 2 columns of I m ) ABAB A: m 1 x n B: m 2 x n [ ]: m x n ABAB Page 17

GSVD Mathematics (Cosine,Sine) Thus any hyperplane is the span of with U’U=V’V=I and C 2 +S 2 =I (diagonal) One can directly verify by taking Jacobians projecting out the W’dW directions that we do not care about and further understand C J = 18/77 Page 18

Circular Ensembles A random complex unitary matrix has eigenvalues distributed on the unit circle with density It is desriable to construct random matrices that we can compute with for general beta: (Killip and Nenciu 2004 product of tridiagonal solution) Page 19

Numerical linear algebra Ammar, Gragg, Reichel (1991) Hessenberg Unitary Matrices G j = Recalled in Forrester and Rains (2005) Converts Killip and Nenciu to AmmarGraggReichel format Classical orthogonal polynomials on unit circle! Story relates to CMV matrices Verblunsky Coefficients = Schur Parameters Page 20

Implementation Needed facts 1)AmmarGraggReichel format 2)Generating the variables requires some thinking |  j | ~ sqrt(1-rand()^ ((2/β)/j))  j = |  j |*exp(2*pi*i*rand()) Remark: Authors would rightly feel all the mathematical information is in their papers. Still this presenter had some considerable trouble gathering enough of the pieces to produce the ultimate arbiter of understanding the result: a working code! Plea: Random Matrix Theory is so valuable to so many, that a pseudocode or code available is a great technology transfer mechanism. Page 21

Julia Code: (similar to MATLAB etc.) really useful! Page 22

The method of Ghosts and Shadows for Beta Ensembles Page 23

So far: I tried to hint about β Introduction to Ghosts G 1 is a standard normal N(0,1) G 2 is a complex normal (G 1 +iG 1 ) G 4 is a quaternion normal (G 1 +iG 1 +jG 1 +kG 1 ) G β (β>0) seems to often work just fine  “Ghost Gaussian” Page 24

Chi-squared Defn: χ β is the sum of β iid squares of standard normals if β=1,2,… Generalizes for non-integer β as the “gamma” function interpolates factorial χ β is the sqrt of the sum of squares (which generalizes) (wikipedia chi-distriubtion) |G 1 | is χ 1, |G 2 | is χ 2, |G 4 | is χ 4 So why not |G β | is χ β ? I call χ β the shadow of G β 2 Page 25

Page 26

Working with Ghosts Real quantity Page 27

Singular Values of a Ghost Gaussian by a real diagonal Page 28 (see related work by Forrester 2011) (Dubbs, E, Koev, Venkataramana 2013)

The Algorithm Page 29

Removing U and V Page 30

Algorithm cont. Page 31

Completion of Recursion Page 32

Monte Carlo Trials Parallel Histograms No tears! Page 33

Julia: Parallel Histogram 3 rd eigenvalue, pylab plot, 8 seconds! 75 processors Page 34

Linear Algebra too limited in Lets me put together what I need: e.g.:Tridiagonal Eigensolver Fast rank one update Arrow matrix eigensolver can surgically use LAPACK without tears Page 35

Weekend Julia Run Histogram of smallest singular value of randn(200) –spiked by doubling the first column –Julia: A=randn(200,200); A(:,1)*=2; –Reduced mathematically to bidiagonal form –Used julia’s bidiagonal svd –Ran 2.25 billion trials in 20 hours on 75 processors –Naïve serial algorithm would take 16 years Page 36

Weekend Julia Run Histogram of smallest singular value of randn(200) –spiked by doubling the first column –Julia: A=randn(200,200); A(:,1)*=2; –Reduced mathematically to bidiagonal form –Used julia’s bidiagonal svd –Ran 2.25 billion trials in 20 hours on 75 processors –Naïve serial algorithm would take 16 years I knew the limiting histogram from my thesis, universality, etc. With this kind of power, I could obtain the first order correction for finite n! Semilogy plot of abs correction and a prediction: This is like having an electron microscope to see small details that would be invisible with conventional tools! Page 37

Conclusion If you already held Numerical Linear Algebra with high esteem, thanks to random matrix theory, now there are even more reasons! Try julia. (Google: julia) Page 38

Page 39

Wishart Matrices (arbitrary covariance) G=mxn matrix of Gaussians Σ=mxn semidefinite matrix G’G Σ is similar to A=Σ ½ G’GΣ -½ For β=1,2,4, the joint eigenvalue density of A has a formula: Known for β=2 in some circles as Harish-Chandra-Itzykson-Zuber Page 40

Eigenvalue density of G’G Σ ( similar to A=Σ ½ G’GΣ -½ ) Present an algorithm for sampling from this density Show how the method of Ghosts and Shadows can be used to derive this algorithm Further evidence that β=1,2,4 need not be special Page 41

Scary Ideas in Mathematics Zero Negative Radical Irrational Imaginary Ghosts: Something like a sometimes commutative algebra of random variables that generalizes random Reals, Complexes, and Quaternions and inspires theoretical results and numerical computation Page 42

Did you say “commutative”?? Quaternions don’t commute. Yes but random quaternions do! If x and y are G 4 then x*y and y*x are identically distributed! Page 43

RMT Densities Hermite: c ∏|λ i - λ j | β e -∑λ i 2 /2 (Gaussian Ensemble) Laguerre: c ∏|λ i -λ j | β ∏λ i m e -∑λ i (Wishart Matrices) Jacobi: c ∏|λ i -λ j | β ∏λ i m 1 ∏(1-λ i ) m 2 (Manova Matrices) Fourier: c ∏|λ i -λ j | β (on the complex unit circle) (Circular Ensembles) (orthogonalized by Jack Polynomials) Page 44

Wishart Matrices (arbitrary covariance) G=mxn matrix of Gaussians Σ=mxn semidefinite matrix G’G Σ is similar to A=Σ ½ G’GΣ -½ For β=1,2,4, the joint eigenvalue density of A has a formula: Page 45

Joint Eigenvalue density of G’G Σ The “ 0 F 0 ” function is a hypergeometric function of two matrix arguments that depends only on the eigenvalues of the matrices. Formulas and software exist. Page 46

Generalization of Laguerre Laguerre: Versus Wishart: Page 47

General β? The joint density : is a probability density for all β>0. Goals: Algorithm for sampling from this density Get a feel for the density’s “ghost” meaning Page 48

Main Result An algorithm derived from ghosts that samples eigenvalues A MATLAB implementation that is consistent with other beta- ized formulas –Largest Eigenvalue –Smallest Eigenvalue Page 49

More practice with Ghosts Page 50

Bidiagonalizing Σ=I Z’Z has the Σ=I density giving a special case of Page 51

Numerical Experiments – Largest Eigenvalue Analytic Formula for largest eigenvalue dist E and Koev: software to compute Page 52

53

Page 54

55

Smallest Eigenvalue as Well The cdf of the smallest eigenvalue, Page 56

Cdf’s of smallest eigenvalue 57

Goals Continuum of Haar Measures generalizing orthogonal, unitary, symplectic Place finite random matrix theory “β”into same framework as infinite random matrix theory: specifically β as a knob to turn down the randomness, e.g. Airy Kernel –d 2 /dx 2 +x+(2/β ½ )dW  White Noise Page 58

Formally Let S n =2π/Γ(n/2)=“surface area of sphere” Defined at any n= β>0. A β-ghost x is formally defined by a function f x (r) such that ∫ ∞ f x (r) r β-1 S β-1 dr=1. Note: For β integer, the x can be realized as a random spherically symmetric variable in β dimensions Example: A β-normal ghost is defined by f(r)=(2π) -β/2 e -r 2 /2 Example: Zero is defined with constant*δ(r). Can we do algebra? Can we do linear algebra? Can we add? Can we multiply? r=0 Page 59

Understanding ∏|λ i -λ j | β Define volume element (dx)^ by (r dx)^=r β (dx)^ (β-dim volume, like fractals, but don’t really see any fractal theory here) Jacobians: A=QΛQ’ (Sym Eigendecomposition) Q’dAQ=dΛ+(Q’dQ)Λ- Λ(Q’dQ) (dA)^=(Q’dAQ)^= diagonal ^ strictly-upper diagonal = ∏dλ i =(dΛ)^ off-diag = ∏ ( (Q’dQ) ij (λ i -λ j ) ) ^=(Q’dQ)^ ∏|λ i -λ j | β Page 60