Download presentation
Presentation is loading. Please wait.
Published byAugust Williamson Modified over 7 years ago
1
Paper discussion: Why does cheap and deep learning work so well?
Cu Neural net Paper discussion: Why does cheap and deep learning work so well?
3
What are Hamiltonians? According to wikipedia, it’s an “energy function” H: State -> Negative log-likelihood of state H({state}) = - log P({state})
4
Hamiltonians
5
Lin’s argument 1: Why cheap learning?
Neural networks can approximate polynomial Hamiltonians The kind of Hamiltonians we want to approximate are Low degree polynomials Local Symmetric
6
Low degree polynomials
Hamiltonians in physics are low degree polynomials Standard model of physics has Hamiltonian of degree-4 polynomial Maxwell equations for electromagnetism, Navier-Stokes for fluid dynamics, etc all have Hamiltonian of polynomials with degree 2, 3 or 4 If data is generated by a multivariate Gaussian, it has a Hamiltonian of degree-2 polynomial Central Limit Theorem means lots of stuff can be approximated with multivariate Gaussians
7
Local Things only affect what’s in their immediate vicinity
Therefore, the total number of non-zero coefficients in the Hamiltonian grows only linearly with n Why is this true? Lin: Model spins as vertices of a Markov network and dependencies as edges, and fix the size of the largest clique
8
Symmetric Many probability distributions we are concerned about in physics and machine learning are invariant under translational and rotational symmetry Symmetry not only reduces the complexity of the model, but also the computational complexity With translation symmetry, we can use convolutions to perform multiplications in O(nlogn) with FFTs rather than O(n^2)
9
Example Say we’ve got n pieces of data Low-degree polynomial:
Assume Hamiltonian is of degree 2, we bring number of parameters N down from infinity to (n+1)(n+2)/2 Locality: Further assume nearest-neighbor coupling, we now have N = 2n Symmetry: Further assume translational symmetry, we now have N = 3
10
Example The number of continuous parameters in the standard model of physics has just N = 32 after taking into account these three factors
11
Lin’s argument 1: Why cheap learning?
Neural networks can approximate polynomial Hamiltonians The kind of Hamiltonians we want to approximate are Low degree polynomials Local Symmetric
12
Lin’s argument 2: Why deep learning?
Data sets in physics are generated by hierarchical Markovian processes We need deep learning to reverse-engineer this hierarchy Renormalization Linear no flattening
15
By induction … We can use the minimal sufficient statistic at each point of the process to get the minimal sufficient statistic at the previous point By induction, we can recover the first point
16
Renormalization Problem: uncertainty propagates as well
Solution: Renormalization H -> H’ Remapping to a coarser resolution level Analogy is drawn with stacking RBMs together
17
Linear no flattening Successive multiplication with n different matrices is equivalent to multiplication with one matrix, so why can’t we just flatten everything? Lin gives examples: If we flatten the intermediate representation, we lose the sparse factorization of O(nlogn) and get back O(n^2) Strassen’s algorithm for matrix multiplication performs better than the naïve O(n^3) one
18
Lin’s argument 2: Why deep learning?
Data sets in physics are generated by hierarchical Markovian processes We need deep learning to reverse-engineer this hierarchy Renormalization Linear no flattening
19
Summary Neural networks are good approximators of probability distributions in physics Since physics is just like everything else (cats look that way because of physics), neural networks are a good way to do ML
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.