Presentation is loading. Please wait.

Presentation is loading. Please wait.

Paper discussion: Why does cheap and deep learning work so well?

Similar presentations


Presentation on theme: "Paper discussion: Why does cheap and deep learning work so well?"— Presentation transcript:

1 Paper discussion: Why does cheap and deep learning work so well?
Cu Neural net Paper discussion: Why does cheap and deep learning work so well?

2

3 What are Hamiltonians? According to wikipedia, it’s an “energy function” H: State -> Negative log-likelihood of state H({state}) = - log P({state})

4 Hamiltonians

5 Lin’s argument 1: Why cheap learning?
Neural networks can approximate polynomial Hamiltonians The kind of Hamiltonians we want to approximate are Low degree polynomials Local Symmetric

6 Low degree polynomials
Hamiltonians in physics are low degree polynomials Standard model of physics has Hamiltonian of degree-4 polynomial Maxwell equations for electromagnetism, Navier-Stokes for fluid dynamics, etc all have Hamiltonian of polynomials with degree 2, 3 or 4 If data is generated by a multivariate Gaussian, it has a Hamiltonian of degree-2 polynomial Central Limit Theorem means lots of stuff can be approximated with multivariate Gaussians

7 Local Things only affect what’s in their immediate vicinity
Therefore, the total number of non-zero coefficients in the Hamiltonian grows only linearly with n Why is this true? Lin: Model spins as vertices of a Markov network and dependencies as edges, and fix the size of the largest clique

8 Symmetric Many probability distributions we are concerned about in physics and machine learning are invariant under translational and rotational symmetry Symmetry not only reduces the complexity of the model, but also the computational complexity With translation symmetry, we can use convolutions to perform multiplications in O(nlogn) with FFTs rather than O(n^2)

9 Example Say we’ve got n pieces of data Low-degree polynomial:
Assume Hamiltonian is of degree 2, we bring number of parameters N down from infinity to (n+1)(n+2)/2 Locality: Further assume nearest-neighbor coupling, we now have N = 2n Symmetry: Further assume translational symmetry, we now have N = 3

10 Example The number of continuous parameters in the standard model of physics has just N = 32 after taking into account these three factors

11 Lin’s argument 1: Why cheap learning?
Neural networks can approximate polynomial Hamiltonians The kind of Hamiltonians we want to approximate are Low degree polynomials Local Symmetric

12 Lin’s argument 2: Why deep learning?
Data sets in physics are generated by hierarchical Markovian processes We need deep learning to reverse-engineer this hierarchy Renormalization Linear no flattening

13

14

15 By induction … We can use the minimal sufficient statistic at each point of the process to get the minimal sufficient statistic at the previous point By induction, we can recover the first point

16 Renormalization Problem: uncertainty propagates as well
Solution: Renormalization H -> H’ Remapping to a coarser resolution level Analogy is drawn with stacking RBMs together

17 Linear no flattening Successive multiplication with n different matrices is equivalent to multiplication with one matrix, so why can’t we just flatten everything? Lin gives examples: If we flatten the intermediate representation, we lose the sparse factorization of O(nlogn) and get back O(n^2) Strassen’s algorithm for matrix multiplication performs better than the naïve O(n^3) one

18 Lin’s argument 2: Why deep learning?
Data sets in physics are generated by hierarchical Markovian processes We need deep learning to reverse-engineer this hierarchy Renormalization Linear no flattening

19 Summary Neural networks are good approximators of probability distributions in physics Since physics is just like everything else (cats look that way because of physics), neural networks are a good way to do ML


Download ppt "Paper discussion: Why does cheap and deep learning work so well?"

Similar presentations


Ads by Google