Download presentation
Presentation is loading. Please wait.
Published byHengki Widjaja Modified over 6 years ago
1
Approximation and Generalization in Neural Networks
CMS 165 Lecture 8 Approximation and Generalization in Neural Networks
2
Recall from previous lecture:
Hypothesis class: ββπ»;β:πβπ A loss function: π π,β π βπ
, π.π., π(πβ β(π)) Expected risk: πΏ β := πΈ π π π,β π Expected risk minimizer: β β β πππ πππ ββπ» πΏ β Given a set of samples: π₯ π , π¦ π π π Empirical risk: πΏ β β 1 π π π π π¦ π ,β π₯ π Empirical risk minimizer: β β πππ πππ ββπ» πΏ β π X,πβΌπ
3
Measures of Complexity
πΈ sup ββπ» πΏ β β πΏ β =2 π
π π»,π π
π π»,π =πΈ sup ββπ» 1 π 1 π π π π π π ,β( π π ) , π€βπππ π π ππ π
πππππβππ ππππππ π£πππππππ β1,1 πΏ β βπΏ β β β€2π
log 2 πΏ π with prob at least 1βπΏ VC-Dimension: π
β€ 2ππΆ π» log π +1 π Linear class π
β€π π log 2π π Bounded linear class π
β€π π½ log 2π π
4
Rademacher complexity of NN
From notes of Percy Liang
5
Decomposition of Errors
Derivation for linear regression
6
Universality of NN
7
Approximation in Shallow NN
Universality proof is loose: exponential number of units. Better bound? Better basis? How does it improve bound for various classes of functions?
8
Deep vs. Shallow Networks
What is the advantage of deep networks? Compositionality: requires exponential number of units in a shallow network
9
Classical NN theory
10
Modern Neural Networks
From Belkin etal, βReconciling modern machine learning and the bias-variance trade-offβ
11
Seems to be true in practice
Slides from Ben Recht
12
Is it really true? Slides from Ben Recht
13
Look closely at data.. Slides from Ben Recht
14
Solution? Better Test Sets..
Slides from Ben Recht
15
Accuracy on harder test set
Slides from Ben Recht
16
True even on Imagenet Slides from Ben Recht
17
Is this a good summary? Slides from Ben Recht
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.