Presentation is loading. Please wait.

Presentation is loading. Please wait.

Approximation and Generalization in Neural Networks

Similar presentations


Presentation on theme: "Approximation and Generalization in Neural Networks"β€” Presentation transcript:

1 Approximation and Generalization in Neural Networks
CMS 165 Lecture 8 Approximation and Generalization in Neural Networks

2 Recall from previous lecture:
Hypothesis class: β„Žβˆˆπ»;β„Ž:π‘‹β†’π‘Œ A loss function: 𝑙 π‘Œ,β„Ž 𝑋 βˆˆπ‘…, 𝑒.𝑔., 𝕀(π‘Œβ‰ β„Ž(𝑋)) Expected risk: 𝐿 β„Ž := 𝐸 𝑃 𝑙 π‘Œ,β„Ž 𝑋 Expected risk minimizer: β„Ž βˆ— ∈ π‘Žπ‘Ÿπ‘” π‘šπ‘–π‘› β„Žβˆˆπ» 𝐿 β„Ž Given a set of samples: π‘₯ 𝑖 , 𝑦 𝑖 𝑖 𝑛 Empirical risk: 𝐿 β„Ž ≔ 1 𝑛 𝑖 𝑛 𝑙 𝑦 𝑖 ,β„Ž π‘₯ 𝑖 Empirical risk minimizer: β„Ž ∈ π‘Žπ‘Ÿπ‘” π‘šπ‘–π‘› β„Žβˆˆπ» 𝐿 β„Ž 𝑃 X,π‘ŒβˆΌπ‘ƒ

3 Measures of Complexity
𝐸 sup β„Žβˆˆπ» 𝐿 β„Ž βˆ’ 𝐿 β„Ž =2 𝑅 𝑛 𝐻,𝑙 𝑅 𝑛 𝐻,𝑙 =𝐸 sup β„Žβˆˆπ» 1 𝑛 1 𝑛 𝜎 𝑖 𝑙 π‘Œ 𝑖 ,β„Ž( 𝑋 𝑖 ) , π‘€β„Žπ‘’π‘Ÿπ‘’ 𝜎 𝑖 𝑖𝑠 π‘…π‘Žπ‘‘π‘’π‘šπ‘β„Žπ‘’π‘Ÿ π‘Ÿπ‘Žπ‘›π‘‘π‘œπ‘š π‘£π‘Žπ‘Ÿπ‘–π‘Žπ‘π‘™π‘’ βˆ’1,1 𝐿 β„Ž βˆ’πΏ β„Ž βˆ— ≀2𝑅 log 2 𝛿 𝑛 with prob at least 1βˆ’π›Ώ VC-Dimension: 𝑅≀ 2𝑉𝐢 𝐻 log 𝑛 +1 𝑛 Linear class 𝑅≀𝑂 𝑑 log 2𝑑 𝑛 Bounded linear class 𝑅≀𝑂 𝛽 log 2𝑑 𝑛

4 Rademacher complexity of NN
From notes of Percy Liang

5 Decomposition of Errors
Derivation for linear regression

6 Universality of NN

7 Approximation in Shallow NN
Universality proof is loose: exponential number of units. Better bound? Better basis? How does it improve bound for various classes of functions?

8 Deep vs. Shallow Networks
What is the advantage of deep networks? Compositionality: requires exponential number of units in a shallow network

9 Classical NN theory

10 Modern Neural Networks
From Belkin etal, β€œReconciling modern machine learning and the bias-variance trade-off”

11 Seems to be true in practice
Slides from Ben Recht

12 Is it really true? Slides from Ben Recht

13 Look closely at data.. Slides from Ben Recht

14 Solution? Better Test Sets..
Slides from Ben Recht

15 Accuracy on harder test set
Slides from Ben Recht

16 True even on Imagenet Slides from Ben Recht

17 Is this a good summary? Slides from Ben Recht


Download ppt "Approximation and Generalization in Neural Networks"

Similar presentations


Ads by Google