Download presentation
Presentation is loading. Please wait.
PublishEsther Holmes Modified over 9 years ago
1
Brain Damage: Algorithms for Network Pruning Andrew Yip HMC Fall 2003
2
The Idea Networks with excessive weights “over-train” on data. As a result, they have poor generalization. Create a technique that can effectively reduce the size of the network without reducing validation. Hopefully, by reducing the complexity, network pruning can increase the generalization capabilities of the net.
3
History Removing weights means to set them to 0 and freeze them First attempt at network pruning removed weights of least magnitude Minimize cost function composed of both the training error and the measure of network complexity
4
Lecun’s Take Derive a more theoretically sound technique for weight removal order using the derivative of the error function:
5
Computing the 2 nd Derivatives Network expressed as: Diagonals of Hessian: Second Derivatives:
6
The Recipe Train the network until local minimum is obtained Compute the second derivatives for each parameter Compute the saliencies Delete the low-saliency parameters Iterate
7
Results Results of OBD Compared to Magnitude-Based Damage
8
Results Continued Comparison of MSE with Retraining versus w/o Retraining
9
Lecon’s Conclusions Optimal Brain Damage results in a decrease in the number of parameters by up to four; general recognition accuracy increased. OBD can be used either as an automatic pruning tool or an interactive one.
10
Babak Hassibi: Return of Lecun Several problems arise from Lecun’s simplifying assumptions For smaller sized networks, OBD chooses the incorrect parameter to delete It is possible to recursively calculate the Hessian, yielding a more accurate approximation.
11
**Insert Math Here** (I have no idea what I’m talking about)
12
The MONK’s Problems Set of problems involving classifying artificial robots based on six discrete valued attributes Binary Decision Problems: (head_shape = body_shape) Study performed in 1991; Back-propagation with weight decay found to be most accurate solution at the time.
13
Results: Hassibi Wins Training # weights MONK1 BPWD OBS 100 58 14 MONK2 BPWD OBS 100 39 15 MONK3 BPWD OBS 93.4 97.2 39 4
14
References Le Cun, Yann. “Optimal Brain Damage”. AT&T Bell Laboratories, 1990. Hassibi, Babak, Stork, David. “Optimal Brain Surgeon and General Network Pruning”. Ricoh California Research Center. 1993. Thrun, S.B. “The MONK’s Problems”. CMU. 1991.
15
Questions? (Brain Background Courtesy Brainburst.com)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.