Brain Damage: Algorithms for Network Pruning Andrew Yip HMC Fall 2003.

Brain Damage: Algorithms for Network Pruning Andrew Yip HMC Fall 2003

The Idea Networks with excessive weights “over-train” on data. As a result, they have poor generalization. Create a technique that can effectively reduce the size of the network without reducing validation. Hopefully, by reducing the complexity, network pruning can increase the generalization capabilities of the net.

History Removing weights means to set them to 0 and freeze them First attempt at network pruning removed weights of least magnitude Minimize cost function composed of both the training error and the measure of network complexity

Lecun’s Take Derive a more theoretically sound technique for weight removal order using the derivative of the error function:

Computing the 2 nd Derivatives Network expressed as: Diagonals of Hessian: Second Derivatives:

The Recipe Train the network until local minimum is obtained Compute the second derivatives for each parameter Compute the saliencies Delete the low-saliency parameters Iterate

Results Results of OBD Compared to Magnitude-Based Damage

Results Continued Comparison of MSE with Retraining versus w/o Retraining

Lecon’s Conclusions Optimal Brain Damage results in a decrease in the number of parameters by up to four; general recognition accuracy increased. OBD can be used either as an automatic pruning tool or an interactive one.

Babak Hassibi: Return of Lecun Several problems arise from Lecun’s simplifying assumptions For smaller sized networks, OBD chooses the incorrect parameter to delete It is possible to recursively calculate the Hessian, yielding a more accurate approximation.

**Insert Math Here** (I have no idea what I’m talking about)

The MONK’s Problems Set of problems involving classifying artificial robots based on six discrete valued attributes Binary Decision Problems: (head_shape = body_shape) Study performed in 1991; Back-propagation with weight decay found to be most accurate solution at the time.

Results: Hassibi Wins Training # weights MONK1 BPWD OBS 100 58 14 MONK2 BPWD OBS 100 39 15 MONK3 BPWD OBS 93.4 97.2 39 4

References Le Cun, Yann. “Optimal Brain Damage”. AT&T Bell Laboratories, 1990. Hassibi, Babak, Stork, David. “Optimal Brain Surgeon and General Network Pruning”. Ricoh California Research Center. 1993. Thrun, S.B. “The MONK’s Problems”. CMU. 1991.

Questions? (Brain Background Courtesy Brainburst.com)

Brain Damage: Algorithms for Network Pruning Andrew Yip HMC Fall 2003.

Similar presentations

Presentation on theme: "Brain Damage: Algorithms for Network Pruning Andrew Yip HMC Fall 2003."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Brain Damage: Algorithms for Network Pruning Andrew Yip HMC Fall 2003.

Similar presentations

Presentation on theme: "Brain Damage: Algorithms for Network Pruning Andrew Yip HMC Fall 2003."— Presentation transcript:

Similar presentations

About project

Feedback