cs638/838 - Spring 2017 (Shavlik©), Week 10 CS 540 Fall 2015 (Shavlik) 6/3/2018 Today’s Topics Some Lab 3 Comments Talk by Akshay Sood on recurrent ANNs, LSTM, etc Talk by Luisa Polania Cabrera of American Family Insurance on some of their Deep ML projects One-on-one Q&A on Lab3, Projects, etc I have run turned-in Lab 3 code Don’t redefine Vector! Project report comments emailed – lots on RL! April 18: Google-Madison talk on TPUs 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
Sample, Latest CNN Results (32x32 images, batch size = 10) ADAM? drHU/drIn Train/TestErr@Epoch ExtraTrain #FlatHUs yes 0.00/0.00 0@20 18@20 18,636 256 yes 0.00/0.05 177@40 19@13 28,509 128 yes 0.50/0.05 113@80 20@74 18,568 256 no 0.00/0.00 410@30 22@25 28,460 256 no 0.00/0.00 0@160 25@159 0 64 ADAM worked well Extra examples helped a good deal Convolution kernels of 4x4 worked better than 5x5 (3x3 in between) ‘Zero padding’ worked about the same as w/o it Using 10 plates instead of 20 worked ok Better test-set accuracy than I expected, given small dataset! 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
cs638/838 - Spring 2017 (Shavlik©), Week 10 Ensembles Ensembles often greatly increase accuracy Combining all models with 25 or fewer testset errors, led to 12 errors! But this is a cheat! Why? Correcting the cheat, led to 19 errors Key question: how to pick N best models? Don’t forget about ensembles! 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
Thought These are FLOWERS 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
Thought These are AIRPLANES Some white borders! Not centered 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
cs638/838 - Spring 2017 (Shavlik©), Week 10 NOT butterflies More Ensemble Errors NOT pianos NOT starfish NOT a watch 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
Tuneset vs. Testset Accuracies We’d like to threshold the Y axis, but we need to threshold the X ! Testeset Errors Tuneset Errors 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
Some Lab 3 Report Comments Two senses of ‘learning curve’ (see original Lab 3 slides) CURVES better than TABLES! Some learning curves STEEP (next slide) suggests value in getting more original images Drop out worked for some, not for others Generating ‘perturbed’ examples greatly helps We really should replicate to get ‘error bars’ (ie, different random seeds) 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
An Encouraging Learning Curve! 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
My Learning Curve (used top-10 TUNEset models) Testeset Errors Number of (Original) Training Examples 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
cs638/838 - Spring 2017 (Shavlik©), Week 10 What Action is This? 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
cs638/838 - Spring 2017 (Shavlik©), Week 10 Impact of Random Seed Be careful to avoid ‘cherry picking’! Avoid ‘peeking at the test set’ while making decisions! 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
A Nice Overfitting Curve (from Lab 2) ERROR EPOCH 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10
Some Lab 3 Report Comments (2) Should start learning curves at 0 or ‘accuracy of always guessing most common output’ Some initial weights should be negative Too many plates might overfit? Third CONVOLUTION layer probably hurts WATCH predicted a lot because most common, probably not because it was LAST (reorder the enum!) ‘Rotated by 180’ – three meanings! 3/28/17 cs638/838 - Spring 2017 (Shavlik©), Week 10