Using Neural Networks to Determine NFL Game Outcomes Presented by Alex Dixon
Motivation NFL games are enjoyed by 264 million people each week. Millions of dollars are bet each week on the NFL. Experts try to predict winners every week Numerous programs already exist to predict the winner of a game
Accuracy of Current Prediction Methods NFL Analysists The very best analysts picked around 64% of games correct. This year the average accuracy of the ten best annalists is 65.3% Computer Models Last year Microsoft’s “Cortana” finished with a 161-95 record (63 percent), but still a few games ahead of the Las Vegas oddsmakers. This season, through week 13 [2] Nate Silver’s “Elo” predictor is 63% accurate so far this season Microsoft’s “Cortana” is 66% Accurate
Feature Vectors & Data Collection 9 Feature vectors extracted from www.pro-football-reference.com Home/Away Differential Statistics Points For Points Allowed Yards For Yards Allowed First Downs For First Downs Allowed Turnovers For Turnovers Against
Neural Network Choice Multilayer Perceptron Can process a large amount of data Good choice for predicting nonlinear data
Multilayer Perceptron Configuration 9 Inputs 5 Hidden Nodes 1 output (0 or 1) Sigmoidal Activation Function Backpropagation
Original Multilayer Perceptron Cross Validation Each cross validation is for 50 epochs Data Set Model MSE Correct/Total Percentage Correctly Predicted Set 1 .09878 138/240 57.5% Set 2 .1001 144/240 60.0% Set 3 .1006 129/240 53.8% Set 4 .0981 130/240 54.2% TOTAL .0994 541/960 56.35%
Feature Vector Elimination based upon Correlation Coefficients Remain in Feature Vectors? Home/Away 0.1211 Yes Points For 0.1297 Points Allowed -0.1628 Yards For 0.0722 No Yards Allowed -0.0384 First Downs For 0.0656 First Downs Allowed -0.0021 Turnovers Committed -0.1810 Turnovers Allowed 0.0845
Feature Vector Elimination Cross Validation Each cross validation is for 50 epochs Data Set Model MSE Correct/Total % Correct Set 1 .1103 143/240 59.6% Set 2 .1142 152/240 63.3% Set 3 .1114 131/240 54.6% Set 4 .1111 140/240 58.3% TOTAL .1118 566/960 59% LAST .0994 541/960 56.35% IMPROVEMENT + .0186 25/960 2.6%
The Final Model The feature vector of “Winning Percentage” was calculated and extracted. The Correlation coefficient of the feature vector is .2266, much higher than any other feature vector This feature vector was then added to the 5 from the previous model Final Multilayer Perceptron 6 Inputs 3 Hidden Nodes Sigmoidal Activation Function
Final Model Cross Validation Each cross validation is for 50 epochs Data Set Model MSE Correct/Total % Correct Set 1 .1067 134/240 55.8% Set 2 .1085 154/240 64.2% Set 3 .1095 143/240 59.6% Set 4 .1087 136/240 56.7% TOTAL .1084 567/960 59%
Results of Final Model on the 2017 NFL Season so Far The Predictor was able to correctly determine the outcome of 110 of the 175 games played so far in the NFL season (Week 2 – Week 13) 62.8% Correct .2 % less than Nate Silver’s “Elo” Predictor 3.2 % less than Microsoft’s “Cortana”
Possible Reasons For Error Takes into account past averaged statistics Model does not take into account current state Injury to key players Win/Loss streaks Teams that clinch a playoff spot often bench their starters Averages for games at start of season are inconsistent
References [1] http://thesportsquotient.com/nfl/2015/1/12/the-accuracy-of-nfl-experts [2] http://www.businessinsider.com/nfl-picks-microsoft-cortana-elo-week-13- 2017-11 [3] https://www.pro-football-reference.com/