Presentation is loading. Please wait.

Presentation is loading. Please wait.

Football for KMS: NFL ‘01 Abhijit Kumar Kaijia Bao Vishal Rupani APRIL 30 TH 2008 Course Instructor: Prof. Hsinchun Chen.

Similar presentations


Presentation on theme: "Football for KMS: NFL ‘01 Abhijit Kumar Kaijia Bao Vishal Rupani APRIL 30 TH 2008 Course Instructor: Prof. Hsinchun Chen."— Presentation transcript:

1 Football for KMS: NFL ‘01 Abhijit Kumar Kaijia Bao Vishal Rupani APRIL 30 TH 2008 Course Instructor: Prof. Hsinchun Chen

2 Agenda Data Collection Client Relations Final Presentation Knowledge Discovery Statistical Analysis Data Mining Techniques Key Findings KMS Demonstration ABHI VISHALKAI Objectives Literature Overview Conclusion Data Cleaning Statistical Analysis Final Paper Data Import Data Transformation Data Mining

3 Research Objectives  Pattern identification  Descriptive Statistics  Data Mining Techniques  Prediction  Developing a strategy  Fantasy League

4 Literature Overview  Moneyball: The Art of Winning an Unfair Game Michael Lewis  Las Vegas Odds www.VegasInsider.com  NFL Fantasy League www.Nfl.com/fantasy

5 Knowledge Discovery Process DATA Pro-Football -3 Tables -40 Columns -82,346 Rows Lisa Ordonez -1 Table -90 Columns -50,417 Rows SQL 2005 IS TRANSFORMATION Dependent Variables Calculated Variables Independent Variables SQL 2005 AS Play Decision, Intended Player, Play Direction, Yards GameNum, IsPlayChal, PlayZone, TotalOffTO, PlayDecision, QtrTimeLeft, HalfTimeLeft, GameTimeLeft Defense, Down, GAP, Halftime Left, Off Ydl, Offense, Play Zone, QTR, ToGo, Total Off TO

6 Knowledge Discovery Process DATA Pro-Football -3 Tables -40 Columns -82,346 Rows Lisa Ordonez -1 Table -90 Columns -53,000 Rows SQL 2005 IS TRANSFORMATION Dependent Variables Calculated Variables Independent Variables SQL 2005 AS PROCESSING Simple Statistics -Play Decision -Intended Player -Play Direction -Yards MS Excel 2007 MINING Models - ID3 - Neural Networks Accuracy -Lift Charts -Classification Matrix SQL 2005 AS

7 Dependency Network

8

9 Intended Player: Statistics Top 3 Intended Players for Passes for the 4 teams that played in the semi-finals H.Ward (142), P.Burress (121), B.Shaw (44) T.Brown (143), D.Patten (93), M.Edwards (39) T.Holt (133), M.Faulk (104), I.Bruce (103) J.Thrash (107), D.Staley (89), T.Pinkston (83)

10 Play Direction: Statistics  Direction of Rushes for all plays in 2001 season Middle Left Tackle Right GuardRight End Middle Right Tackle Left GuardLeft End

11 Play Direction: Statistics  Direction of Rushes for all plays in 2001 season Number of Rushes Direction

12 Yardage: Statistics  Yardage during each down for Pass and Rush Yards To Go Average Yards Covered Passes Rushes

13 Play Decision: Statistics  Play Decisions for the 4 teams that played in the semi-finals Number of Decisions Play Decision Type

14 Play Decision: Analysis Overview  Discovery of what environmental and/or game factors affect play decision  Discovery of football expert knowledge through data mining  Prediction of play decisions based on game factors

15 Play Decision: ID3 Analysis

16

17 Play Decision: Accuracy

18 Rush Accuracy: Lift Chart

19 Field Goal Accuracy: Lift Chart

20 Play Decision: Classification Matrix

21 Play Decision: Key Findings  Football strategy can be discovered through data, instead of knowledge experts  Top 3 factors affecting decision:  Down, Off Ydl, Time  Accuracy of the models are different depending on the decision we are trying to predict  Team specific strategies may be discovered with more data.

22 Play Direction: Analysis Overview  Discover team’s strengths and weakness in their defense and/or offense  Prediction of play directions based on game factors Middle Left Tackle Right GuardRight End Middle Right Tackle Left GuardLeft End

23 Play Direction: Accuracy

24 Play Direction: Key Findings (ID3)

25 Intended Player: Analysis Overview  Discover each team’s favored recipient of a pass  Prediction of intended player based on game factors

26 Intended Player: Lift Chart

27 Intended Player: Key Findings  There are 400+ intended players  Not enough data to accurately predict intended players  Not enough data to gain knowledge over statistical models

28 Conclusions PLAY DECISION - Accurate - Gained Knowledge PLAY DIRECTION - Less accurate - Enough data to gain knowledge INTENDED PLAYERS - Insufficient data - No knowledge gained - Need to increase sample size

29 Future Direction  Increase sample set  More instances of different scenarios  Incorporate additional information  Pro-football-Reference.com  VegasInsider.com (Odds for favorites)  Extend Analysis  Nested case (Historical performance)

30 References  Prof. Lisa Ordóñez  Professor in Statistics  Steve Aldrich  Author of Moneyball in Football  About Football  Glossary of terms

31 DATA Pro-Football -3 Tables -40 Columns -82,346 Rows Lisa Ordonez -1 Table -90 Columns -53,000 Rows SQL 2005 IS TRANSFORMATION Dependent Variables Calculated Variables Independent Variables SQL 2005 AS PROCESSING Simple Statistics -Play Decision -Intended Player -Play Direction -Yards MS Excel 2007 MINING Models - ID3 - Neural Networks Accuracy -Lift Charts -Classification Matrix SQL 2005 AS Knowledge Discovery Process

32 Research Objectives Literature Overview Knowledge Discovery Statistics: Intended Player Statistics: Play Direction Statistics: Yardage Statistics: Play Decision Accuracy: Lift Chart Charts Analysis: Play Decision Analysis: Play Direction Analysis: Intended Player Conclusions Future Directions System Design

33 Backup Slide Section

34 Data Collection Football Outsiders Pro-Football Initial Dataset Cleaning Hierarchy Relevance Processing Dependent Independent Calculated Analysis 55,000 rows 90 columns 47,033 rows 30 columns Dependent – 4 Independent – 10 Calculated - 9

35 System Design NFL Season 2001 FOOTBALL DATA DB NFL KMS Model Building Testing/ Accuracy Pattern Analysis Formations Substitutions Play Decisions FIELD STRATEGY DEFENSE STRATEGY METRICS Accuracy Performance

36 Yards Analysis  Yards gained on the play is used as a metric to measure effort  Discover how environmental and/or game factors affect player’s efforts  Key Findings: Top 4 environmental factors  Off Ydl  Time  Down  Gap


Download ppt "Football for KMS: NFL ‘01 Abhijit Kumar Kaijia Bao Vishal Rupani APRIL 30 TH 2008 Course Instructor: Prof. Hsinchun Chen."

Similar presentations


Ads by Google