Huyen Nguyen, Dung Phan, and Girish Shirodkar

Slides:



Advertisements
Similar presentations
Baseball Statistics By Krishna Hajari Faraz Hyder William Walker.
Advertisements

Soccer Soccer is one of most popular sports in the world.
Introduction Offensive strategies of the National Football League have seemingly shifted towards a “West Coast” style offense, relying more heavily on.
KICKIN’ BACK Predictive Analytics and Fantasy Football Kicking GEORGE GREEN DSS680: PREDICTIVE ANALYTICS.
Knowledge Management for UEFA Champions League By Harsha Gunnam Hetal Mehta Nargis Memon Manish Wadhwa.
Rating Systems Vs Machine Learning on the context of sports George Kyriakides, Kyriacos Talattinis, George Stefanides Department of Applied Informatics,
Team 2:FBI(Fresh Business Intelligence). Agenda Brief Introduction on project Working Process Dashboard Show Team Work.
Introduction to Directed Data Mining: Decision Trees
April 11, 2008 Data Mining Competition 2008 The 4 th Annual Business Intelligence Symposium Hualin Wang Manager of Advanced.
History of FIFA The modern football was born 1863 when English football team was founded it.The first FIFA cup was 18 July 1930 on. Over the 25 years.
Application of SAS®! Enterprise Miner™ in Credit Risk Analytics
Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.
Meng Yang 02/27/2014. Summary The paper was written by Babatunde Buraimo and Rob Simmons. Their research interests include audience demand, sports broadcasting,
NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015
Ball Speed: 2 kph Player Speed: 12 kph Closest Opponent: 7 m & behind Distance to Goal: 32m Chance of a Goal: Very High “Read the Game” by Sermetcan Baysal.
Intelligent Systems in the Gambling Industry Kieran O’Neill 25/03/10.
Quartiles What are they?.
Loan Default Model Saed Sayad 1www.ismartsoft.com.
Maximizing Revenue with Analytics. Class Overview 1 How Do Sports Organizations Make Money? Examining the five primary sources of revenue. How Can Sports.
Economic and team performance data have different effects on different elements of ticket demand Unemployment is a strong predictor of attendance but not.
PREDICTING WHICH POSITION AN NFL PLAYER SHOULD BE ON THE VIDEO GAME MADDEN. Forrest Lamp & Matt Nord.
The Impact of Power Plays in NHL Hockey, or: No Dogs Play Hockey Jordan Pedersen, Tom Geiger, and Waleed Khoury.
Mapping the English Premier League By Aaron Emrazian For Geography 1820 Applied GIS I Instructor: Scott Festin.
Multinomial Distribution World Premier League Soccer Game Outcomes.
Logistic Regression An Introduction. Uses Designed for survival analysis- binary response For predicting a chance, probability, proportion or percentage.
Prepared by 1/1.   This group goes to the library to : 1- Get information about sports. 2- Know the importance of spots. Handball Group.
Alvin CHAN Kay CHEUNG Alex YING Relationship between Twitter Events and Real-life.
LONG CORNER KICKS IN THE ENGLISH PREMIER LEAGUE: DELIVERIES INTO THE GOAL AREA AND CRITICAL AREA INTRODUCTION Goals within soccer can be scored from open.
Report on the sport: Student work: Grade: ubmit to the teacher:
Prediction of Box Office Gross Revenue
Energy Consumption Forecast Using JMP® Pro 11 Time Series Analysis
A Jump Into the Android App Store: What Makes a Best-Selling App?
Analysis of Fastenal Quoting Practices
Decision Trees in Analytical Model Development
On your mini whiteboard list any form of media you can think of!
Structures, Strategies and Compositions
Customer Segmentation Based on RFM and Predicting Defaulters
Using JMP® to Predict the Adoption of Animals at Austin Animal Center g Hind Manou & Imran Selim MBA Students, Analytics Concentration, Oklahoma State.
Conclusions and areas for further analysis
Multilevel Modeling in Hockey Analytics: Untangling Individual and Team Performance In Even-Strength, Power Play, and Short Handed Situations Sophie Jablansky.
Eco 6380 Predictive Analytics For Economists Spring 2016
Propensity Modeling and Targeted Marketing
A Quantitative Analysis of Penalty Kicks in the English Premier League
Using Analytics to Find Influencers for Power Grid Security Failure
USE OF DATA ANALYTICS TO PREDICT THE DEMAND OF BIKES
Nick Onopa, Charles Jones, Kathy Anderson
Predicting the Market Value of the Property Using JMP® Pro 11
Fundraising Analytics to identify potential prospects using SAS 12.1
Prediction as Data Mining Task
School of Computing Science
Introduction to Data Mining and Classification
Getting Tickets For A Football Match – The Most Popular Sport In The World.
Advanced Analytics Using Enterprise Miner
NBA Draft Prediction BIT 5534 May 2nd 2018
IDENTIFYING BERNIE SANDERS’ VOTER BASE THROUGH PREDICTIVE ANALYTICS
(classification & regression trees)
English Premier league football statistics to win!
Predicting Government Spending on Professional Services
English Premier league football statistics to win!
Analysis of MLS Season Data Using Poisson Regression with R
Yellow Cards: Do they Matter?
Rodney J. Paul – Syracuse University
Dr. Morgan C. Wang Department of Statistics
The Math of Baseball Will Cranford 11/1/2018.
Analytics: Its More than Just Modeling
Course Lab Introduction to IBM Watson Analytics
Science Fair – Baseball
Identifying Severe Weather Radar Characteristics
Do Revenues Effect Success Among Professional Sports Teams?
March Madness Data Crunch Overview
Presentation transcript:

Examining factors that influence English Premier Soccer Results Using JMP® Pro 11 Huyen Nguyen, Dung Phan, and Girish Shirodkar Oklahoma State University, Stillwater, OK 74078 Introduction Soccer is the most popular sport in the world with more than 250 millions players in over 200 countries. English Premier League is broadcasted in 212 territories to 643 million homes and 4.7 billion TV audience. It is therefore of great general importance to determine what attributes drive English Premier League game results. Very few concrete studies have been done to explore the influencial factors to soccer game results. This study, which is based on 10 annual seasons of English Premier League games data, attempts to explore from the perspective of Home Teams. JMP ® Pro 11 is utilized for data preparation, data analysis, and predictive modeling. Fig. 1b: Forward Logistics Regression Odds Ratios Fig. 2a: Neural Network Confusion Matrix Fig. 2b: Neural Network Fig. 1a: Forward Logistics Regression Confusion Matrix Data Preparation The English Premier League games dataset consists of 3680 observations and 23 variables. The target variable Home Team Results is derived from the two variables: Full Time Home Goal and Full Time Away Goal. It is a binary variable, with 0 meaning Home Team loses or draws a tie, and 1 meaning Home Team wins. Using JMP ® Pro 11 the data were consolidated and prepared before Predictive Modeling were utilized. Variable Selection were performed using domain knowledge and statistical methods. 21 key variables were selected. Predictive Modeling Predictive models including Stepwise Logistics Regression Model, Forward Logistics Regression Model, Decision Tree and Neural Network have been used and competing models were analyzed and compared with each other. Fig. 3b: Decision Tree Confusion Matrix Fig. 3a: Decision Tree

Examining factors that influence English Premier Soccer Results Using JMP® Pro 11 Huyen Nguyen, Dung Phan, and Girish Shirodkar Oklahoma State University, Stillwater, OK 74078 Model Misclassification rate Generalized R square AICc BIC Logistic Regression 1 22.15% 48.13% 1056.5 1106.39 Logistic Regression 2 22.88% 47.56% 1059.76 1099.7 Decision Tree 21.78% 46.58% N/A Neural Network 20.92% 53.00% Based on Misclassification Rate Criterion, Stepwise Logistics Regression Model outperforms other models with Misclassification rate of 22.15% . Stepwise Logistics Regression Model points out that factors such as Half Time Home Goal, Half Time Away Goal, Home Team Red Cards, Away Team Red Cards, Home Team Shots, Away Team Shot are the most important predictors in determining game results of English Premier League. Stepwise Logistics Regression Model yeilds a sensitivity of 86.20%, and a speficity of 87.18%. Fig. 4c: Stepwise Logistics Regression model results Conclusion and Discussion Stepwise Logistics Regression Model is selected as the final model. Half Time Home Goal, Half Time Away Goal, Home Team Red Cards, Away Team Red Cards, Home Team Shots, Away Team Shot are the most important predictors in determining game results of English Premier League. It is feasible to predict with high accuracy game results after the first half of the game. Fig. 4b: Stepwise Logistics Regression model results The effects of influential factors to the Soccer Game results can be quantified. For each additional goal Away Team scores by the second half of the game, they stand 264% more chance of winning, whereas for each additional goal Home Team scores, the chance of losing or calling it a tie only decreases by 79.3%. The same pattern is also observed in the effects of Red Cards on the full time results of the game. If Home Team gets an additional Red Card, the chance of losing or calling it a tie goes up by 122% while it is 32.3% for Away Team. Reference http://www.football-data.co.uk/englandm.php The differences in how these factors drive the results of the games can be put down to the influence of Home Playground. Whereas Home Teams have certain advantage of playing on their stadium, the quantified effects mentioned above point to the fact that Home Team is also under more pressure, therefore the effects of Half Time Goal and Red Card are diluted when it comes to Home Team. Acknowledgements Dr. Goutam Chakraborty, founder of SAS and OSU Business Analytics Program at Oklahoma State University, for his continued support and guidance. Fig. 4a: Stepwise Logistics Regression ROC