Download presentation
Presentation is loading. Please wait.
Published byMark Blair Modified over 9 years ago
1
NCAA Basketball Ranking With a Neural Network. By Erik O’Connor Copyright 2000
2
Outline Problem Current Attacks My Approach MLP with Back Propagation Inputs Used Data used Tests preformed Results Conclusion Sources
3
Problem There are a great number of Ranking systems out there. All of the systems give different results. There is no set system on inputs. All of the systems use different methods to determine rank. Many ranking systems include human inputs that may be biased.
4
Current Attacks Saragin Ratings, uses a Bayesian network, to sort inputs. RPI, looks primarily at strength of schedule. ESPN/USATODAY and AP Top 25 both polls that are determined by votes. Teamrankings.com rankings, determined by probability that a team will win. There are countless other systems that use other various systems.
5
My Approach I have based my approach on the RPI, in that I am focusing on the strength of a teams schedule and how they performed. This way no human inputs will effect the result. Only the teams performance, and more specifically how well the team performed against quality opponents.
6
Multi-Layer Perceptron with Back Propagation I chose to use a multi-layer perceptron approach for this problem because over the course of the semester, I felt that it provided the best results, and could handle a great variety of network designs (# of inputs and # of outputs). The number of inputs I used ranged from 10 to 80, with 1 to 8 outputs. The range in inputs and outputs, is due to the need for more inputs to get better learning, and predicting a more specific rank.
7
Inputs Used Official RPI Home RPI Road RPI Neutral RPI Strength of schedule RPI Points Rating Wins Rating Last 10 games ranking League RPI Non-League RPI These inputs will be described on the following slides……. These are the initial set of inputs used for the network.
8
Figuring in the Tournament One factor that I decided was necessary, was the teams Performance over the last 10 games. This is because a team may have struggled the beginning of the season, but may play well towards the end. More specifically the only team that will have been undefeated the last 10 games must have won the NCAA tournament. Also teams that do well in the tournament will also benefit from this input.
9
Inputs in Detail Home, Away, Neutral inputs are all rankings determined by how a team performed at various locations through the season. Wins and Points rankings are based on quality of wins, and how much a team scores and how much a team outscores its opponent, respectively. Strength of schedule is based on the records and quality of wins by opponents. The official RPI is the Index used by the NCAA selection committee. League and non-league are based on a teams performance in and out of league play. Teams with tougher leagues will benefit from this input.
10
Data Rankings and Ratings from 199-1999 season, as well as 1999-2000 season used. Random grouping from 99-00 season used as testing set, with the remaining data from both seasons used as training data. The main ranking from www.teamrankings.com is used as the official output, since all of the inputs were obtained from this site.www.teamrankings.com
11
Tests Done First test, 10 inputs, 1 output, see if can learn and predict to 50% team. Then altered to predict top 100 teams out of 300+. Currently attempting to convert inputs and outputs to binary to increase to 80-90 inputs, and up to 8 outputs. Once conversion is complete, possible attempts include: predicting by groups of 50, thus having 3 outputs in binary. The most difficult test will involve 8 outputs, which will ultimately stand for ranking 1-256 for the teams.
12
Results The first test I ran, involved 10 inputs and 1 output. The output tested whether a team was in the top 100 or not. I ran this for both seasons as testing and training, and got results that varied between 67%-88%, but since I used 10,000 epochs it cycled, and I did notice at points during training 97% was obtained. The second test I ran involved ten inputs and 3 outputs, this is the first step in generating more precise results. I found for this that a learning rate of 47%- 60% were obtained. Currently I am in the process of transforming the inputs in to binary from the decimal. This will result in 80 inputs, and will be used to predict each team down to the exact rank.
13
Conclusion From the results that I have obtained thus far, it is a good indication that this approach will be a somewhat successful one. As seen by the results I have obtained with an increase I outputs, it is necessary to increase the number of inputs. This will most likely drastically improve the performance. It is hard to believe that it will predict 100% correctness, because there may be flaws in the Rankings that it is compared against. A more precise look team by team will reveal a solid ranking system. The resulting rankings should be different from other systems, but still close enough so that the results can be looked at as valid. This is due to how close many teams are in terms of record and performance.
14
Sources www.teamrankings.com, this is where I got the data from.www.teamrankings.com http://www.mratings.comhttp://www.mratings.com, for more information on ranking systems http://www.kiva.net/~jsagarin/sports/cbsend.htmhttp://www.kiva.net/~jsagarin/sports/cbsend.htm, for information on how Jeff Saragin computes his rankings. Neural Networks, A Comprehensive Foundation by Simon Haykin, for information on MLP’s
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.