Forecasting The Future of Movies By : Kevin Conti, Robbie Arnold, Sean Zeringue
What’s The Deal? Ever look at a movie and wonder whether or not it will be worth your time? We want to be able to create different forecast models for movie ratings (IMDB) for a specific Producer with some accuracy and precision. This would allow you to figure out ahead of time whether or not a movie is worth seeing.
The Formalities We have implemented different forecast algorithms such that: ∑(A - X)2 Where A = Actual Movie Rating, X = Forecasted value produced by a model Algorithms will be merited based upon accuracy, time and complexity. Also implemented forecast algorithm where: Classification = Max(g1(x),g2(x),g3(x)) Where gi(x) is the discriminant function for the given class
Constraints For the given producer, our algorithms will focus on creating forecast models for various movie attributes for the IMDB movie ratings for the producer Steven Spielberg. This allows us to work with a defined set of data, as well reducing the amount of independent variables, therefore allowing us to more easily identify a trend in the data. The attributes of our algorithms will include: Previous IMBD movie ratings for the producer. Budget for a production Run time of a production And previous movie release dates
The Algorithms used Whether these methods will provide an accurate prediction is yet to be seen, these are the algorithms that we will be working with initially: Neural Network A neural Network of linear perceptrons aimed to guess the weight of each attribute and adjust as more data points are fed through. Discriminant Function Analysis A discriminant function analysis is a statistical analysis used to predict a categorical dependent variable(a movie) by one or more continuous variables(budget, runtime, release date) Classical Evolutionary Programming (CEP) A genetic algorithm that stochastically generates individuals in the gene pool, checks how accurate they are, and modifies them until only the most optimized individuals remain
Experimental Procedure (Neural Network) The Network of Perceptrons is initialized with random starting weights between 1 and -1 and a learning constant of .1 The Network then examined the linear relationship between each attribute and movie rating and returned the slope of the relationship as a weight. The size of the hidden layer of the network was modified for several different runs. The Network would ‘guess’ the slope of that relationship and it would feed forward a 1 or -1 depending on how accurate its guess was. If the result was -1, the weights would be adjusted by New Weight = Weight + Error * Input * Learning Constant
Experimental Procedure (Discriminant Function Analysis) For creating the model, a 1-out method was implemented where the given movie to be tested was taken out of the data set and then plugged back into the model and classified. The Discriminant function: gi (x) = (x - ui )t Σi-1,(x-ui) - .5 ln |Σi| + ln(P(wi)) gi(x): Discriminant function for class i ui : mean vector Σi: covariance matrix of the class Σi-1: inverse of Σi |Σi| : determinant of Σi (x - ui )t : the transpose of the vector (x - ui) P(wi): probability of the given class
Experimental Procedure (Discriminant Function Analysis) Cont’d Three classes were created with the movie data presorted into the classes Class 1: Ratings [4,6) Class 2: Ratings [6,8) Class 3: Ratings [8-10] Class data was normalized The discriminant was calculated for each class with the test movie as input. The discriminant with the highest value determined which class to place the movie into
Experimental Procedure: Classical Evolutionary Programming (CEP) Terminology: Individual, population, weights, parents and children The starting population is randomly generated, with each individual’s weights having bounds from -1 to 2, inclusive. The individuals have their fitness tested, and then copies of them are mutated Parents and children are mutated (no crossover), and the top 50% are selected to continue the gene pool, the rest are eliminated. This repeated for a set amount of iterations. The more iterations, the more time the process takes, but the more accurate the algorithm is on average
Some Results! (Neural Network) 1 Node 5 Nodes 10 Nodes 20 Nodes Weighted Model Produced -0.4379(Budget) + 2.9756(Runtime) -0.4648(Budget) + 2.1070(Runtime) -0.6134(Budget) + 1.7144(Runtime) -0.4519(Budget) + 2.45797(Runtime) Accuracy Rating 73.46 34.859 42.9603 35.488 Time Taken (Seconds) 1.07510 5.5063 9.5585 18.9690 Complexity of implementation = O(n)
More Results! (Discriminant Function Analysis)
Maximum Optimization - Results
Within 2: How quick can we be accurate?
Within 2: Iterations View
Fun Facts: On average, the runtime weight was ~0.05 Comparatively, the budget weight was ~0.00007 Most accurate result run in test was the 1 million iteration algorithm, but it found the result within 50,000 iterations. It spend the last 950,000 trying to beat it and failed.
Time Complexities Neural Network - O(n) Discriminant Function - O(n2) Classical Evolutionary Programming - O(n)
Possible holes Due to the subjective nature of movies the attributes don’t have clear correlations to the movies. Using linear perceptrons may not have been the best choice since the data is extremely varied. Because roughly 84% of the data for the Discriminant Function Analysis was located in class 2, it made the classifier very broad compared to classes 1 and 3 which were both respectively small. Classes 1 and 3 could be thrown off by outliers in the data, as well as having similarities that are too close to class 2. For CEP, the algorithm often gets stuck on local minimums.
Future Work Experimentation with different activation functions for the neural network can be done. Try using different categorical inputs for the movies. Use different producers or actors and their respective movies Implementing FEP variant (Cauchy randomly distributed). Bigger jumps Improve the “accuracy” score, by telling the algorithm that 0 <= rating <= 10
Why is Our Project 380 Worthy? Our project utilized existing algorithms and forecasting techniques, which were applied to our project in order to attempt to forecast future movie ratings
Questions What did the learning constant do in this neural network implementation? What could cause misclassifications in the Discriminant function analysis? For the Discriminant Function Analysis, what causes the time complexity to be O(n2)? How does a CEP algorithm replicate the concept of children? What role do the bounds of the original gene pool play in the speed of this particular CEP algorithm?
Answers! It determined how rapidly each weight would be adjusted when the Network “learned”. Outliers and small data sets Matrix arithmetic Using random mutation based off of the “parent’s” values Extremely important