Download presentation
Presentation is loading. Please wait.
Published byPatience Peters Modified over 6 years ago
1
Eco 6380 Predictive Analytics For Economists Spring 2016
Professor Tom Fomby Department of Economics SMU
2
Presentation 8 Ensemble Predictions, Bagging and Boosting Classroom Notes
3
Benefits of Combining Forecasts
The accuracy of Ensemble (Combination) predictions are usually better (especially if the individual methods making up the ensemble are pre-picked (“trimmed”) and there are not too many of them (rule of thumb: 4 or less) This general result holds in both prediction and classification problems
4
Ensembles Predictions for Numeric Target Variable Based on M Competing Forecasts (Predictions)
Philosophy: “Essentially, all models are wrong, but some are useful.” G.E.P. Box Philosophy: “Don’t Put All of your Eggs in One Basket.” A well accepted rule in finance and portfolio diversification theory. The Nelson Combination (Ensemble): 𝑓 𝑁 = 𝑤 1 ∗ 𝑓 1 + 𝑤 2 ∗ 𝑓 2 + ⋯+ 𝑤 𝑀−1 ∗ 𝑓 𝑀−1 + 1− 𝑤 1 − 𝑤 2 − ⋯ 𝑤 𝑀−1 ∗ 𝑓 𝑀 Note: Weights add to one. The Granger-Ramanathan Combination (Ensemble): 𝑓 𝐺𝑅 = 𝑤 0 + 𝑤 1 ∗ 𝑓 1 + 𝑤 2 ∗ 𝑓 2 +⋯+ 𝑤 𝑀−1 ∗ 𝑓 𝑀−1 + 𝑤 𝑀 ∗ 𝑓 𝑀 Note: Weights on forecasts don’t necessarily add to one and the intercept, 𝑤 0 , need not be zero. “Good” (trimmed) Ensembles are made up of the forecasts derived from the most accurate forecasting models (preferably four or less most accurate models), and at the same time, it is hoped that these best forecasting models have errors that are negatively correlated or at least have minimal correlation between their errors. Correlation Matrix plots of the forecast errors can be helpful in this regard.
5
Obtaining the Weights for the Ensemble Methods
The Nelson Ensemble weights are obtained by regressing the actual values of the target variable in an independent data set on the M forecasts (predictions) of the target variable in the independent data set derived by using the M forecasting (prediction) models. This is a restricted regression in that the intercept has to be set to zero and the weights (coefficients) applied to the forecasts have to be restricted to add to one. The Granger-Ramanathan weights are obtained like in the Nelson method but, instead, the intercept of the regression is not set to zero and the weights (coefficients) applied to the forecasts need not add to one. Some software packages provide the Simple Average Ensemble which is nothing more than the Nelson Ensemble method with each of the forecast weights being 1/M. The Simple Average Ensemble is likely to be less accurate than the more sophisticated Nelson and Granger-Ramanathan Ensemble methods except in cases where the forecasting methods making up the Ensemble are approximately equally accurate.
6
Classification Ensembles: The Majority Voting Rule
In the case that there are M binary classification models predicting either 1 (a “success”) or 0 (a “failure”), a Majority Voting Rule can be used. If M is odd, then the Majority Voting Ensemble prediction is the outcome that has the majority vote. If M is even and there is a tie vote, a coin flip can be used to break the tie. It is possible to have weighted voting rules with the most accurate classification models carrying greater voting power.
7
Bagging Prediction and Classification Methods
“Bagging” stands for Bootstrap Aggregation. Prediction and Classification Models are often improved in terms of accuracy of prediction and classification if they are “bagged.” Take, for example, Multiple Linear Regression (MLR) in the application of predicting a numeric target variable in an independent data set. More accurate prediction of an independent data set might be obtained by picking a set of MLR models obtained by a large number (B) of random, say, 3/4 to 1/4 cross-validations. Let N be the number of observations available to estimate the coefficients of the MLR. Draw N bootstrap observations (“cases”) with replacement. Then randomly choose (3/4)*N observations for the training data set and (1/4)*N observations for the validation data set and obtain 3 MLR models (to the extent they are unique) by the backward, forward, and step-wise selection methods and choose the MLR model that is the most accurate in the validation data set. Repeat this process, B times (maybe B=100). Then invariably one has B MLR models to use to predict a target variable in an independent data set. The “Bagged” MLR predictions for the independent data set are just the simple average predictions based on the prediction of the B MLR models obtained in the bagging process. In many cases, the “Bagged” MLR models are more accurate than any one single MLR model obtained from a one-time run of a training/validation estimation of an MLR model. In a similar manner, classification models, like the CART model, can be bagged as well. Such bagging is sometimes called building a “Random Forest” of decision trees.
8
“Boosting” a Model Boosting – Combines models of the same type (e.g., decision tree) and is iterative, i.e., a new model is influenced by the performance of the previously built model. If, for example, there are several cases that were very poorly predicted by a CART tree, these cases are given more weight and a second CART tree is fit to the data. This process continues with successive reweighting of “tough to classify cases” and the building of a succession of trees reflecting the successive reweighting of tough cases. Like Bagging, Boosting of a classifier (or predictor) leads to Majority Voting of classifications or simple averages of predictions across the succession of boosted versions of the given model. The most popular of Boosting Methods is called AdaBoost.M1 For more on AdaBoost.M1 see AdaBoost.M1.pdf.
9
Now for a Discussion of the Nelson and Granger-Ramanathan Ensemble Methods read the comments in the SAS program combo.sas and run it. The data analyzed is from the article by J.A. Brandt and D.A. Bessler entitled "Price Forecasting and Evaluation: An Application in Agriculture," Journal of Forecasting (July - Sept. 1983), pp
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.