Using Clustering to Make Prediction Intervals For Neural Networks Claus Benjaminsen ECE539 - final project fall 2005
What is a prediction interval? An interval within which the true target value is predicted to be Prediction intervals are often defined by Interval end values and an associated probability A Gaussian distribution with a certain mean and variance
Motivation Give user a more informative output Precision of prediction (width of interval) Max and min limits (end values of interval) Certainty of prediction (associated probability) Make user able to make better decisions
Approach 1 Use clustering to group training feature vectors Associate the mean training error of all the features within each cluster to the corresponding cluster center Estimate prediction interval for all training features in a cluster by scaling the associated error
Approach 2 Given a new input its membership to one of the cluster centers is determined The prediction interval for the new input is then the same as for the training data belonging to that cluster
Data set used Synthetic data set 1 input feature 1 output target 256 samples Divided into: 156 training samples 100 testing samples
Results Plot of the test data along with prediction intervals estimated by the clustering method
Results 2 cost_test_clus 0.001277 The cost function is calculated as the mean squared distance from the edge of the prediction interval to the test target for all targets not inside prediction intervals The clustering method has the best performance of the three different methods cost_test_clus 0.001277 cost_test_var 0.004856 cost_test_basis 0.002414
Discussion Model can become big, when input feature space is big and the number of training samples is high Problems when new inputs doesn’t “look” like any of the training inputs
Conclusion The clustering method can estimate prediction intervals, which yields a very good performance Data used is very well suited for clustering method – might not show same good result for other types of datasets This will have to be tested in the future!