About the Capability of Some Parallel Program Metric Prediction Using Neural Network Approach Vera Yu. Goritskaya Nina N. Popova
Problem Area Rising complexity of multiprocessor systems Heterogeneous clusters Distributed systems which include clusters and other multiprocessors Parallel program specifics Parallel program execution is affected by many factors communication environment loading nodes on which application is scheduled can vary … Almost impossible to estimate how a program would behave when running on multiprocessor
Parallel Program Metric Examples Run time (flow time) Tflow = Tcomputation + Tcommunication + Tidle Speedup = T1 / TN Efficiency = Speedup / N Scalability of a parallel system Efficiency is the same for increasing the number of processors and the size of the problem …
Related Works Neural network mechanism for application performance prediction Ipek, B.R. de Supinski, M. Schultz, S.A. McKee “An Approach to Performance Prediction for Parallel Applications” // Euro-Par 2005 Parallel Processing, Volume 3648, 2005, p , ISBN , 2005 Analytic modeling for performance tuning of parallel programs Crovella Mark E., Tomas J. LeBlanc. Parallel Performance Prediction Using Lost Cycles Analysis // Proceedings of Supercomputing’94, P Job execution time estimation based on program source analysis V.V. Balashov, A.P. Kapitonova, V.A. Kostenko, R.L. Smelyanskiy, N.V. Yuschenko “Method for estimating platform-optimized application execution time based on its high-level language source code” // Proceedings of 1 st international conference “Digital signal processing and its applications”, Volume IV, p
Project Features Parallel application flow time prediction i.e. the time that program would spend inside the multiprocessor system Consider large amount of parameters communication environment loading nodes characteristics scheduling features … Neural network mechanism potentially can be applied to various parallel systems and applications
Project Features (2) Processes data: can be gathered without affecting source code of an application can be gathered using standard OS and job managing system utilities includes: job submission moment required processors maximum required execution time system loading at the submission moment size of executable, etc.
Project Features (3) Improving prediction accuracy: we gather characteristics of multiple executions of an application (“execution history”) sample “historical” characteristics: average execution time for definite application with fixed required processors number average required time for definite application average size of executable, etc.
Data Pre-Processing Grouping parallel programs from input set into categories according to the average execution time 4 groups in described case ( 5000 sec) “Noise” data excluding samples with max and min execution time values for each job were removed from input data sets samples corresponding to rejected jobs were also excluded
Target Platforms IBM eServer pSeries 690 (“Regatta”) 16-processor SMP architecture IBM eServer pSeries 360 (“Hill”) 10-processor cluster
Neural Network Architectures Multilayer feedforward network with sigmoid transfer function (1 hidden layer) Elman backpropagation network
Training Results Job group (according to the average execution time) Performance (multiplayer feedforward network with sigmoid transfer function) Performance (elman backpropagation network) < 100 sec10 -2 – – 1000 sec – 5000 sec10 -4 – > 5000 sec10 -5
Testing: Execution Time Prediction (“Regatta”) Execution time prediction (feedforward NN) on “Regatta” Execution time prediction (Elman NN) on “Regatta” predicted values real values
Target Platforms IBM eServer pSeries 690 (“Regatta”) 16-processor SMP architecture IBM eServer pSeries 360 (“Hill”) 10-processor cluster
Neural Network Architecture Elman backpropagation network
Testing: Execution Time Prediction (“Hill”) predicted values real values tasks time (sec) Hill’s flow is more homogeneous than Regatta’s flow
Conclusions and Future Work Improving data processing methods possibly will lead to more accurate results Using described approach new scheduling algorithms can be developed Applying NN prediction mechanisms on other multiprocessor platforms Problem Solving Environments
Thank you! Q & A ?