About the Capability of Some Parallel Program Metric Prediction Using Neural Network Approach Vera Yu. Goritskaya Nina N. Popova

Slides:

Advertisements

Similar presentations

1 Lecture 5: Part 1 Performance Laws: Speedup and Scalability.

Advertisements

History of Distributed Systems Joseph Cordina

A Parallel Computational Model for Heterogeneous Clusters Jose Luis Bosque, Luis Pastor, IEEE TRASACTION ON PARALLEL AND DISTRIBUTED SYSTEM, VOL. 17, NO.

Neural Networks. R & G Chapter Feed-Forward Neural Networks otherwise known as The Multi-layer Perceptron or The Back-Propagation Neural Network.

Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.

1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.

Classification and Prediction by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.

1 Neural plug-in motor coil thermal modeling Mo-Yuen Chow; Tipsuwan Y Industrial Electronics Society, IECON 26th Annual Conference of the IEEE, Volume:

Neural networks - Lecture 111 Recurrent neural networks (II) Time series processing –Networks with delayed input layer –Elman network Cellular networks.

PARUS: a parallel programming framework for heterogeneous multiprocessor systems Alexey N. Salnikov (salnikov cs.msu.su) Moscow State University Faculty.

Computer System Architectures Computer System Software

Development in hardware – Why? Option: array of custom processing nodes Step 1: analyze the application and extract the component tasks Step 2: design.

Soft Computing Colloquium 2 Selection of neural network, Hybrid neural networks.

Cmpe 589 Spring Software Quality Metrics Product  product attributes –Size, complexity, design features, performance, quality level Process  Used.

Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.

 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.

1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.

Multiple-Layer Networks and Backpropagation Algorithms

Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.

IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.

Explorations in Neural Networks Tianhui Cai Period 3.

Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.

Appendix B: An Example of Back-propagation algorithm

Matlab Matlab Sigmoid Sigmoid Perceptron Perceptron Linear Linear Training Training Small, Round Blue-Cell Tumor Classification Example Small, Round Blue-Cell.

Lecture 2 Process Concepts, Performance Measures and Evaluation Techniques.

Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy

NEURAL NETWORKS FOR DATA MINING

Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.

Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec.

GRID’2012 Dubna July 19, 2012 Dependable Job-flow Dispatching and Scheduling in Virtual Organizations of Distributed Computing Environments Victor Toporkov.

Performance Measurement. A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors:

Example: Sorting on Distributed Computing Environment Apr 20,

Artificial Neural Networks An Introduction. What is a Neural Network? A human Brain A porpoise brain The brain in a living creature A computer program.

Gap filling of eddy fluxes with artificial neural networks

Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,

Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.

Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.

Estimating “Size” of Software There are many ways to estimate the volume or size of software. ( understanding requirements is key to this activity ) –We.

Towards large-scale parallel simulated packings of ellipsoids with OpenMP and HyperFlow Monika Bargieł 1, Łukasz Szczygłowski 1, Radosław Trzcionkowski.

Over-Trained Network Node Removal and Neurotransmitter-Inspired Artificial Neural Networks By: Kyle Wray.

Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

A Memory-hierarchy Conscious and Self-tunable Sorting Library To appear in 2004 International Symposium on Code Generation and Optimization (CGO ’ 04)

Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.

3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.

Neural Networks 2nd Edition Simon Haykin

Joe Bradish Parallel Neural Networks. Background  Deep Neural Networks (DNNs) have become one of the leading technologies in artificial intelligence.

G W. Yan 1 Multi-Model Fusion for Robust Time-Series Forecasting Weizhong Yan Industrial Artificial Intelligence Lab GE Global Research Center.

Background Computer System Architectures Computer System Software.

1 Hierarchical Parallelization of an H.264/AVC Video Encoder A. Rodriguez, A. Gonzalez, and M.P. Malumbres IEEE PARELEC 2006.

GPGPU Performance and Power Estimation Using Machine Learning Gene Wu – UT Austin Joseph Greathouse – AMD Research Alexander Lyashevsky – AMD Research.

Classification of parallel computers Limitations of parallel processing.

15/02/2006CHEP 061 Measuring Quality of Service on Worker Node in Cluster Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division,

OPERATING SYSTEMS CS 3502 Fall 2017

Introduction to Load Balancing:

Chapter 6: CPU Scheduling

Chapter 3: Principles of Scalable Performance

Introduction to Neural Networks

Module 5: CPU Scheduling

Artificial Neural Network & Backpropagation Algorithm

Chapter5: CPU Scheduling

Chapter 6: CPU Scheduling

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.

Chapter 5: CPU Scheduling

Ch4: Backpropagation (BP)

Chapter 6: CPU Scheduling

Module 5: CPU Scheduling

Chapter 6: CPU Scheduling

Ch4: Backpropagation (BP)

Module 5: CPU Scheduling

Presentation transcript:

About the Capability of Some Parallel Program Metric Prediction Using Neural Network Approach Vera Yu. Goritskaya Nina N. Popova

Problem Area Rising complexity of multiprocessor systems Heterogeneous clusters Distributed systems which include clusters and other multiprocessors Parallel program specifics Parallel program execution is affected by many factors communication environment loading nodes on which application is scheduled can vary … Almost impossible to estimate how a program would behave when running on multiprocessor

Parallel Program Metric Examples Run time (flow time) Tflow = Tcomputation + Tcommunication + Tidle Speedup = T1 / TN Efficiency = Speedup / N Scalability of a parallel system Efficiency is the same for increasing the number of processors and the size of the problem …

Related Works Neural network mechanism for application performance prediction Ipek, B.R. de Supinski, M. Schultz, S.A. McKee “An Approach to Performance Prediction for Parallel Applications” // Euro-Par 2005 Parallel Processing, Volume 3648, 2005, p , ISBN , 2005 Analytic modeling for performance tuning of parallel programs Crovella Mark E., Tomas J. LeBlanc. Parallel Performance Prediction Using Lost Cycles Analysis // Proceedings of Supercomputing’94, P Job execution time estimation based on program source analysis V.V. Balashov, A.P. Kapitonova, V.A. Kostenko, R.L. Smelyanskiy, N.V. Yuschenko “Method for estimating platform-optimized application execution time based on its high-level language source code” // Proceedings of 1 st international conference “Digital signal processing and its applications”, Volume IV, p

Project Features Parallel application flow time prediction i.e. the time that program would spend inside the multiprocessor system Consider large amount of parameters communication environment loading nodes characteristics scheduling features … Neural network mechanism potentially can be applied to various parallel systems and applications

Project Features (2) Processes data: can be gathered without affecting source code of an application can be gathered using standard OS and job managing system utilities includes: job submission moment required processors maximum required execution time system loading at the submission moment size of executable, etc.

Project Features (3) Improving prediction accuracy: we gather characteristics of multiple executions of an application (“execution history”) sample “historical” characteristics: average execution time for definite application with fixed required processors number average required time for definite application average size of executable, etc.

Data Pre-Processing Grouping parallel programs from input set into categories according to the average execution time 4 groups in described case ( 5000 sec) “Noise” data excluding samples with max and min execution time values for each job were removed from input data sets samples corresponding to rejected jobs were also excluded

Target Platforms IBM eServer pSeries 690 (“Regatta”) 16-processor SMP architecture IBM eServer pSeries 360 (“Hill”) 10-processor cluster

Neural Network Architectures Multilayer feedforward network with sigmoid transfer function (1 hidden layer) Elman backpropagation network

Training Results Job group (according to the average execution time) Performance (multiplayer feedforward network with sigmoid transfer function) Performance (elman backpropagation network) < 100 sec10 -2 – – 1000 sec – 5000 sec10 -4 – > 5000 sec10 -5

Testing: Execution Time Prediction (“Regatta”) Execution time prediction (feedforward NN) on “Regatta” Execution time prediction (Elman NN) on “Regatta” predicted values real values

Target Platforms IBM eServer pSeries 690 (“Regatta”) 16-processor SMP architecture IBM eServer pSeries 360 (“Hill”) 10-processor cluster

Neural Network Architecture Elman backpropagation network

Testing: Execution Time Prediction (“Hill”) predicted values real values tasks time (sec) Hill’s flow is more homogeneous than Regatta’s flow

Conclusions and Future Work Improving data processing methods possibly will lead to more accurate results Using described approach new scheduling algorithms can be developed Applying NN prediction mechanisms on other multiprocessor platforms Problem Solving Environments

Thank you! Q & A ?