Using extremal optimization for Java program initial placement in clusters of JVMs E. Laskowski 1, M. Tudruj 1,3, I. De Falco 2, U. Scafuri 2, E. Tarantino.

Slides:

Advertisements

Similar presentations

Agents & Mobile Agents.

Advertisements

Hadi Goudarzi and Massoud Pedram

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.

Decentralized Reactive Clustering in Sensor Networks Yingyue Xu April 26, 2015.

Distributed Process Scheduling Summery Distributed Process Scheduling Summery BY:-Yonatan Negash.

An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.

Operating System Concepts with Java – 7 th Edition, Nov 15, 2006 Silberschatz, Galvin and Gagne ©2007 Processes and Their Scheduling.

1 Complexity of Network Synchronization Raeda Naamnieh.

Reference: Message Passing Fundamentals.

A Grid Parallel Application Framework Jeremy Villalobos PhD student Department of Computer Science University of North Carolina Charlotte.

Introduction to Evolutionary Computation  Genetic algorithms are inspired by the biological processes of reproduction and natural selection. Natural selection.

Rutgers PANIC Laboratory The State University of New Jersey Self-Managing Federated Services Francisco Matias Cuenca-Acuna and Thu D. Nguyen Department.

Learning Objectives Understanding the difference between processes and threads. Understanding process migration and load distribution. Understanding Process.

1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.

The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.

Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.

GHS: A Performance Prediction and Task Scheduling System for Grid Computing Xian-He Sun Department of Computer Science Illinois Institute of Technology.

Exposure In Wireless Ad-Hoc Sensor Networks S. Megerian, F. Koushanfar, G. Qu, G. Veltri, M. Potkonjak ACM SIG MOBILE 2001 (Mobicom) Journal version: S.

Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.

Self-Organizing Agents for Grid Load Balancing Junwei Cao Fifth IEEE/ACM International Workshop on Grid Computing (GRID'04)

1 IEEE Trans. on Smart Grid, 3(1), pp , Optimal Power Allocation Under Communication Network Externalities --M.G. Kallitsis, G. Michailidis.

Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Summary of Contributions Background: MapReduce and FREERIDE Wavelet.

Optimized Java computing as an application for Desktop Grid Olejnik Richard 1, Bernard Toursel 1, Marek Tudruj 2, Eryk Laskowski 2 1 Université des Sciences.

Heterogeneous Parallelization for RNA Structure Comparison Eric Snow, Eric Aubanel, and Patricia Evans University of New Brunswick Faculty of Computer.

Neural and Evolutionary Computing - Lecture 10 1 Parallel and Distributed Models in Evolutionary Computing  Motivation  Parallelization models  Distributed.

Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.

Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

Zorica Stanimirović Faculty of Mathematics, University of Belgrade

Energy Aware Task Mapping Algorithm For Heterogeneous MPSoC Based Architectures Amr M. A. Hussien¹, Ahmed M. Eltawil¹, Rahul Amin 2 and Jim Martin 2 ¹Wireless.

G-JavaMPI: A Grid Middleware for Distributed Java Computing with MPI Binding and Process Migration Supports Lin Chen, Cho-Li Wang, Francis C. M. Lau and.

Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.

Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,

Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”

1 M. Tudruj, J. Borkowski, D. Kopanski Inter-Application Control Through Global States Monitoring On a Grid Polish-Japanese Institute of Information Technology,

Performance evaluation of component-based software systems Seminar of Component Engineering course Rofideh hadighi 7 Jan 2010.

Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter

Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.

A High Performance Middleware in Java with a Real Application Fabrice Huet*, Denis Caromel*, Henri Bal + * Inria-I3S-CNRS, Sophia-Antipolis, France + Vrije.

UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

Performance evaluation on grid Zsolt Németh MTA SZTAKI Computer and Automation Research Institute.

Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.

Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,

1 Supporting Dynamic Migration in Tightly Coupled Grid Applications Liang Chen Qian Zhu Gagan Agrawal Computer Science & Engineering The Ohio State University.

Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.

Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 3: Process-Concept.

1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.

Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

Static Process Scheduling

Proposal of Asynchronous Distributed Branch and Bound Atsushi Sasaki†, Tadashi Araragi†, Shigeru Masuyama‡ †NTT Communication Science Laboratories, NTT.

1 OASIS Team, INRIA Sophia-Antipolis/I3S CNRS, Univ. Nice Christian Delbé Data Grid Explorer 15/09/03 Large Scale Emulation Mobility in ProActive.

Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.

Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.

Operating Systems (CS 340 D) Dr. Abeer Mahmoud Princess Nora University Faculty of Computer & Information Systems Computer science Department.

Tool Integration with Data and Computation Grid “Grid Wizard 2”

1 Καστοριά Μάρτιος 13, 2009 Efficient Service Task Assignment in Grid Computing Environments Dr Angelos Michalas Technological Educational Institute of.

Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,

Genetic algorithms for task scheduling problem J. Parallel Distrib. Comput. (2010) Fatma A. Omara, Mona M. Arafa 2016/3/111 Shang-Chi Wu.

Agenda  Quick Review  Finish Introduction  Java Threads.

Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.

1 Comparative Study of two Genetic Algorithms Based Task Allocation Models in Distributed Computing System Oğuzhan TAŞ 2005.

Architecture for Resource Allocation Services Supporting Interactive Remote Desktop Sessions in Utility Grids Vanish Talwar, HP Labs Bikash Agarwalla,

1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.

OPERATING SYSTEMS CS 3502 Fall 2017

Introduction to Load Balancing:

Operating Systems (CS 340 D)

Operating Systems (CS 340 D)

Liang Chen Advisor: Gagan Agrawal Computer Science & Engineering

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

Using extremal optimization for Java program initial placement in clusters of JVMs E. Laskowski 1, M. Tudruj 1,3, I. De Falco 2, U. Scafuri 2, E. Tarantino 2, R. Olejnik 4 1 Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland 2 Institute of High Performance Computing and Networking, ICAR-CNR, Naples, Italy 3 Polish-Japanese Institute of Information Technology, Warsaw, Poland 4 Computer Science Laboratory of Lille, University of Science and Technology of Lille, France. {laskowsk,

ASTEC Contents Motivation The ProActive environment Metrics Extremal Optimization Experimental results Conclusions

ASTEC Motivation Efficient load balancing on Grid platform Distribution management: load metrics: CPU queue length, resource utilization, response time communications metrics: transferred data volume, message exchange frequency. Balancing strategies: optimization of initial distribution of components of an application (initial object deployment) dynamic load balancing (migration of objects).

ASTEC ProActive ProActive: a Java-based framework for cluster and Grid computing an Java API + tools Desktop→SMP→LAN→Cluster→Grid Application model: Remote Mobile Objects, Group Communications Asynchronous Communications with synchronization (Futures mechanism) OO SPMD, Migration, Web Services, Grid support Various protocols: rmi, ssh, LSF, Globus

ASTEC Active Objects Active Object is a standard Java object with an attached thread: asynchronous method calls wait by necessity. Active Objects are created on nodes of the parallel system: the deployment is specified by external XML description and/or API calls the location is completely transparent to the client. Active objects are mobile.

ASTEC Active Objects

ASTEC Distributed multi-threaded model

ASTEC Initial deployment optimization steps Measure the properties of the environment: CPU power and availability, network utilization. Execute a program for some representative data (data sample): carry out the measurements of the number of mutual method calls and data volume create a method call graph (a DAG) with the use of method dependency graph and measured data. Find the optimal mapping of the graph Deploy and run the application in ProActive

ASTEC Observation Load monitoring (system observation): predicts workstation load and network utilization principle:  the average idle time that threads pass to the OS is directly related to the CPU load  the maximal number of RMI calls per second. Object monitoring (application observation): gives the intensity of communication between active objects principle: the number of method calls between active objects and the volume of serialized data.

ASTEC Application observation Application objects: global objects (ProActive's active objects) local objects (traditional Java objects) Only global (active) objects are observed Observed items: quantity of objects’ work intensity of communications between objects Counting the method invocation number (remote communication)

ASTEC Invocation counters G – an Active Object G Local objects Output local invocations Input invocations Output global invocations counter counter s

ASTEC The system description The system consists of N computing resources (nodes) The state of system resources: node power α i : the number of instructions computed per time unit on the node i average load of each node ℓ i (Δt) in a particular time span Δt: ℓ i (Δt) ranges in [0.0, 1.0], where 0.0 means a node with no load and 1.0 a node loaded at 100% network bandwidth β ij : the communication bandwidth between the pair of nodes i and j.

ASTEC The application An application is subdivided into P communicating subtasks, two possible models:  an application is described by an undirected weighted graph G tig = {P, E} (the TIG model)  an application is described by a weighted directed acyclic graph G dag = {P, E} (the DAG model)  it is possible to translate DAG into TIG Parameters of a subtask k:  the number of instructions γ k to be executed  the amount of communications ψ km to be performed with the other m-th subtask

ASTEC EO algorithm An introductory optimization algorithm determines an initial distribution of application components on JVMs located on Grid nodes The problem is to assign each subtask to one node in the grid in a way that the execution of the application task is as efficient as possible the optimal mapping of application tasks onto the nodes in heterogeneous environment is NP–hard So, we use the Extremal Optimization algorithm for mapping of tasks to nodes

ASTEC The principle of the EO Extremal Optimization is a co-evolutionary algorithm proposed by Boettcher and Percus in 1999 EO works with one single solution S made of a given number of components s i, each of which is a variable of the problem, is thought to be a species of the ecosystem, and is assigned a fitness value φ i Two fitness functions, one for the variables and one for the global solution.

ASTEC The outline of the EO an initial random solution S is generated and its fitness Φ(S) is computed repeat the following until a termination criterion becomes satisfied: the fitness value φ i is computed for each of the components s i the worst variable (in terms of φ i ) is randomly updated, so that the solution is transformed into another solution S’ belonging to its neighborhood Neigh(S)

ASTEC Problems and improvements The basic EO leads to a deterministic process, i.e., it gets stuck in a local optimum  to avoid this behavior, Boettcher and Percus introduced a probabilistic version of EO /τ-EO/ the variables are ranked in increasing order of fitness values /for the minimization problem/ a distribution probability over the ranks k is conside- red for a given value of the parameter τ: p k ~ k -τ, 1 ≤ k ≤ n at each update a rank k is selected according to p k and the variable s i with i = π(k) randomly changes its state.

ASTEC Pseudocode of the τ-EO algorithm (for a minimization problem) The only algorithm parameters are:  the maximum number of iterations N iter  the probabilistic selection value τ

ASTEC Encoding of mapping problem A mapping solution is represented by a vector μ of P integers ranging in the interval [1,N] In this example the first subtask of the task is placed on the grid node denoted with the number 12, the second on grid node 7, and so on. Sub- task P Sub- task P-1 Sub- task P-2 …Sub- task 4 Sub- task 3 Sub- task 2 Sub- task …45712

ASTEC Fitness of a solution Fitness function of a mapping solution: where  θ ij comp the computation time needed to execute the subtask i on the node j to which it is assigned by the proposed mapping solution  θ ij comm the communication time requested to execute the subtask i on the node j to which it is assigned by the proposed mapping solution  θ ij = θ ij comp + θ ij comm is the total time needed to execute the subtask i on the node j to which it is assigned by the proposed mapping solution.

ASTEC Experimental results τ-EO parameter setting N iter = 200,000 τ = runs on each problem Two experiments reported in the presentation: a simulated execution of an application in a test grid (10 sites, 184 nodes with different power and load) an optimization of a ProActive application in a cluster (7 homogenous, two-core nodes).

ASTEC Experiment 1 – the test grid A grid with 10 sites, a total of 184 nodes

ASTEC Experiment 1 - features of the nodes Average loads: ℓ i (Δt) = 0.0 for all nodes apart: ℓ i (Δt) = 0.5 for each i є [22, …, 31] ℓ i (Δt) = 0.5 for each i є [42, …, 47]  the most powerful nodes are the first 22 of A and the first 10 of B

ASTEC Experiment 1 – the application P=30 nodes G1G1 G2G2 T 14 TiTi T0T0 T 15 T i+15 T 29 :::::: :::::: :::::: γ k = 90,000 MI ψ km = 100 Mbit ::::::

ASTEC Experiment 1 – the result The optimal allocation entails both the use of the most powerful nodes and the distribution of the communicating tasks in pairs on the same site so that communications are faster (only intersite, no intrasite) the solution allocates 11 task pairs on the 22 unloaded nodes in A and the remaining 4 pairs on 8 unloaded nodes in B:

ASTEC Experiment 2 – the cluster A cluster: 7 homogenous two-core nodes Gigabit Ethernet LAN average extra load ℓ i (Δt) = 0.0 for all nodes each node has Sun JVM installed and a ssh agent The scenario of the experiment: 1.CPU power, load and network utilization monitoring 2.application parameters' measuring (using the sample data) 3.mapping optimization and the final run.

ASTEC Experiment 2 – the application A ProActive Java multi-threaded application, working according to the DAG model 58 nodes the DAG is executed in the loop (200 iterations)

ASTEC Experiment 2 – the result Since the nodes are homogenous and without the extra load, the EO mapping balanced the amount of computations assigned to each node: Node nb:Amount of computations

ASTEC Typical evolution of τ-EO on a mapping problem Evolution of the best-so-far value is shown on the left, and both best-so-far and current solutions for the first 200 iterations are shown on the right

ASTEC Conclusions Extremal Optimization has been proposed as a viable approach to the mapping of the tasks making up an application in grid environments The unique feature of the presented approach is the ability to deal with different load of nodes and the diversity in network bandwidths τ-EO shows two very interesting features when compared to other optimization tools based on Evolutionary Algorithms (e.g. Differential Evolution: – a much higher speed – its ability to provide stable solutions.