Optimized Java computing as an application for Desktop Grid Olejnik Richard 1, Bernard Toursel 1, Marek Tudruj 2, Eryk Laskowski 2 1 Université des Sciences et Technologies de Lille Laboratoire d’Informatique Fondamentale de Lille (LIFL UMR CNRS 8022) {olejnik, 2 Institute of Computer Science Polish Academy of Sciences Warsaw, Poland {tudruj,
2 Components for Desktop Grid
3 Heterogeneous Applications Each application is composed of one or more tasks. Different tasks may have different computational needs. There may be inter-task communication. Each task can be assigned to a computer with any architecture.
4 Grid'5000 Grid'5000 is a research effort developing a large scale infrastructure for Grid research. 10 laboratories are involved, in the objective of providing the community of Grid researchers a testbed allowing experiments in all the software layers between the network protocols up to the applications. The current plans are to assemble a physical platform featuring 8 clusters, connected by the Renater Education and Research Network at 1Gb/s (10 Gb/s is expected in future).
5 Objectives Execution efficiency of parallel and distributed Java applications on Grid. Transparent and optimized object placement with dynamic load balancing strategies (execution aspects). Transparent control of parallelism and easy collecting of results (programming aspects).
6 What we propose Single System Image (SSI) of clusters. Special mechanisms at the middleware level: –Dynamic and automatic adaptation to variations of computation methods and execution platforms. –Special mechanisms at the programming environment level, which facilitate expression of parallelism and distribution. Using Components in Grid environment
7 Issues Issues in building distributed applications –Collaboration between different applications –Cross programming languages and platforms –Controlling parallel applications transparently –Managing software complexity and evolution –Reuse and sharing of existing scientific code –Encapsulation and modular construction
8 DG-ADAJ Environment JVM Network Portability RMI Distant Access JavaParty Transparency Load Balancing System Tools for expression of parallelism ADAJADAJ Migration of active objects Application Builder Framework ServicesControl Components CCACCA Grid working node
9 CCADAJ Component Architectures Control component library –Provides parallel/distributed control –Connects components in a parallel way –Data exchange between components: -Demand driven & Data driven -Event notification Multiple level composition CCADAJ features –CCA Compliant (Common Component Architecture –Services -Instantiation, Connection -ConnectionEvent, ComponentEvent -Composition
10 CCA Features Components and Ports –Components interact through interfaces (ports) –Components can provide ports: -implementation of port interfaces -the service a component offers –Components can use ports -Call of the method in the port -Capability the component needs to use –Connecting components through ”provides-uses” ports
11 CCA: Interactions between components GoPort –Special port to “execute” a component –Implements go() method, which starts execution of the component –Framework search for go ports and uses them UsesPort (multiplierPort,AdderPort) Call getPort to obtain port by services –Call method on port Ex: x=multiplierPort.getProduct(y,z); Connection uses/provides ports –”No uses” / ”uses neither provides” /”provides” Ports connected by types –Port types must match –Port names are unique in a component Framework puts info about provider into user component’s service object MultiplierComponent AdderPortMultiplierPort AdderComponent AdderPort StarterComponent MultiplierPortGoPort connect StarterComponent MultiplierPort MultiplierComponent MultiplierPort connect MultiplierComponent AdderPort AdderComponent AdderPort
12 Composition (super-component ) Super component –Component -Incorporates ”provides”, ”uses”, and/or ”go” ports –Additional services (framework) -Instantiation of inner components -Mechanism for exposing inner component’s ports to the outside -Inner components activations -Inner components connection SuperComponent UsesPortProvidesPort GoPort InnerComponent InnerUses InnerProvides Some Additional Services InnerComp2 InUses2
13 The static optimization heuristics Before a Java program is executed on a GRID, an introductory optimization algorithm is performed, which will determine an initial distribution of program elements (objects) on Java Virtual Machines. The algorithm starts with creating a method dependence graph of a Java program. In a MDG, methods are shown as nodes, their mutual calls are shown as edges.
14 The optimization algorithm Execute programs for some representative data Carry out the measurements of the number of mutual method calls and the number of new threads spawnings Store measurement results in a trace file Create a method call graph with the use of method dependency graph and trace file Perform clustering and mapping phases: -in the clustering phase, the algorithm merges pairs of nodes from MCG if it leads to a reduction of the program execution time, -the mapping phase assigns clusters to the real physical JVMs with load balancing.
Method Call Graph (MCG)
16 Dynamic Objects Redistribution Workload is computed as a function of the intensity of method invocations Two-phase algorithm is performed concurrently with application execution: –to detect an application distribution imbalance –to correct the imbalance in a distributed manner Objects from overloaded machines are transferred to the underloaded ones
17 Observation mechanism of relations between objects outputGlobalInvocation (OGI) global object global object global object local object local object local object local object outputLocalInvocation (OLI) + + inputInvocation (II) Legend invocation sum JVM
18 Inter-Object Dynamic Relation Graph
19 Conclusions Some optimization algorithms for distributed Java programs based on analysis of graph representations. The optimization explores both static and dynamic dependencies between objects and their methods. The proposed static analysis is used as a preliminary stage that is followed by a dynamic load balancing, which, in most cases, exploits a little of static information coming from Java source code.
20 A dynamic load balancing mechanism is used, supported by three kinds of information: information about the computer load and performance, information about dynamic relation between objects and information deduced from code analysis. Compile-time optimizations do not introduce any penalty in execution time of the program, and thus, they can use more sophisticated heuristics, which give better results.
21 Future Works CCADAJ development –Add new functionality –Emphasis on parallel/distributed control components –Component deployment –Collaboration with other CCA-Compliant frameworks Real problem solving