MPJ (Message Passing in Java): The past, present, and future

Slides:



Advertisements
Similar presentations
Chorus and other Microkernels Presented by: Jonathan Tanner and Brian Doyle Articles By: Jon Udell Peter D. Varhol Dick Pountain.
Advertisements

3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.
1 Parallel Computing—Introduction to Message Passing Interface (MPI)
3.5 Interprocess Communication
1/28/2004CSCI 315 Operating Systems Design1 Operating System Structures & Processes Notice: The slides for this lecture have been largely based on those.
High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
1 Developing Native Device for MPJ Express Advisor: Dr. Aamir Shafi Co-advisor: Ms Samin Khaliq.
Crossing The Line: Distributed Computing Across Network and Filesystem Boundaries.
SUMA: A Scientific Metacomputer Cardinale, Yudith Figueira, Carlos Hernández, Emilio Baquero, Eduardo Berbín, Luis Bouza, Roberto Gamess, Eric García,
G-JavaMPI: A Grid Middleware for Distributed Java Computing with MPI Binding and Process Migration Supports Lin Chen, Cho-Li Wang, Francis C. M. Lau and.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Background Computer System Architectures Computer System Software.
Programming Parallel Hardware using MPJ Express By A. Shafi.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Parallel and Distributed Programming: A Brief Introduction Kenjiro Taura.
Introduction to Operating Systems Concepts
Computer System Structures
Introduction to threads
Applications Active Web Documents Active Web Documents.
Object Oriented Programming in
GridOS: Operating System Services for Grid Architectures
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Kernel Design & Implementation
Chapter 1: Introduction
Operating System Structure
Processes and threads.
Advanced Operating Systems CIS 720
Chapter 3: Process Concept
Distributed Shared Memory
Processes Overview: Process Concept Process Scheduling
CRESCO Project: Salvatore Raia
Lecture 1 Runtime environments.
Introduction Enosis Learning.
Replication Middleware for Cloud Based Storage Service
Chapter 2: Operating-System Structures
Introduction Enosis Learning.
MPJ: The second generation ‘MPI for Java’
Department of Computer Science University of California, Santa Barbara
Chapter 4: Processes Process Concept Process Scheduling
ICS 143 Principles of Operating Systems
Threads and Data Sharing
Recap OS manages and arbitrates resources
Pluggable Architecture for Java HPC Messaging
Chapter 2: System Structures
Chapter 3: Operating-System Structures
Chapter 1 Introduction to Operating System Part 5
Introduction to Operating Systems
Introduction to Operating Systems
Hybrid Programming with OpenMP and MPI
Operating Systems : Overview
Multithreaded Programming
MPJ: A Java-based Parallel Computing System
Outline Chapter 2 (cont) OS Design OS structure
Prof. Leonardo Mostarda University of Camerino
(Computer fundamental Lab)
Lecture 1 Runtime environments.
Outline Operating System Organization Operating System Examples
System calls….. C-program->POSIX call
Database System Architectures
Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.
Operating System Overview
A Virtual Machine Monitor for Utilizing Non-dedicated Clusters
Chapter 13: I/O Systems.
JIT Compiler Design Maxine Virtual Machine Dhwani Pandya
Processes August 10, 2019 OS:Processes.
Presentation transcript:

MPJ (Message Passing in Java): The past, present, and future Aamir Shafi Distributed Systems Group University of Portsmouth November 30, 2018

Presentation Outline Introduction Background and review Java messaging system Java NIO (New I/O package) Comparison of Java with C The trend towards SMP clusters Background and review MPJ design & implementation Performance evaluation Conclusions November 30, 2018

Introduction A lot of interest in a Java messaging system: There is no reference messaging system in pure Java, A reference system should follow the API defined by the MPJ specification. What a Java messaging system has to offer? Portability: Write once run anywhere. Object oriented programming concept: Higher level of abstraction for parallel programming. An extensive set of API libraries: Avoids reinventing the wheel. Multi-threaded language: Thread-safety mechanisms: ‘synchronized’ blocks, wait() and notify() in Object class. Automatic memory management. November 30, 2018

Introduction But, is not Java slower than C (in terms of I/O)? The traditional I/O package of Java is an blocking API: A separate thread to handle each socket connection. Java New I/O package: Adds non-blocking I/O to the Java language: C select () like functionality. Direct Buffers: Conventional Java objects are allocated on JVM heap, Unlike conventional Java objects, direct buffers are allocated in the native OS memory, Provides faster I/O, not subject to garbage collection. JIT (Just In Time) compilers: Convert the object code into native machine code. Communication performance: Comparison of Java NIO and C Netpipe drivers, Java performs similar to C on Fast Ethernet. November 30, 2018

Xaxis, Yaxis, what is each line representing. Transfer time graphs are important for short messages November 30, 2018

BW is important for large messages November 30, 2018

Introduction – Some background Parallel programming paradigms: Shared memory: Standard: SHMEM and more recently OpenMP. Implementation: JOMP (Java OpenMP). Distributed memory: Standard: Message Passing Interface (MPI). Implementation: MPJ (Message Passing in Java). Hybrid paradigms. Clusters have become a cost effective alternative to traditional HPC hardware This trend towards clusters lead to the emergence of SMP clusters: StarBug, the DSG cluster consists of eight dual CPU nodes, Shared memory for intra-node communications, Distributed memory for inter-node communications, Thus, a framework based on a hybrid programming paradigm. Having discussed about Java, we will discuss some parallel programming backgrounds November 30, 2018

Aims of the project Research and development of a reference messaging system based on Java NIO. A secure runtime infrastructure to bootstrap and control MPJ processes. MPJ framework for SMP clusters: Integrate MPJ and JOMP: Use MPJ for distributed memory, Use JOMP for shared memory. Map the parallel application to the underlying hardware for optimal execution. Debugging, monitoring and profiling tools. This talk discusses MPJ, the secure runtime and motivates for efficient execution on shared memory processors. Outline the project November 30, 2018

Presentation Outline Introduction Background and review MPJ design & implementation Performance evaluation Conclusions November 30, 2018

Background and review This section of talk discusses: Messaging systems in Java, Shared memory libraries in Java, The runtime infrastructures. A detailed literature review is available in DSG first year technical report: “A Status Report: Early Experiences with the implementation of a Message Passing System using Java NIO” http://dsg.port.ac.uk/~shafia/res/papers/DSG_2.pdf November 30, 2018

Messaging systems in Java Three approaches to build messaging systems in Java, using: RMI (Remote Method Invocation): An API of Java that allows execution of remote objects, Meant for client server interaction, Transfers primitive datatypes as objects. JNI (Java Native Interface): An interface that allows to invoke C (and other languages) from Java, Not truly portable, Additional copying between Java and C. Sockets interface: Java standard I/O package, Java New I/O package. November 30, 2018

Using RMI JMPI (University of Massachusetts): Cons: Not active, Poor performance because of RMI, KaRMI was used instead of RMI: KaRMI runs on Myrinet. CCJ (Vrije Universiteit Amsterdam): Not active. Supports the transfer of objects as well as basic datatypes. Poor performance because of RMI. JMPP (National Chiao-Tung University). November 30, 2018

Using JNI mpiJava (Indiana University + UoP): Pros: Moving towards the MPJ API specification, Well-supported and widely used. Cons: Uses JNI and native MPI as the communication medium. JavaMPI (University of Westminster): No longer active (uses Native Method Interface NMI), Source code not available. M-JavaMPI (The University of Hong Kong): Supports process migration using JVMDI (JVM Debug Interface) that has been deprecated in Java 1.5. JVMTI (JVM Tool Interface) November 30, 2018

Using sockets interface … MPJava (University of Maryland) Pros: Based on Java NIO, Cons: No runtime infrastructure, Source code is not available, MPP (University of Bergen) Based on Java NIO Subset of MPI functionality of a bug in the TCP/IP stack. November 30, 2018

Shared memory libraries in Java OpenMP implementation using Java (EPCC): JOMP (Java OpenMP), Single JVM, starts multiple threads to match the number of processors on an SMP node. Efficient shared memory communications can also be implemented by MappedByteBuffer class: Memory mapped to a file, Sender may lock and write to the file, Reader may lock and read from the file. Single JVM implementation of mpjdev Threads are processes November 30, 2018

The runtime infrastructures Shell/Perl scripts Most messaging systems use SSH to start remote processes (for linux). SPMD (Argonne National Lab): Part of MPICH-2, SPMD stands for “Super Multi Purpose Daemon”, Different implementation for linux and windows. Java is ideal to implement the runtime infrastructure Portability - same implementation will run on different operating systems. November 30, 2018

Presentation Outline Introduction Literature review MPJ design & implementation Performance evaluation Conclusions November 30, 2018

Design Goals Portability. Standard Java: High Performance. Assuming no language extensions. High Performance. Modular and layered architecture: Device drivers, and other layers. Allows higher level of abstraction: By enabling the transfer of objects. November 30, 2018

The Generic Design High Level MPJ Collective operations Process topologies   Base Level MPJ All point-to-point modes Groups Communicators Datatypes MPJ Device Level isend, irecv, waitany, . . . Physical process ids (no groups) Contexts and tags (no communicators) Byte vector data Buffer packing/unpacking JNI Wrapper Communication medium Java NIO and Thread APIs Native MPI Specialised Hardware Library (For e.g. VIA communication primitives) Process Creation and Monitoring MPJ service daemon Java Reflection API to start processes Dynamic Class loading We’ll first discuss the top three layer Then, we’ll discuss the bottom layer November 30, 2018

Implementation of MPJ Device drivers: Java NIO device driver (mpjdev). The native MPI device driver (native mpjdev). Swapped in/out of MPJ. Similar to device drivers in MPICH. MPJ Point to point communications: Blocking and non-blocking. Communicators, virtual topologies. MPJ Collective Communications: Various collective communications methods. Instantiation of MPJ design (on next slide). November 30, 2018

November 30, 2018

Java NIO device driver Communication Protocols: The buffering API: Eager-Send: Assumes the receiver has infinite memory, For small messages (< 128 Kbytes), May incur additional copying. Rendezvous: Exchange of control messages before the actual transmission, For long messages ( 128 Kbytes). The buffering API: Supports gathering/scattering of data, Support the transfer of Java objects. November 30, 2018

Pt2Pt and collective methods Point to Point communications: Blocking/non-Blocking methods, Buffered/ready/synchronous modes of send: Supported by eager-send and rendezvous protocols at the device level. Communicators. Virtual Topologies. Collective Communications: Provided as utility to MPI programmers, Gather/scatter/all-to-all/reduce/all-Reduce/scan. November 30, 2018

The runtime infrastructure November 30, 2018

Design of the runtime infrastructure November 30, 2018

Implementation of the runtime The administrator installs MPJDaemons: SSH allows us to install the daemons remotely on Linux, Adds admin certificate to all the daemons keystore (a repository of certificates). Using the daemons: The administrator adds the user certificate into the keystore, The MPJRun module is used run the parallel application. Copying executables from MPJRun to MPJDaemon: Via dynamic class loading. Stdout/Stderr is redirected to MPJRun. November 30, 2018

Implementation issues Issues with Java NIO: Taming the NIO circus thread: http://forum.java.sun.com/thread.jsp?forum=4&thread=459338&start=0&range=15&hilite=false&q Allocating direct buffers lead to OutOfMemoryException (a bug). Selectors taking 100 percent CPU: No need to register for write events, Only register for read events. J2SE (Java2 Standard Edition) 1.5 has solved many problems. MPJDaemons went out of memory because of direct buffers: These buffers are not subject to garbage collection, Details shown in coming slides, Solved by starting a new JVM at MPJDaemon. November 30, 2018

Machine names where MPJDaemon will be installed November 30, 2018

Installing the daemon from the initiator machine November 30, 2018

First execution … November 30, 2018

Memory stats of one of machines where MPJDaemon is installed and is executing MPJ app November 30, 2018

After second execution … November 30, 2018

After a few more executions … November 30, 2018

Finally, Out of memory …. November 30, 2018

Presentation Outline Introduction Literature review MPJ design & implementation Performance evaluation Conclusion November 30, 2018

Sequence of perf evaluation graphs Java NIO device driver evaluation: Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) Remote nodes of a cluster: Transfer time (micro-seconds), Throughput achieved (Mbits/seconds), Same node of a cluster: Importance of OpenMP for shared memory communications. Evaluation of eager-send & rendezvous protocols: Throughput achieved (Mbits/seconds). MPJ Pt2Pt evaluation: November 30, 2018

Latency (the time taken to transfer one byte) is ~ 260 microseconds Java NIO device driver (red line) performs similar to native mpjdev device driver Latency (the time taken to transfer one byte) is ~ 260 microseconds November 30, 2018

Sequence of perf evaluation graphs Java NIO device driver evaluation: Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) Remote nodes of a cluster: Transfer time (micro-seconds), Throughput achieved (Mbits/seconds), Same node of a cluster: Importance of OpenMP for shared memory communications. Evaluation of eager-send & rendezvous protocols: Throughput achieved (Mbits/seconds). MPJ Pt2Pt evaluation: November 30, 2018

Throughput for both devices is ~ 89 Mbits/s Change from eager send to rendezvous protocol November 30, 2018

Sequence of perf evaluation graphs Java NIO device driver evaluation: Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) Remote nodes of a cluster: Transfer time (micro-seconds), Throughput achieved (Mbits/seconds), Same node of a cluster: Importance of OpenMP for shared memory communications Evaluation of eager-send & rendezvous protocols: Throughput achieved (Mbits/seconds). MPJ Pt2Pt evaluation: November 30, 2018

‘native mpjdev’ is communicating through shared memory mpjdev (represented by red line) is communicating through sockets (the time is dictated by memory bus bandwidth) ‘native mpjdev’ is communicating through shared memory A problem for SMP clusters! November 30, 2018

Sequence of perf evaluation graphs Java NIO device driver evaluation: Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) Remote nodes of a cluster: Transfer time (micro-seconds), Throughput achieved (Mbits/seconds), Same node of a cluster: Importance of OpenMP for shared memory communications. Evaluation of eager-send & rendezvous protocols: Throughput achieved (Mbits/seconds). MPJ Pt2Pt evaluation: November 30, 2018

November 30, 2018

Sequence of perf evaluation graphs Java NIO device driver evaluation: Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) Remote nodes of a cluster: Transfer time (micro-seconds), Throughput achieved (Mbits/seconds), Same node of a cluster: Importance of OpenMP for shared memory communications. Evaluation of eager-send & rendezvous protocols: Throughput achieved (Mbits/seconds). MPJ Pt2Pt evaluation: November 30, 2018

Eager-send is for small messages < 128 Kbytes Eager-send may incur additional copying The time for exchanging control messages in rendezvous dictates the communication time of small messages Rendezvous is suitable for large message > 128 Kbytes November 30, 2018

Sequence of perf evaluation graphs Java NIO device driver evaluation: Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) Remote nodes of a cluster: Transfer time (micro-seconds), Throughput achieved (Mbits/seconds), Same node of a cluster: Importance of OpenMP for shared memory communications. Evaluation of eager-send & rendezvous protocols: Throughput achieved (Mbits/seconds) MPJ Pt2Pt evaluation: Throughput achieved (Mbits/seconds). November 30, 2018

November 30, 2018

Sequence of perf evaluation graphs Java NIO device driver evaluation: Remote nodes of a cluster: Transfer time (micro-seconds), Throughput achieved (Mbits/seconds), Same node of a cluster: Importance of OpenMP for shared memory communications. Evaluation of eager-send & rendezvous protocols: Throughput achieved (Mbits/seconds). MPJ Pt2Pt evaluation: Comparing to MPICH, mpiJava (mpiJava uses MPICH by interfacing through JNI) November 30, 2018

November 30, 2018

Sequence of perf evaluation graphs Java NIO device driver evaluation: Remote nodes of a cluster: Transfer time (micro-seconds), Throughput achieved (Mbits/seconds), Same node of a cluster: Importance of OpenMP for shared memory communications. Evaluation of eager-send & rendezvous protocols: Throughput achieved (Mbits/seconds). MPJ Pt2Pt evaluation: Comparing to MPICH, mpiJava (mpiJava uses MPICH by interfacing through JNI) November 30, 2018

Over-head of JNI is coming into play for mpiJava MPICH (C MPI) ~ 89 Mbits/s MPJ ~ 88 Mbits/s mpiJava ~ 82 Mbits/s Over-head of JNI is coming into play for mpiJava November 30, 2018

Parallel matrix multiplication Aims: To test the functionality of MPJ, Check-out the speed-up of parallel version against the sequential version. The parallel version (Total Processes = TP): Suppose matrix A and matrix B with rows M and columns N, Send (M/TP) rows of matrix A to each process along with matrix B to compute the (N/TP) columns of resultant matrix C, Receive (N/TP) columns of resultant matrix C from each process. A trivial parallel application: Parallel version used eight processors on StarBug, the DSG Cluster. November 30, 2018

November 30, 2018

The default heap 64M goes out of memory at this point November 30, 2018

Java on Gigabit Ethernet The max throughput achieved by C Netpipe driver is ~ 900 Mbits/s The max throughput achieved by Java Netpipe driver is ~ 680 Mbits/s The Java driver should be right up with the C driver! November 30, 2018

Things only get worse with mpjdev on Gigabit Ethernet! 1. Aggressive Heap 2. -client JVM 3. Concurrent GC 4. Concurrent I/O 5. Fixed no. of GC Threads 6. Incremental GC 7. No Class GC 8. Par New GC 9. GC Time Ratio 10. Simple (Nothing) Things only get worse with mpjdev on Gigabit Ethernet! November 30, 2018

Java on Gigabit Ethernet The performance of Java is not satisfactory on Gigabit Ethernet: Depends how well the application is written: Well in the sense of consuming as little memory, as possible. Suspicions: Garbage collection starts using a lot of CPU. Identified this problem while comparing Netpipe C and Java drivers on Gigabit Ethernet. Understanding this behaviour/problem is a work in progress. November 30, 2018

Presentation Outline Introduction Literature review MPJ design & implementation Performance evaluation Conclusions November 30, 2018

Summary A lot of interest in the community to implement a reference Java messaging system. The overall aim of this project is to develop a framework for parallel programming in Java over SMP clusters. The past year has been spent in developing a reference messaging system: Java NIO based device driver, MPJ Pt2Pt and collective communications are in progress. A secure runtime infrastructure to execute the MPJ application over a cluster or workstations connected by fast network. November 30, 2018

Future work Implementation of the MPJ standard: Virtual topologies, communicators, derived datatypes etc. Support for multi-dimensional arrays: Multi-dimensional arrays in Java are really ‘arrays of arrays’, Inefficient and confusing for application developers. Support for shared memory communications: JOMP, Develop a specialized device driver which has threads as processes over an SMP node, Use MappedByteBuffer to implement shared memory communications between the two JVMs on an SMP node, It is not clear what is the efficient options out of above three. Integration of JOMP and MPJ: It is not clear what is the best way to integrate JOMP and MPJ. Monitoring, debugging, and profiling tools. November 30, 2018

Conclusions The performance of MPJ suggest that Java is an viable option for a messaging system. Java NIO adds useful functionality to Java: Non-blocking I/O, Direct buffers. MPJ can be used as: A teaching tool: Easy to use: Concentrate on message passing concepts, No memory leaks. A simulation tool (enables Rapid Application Development): To test fault tolerant algorithms: Load balancing and process migration. An MPJ runtime infrastructure allows execution on heterogeneous systems. November 30, 2018

Questions/Suggestions November 30, 2018