MPJ: The second generation ‘MPI for Java’

Slides:



Advertisements
Similar presentations
Remote Procedure Call (RPC)
Advertisements

Aamir Shafi, Bryan Carpenter, Mark Baker
The road to reliable, autonomous distributed systems
Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.
1 Parallel Computing—Introduction to Message Passing Interface (MPI)
High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
1 Developing Native Device for MPJ Express Advisor: Dr. Aamir Shafi Co-advisor: Ms Samin Khaliq.
UNIX SVR4 COSC513 Zhaohui Chen Jiefei Huang. UNIX SVR4 UNIX system V release 4 is a major new release of the UNIX operating system, developed by AT&T.
PARMON A Comprehensive Cluster Monitoring System A Single System Image Case Study Developer: PARMON Team Centre for Development of Advanced Computing,
Crossing The Line: Distributed Computing Across Network and Filesystem Boundaries.
G-JavaMPI: A Grid Middleware for Distributed Java Computing with MPI Binding and Process Migration Supports Lin Chen, Cho-Li Wang, Francis C. M. Lau and.
Core Java Introduction Byju Veedu Ness Technologies httpdownload.oracle.com/javase/tutorial/getStarted/intro/definition.html.
Background Computer System Architectures Computer System Software.
Programming Parallel Hardware using MPJ Express By A. Shafi.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS.
Introduction to Operating Systems Concepts
Computer System Structures
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Introduction to threads
Applied Operating System Concepts
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Client-Server Communication
Chapter 1: Introduction
Chapter 4: Threads.
Introduction to Distributed Platforms
Credits: 3 CIE: 50 Marks SEE:100 Marks Lab: Embedded and IOT Lab
Operating System.
Spark Presentation.
Chapter 1: Introduction
Chapter 1: Introduction
Computer Engg, IIT(BHU)
Chapter 1: Introduction
Containers in HPC By Raja.
PHP / MySQL Introduction
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 2: Operating-System Structures
Chapter 4: Threads.
CSE 451: Operating Systems Winter 2006 Module 20 Remote Procedure Call (RPC) Ed Lazowska Allen Center
Pluggable Architecture for Java HPC Messaging
Chapter 4: Threads.
MPI-Message Passing Interface
Developing a Scalable Coherent Interface (SCI) device for MPJ Express
Chapter 2: System Structures
MPJ (Message Passing in Java): The past, present, and future
CSE 451: Operating Systems Autumn 2003 Lecture 16 RPC
Multiple Processor Systems
Aamir Shafi MPJ Express: An Implementation of Message Passing Interface (MPI) in Java Aamir Shafi.
Chapter 1: Introduction
Threads Chapter 4.
CSE 451: Operating Systems Winter 2004 Module 19 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Spring 2012 Module 22 Remote Procedure Call (RPC) Ed Lazowska Allen Center
MPJ: A Java-based Parallel Computing System
Outline Chapter 2 (cont) OS Design OS structure
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 4: Threads & Concurrency
Chapter 4: Threads.
Chapter 1: Introduction
Chapter 2: Operating-System Structures
APACHE WEB SERVER.
Chapter 1: Introduction
CSE 451: Operating Systems Winter 2003 Lecture 16 RPC
A Virtual Machine Monitor for Utilizing Non-dedicated Clusters
Chapter 1: Introduction
CSE 451: Operating Systems Messaging and Remote Procedure Call (RPC)
In Today’s Class.. General Kernel Responsibilities Kernel Organization
Presentation transcript:

MPJ: The second generation ‘MPI for Java’ Aamir Shafi 26th April, 2005 Distributed Systems Group http://dsg.port.ac.uk

People Aamir Shafi Bryan Carpenter: Mark Baker Open Middleware Infrastructure Institute (OMII) Mark Baker November 16, 2018

Presentation outline Introduction Design and implementation of MPJ The runtime infrastructure Implementation issues Conclusion November 16, 2018

Introduction MPI was introduced in June 1994 as a standard message passing API for parallel scientific computing. Language bindings for C, C++, and Fortran ‘Java Grande Message Passing Workgroup’ defined Java bindings in 98 Previous efforts follow two approaches: JNI approach Pure Java approach: Remote Method Invocation (RMI) Sockets Outline the project November 16, 2018

Introduction: Pure Java approach RMI Meant for client server applications Java Sockets Java New I/O package: Adds non-blocking I/O to the Java language, Direct Buffers: Allocated in the native OS memory and the JVM attempts to provide faster I/O Communication performance: Comparison of Java NIO and C Netpipe drivers, Java performs similar to C on Fast Ethernet. A very naïve comparison November 16, 2018

The latency is ~250 microseconds After 1k, the latency starts increasing due to fragmentation of packets Netpipe is a single-threaded simple benchmark Xaxis, Yaxis, what is each line representing. Transfer time graphs are important for short messages November 16, 2018

Max throughput is ~90 Mbps It will be great if MPJ with all its complexities can reach ~80 Mbps BW is important for large messages November 16, 2018

Introduction: JNI approach Importance of JNI cannot be ignored: Where Java fails, JNI makes it work Advances in HPC communication hardware have continued to grow: Network latency has been reduced to a couple of microseconds ‘Pure Java’ looks like an impractical solution: In the presence of myrinet, no application developer/user would opt for Fast Ethernet Cons: Not in essence with Java philosophy of ‘write once, run anywhere’ November 16, 2018

Introduction For Java messaging: There is no ‘one size fits all’ approach Portability and high performance are often contradictory requirements: Portability: Pure Java High Performance: JNI The choice between portability and high performance should best be left to application developers The challenging issue is how to manage these contradictory requirements: How to provide a flexible mechanism to help applications swap communication protocols? November 16, 2018

Presentation outline Introduction Design and implementation The runtime infrastructure Implementation issues Conclusion November 16, 2018

Design Aims: Two device levels: Support swapping various communication devices Two device levels: The MPJ Device level (mpjdev) Separates native MPI device from all other devices ‘native MPI’ device is a special case Possible to cut through and make use of native implementation of advanced MPI features The xdev Device level (xdev) ‘gmdev’ – xdev based on GM 2.x comms library ‘niodev’ – xdev based on Java NIO API ‘smpdev’ – xdev based on Threads API November 16, 2018

MPJ design November 16, 2018

Implementation Point to point communications Collective communications Groups, communicators, and contexts Derived datatypes Vector, Indexed, Contiguous, and Struct Explict packing and unpacking Process Topologies Cartesian Graph Possible to cut through to the native MPI implementation As of today, three methods (Dims_create, Cancel, and Wtick are left unimplemented) November 16, 2018

Presentation outline Introduction Design and implementation The runtime infrastructure Implementation issues Conclusion November 16, 2018

The runtime infrastructure All MPI libraries face the task of bootstrapping MPI processes over network computers RSH/SSH based scripts are the most common LAM/MPI daemons and runtime system works on UNIX based OS No version of LAM for Windows MPICH has recently introduced SMPD (Super Multi Purpose Daemon): According to docs: Works on linux and Windows Difficult (if not impossible) to interface with Java November 16, 2018

Runtime: MPJDaemon and MPJStarter modules Consists of two modules: The daemon that runs on compute nodes (MPJDaemon) The starter module that runs on head nodes (MPJStarter) Installing MPJDaemon on compute nodes: RSH/SSH based scripts can easily install daemon on UNIX based OSes: Could be installed as services (/etc/init.d) Two files are required to install as a service on Windows November 16, 2018

Runtime: MPJDaemon on UNIX based OSes $MPJ_HOME/bin/mpjdaemon is a rc shell that starts and stops the daemon Installation as an app: ‘cd $MPJ_HOME/bin’ ./mpjdaemon start Could use RSH/SSH script to install on whole UNIX cluster Installation as a service ‘cp $MPJ_HOME/bin/mpjdaemon /etc/init.d’ Adding to the default runtime ‘rc-update add mpjdaemon default’ (Gentoo Linux) ‘/etc/init.d/mpjdaemon start/stop/status November 16, 2018

Runtime: MPJDaemon on Windows ‘cd %MPJ_HOME%/bin’ ‘InstallMPJDaemon-NT.bat’ This bat file installs the daemon as a service November 16, 2018

Runtime: MPJDaemon as services Apache Commons Daemon: The source bundle does not even compile The project is no more active Spent a week trying to make it work on Windows: Gave up! Java Service Wrapper: Simple and does what it says Support for almost platforms available (where you can run Java) Distributed under MIT License: Redistribute without any restricitons November 16, 2018

Runtime: JMX M&M Claims monitoring and management of Java apps: Start Java app with following switch: –Dcom.sun.management.jmxremote Run ‘jconsole’: Possible to connect to remote and local JVMs Useful if application is an Mbean: Application attributes could be get/set remotely Possibility: MPJDaemon could be operated remotely November 16, 2018

JMX M&M: Connection GUI November 16, 2018

JMX M&M: Connection summary November 16, 2018

JMX M&M: JVM memory November 16, 2018

JMX M&M: JVM threads November 16, 2018

JMX M&M: JVM info. November 16, 2018

Runtime: Dynamic class loading(1) The application (parallel program) and MPJ library is dynamically loaded into the daemon JVM: No need to copy jar files No shared file system assumption MPJStarter starts the light-weight HTTP server (Jetty), which serves the jar file containing parallel program November 16, 2018

Runtime: Dynamic class loading(2) For example, ‘HiMPJ.java’ is a parallel program: Requires mpj.jar to compile and run Bundle it into a jarfile specifying a manifest file with CLASSPATH attribute pointing to mpj.jar Write the manifest file, Manifest-Version: 1.0 Main-Class: HiMPJ Class-Path: mpj.jar ‘jar –cfm himpj.jar manifest HiMPJ.class’ Copy it to $MPJ_HOME/lib directory Executing MPJStarter: ‘cd $MPJ_HOME/bin’ ‘starter.[sh/bat] 2 himpj.jar ../lib –xdev niodev’ JarClassLoader will load himpj.jar and mpj.jar into the daemons JVM November 16, 2018

Presentation outline Introduction Design and implementation The runtime infrastructure Implementation issues Conclusion November 16, 2018

Issue 1: Shared memory device Based on Java Threads API: Each thread is an MPI process Communicates with other threads by sending messages All threads run in the same JVM: Cannot have static variables in the parallel program Static variables within the MPJ library require synchronized access November 16, 2018

Issue 2: Synchronization problems with threads in smpdev Each MPJDaemon is assigned number of processes to be executed: In case of smpdev, all processes run on the same machine MPJDaemon loads the parallel program: ‘JarClassLoader.loadClass(parallelProgramName)’ Once loaded, the program is started as follows: ‘JarClassLoader.invokeClass(pClass, args)’ November 16, 2018

Issue 2: Synchronization problems with threads in smpdev For example, MPJStarter request MPJDaemons to start 2 processes (threads) MPJDaemon started two threads, which first load, and then start the program Processes (threads) are started in this way do not share static variables and cannot synchronize In order to share static variables and sync them, the class should be loaded just once, and exectued N times It was implemented in this way because niodev requires the exact opposite behaviour – No sharing of static variables Currently, the user specifies which device should be used: In case of niodev, the loading is done twice In case of smpdev, the loading is done only once November 16, 2018

Issue 3: ‘cygwin’ If running MPJ on cygwin, ‘chmod o+w $MPJ_HOME/logs’ ‘chmod a+x $MPJ_HOME/lib/*.dll’ Is MPJDaemon a windows service, or a linux service on cygwin? November 16, 2018

(Future) Issue 4: Specifying multiple devices Currently, only one device can be specified: Either niodev or smpdev will be selected as the primary comms device But for SMP clusters, it would be ideal: To use smpdev on a SMP node Use niodev/gmdev for internode comms November 16, 2018

(Future) Issue 5: Starting MPJ with native MPI device mpiJava/native MPI device uses ‘mpirun’ to bootstrap MPI processes: To bring it in line with other devices, native MPI device will have to be started by MPJ runtime infrastructure November 16, 2018

Issue 6: Multiple users running MPJDaemons at the same time Install daemons as an app, Agree on the port numbers. November 16, 2018

Presentation outline Introduction Design and Implementation The runtime infrastructure Implementation Issues Conclusion November 16, 2018

Summary The key issue for Java messaging is not debating pure Java or JNI approach: But, providing a flexible mechanism to swap various comm protocols MPJ has a pluggable architecture: We are implementing ‘niodev’, ‘gmdev’, ‘smpdev’, and native MPI device MPJ runtime infrastructure allows bootstrapping MPI process across various platforms MPJDaemons can be installed as native OS service November 16, 2018

Conclusions We are slowly but surely moving towards the first release of MPJ, the next generation of ‘MPI for Java’ Current Status: Unit Testing MPJ follows the same API as mpiJava: The parallel applications built on top of mpiJava will work with MPJ There are some differences in the API: Bsend, and explicit packing/unpacking -- see release docs for more details Arguably, the first MPI library for Java that implements real messaging stuff in pure Java November 16, 2018

Questions ? November 16, 2018