Presentation is loading. Please wait.

Presentation is loading. Please wait.

IPDPS 2009 Dynamic High-level Scripting in Parallel Applications Filippo Gioachin Laxmikant V. Kalé Department of Computer Science University of Illinois.

Similar presentations


Presentation on theme: "IPDPS 2009 Dynamic High-level Scripting in Parallel Applications Filippo Gioachin Laxmikant V. Kalé Department of Computer Science University of Illinois."— Presentation transcript:

1 IPDPS 2009 Dynamic High-level Scripting in Parallel Applications Filippo Gioachin Laxmikant V. Kalé Department of Computer Science University of Illinois at Urbana-Champaign

2 IPDPS 2009 2 Filippo Gioachin - PPL @ UIUC Outline ● Overview – Motivations – Charm++ RTS ● Scripting interface – Execution flow – Cross communication ● Case Studies – CharmDebug (parallel debugger) – Salsa (particle analysis tool) ● Future work

3 IPDPS 2009 3 Filippo Gioachin - PPL @ UIUC Motivations ● Need for extra functionality at runtime – Steering computation – Analyzing data – Adding correctness checks while debugging ● Long running applications – Time consuming to recompile the code (if at all available) – Need to wait for application to re-execute ● Useful to upload scripts to perform not foreseen procedures

4 IPDPS 2009 4 Filippo Gioachin - PPL @ UIUC Execution flow Registratio n Serve r Clien t Execut e ID Prin t (ID) “Hello World” Is Finished? (ID) Yes/No Execute can: ● Create new Python interpreter ● Wait for termination of script before returning ID

5 IPDPS 2009 5 Filippo Gioachin - PPL @ UIUC Charm++ Overview ● Middleware written in C++ ● User decomposes work among objects (chares) ● System maps chares to processors – automatic load balancing – communication optimizations ● Communicate through asynchronous messages System view User view

6 IPDPS 2009 6 Filippo Gioachin - PPL @ UIUC Charm++ RTS Pytho n Modul e Pytho n Modul e Pytho n Modul e External Client Converse Client ServerConverse Client Server

7 IPDPS 2009 7 Filippo Gioachin - PPL @ UIUC Interface overhead ● Single Python interpreter – Creation of one Python interpreter: 40~50 ms – Other overhead: 1~2 ms – Independent on the number of processors ● Multiple Python interpreters Abe: NCSA Linux Cluster (dual-socket quad-core Intel Clovertown, 2.33GHz)

8 IPDPS 2009 8 Filippo Gioachin - PPL @ UIUC Registration on the server ● In the code: Charm Interface (ci) file ● At runtime: module MyModule { array [1D] [python] MyArray { entry MyArray();..... } arrayProxy = CProxy_MyArray::ckNew(elem); arrayProxy.registerPython("pyCode"); Create a new chare collection of type MyArray. CCS request with tag “pyCode” are treated as Python requests bound to the just created collection. Declare a chare collection type called “MyArray” indexable by a 1-dimensional index.

9 IPDPS 2009 9 Filippo Gioachin - PPL @ UIUC Usage on the client PythonExecute code = new PythonExecute(input.getText(), input.getMethod(), new PythonIteratorGroup(input.getChareGroup()), false, true, 0); code.setKeepPrint(true); code.setWait(true); code.setPersistent(true); if (interpreterHandle > 0) code.setInterpreter(interpreterHandle); byte[] reply = server.sendCcsRequestBytes("CpdPythonGroup", code.pack(), 0, true); if (reply.length == 0) { System.out.println("The python module was not linked in the application"); return; } interpreterHandle = CcsServer.readInt(reply, 0); PythonFinished finished = new PythonFinished(interpreterHandle, true); byte[] finishReply = server.sendCcsRequestBytes("CpdPythonGroup", finished.pack(), 0, true); PythonPrint print = new PythonPrint(interpreterHandle, false); byte[] output = server.sendCcsRequestBytes("CpdPythonGroup", print.pack(), 0, true); System.out.println("Python printed: "+new String(output)); Java snippet from CharmDebug

10 IPDPS 2009 10 Filippo Gioachin - PPL @ UIUC Communication (1): low-level ● Always available to Python scripts – ck.mype() – ck.numpes() – ck.myindex() ● Additionally implementable in the server code – ck.read(what) – ck.write(what, where) Pytho n PyObject* read(PyObject*); void write(PyObject*, PyObject*); C++ (server code)

11 IPDPS 2009 11 Filippo Gioachin - PPL @ UIUC Communication (2): high-level ● Allow other functions to be called by Python – Accessible through the “charm” module ● These functions can suspend and perform parallel operations module MyModule { array [1D] [python] MyArray { entry MyArray(); entry [python] void run(int); entry [python] void jump(int);..... } void run (int handle) { PyObject *args = pythonGetArg(handle); /* use args with Python/C API */ thisProxy.performAction(...parameters..., CkCallbackPython(msg)); int *value = (int*)msg->getData(); pythonReturn(i, Py_BuildValue("i",*value)); } Method definition (.ci)Method implementation (.C)

12 IPDPS 2009 12 Filippo Gioachin - PPL @ UIUC High-level call overhead 10μs/cal l ● This timing depends on the Python/C API Abe, 1 processor (to avoid interference)

13 IPDPS 2009 13 Filippo Gioachin - PPL @ UIUC Communication (3): iterative ● Double the mass of all particles with high velocity ● int buildIterator(PyObject*& data, void* iter); ● int nextIteratorUpdate(PyObject*& data, PyObject* result, void* iter); size = ck.read((“numparticles”, 0)) for i in range(0, size): vel = ck.read((“velocity”, i)) mass = ck.read((“mass”, i)) mass = mass * 2 if (vel > 1): ck.write((“mass, i), mass) Using low-level communication def increase(p): if (p.velocity > 1): p.mass = p.mass * 2 Using iterative communication

14 IPDPS 2009 14 Filippo Gioachin - PPL @ UIUC Iterative interface overhead ● Each element performs work for 10μs ● The overhead to iterate over the elements is zero

15 IPDPS 2009 15 Charm Workshop 2009 CharmDebug: overview CharmDebug Java GUI (local machine) Firewal l Parallel Application (remote machine) CharmDebu g Applicatio n CCS (Converse Client- Server) GDB

16 IPDPS 2009 16 Filippo Gioachin - PPL @ UIUC CharmDebug: introspection

17 IPDPS 2009 17 Filippo Gioachin - PPL @ UIUC Salsa: cosmological data analysis Write your own piece of Python script Use graphical tool that internally issue Python scripts to perform the task

18 IPDPS 2009 18 Filippo Gioachin - PPL @ UIUC 5-point 2D Jacobi ● Matrix size: sizeXsize ● Running on 32 processors ● Python interface used through CharmDebug ● Execution time in ms

19 IPDPS 2009 19 Filippo Gioachin - PPL @ UIUC Conclusions ● Interface to dynamically upload Python scripts into a running parallel application ● The scripts can interact with the application – Low-level – High-level (initiate parallel operations) – Iterative mode ● The overhead of the interface is minimal – Negligible for human interactivity (Salsa) – Low enough to allow non-human interactivity (CharmDebug)

20 IPDPS 2009 20 Filippo Gioachin - PPL @ UIUC Future work ● Handling errors more effectively – Currently mostly left to the programmer ● Define a good atomicity of operations – Checkpoint/rollback for automatic recovery ● Extend to MPI and other languages built on top of Charm++ – Any MPI routine is called – A specific routine is called (i.e MPI_Python) ● Export into MPI standard – Add capability to MPI to receive external messages – Execute Python scripts upon

21 IPDPS 2009 21 Filippo Gioachin - PPL @ UIUC Questions? Thank you http://charm.cs.uiuc.edu/


Download ppt "IPDPS 2009 Dynamic High-level Scripting in Parallel Applications Filippo Gioachin Laxmikant V. Kalé Department of Computer Science University of Illinois."

Similar presentations


Ads by Google