Presentation is loading. Please wait.

Presentation is loading. Please wait.

This work was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract.

Similar presentations


Presentation on theme: "This work was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract."— Presentation transcript:

1 This work was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract W-7405-Eng-48. Steering Massively Parallel Applications Under Python Patrick Miller Lawrence Livermore National Laboratory patmiller@llnl.gov

2 Characterization of Scientific Simulation Codes Big!Big! –Many options –Many developers Long LivedLong Lived –Years to develop –20+ year life Used as a tool for understandingUsed as a tool for understanding –Used in unanticipated ways

3 Steering Steering uses an interpreted language as the interface to “compiled assets.”Steering uses an interpreted language as the interface to “compiled assets.” –E.g. Mathematica Rich command structure lets users “program” within simulation paradigmRich command structure lets users “program” within simulation paradigm Exploit interactivityExploit interactivity

4 Why Steering Expect the unexpectedExpect the unexpected Let end-users participate in solutions to their problemsLet end-users participate in solutions to their problems Users can set, calculate with, and control low level detailsUsers can set, calculate with, and control low level details Developers can test and debug at a higher levelDevelopers can test and debug at a higher level Users can write their own instrumentation or even full blown physics packagesUsers can write their own instrumentation or even full blown physics packages It works!It works! –Mathematica, SCIrun, BASIS, IDL, Tecolote (LANL), Perl, Python!!!!

5 Why Python? Objects everywhereObjects everywhere Loose enough to be practicalLoose enough to be practical Small enough to be safeSmall enough to be safe –The tiny core could be maintained by one person Open source for the core AND many useful bits & pieces already existOpen source for the core AND many useful bits & pieces already exist

6 The big picture C++ and FORTRAN compiled assets for speedC++ and FORTRAN compiled assets for speed Python for flexibilityPython for flexibility Small, nimble objects are better than monolithic objectsSmall, nimble objects are better than monolithic objects No "main" - The control loop(s) are in PythonNo "main" - The control loop(s) are in Python –Python co-ordinates the actions of C++ objects –Python is the integration “glue”

7 Putting it together - Pyffle The biggest issue is always getting C++ to "talk with" Python (wrapping)The biggest issue is always getting C++ to "talk with" Python (wrapping) SWIG, PyFORT, CXX Python, Boost Python, PYFFLESWIG, PyFORT, CXX Python, Boost Python, PYFFLE Our big requirements are...Our big requirements are... –support for inheritance –support for Python-like interfaces –tolerance of templating<>

8 The shadow layer Pyffle generates constructor functions - NOT Python classesPyffle generates constructor functions - NOT Python classes We build extension classes in Python that "shadow" the C++ object toWe build extension classes in Python that "shadow" the C++ object to Allow Python style inheritanceAllow Python style inheritance Add functionalityAdd functionality Enforce consistencyEnforce consistency Modify interfaceModify interface......

9 Parallelism Kull uses SPMD style computation with a MPI enabled Python co-ordinatingKull uses SPMD style computation with a MPI enabled Python co-ordinating –Most parallelism is directly controlled "under the covers" by C++ (MPI and/or threads) –Started with a version with limited MPI support (barrier, allreduce) –New pyMPI supports a majority of MPI standard calls (comm, bcast, reduces, sync & async send/recv,...)

10 Basic MPI in Python mpi.bcast(value)mpi.bcast(value) mpi.reduce(item,operation, root?)mpi.reduce(item,operation, root?) mpi.barrier()mpi.barrier() mpi.rank and mpi.nprocsmpi.rank and mpi.nprocs mpi.send(value,destination, tag?)mpi.send(value,destination, tag?) mpi.recv(sender,tag?)mpi.recv(sender,tag?) mpi.sendrecv(msg, dest, src?, tag?, rtag?)mpi.sendrecv(msg, dest, src?, tag?, rtag?)

11 MPI calcPi def f(a): return 4.0/(1.0 + a*a) def integrate(rectangles,function): n = mpi.bcast(rectangles) h = 1.0/n for i in range(mpi.rank+1, n+1, mpi.procs): sum = sum + function( h * (i-0.05)) myAnswer = h * sum answer = mpi.reduce(myAnswer, mpi.SUM) return answer

12 Making it fast Where we have very generic base classes in the code (e.g. Equation-of-state), we have written Pythonized descendant classes that invoke arbitrary user written Python functionsWhere we have very generic base classes in the code (e.g. Equation-of-state), we have written Pythonized descendant classes that invoke arbitrary user written Python functions C++ component doesn't need to know it's invoking PythonC++ component doesn't need to know it's invoking Python There is a speed issue :-(There is a speed issue :-( But there is hope!But there is hope!

13 PyCOD - Compile on demand! Builds accelerated Python functions AND C++ functionsBuilds accelerated Python functions AND C++ functions Input is a Python function object and a type signature to compile toInput is a Python function object and a type signature to compile to Dynamically loads accellerated versionsDynamically loads accellerated versions Caches compiled code (matched against bytecode) to eliminate redundant compilesCaches compiled code (matched against bytecode) to eliminate redundant compiles Early results are encouraging (20X to 50X speedup)Early results are encouraging (20X to 50X speedup)

14 PyCOD example import compiler from types import * def sumRange(first,last): sum = 0 for i in xrange(first,last+1): sum = sum + i return sum signature = (IntType,IntType,IntType compiledSumRange, rawFunctionPointer = \ compiler.accelerate(sumRange,signature) print "Compiled result:", compiledSumRange(1,100)

15 Summary MPI enabled Python is a realityMPI enabled Python is a reality –We have run 1500 processors, long running simulations running fully under a Python core PYFFLE provides a reasonable conduit for developers to build capability in C++ and easily integrate that capability into PythonPYFFLE provides a reasonable conduit for developers to build capability in C++ and easily integrate that capability into Python


Download ppt "This work was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract."

Similar presentations


Ads by Google