Download presentation
Presentation is loading. Please wait.
Published byLeo Merritt Modified over 9 years ago
1
Exploring Parallelism with Joseph Pantoga Jon Simington
2
Issues Between Python and C Python is inherently slower than C Especially using libraries that take advantage of Python’s relationship with C / C++ code Thanks interpreter & dynamic typing scheme Python 3 can be comparable to C in some respects, but still slower on the average case (we use Python 2.7.10) Python too popular? So many devs with so many ideas leads to many incomplete projects, but plenty of room for contribution Python’s Global Interpreter Lock (GIL) Prevents more than 1 thread from running at a time
3
The Global Interpreter Lock A lock enforced by the Python interpreter to avoid sharing memory with non- thread-safe threads Limits the amount of parallelism through concurrency when using multiple threads Very little, if any speedup on a multiprocessor machine
4
The Global Interpreter Lock def countdown(n): while n > 0: n -= 1 count = 100000000 countdown(count) t1 = Thread(target=countdown, args=(count//2,)) t2 = Thread(target=countdown, args=(count//2,)) t1.start(); t2.start() t1.join(); t2.join() t1 = Thread(target=countdown, args=(count//4,)) t2 = Thread(target=countdown, args=(count//4,)) t3 = Thread(target=countdown, args=(count//4,)) t4 = Thread(target=countdown, args=(count//4,)) t1.start(); t2.start(); t3.start(); t4.start() t1.join(); t2.join(); t3.join(); t4.join() Sequential 2 Threads 4 Threads 7.8s 15.4s 15.7s - The GIL ruins everything! - Thread-based Parallelism is often not worth it with Python *test completed on 3.1GHz x4 machine with Python 2.7.10
5
Getting around the GIL Make calls to outside libraries and circumvent the interpreter’s rules entirely Python modules that call external C libraries have inherent latency BUT! In certain cases, Python + C MPI performance can be comparable to the native C libraries
6
How does Python + C compare to C? The following was tested on the Beowulf class cluster `Geronimo` at CIMEC with ten Intel P4 2.4GHz processors, each equipped with 1GB DDR 333MHz RAM connected together on a 100Mbps ethernet switch. The mpi4py library was compiled with MPICH 1.2.6 from mpi4py import mpi import numarray as na sbuff = na.array(shape=2**20,type=na.Float64) wt = mpi.Wtime() if mpi.even: mpi.WORLD.Send(buffer, mpi.rank + 1) rbuff = mpi.WORLD.Recv(mpi.rank + 1) else: rbuff = mpi.WORLD.Recv(mpi.rank - 1) mpi.WORLD.Send(buffer, mpi.rank - 1) wt = mpi.Wtime() - wt tp = mpi.WORLD.Gather(wt, root=0) if mpi.zero: print tp http://www.cimec.org.ar/ojs/index.php/cmm/article/viewFile/8/11
7
How does Python + C compare to C? The rest of the graphs display time analysis from similar programs, with only the MPI instruction differing. http://www.cimec.org.ar/ojs/index.php/cmm/article/viewFile/8/11
8
How does Python + C compare to C? http://www.cimec.org.ar/ojs/index.php/cmm/article/viewFile/8/11
9
How does Python + C compare to C? For large data sets, Python performs very similarly to C Python has less bandwidth available as mpi4py uses an MPI library from C to perform general networking calls But, in general, Python is slower than C http://www.cimec.org.ar/ojs/index.php/cmm/article/viewFile/8/11
10
Python’s Parallel Programming Libraries Message Passing Interface (MPI) pyMPI mpi4py - uses the C MPI library directly Pypar Scientific Python (MPIlib) MYMPI Bulk Synchronous Parallel (BSP) Scientific Python (BSPlib)
11
pyMPI Almost-full MPI instruction set Requires a modified Python interpreter which allows for ‘interactive’ parallelism Not maintained since 2013 The modified interpreter is the parallel application -> Have to recompile the interpreter whenever you want to do different tasks
12
Pydusa formerly MYMPI 33KB Python module -- no custom Python interpreter to maintain While the MPI Standard contains 120+ routines, MYMPI contains 35 “important” MPI routines Syntax is very similar to the Fortran, C MPI libraries Your Python code is the parallel application
13
pypar No modified interpreter needed! Still maintained on GitHubGitHub Few MPI interfaces are implemented Can’t handle topologies well and prefers simple data structures in parallel calculations
14
mpi4py Still being maintained on Bitbucket (updated 11/23/2015)Bitbucket Makes calls to external C MPI functions to avoid GIL Attempts to borrow ideas from other popular modules and integrate them together
15
Scientific Python GREAT documentation -> Easy to use with their examples GREAT documentation Supports both MPI and BSP Requires installation of both an MPI and a BSP library
16
Is Parallelism Fully Implemented? From our research so far, we have not found a publically-available Python package that fully implements the full MPI instruction set Not all popular languages have complete and extensive libraries for every task or use case!
17
Conclusion You CAN create parallel programs and applications with Python Doing so efficiently can require the compilation of a large custom Python Interpreter Should they try to keep it in future versions or even maintain the current implementations? From our research it seems like the community has done just about all they could do to bring parallelism to Python but some sacrifices have to be made, mainly a restriction on what data types can and can’t be supported
18
Conclusion Cont. Maybe Python isn’t the best language to implement parallel algorithms in, but there are many other languages besides C and Fortran which have interesting approaches to solving parallel problems
19
Julia Really good documentation for parallel tasks with examplesdocumentation Able to send a task to n connected computers and asynchronously receive the results back, both upon request, and automatically when the task completes Has pre-defined topology configurations for networks like all-to-all and master- slave Allows for custom worker configurations to fit your specific topology
20
Go - Fairly good documentation, along with an interactive interpreter on site to learn the basics without installing anything. - Initial installation comes with all required libraries for parallel coding. So no extra libraries to search for or install. - Lightweight and easy to learn - Can write several parallel programs using simple functions in Go
21
Questions?
22
Sources http://www.researchgate.net/profile/Mario_Storti/publication/220380647_MPI_for_Python/links/00b495242ba3 b30eb3000000.pdf http://www.researchgate.net/profile/Leesa_Brieger/publication/221134069_MYMPI_- _MPI_programming_in_Python/links/0c960521cd051bc649000000.pdf http://uni.getrik.com/wp-content/uploads/2010/04/pyMPI.pdf http://www.researchgate.net/profile/Konrad_Hinsen/publication/220439974_High- Level_Parallel_Software_Development_with_Python_and_BSP/links/09e4150c048e4e7cd8000000.pdf http://www.researchgate.net/profile/Ola_Skavhaug/publication/222545480_Using_B_SP_and_Python_to_simplif y_parallel_programming/links/0fcfd507e6cac3eb63000000.pdf http://downloads.hindawi.com/journals/sp/2005/619804.pdf
23
Sources http://geco.mines.edu/workshop/aug2010/slides/fri/mympi.pdf http://sourceforge.net/projects/pydusa/ http://docs.julialang.org/en/latest/manual/parallel-computing/ http://dirac.cnrs-orleans.fr/plone/software/scientificpython http://dirac.cnrs-orleans.fr/ScientificPython/ScientificPythonManual/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.