PYTHON FOR HIGH PERFORMANCE COMPUTING
OUTLINE Compiling for performance Native ways for performance Generator Examples
IMPORTANCE OF COMPILED CODE When a code (e.g. a for loop) is written in C, the compiler has the opportunity to optimize its use of memory and floating point units. The Python interpreter, on the other hand, doesn’t have nearly the same ability. Import a Python module whose implementation is in C or Fortran
NUMPY EXAMPLE Using NumPy for mathematical operations on arrays and vectors can be hundreds of times faster. NumPy libraries have been tested, and integrating C or Fortran subroutines you might have written becomes easier. Tutorial:
CYTHON An Open-Source project Almost a Python compiler (almost) An extended Python language writing fast Python extension modules interfacing Python with C libraries Credit: Behnal
NATIVE WAYS TO GAIN PERFORMANCE Python is not as strong in memory management Using the del keyword to delete a variable Using the feature called generators: greatly reduces memory usage and simplify programming when used in a particular design pattern
GENERATOR A generator is a function that produces a sequence of results instead of a single value Calling a generator function creates an generator object When the generator returns, iteration stops A generator function is a more convenient way of writing an iterator A generator is a one-time operation – different than a list, need to call again for another iteration
GENERATOR EXPRESSIONS General Syntax: Expression for i in s if condition for i in s: if condition: yield expression E.g. generated version of a list comprehension
EXAMPLE: PROCESSING DATA FILES Summing up the last column of data in a log file Non-generator solution Generator solution
GENERATORS AS A PIPELINE Each step is defined by iteration/generation Instead of focusing on the problem at a line-by-line level, we just break it down into big operations that operate on the whole file. The iteration that occurs in each step holds the pipeline together.
EXAMPLE FURTHER DEVELOPED Developed from the last program to read hundreds of logs across various directories. Using the function “os.walk” for searching the file system.
Open a sequence of filenames Concatenate items from one or more source into a single sequence of items
Generate a sequence of lines that contain a given regular expression
Call Generator Functions
REFERENCES Behnel, S. Using the Cython Compiler to write fast Python code Generator Tricks for Systems Programmers by David M. Beazley