Parallelization of a Non-Linear Analysis Code Lee Hively and Jim Nutaro (mentors) Computational Sciences and Engineering Travis Whitlow Research Alliance in Math and Science Program Alabama Agricultural and Mechanical University August 8, 2007 Oak Ridge, Tennessee
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 2 Outline Problem statement Research/objectives Solution approach MPI (Message Passing Interface) Results Summary
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 3 Problem FORTRAN-based, research code Predicts biomedical events (seizure) Searches over large parameter space 1≤ w ≤ 500(500) 5000 ≤ N ≤ 100,000(95,000) 2 ≤ S ≤ ≤ d ≤ 26 2 ≤ λ ≤ 250(250) 0 ≤ uE ≤ 1(2) (231) “Lots of combinations”
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 4 What is Parallel Computing? Theory Executable task can be divided into smaller chunks and run on multiple processors to achieve faster results Each job is calculated simultaneously Output Initialization Sent to processorsGather results
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 5 Parallel Computing Dependencies Software compatibility Memory Each program segment independent of other segments
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 6 Research ORNL developed and patented novel forewarning technology Biomedical events Machine failures FORTRAN-based research code Accurately predicts epileptic seizures 4.5 hours of forewarning R & D 100 Award
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 7 Design Patterns of the Code Initialization Begins with process-indicative, time-serial data Rejects inadequate data quality to avoid the garbage-in-garbage- out syndrome Removes confounding artifacts Eye blinks Muscular artifacts Converts filtered data Statistical distribution function Captures a baseline signature Compares to signatures of dissimilarity Indicates condition change that detects or predicts the event Accurate prediction requires that the algorithm use a good parameter set
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 8 Goal Perform Monte-Carlo search over large space of possible parameters Analyze each parameter-set choice Compare predictions made with each parameter set to known data sets How well can this parameter set predict events? True positive(s) True negative(s) The determination of the total true rate (sum of true positives and true negatives) for a specific set of parameters is independent of every other parameter set – Ideal problem for parallel computing!
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 9 Seizure Prediction Process 1. EEG data gathered and sent to SeizAlert device 2(a). Artifact removal done and discrete points generated 2(b). Analysis of dissimilarity 3. Forewarning result 1 2 3
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 10 Objectives 1.Parallelize existing research-class, nonlinear statistical FORTRAN code 2. Analyze Brain wave seizure data Statistical parameters for seizure forewarning 3.Find parameter sets that maximize total true rate
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 11 Approach Note all possible options Choose appropriate? suitable? option Develop ways to implement that option MPI
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 12 MPI (Message Passing Interface) What is message passing? Programming paradigm used widely on parallel computers with distributed memory and on networks of workstations (NOWs) Head node
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 13 How Does MPI Work? Program Example integer :: a a = 0 do I = 1,10 a = a+1 end do print*, a FORTRANFORTRAN + MPI include ‘mpif.h’ integer myrank,a call MPI_INIT (…) call MPI_COMM_RANK(…) … call MPI_FINALIZE end
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 14 Results After each iteration, a single-line forewarning summary is given detailing attributes: Parameter set values True positives and true negatives The total true rate A standard forewarning summary would be written as, controls search areaforewarning details
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 15 Isolating each value, abcdefghijklmnp Where, a = montage (0 = monopolar, 1 = bipolar) b = half width of the artifact filter window (w) c = # pts per cutset d = # of dimensions in the PS* e = # symbols (S) f = time delay lag (λ ) g = uE (0 = uniform, 1 = equiprobable) h = # of base-case cutsets i = inter-PS*-symbol lag j = # DM above threshold within successive occurrence window k = # successive occurrences above l l = threshold m = total true rate (%) n = # false negatives p = # false positives *PS - phase-space, DM - dissimilarity measures
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 16 Also, successive runs performed on 2,4,8,16,32,64,128, and 256 processors Each run had unique random seed generator Head node not included (a) # N# PN * PCPU time (s)Relative CPU time N = parameter sets P = processors a
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 17 Plotting the relative CPU time…
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 18 Analysis Examined the forewarning summaries noticed that it was possible to choose better parameters by graphing the total true rate versus each parameter. By doing so: Visible parameter value region is shown See which values generate higher total true rates #NODES w N d S λ UE TTR=.90
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 19 Total true rate vs. parameter(s)
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 20 Summary Research highly dependent on ability to produce numerous results without sacrificing time. Achieved with MPI (Message Passing Interface) Obtained multiple results in single result time frame Chose better parameter visually Graphed total true rate vs. parameter(s)
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 21 Acknowledgements Dr. Lee Hively Dr. Jim Nutaro Dr. Nancy Munro Mark Elmore Debbie McCoy Dr. Z.T. Deng and Dr. Cathy Qian
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 22 The Research Alliance in Math and Science program is sponsored by the Mathematical, Information, and Computational Sciences Division, Office of Advanced Scientific Computing Research, U.S. Department of Energy. The work was performed at the Oak Ridge National Laboratory, which is managed by UT-Battelle, LLC under Contract No. De-AC05-00OR This work has been authored by a contractor of the U.S. Government, accordingly, the U.S. Government retains a non-exclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for U.S Government purposes.
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 23 ??