Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 On-line Parallel Tomography Shava Smallen UCSD.

Similar presentations


Presentation on theme: "1 On-line Parallel Tomography Shava Smallen UCSD."— Presentation transcript:

1 1 On-line Parallel Tomography Shava Smallen UCSD

2 2 I) Introduction to On-line Parallel Tomography II) Tunable On-line Parallel Tomography III) User-directed application-level scheduler IV) Experiments V) Conclusion Talk Outline

3 3 What is tomography? A method for reconstructing the interior of an object from its projections At the National Center for Microscopy and Imaging Research (NCMIR), tomography is applied to electron microscopy to study specimens at the cellular and subcellular level

4 4 Tomogram of spiny dendrite (Images courtesy of Steve Lamont) Example

5 5 Parallel Tomography at NCMIR Embarrassingly parallel X Y slice specimen Z scanline projection scanline

6 6 NCMIR Usage Scenarios Off-line parallel tomography (off-line PT) –Data resides somewhere on secondary storage –Single, high quality tomogram –Reduce turnaround time –Previous work (HCW’ 00) On-line parallel tomography (on-line PT) –Data streamed from the electron microscope long makespan, configuration errors, etc. –Iteratively computed tomogram –Soft real-time execution

7 7 On-line PT Real-time feedback on quality of data acquisition 1) First projection acquired from microscope 2) Generate coarse tomogram 3) Iteratively refine tomogram using subsequent projections (refresh) Update each voxel value Size of tomogram is constant

8 8 NCMIR Target Platform Multi-user, heterogenous resources –NCMIR cluster SGI Indigo2, SGI Octane, SUN ULTRA, SUN Enterprise IRIX, Solaris –Meteor cluster Pentium III dual proc Linux, PBS –Blue Horizon AIX, Loadleveler, Maui Scheduler network

9 slices preprocessor ptomo writer On-line PT Architecture projection scanlines tomogram

10 10 On-line PT Design 1) Frame on-line parallel tomography as a tunable application –Resource limitations / dynamic –Availability of alternate configurations [Chang,et al] each configuration corresponds to different output quality and resource usage 2) Coupled with user-directed application- level scheduler (AppLeS) –adaptive scheduler –promote application performance

11 11 On-line PT Configuration Triple: (f, r, su) Reduction factor (f) –Reduce resolution of data  reduce both computation and communication Projections per refresh (r) –Reduce refinement frequency  reduce communication Service Units - (su) –Increase cost of execution  increase computational power

12 12 User Preferences Best configuration (f, r, su) = (1, 1, 0 ) Several possible configurations  user specifies bounds –projections should be at least size 256x256 1  f  4 or 1  f  8 –user could tolerate up to a 10 minute time wait 1  r  13 –reasonable upper bound 0  su  (50 x acquisition period x c)

13 13 User-directed Feasible? –Use dynamic load information –if work allocation found Better? –e.g. 1. (1, 6, 4) - best f 2. (2, 2, 8) - good su/r 3. (2, 1, 20) - best r reduction factor projections per refresh service units

14 generate request display triples adjust request review triples process request find work allocation execute on-line PT accepts one rejects all infeasible feasible User-directed AppLeS User User-directed AppLeS

15 15 Triple Search Search parameter space –If triple satisfies constraints  feasible Constrained optimization problem based on soft real-time execution –compute constraint –transfer constraint Heuristics to reduce search space – e.g. assume user will always choose (1,2,1) over (1,2,4)

16 16 Work Allocation work allocation transfer constraints cost user constraints compute constraints cpu availability processor availability ptomo-to-writer bandwidth subnet-to-writer bandwidth Multiple mixed-integer programs  approx soln

17 17 Experiments Impact of dynamic information on scheduler performance Usefulness of tunability Grid environments Scheduling latency

18 18 Dynamic Information We fix the triple and let schedulers determine work allocation

19 19 Evaluate schedulers –Repeatibility –Long makespan –several resource environments Simgrid (Casanova [CCGrid’2001]) –API for evaluating scheduling algorithms tasks resources modeled using traces –E.g. Parameter sweep applications [HCW’00] Simtomo Simulation

20 20 relative refresh lateness expected refresh period actual refresh period Relative refresh lateness Performance Metric

21 21 NCMIR experiments Traces (8 machines) –8 hour work day on March 8th, 2001 Ran simulations throughout day at 10 minute intervals 8:00 am 4:00 pm

22 22 Perfect Load Predictions 012345678 10 0 1 2 3 4 hours since 3/8/2001 - 8:00 PST mean relative refresh lateness wwa wwa+cpu wwa+bw AppLeS

23 23 Imperfect Load Predictions Student Version of MATLAB

24 24 Synthetic Grids Bandwidth predictibility –Average prediction error –p i  {L, M, H} –p 1 p 2 p 3 e.g. LMH –27 types –2510 Grids x 4 schedulers –10,040 simulations p1p1 p2p2 p3p3

25 25 Relative Scheduler Performance Student Version of MATLAB 705.89658.91127.101.07

26 26 Partial Ordering Performance vs. bandwidth predictability Grid predictibility –Partial orders using p 1 p 2 p 3 –Comparable/Not Comparable e.g. HML is comparable to HLL e.g. HLM is not comparable to LHM HHH, HHM, HMM, HLM, MLM, LLM, LLL

27 27 Example Partial Order HHHHHMHMMHLMMLMLLMLLL. 10 0 1 2 3 4 relative refresh lateness (seconds) wwa wwa+cpu wwa+bw AppLeS

28 28 Tunability Experiments How useful is tunability? –variability Fixed topology –categorized traces L, M, H –v 1 v 2 v 3 v 4 v 5 –243 Grid types v2v2 v1v1 v3v3 v4v4 v5v5

29 29 Tunability Experiments Run over a 2 day period –back-to-back –assume single user model f, r, su Set of triples chosen –T = {1,…,61}

30 30 Tunability Results 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 fraction of changes parameters f r su Count how many times a triple changed per 2-day simulation e.g. –12.9% –25.7%

31 31 Scheduling Latency Time to search for feasible triples e.g. –88% under 1 sec –63% under 1 sec

32 32 Conclusions and Future Work Grid-enabled version of on-line parallel tomography –Tunable application Tunability is useful in Grid environments –User-directed AppLeS Importance of bandwidth predictability –e.g. rescheduling Scheduling latency is nominal Production use

33 33 Search optimization (f min,r min,su min ) (f max,r max,su max ) (f min,r min ) (f max,r max ) (f min,su min ) (f max,su max ) (r min,su min )


Download ppt "1 On-line Parallel Tomography Shava Smallen UCSD."

Similar presentations


Ads by Google