Parallel Tomography Shava Smallen CSE Dept. U.C. San Diego
Parallel Tomography n Tomography: reconstruction of 3D image from 2D projections n Used for electron microscopy at National Center for Microscopy and Imaging Research (NCMIR) n Coarse-grain, embarrassingly parallel application n Goal of project: Achieve performance using AppLeS scheduling and Globus services
Parallel Tomography Structure driver reader ptomo writer preprocessor destsourc e Off-line Solid lines = data flow dashed lines = control
Parallel Tomography Structure driver reader ptomo writer preprocessor destsourc e On-line Solid lines = data flow dashed lines = control
Parallel Tomography Structure n driver: directs communication among processes, controls work queue n preprocessor: serially formats raw data from microscope for parallel processing n reader: reads preprocessed data and passes it to a ptomo process n ptomo: performs tomography processing n writer: writes processed data to destination (i.e. visualization device, tape)
NCMIR Environment n Platform: collection of heterogeneous resources u workstations (e.g. sgi, sun) u NPACI supercomputer time F (e.g. SP-2, T3E) n How can users achieve execution performance in heterogeneous, multi-user environments?
Performance and Scheduling n Users want to achieve minimum turnaround time in the following scenarios: u off-line: already collected data set u on-line: data streamed from electron microscope n Goal of our project is to develop adaptive scheduling strategies which promote both performance and flexibility for tomography in multi- user Globus environments
Adaptive Scheduling strategy n Develop schedule which adapts to deliverable resource performance at execution time n Application scheduler will dynamically u select sets of resources based on user-defined performance measure u plan possible schedules for each set of feasible resources u predict the performance for each schedule u implement best predicted schedule on selected infrastructure
AppLeS = Application-Level Scheduling n Each AppLeS is an adaptive application scheduler u AppLeS+application = self-scheduling application n scheduling decisions based on u dynamic information (i.e. resource load forecasts) u static application and system information
Resource Selection n available resources u workstations : run immediately execution may be slow due to load u supercomputers : may have to wait in a queue execution fast on dedicated nodes n We want to schedule using both types of resources together for an improved execution performance
Allocation Strategy n We have developed a strategy which simultaneously schedules on both u workstations u immediately available supercomputer nodes F avoid wait time in the batch queue F information is exported by batch scheduler n Overall, this strategy performs better than running on either type of resource alone
Preliminary Globus/AppLeS Tomography Experiments n Resources u 6 workstations available at Parallel Computation Laboratory (PCL) at UCSD u immediately available nodes on SDSC SP-2 (128 nodes) F Maui scheduler exports the number of immediately available nodes e.g. 5 nodes available for the next 30 mins 10 nodes available for the next 10 mins u Globus installed everywhere W W W W W W NCMIR SDSC SP-2 W W W W W W PCL
Allocation Strategies compared n 4 strategies compared: u SP2Immed/WS: workstations and immediately available SP-2 nodes u WS: workstations only u SP2Immed: immediately available SP-2 nodes only u SP2Queue(n): traditional batch queue submit using n nodes n experiments performed in production environment u ran experiments in sets, each set contains all strategies F e.g. SP2Immed, SP2Immed/WS, WS, SP2Queue(8) u within a set, experiments ran back-to-back
Results (8 nodes on SP-2)
Results (16 nodes on SP-2)
Results (32 nodes on SP-2)
Targeting Globus n AppLeS uses Globus services u GRAM, GSI, & RSL F process startup on workstations and batch queue systems F remote process control u Nexus F interprocess communication F multi-threaded F callback functions for fault tolerance
Experiences with Globus n What we would like to see improved: u free node information in MDS (e.g. time availability, frequency) u steep learning curve to initial installation F knowledge of underlying Globus infrastructure u startup scripts F more documentation F more flexible configuration
Experiences with Globus n What worked: u once installed, software works well u responsiveness and willingness to help from Globus team F mailing list F web pages
Future work n Develop contention model to address network overloading which includes u network capacity information u model of application communication requirements n Expansion of allocation policy to include additional resources u other Globus supercomputers, reserved resources (GARA) u availability of resources n NPACI Alpha project with NCMIR and Globus
AppLeS n AppLeS/NCMIR/Globus Tomography Project u Shava Smallen, Jim Hayes, Fran Berman, Rich Wolski, Walfredo Cirne (AppLeS) u Mark Ellisman, Marty Hadida-Hassan, Jaime Frey (NCMIR), u Carl Kesselman, Mei-Hui Su (Globus) n AppLeS Home Page u www-cse.ucsd.edu/groups/hpcl/apples n