The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder” Select the “Action Items” tab Type in action items as they come up Click OK to dismiss this box This will automatically create an Action Item slide at the end of your presentation with your points entered.

Computing Today data archives networks visualization instruments MPPs clusters PCs Workstations Wireless

The Computational Grid –ensemble of heterogeneous, distributed resources –emerging platform for high-performance computing The Grid is the ultimate “supercomputer” –Grids aggregate an enormous number of resources –Grids support applications that cannot be executed at any single site Moreover, –Grids provides increased access to each kind of resource (MPPs, clusters, etc.) –Grid can be accessed at any time

Computational Grids Computational Grids you know and love: –Your computer lab –The Internet –The PACI partnership resources –Your home computer, CS and all the resources you have access to You are already using the Computational Grid …

Programming the Grid I Basics –Need way to login, authenticate in different domains, transfer files, coordinate execution, etc. Grid Infrastructure Resources Globus Legion Condor NetSolve PVM Application Development Environment

Programming the Grid II Performance-oriented programming –Need way to develop and execute performance- efficient programs –Program must achieve performance in context of heterogeneity dynamic behavior multiple administrative domains This can be extremely challenging. –What approaches are most effective to achieve Grid application execution performance?

Performance-oriented Execution Careful scheduling required to achieve application performance in parallel and distributed environments For the Grid, adaptivity right approach to leverage deliverable resource capacities UCSD UTK OGI File Transfer time to SC’99 in Portland [Alan Su] NWS prediction

Fundamental components: –Application-centric performance model Provides quantifiable measure of Grid resources in terms of their potential impact on the application –Prediction of deliverable resource performance at execution time –User’s preferences and performance criteria These components form the basis for AppLeS. Adaptive Application Scheduling

What is AppLeS ? AppLeS = Application Level Scheduler –Joint project with Rich Wolski (U. of Tenn.) AppLeS is a methodology –Project has investigated adaptive application scheduling using dynamic information, application- specific performance models, user preferences. –AppLeS approach based on real-world scheduling. AppLeS is software –Have developed multiple AppLeS-enabled applications which demonstrate the importance and usefulness of adaptive scheduling on the Grid.

How Does AppLeS Work? Select resources For each feasible resource set, plan a schedule –For each schedule, predict application performance at execution time –consider both the prediction and its qualitative attributes Deploy the “best” of the schedules wrt user’s performance criteria –execution time –convergence –turnaround time AppLeS + application Resource selection Schedule Planning Deployment Grid Infrastructure NWS Resources

Network Weather Service (Wolski, U. Tenn.) The NWS provides dynamic resource information for AppLeS NWS is stand-alone system NWS –monitors current system state –provides best forecast of resource load from multiple models Sensor Interface Reporting Interface Forecaster Model 2Model 3Model 1

AppLeS Example: Jacobi2D Jacobi2D is a regular, iterative SPMD application Local communication using 5 point stencil AppLeS written in KeLP/MPI by Wolski Partitioning approach = strip decomposition Goal is to minimize execution time

Jacobi2D AppLeS Resource Selector Feasible resources determined according to application-specific “desirability” metrics –memory –application-specific benchmark Desirability metric used to sort resources Feasible resource sets formed from initial subset of sorted desirability list –Next step is to plan a schedule for each feasible resource set –Scheduler will choose schedule with best predicted execution time

Execution time for ith strip where load = predicted percentage of CPU time available (NWS) comm = time to send and receive messages factored by predicted BW (NWS) AppLeS uses time-balancing to determine best partition on a given set of resources Solve for Jacobi2D Performance Model and Schedule Planning P1P2P3

Jacobi2D Experiments Experiments compare –Compile-time block [HPF] partitioning –Compile-time irregular strip partitioning [no NWS, no resource selection] –Run-time strip AppLeS partitioning Runs for different partitioning methods performed back-to-back on production systems Average execution time recorded Distributed UCSD/SDSC platform: Sparcs, RS6000, Alpha Farm, SP-2

Jacobi2D AppLeS Experiments Representative Jacobi 2D AppLeS experiment Adaptive scheduling leverages deliverable performance of contended system Spike occurs when a gateway between PCL and SDSC goes down Subsequent AppLeS experiments avoid slow link

Work-in-Progress: AppLeS Templates AppLeS-enabled applications perform well in multi-user environments. Methodology is right on target but … –AppLeS must be integrated with application –-- labor-intensive and time- intensive –You generally can’t just take an AppLeS and plug in a new application

Templates Current thrust is to develop AppLeS templates which –target structurally similar classes of applications –can be instantiated in a user-friendly timeframe –provide good application performance AppLeS Template Application Module Performance Module Scheduling Module Deployment Module API Network Weather Service

Case Study: Parameter Sweep Template Parameter Sweeps = class of applications which are structured as multiple instances of an “experiment” with distinct parameter sets Independent experiments may share input files Examples: –MCell –INS2D Application Model

Example Parameter Sweep Application: MCell MCell = General simulator for cellular microphysiology Uses Monte Carlo diffusion and chemical reaction algorithm in 3D to simulate complex biochemical interactions of molecules –Molecular environment represented as 3D space in which trajectories of ligands against cell membranes tracked Researchers plan huge runs which will make it possible to model entire cells at molecular level. –100,000s of tasks –10s of Gbytes of output data –Would like to perform execution-time computational steering, data analysis and visualization

PST AppLeS Template being developed by Henri Casanova and Graziano Obertelli Resource Selection: –For small parameter sweeps, can dynamically select a performance efficient number of target processors [Gary Shao] –For large parameter sweeps, can assume that all resources may be used Platform Model Cluster User’s host and storage network links storage MPP

Contingency Scheduling: Allocation developed by dynamically generating a Gantt chart for scheduling unassigned tasks Basic skeleton 1.Compute the next scheduling event 2.Create a Gantt Chart G 3.For each computation and file transfer currently underway, compute an estimate of its completion time and fill in the corresponding slots in G 4.Select a subset T of the tasks that have not started execution 5.Until each host has been assigned enough work, heuristically assign tasks to hosts, filling in slots in G 6.Implement schedule Scheduling Parameter Sweeps 1 2 1 2 1 2 Network links Hosts (Cluster 1) Hosts (Cluster 2) Computation Time Resources G Scheduling event Scheduling event Computation

Parameter Sweep Heuristics Currently studying scheduling heuristics useful for parameter sweeps in Grid environments HCW 2000 paper compares several heuristics –Min-Min [task/resource that can complete the earliest is assigned first] –Max-Min [longest of task/earliest resource times assigned first] –Sufferage [task that would “suffer” most if given a poor schedule assigned first, as computed by max - second max completion times] –Extended Sufferage [minimal completion times computed for task on each cluster, sufferage heuristic applied to these] –Workqueue [randomly chosen task assigned first] Criteria for evaluation: –How sensitive are heuristics to location of shared input files and cost of data transmission? –How sensitive are heuristics to inaccurate performance information?

Preliminary PST/MCell Results Comparison of the performance of scheduling heuristics when it is up to 40 times more expensive to send a shared file across the network than it is to compute a task “Extended sufferage” scheduling heuristic takes advantage of file sharing to achieve good application performance Max-min Workqueue XSufferage Sufferage Min-min

Preliminary PST/MCell Results with “Quality of Information”

AppLeS in Context Application Scheduling –Mars, Prophet/Gallop, MSHN, etc. Grid Programming Environments –GrADS, Programmers’ Playground, VDCE, Nile, etc. Scheduling Services –Globus GRAM, Legion Scheduler/Collection/Enactor Resource and High-Throughput Schedulers –PBS, LSF, Maui Scheduler, Condor, etc. PSEs –Nimrod, NEOS, NetSolve, Ninf Performance Monitoring, Prediction and Steering –Autopilot, SciRun, Network Weather Service, Remos/Remulac, Cumulus AppLeS project contributes careful and useful study of adaptive scheduling for Grid environments

New Directions Quality of Information –Adaptive Scheduling –AppLePilot / GrADS Resource Economies –Bushel of AppLeS –UCSD Active Web Application Flexibility –Computational Steering –Co-allocation –Target-less computing

Quality of Information How can we deal with imperfect or imprecise predictive information? Quantitative measures of qualitative performance attributes can improve scheduling and execution –lifetime –cost, overhead –accuracy –penalty B C A time

Using Quality of Information Stochastic Scheduling: Information about the variability of the target resources can be used by scheduler to determine allocation –Resources with more performance variability assigned slightly less work –Preliminary experiments show that resulting schedule performs well and can be more predictable SOR Experiment [Jenny Schopf]

Current “Quality of Information” Projects “AppLepilot” –Combines AppLeS adaptive scheduling methodology with “fuzzy logic” decision making mechanism from Autopilot –Provides a framework in which to negotiate Grid services and promote application performance –Collaboration with Reed, Aydt, Wolski –Builds on the software being developed for GrADS

GrADS: Building an Adaptive Grid Application Development and Execution Environment System being developed as performance economy –Performance contracts used to exchange information, bind resources, renegotiate execution environment Joint project with large team of researchers Ken Kennedy Andrew Chien Rich Wolski Ian Foster Carl Kesselman Jack Dongarra Dennis Gannon Dan Reed Lennart Johnsson Fran Berman PSEPSE Config. object program whole program compiler Source application libraries Realtime perf monitor Dynamic optimizer Grid runtime System (Globus) negotiation Software components Scheduler/ Service Negotiator Performance feedback Perf problem Grid Application Development System

In Summary... Development of AppLeS methodology, applications, templates, and models provides a careful investigation of adaptivity for emerging Grid environments Goal of current projects is to use real-world strategies to promote performance –dynamic scheduling –qualitative and quantitative modeling –multi-agent environments –resource economies

AppLeS Corps: –Fran Berman, UCSD –Rich Wolski, U. Tenn –Jeff Brown –Henri Casanova –Walfredo Cirne –Holly Dail –Marcio Faerman –Jim Hayes –Graziano Obertelli –Gary Shao –Otto Sievert –Shava Smallen –Alan Su Thanks to NSF, NASA, NPACI, DARPA, DoD AppLeS Home Page: http://apples.ucsd.edu

Resource Economies All components want to achieve performance on the Grid Competing performance goals Wide variation in “tolerance” Any approach must promote system stability Resource economies - market-based pricing schemes for Grid applications and resources –Currently a very active and important area of investigation –How many currencies? Time vs. money Memory vs. BW vs. compute cycles –What is the role of diversity? –What is the right Grid economic system? (Capitalism, socialism, … ?)

A “Bushel of AppLeS” AppLeS do a good job of achieving performance for individual applications What if multiple applications have AppLeS schedulers? –How much performance can any individual application gain? –Will the activity of multiple AppLeS destabilize the system? –What mechanisms should the system have in place to support the activities of multiple AppLeS?

Preliminary “Bushel of AppLeS” Experiments Replaced different numbers of jobs (NAS benchmarks) with Supercomputer AppLeS (SA) Ran all jobs on simulation of batch-scheduled MPP with conservative backfilling Workload taken from KTH (Swedish Royal Institute of Technology ) General observation: The more jobs use adaptive scheduling, the better the system runs but the harder it is for individual applications to see dramatic improvements in performance.

Distribution for 50% Replacement in the KTH Workload if turnaround (AppLeS) >= turnaround (no AppLeS) otherwise Improvement Factor = { -

Distributed Data Application Template Represents class of applications/tools which transmit/process remote data/images for clients XX Examples: –JPL/SDSC’s SARA application –MicroSoft’s TerraServer –SDSC’s SRB... Compute Servers Data Servers Client Move the computation or move the data? Which compute servers to use? Which servers to use for multiple files?

Distributed Data Application Example: SARA SARA = Synthetic Aperture Radar Atlas –application developed at JPL and SDSC Goal: Assemble/process files for user’s desired image –Radar organized into tracks –User selects track of interest and properties to be highlighted –Raw data is filtered and converted to an image format –Image displayed in web browser

Simple SARA Simple Performance Model Experimental Setup Data for image accessed over shared networks Data sets 1.4 - 3 megabytes, representative of SARA file sizes Servers used for experiments –lolland.cc.gatech.edu –sitar.cs.uiuc –perigee.chpc.utah.edu –mead2.uwashington.edu –spin.cacr.caltech.edu via vBNS via general Internet

Preliminary Results AppLeS developed by Alan Su Experiment with larger data set (3 Mbytes) During this time-frame, general Internet provides data mostly faster than vBNS

Supercomputing ’99 Preliminary results: From Portland SC’99 floor, UCSD and UTK generally “closer” than Oregon Graduate Institute (OGI) in Portland

Co-allocation Space-shared supercomputers: dedicated access to nodes provide good performance but cycles may be hard to get Time-shared clusters: cycles easy to get but computers time- shared so performance may be slow Adaptive approach: Leverage whatever is available: immediately available supercomputer cycles and free workstation cycles AppLeS

Parallel Tomography (GTOMO) Master/Slave (Driver/ptomo) application reconstructs a 3D image from a series of 2D projections GTOMO AppLeS developed by Shava Smallen, Walfredo Cirne, Jaime Frey –Globus provides underlying Grid infrastructure –Production AppLeS used daily by neuroscience researchers … ptomo Driver ReaderWriter ptomo PTOMO DRIVER SOURCE DEST. PRE- PROCESSOR READER WRITER

GTOMO Experiments Experimental design: –Ran on workstation cluster at UCSD and on SP2 at SDSC –Shared resources with other users –Ran back-to-back experiments for each set of scheduling strategies Scheduling strategies: –SP2Immed/WS (co-allocation) –SP2Immed –WS only –SP2Queue with problem sizes provided by users

Better Toast On the electric power grid, power is either adequate or it’s not –On the Computational Grid, application performance depends on the deliverable performance of the underlying resources Major Grid research and development thrusts: –Developing performance-oriented Grid applications and software –Developing and standardizing Grid infrastructure Application Environment Grid Infrastructure Resources

Grid Computing Computing on the Grid can be even more challenging than computing on MPPs –Most resources cannot be dedicated space sharing/resource reservation only a small part of the picture –Program must achieve performance in context of heterogeneity dynamism multiple administrative domains –Target platform is never stable

Scheduling for Performance Experience with parallel and distributed codes shows that careful coordination of tasks and data (i.e. scheduling) required to achieve application performance On the Grid, scheduling is particularly challenging –Scheduling model is messy –No centralized scheduler which controls all resources –Resource and job schedulers prioritize utilization or throughput over application performance –Applications are on their own

Adaptivity Adaptivity a fundamental paradigm for scheduling on the Grid –Heterogeneity of resources and dynamic load variations cause performance characteristics of platform to vary over time and with load –Application must adapt to deliverable resource capacities to achieve performance UCSD UTK OGI File Transfer time to SC’99 in Portland

AppLeS Example: PMHD3D PMHD3D is a 3D magnetohydrodynamics code used as a Legion prototype application –Regular, iterative SPMD stencil application –Column decomposition used to partition data into “slabs” –Performance goal is minimum execution time AppLeS developed by Graziano Obertelli and Holly Dail

PMHD3D AppLeS Resource Selection Feasible resources determined according to application-specific “desirability” metric –computation power –connectivity –hybrid Feasible resource sets formed from initial subsets of each sorted desirability list Scheduler chooses schedule with best predicted execution time from all 3 sorted lists

Execution time for ith slab where Over = perceived overhead in the Legion system (curve-fitting) load = predicted percentage of CPU time available (NWS) comm = time to send and receive messages factored by predicted BW and latency (NWS) PMHD3D Performance Model

Execution time for ith strip where load = predicted percentage of CPU time available (NWS) comm = time to send and receive messages factored by predicted BW (NWS) AppLeS uses time-balancing to determine best partition on a given set of resources Solve for Jacobi2D Performance Model and Schedule Planning

AppLeS uses time-balancing to determine best partition on a given set of resources Performance model: –Solve for with where load = predicted percentage of CPU time available on and comm = time to send and receive messages factored by predicted BW (dynamic predictions provided by NWS) Schedule Planning and Performance Estimation

Determining Slab Heights If system constraints minimally impact application performance, then slab height can be determined as solution to linear system of time-balancing equations If system constraints considerably impact application performance, slab height can be determined using simplex method –Example constraints: memory constraints time-balancing (each processor time as close as possible) compiler constraints sum of slab heights (can’t exceed N)

Jacobi2D Strip Partitioning Solving for {Area} gives strip sizes based on predicted deliverable resource performance at execution time P1 P2 P3

Preliminary PMHD3D Experiments Perf model Measured First set of experiments investigated –the accuracy of the performance model (dedicated runs with and without Legion) –the effectiveness of the resource selection approach (nonded. runs with Legion) “Black box” overhead term used to represent evolving Legion system overheads –Determined by curve-fitting an initial set of experiments

Preliminary “Smart User” Experiments Additional set of experiments compared –AppLeS adaptive scheduling approach –Legion default scheduler –A “smart user” (who tries to choose an optimal number of processors) Experiments conducted on Legion Centurion Cluster with ambient load

Jacobi 2D Experiments Adaptive scheduling enables application to respond to dynamic events

The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Similar presentations

Presentation on theme: "The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Similar presentations

Presentation on theme: "The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create."— Presentation transcript:

Similar presentations

About project

Feedback