Presentation is loading. Please wait.

Presentation is loading. Please wait.

Achieving Application Performance on the Computational Grid Francine Berman U. C. San Diego and NPACI This presentation will probably involve audience.

Similar presentations


Presentation on theme: "Achieving Application Performance on the Computational Grid Francine Berman U. C. San Diego and NPACI This presentation will probably involve audience."— Presentation transcript:

1 Achieving Application Performance on the Computational Grid Francine Berman U. C. San Diego and NPACI This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder” Select the “Action Items” tab Type in action items as they come up Click OK to dismiss this box This will automatically create an Action Item slide at the end of your presentation with your points entered.

2 The Computing Landscape data archives networks visualization instruments MPPs clusters PCs Workstations Wireless

3 Computing Platforms Combining resources “in the box” –focus is on new hardware Combining resources as a “virtual box” –focus is on software infrastructure

4 The Computational Grid Computational Grid = ensemble of distributed and heterogeneous resources Metaphor: Electric Power Grid –for users, power is ubiquitous –you can plug in anywhere –you don’t need to know where the power is coming from

5 Better Toast On the electric power grid, power is either adequate or it’s not –On the computational grid, application performance depends on the underlying system state Major Grid research and development thrusts: –Building the Grid –Programming the Grid

6 Programming for Performance Performance Paradigm: To achieve performance, applications must be designed and implemented to leverage the performance characteristics of the underlying resources. Performance Characteristics of the Grid Resources are distributed, heterogeneous Resources shared by multiple users Resource performance may be hard to predict

7 How Can Applications Achieve Performance on the Grid? Build programs to be grid-aware Leverage deliverable resource performance during execution –Scheduling is fundamental Key Grid scheduling components –dynamic information –quantitative and qualitative predictions –adaptivity

8 Achieving Application Performance Many entities will schedule the application PSEPSE Config. object program whole program compiler Source appli- cation libraries Realtime perf monitor Dynamic optimizer Grid runtime system negotiation Software components Service negotiator Scheduler Performance feedback Perf problem Grid Application Development System

9 Application Scheduling Application schedulers must –perceive the performance impact of system resources on the application –adapt application execution schedule to dynamic conditions –optimize application schedule for Grid according to the user’s performance criteria Application scheduler tasked with promoting application performance over the performance of other applications and system components

10 Self-Centered Scheduling: Everything in the system is evaluated in terms of its impact on the application. performance of each system component can be considered as a measurable quantity forecasts of quantities relevant to the application can be manipulated to determine schedule This simple paradigm forms the basis for AppLeS. Paradigm for Application Scheduling

11 AppLeS AppLeS = Application-Level Scheduler –agent-based approach –each application integrated with its own AppLeS –each AppLeS develops and implements a custom application schedule NWS (Wolski) User Prefs App Perf Model Planner Resource Selector Application Act. Grid/cluster resources/ infrastructure Joint project with Rich Wolski at U. Tenn

12 AppLeS Approach Select resources For each feasible resource set, plan a schedule –For each schedule, predict application performance at execution time –consider both the prediction and its qualitative attributes Implement the “best” of the schedules wrt user’s performance criteria –execution time –convergence –turnaround time NWS (Wolski) User Prefs App Perf Model Planner Resource Selector Application Act. Grid/cluster resources/ infrastructure

13 Network Weather Service (Wolski, U. Tenn.) The NWS provides dynamic resource information for AppLeS NWS is stand-alone system NWS –monitors current system state –provides best forecast of resource load from multiple models Sensor Interface Reporting Interface Forecaster Model

14 The Role of Prediction Is monitoring enough for scheduling?

15 Monitoring vs. Prediction Monitored data provides a snapshot of what has happened, forecasting tells us: what will happen... Last value is not always the best predictor Monitored data

16 Using Forecasting in Scheduling How much work should each processor be given? Jacobi2D AppLeS solves equations for Area: P1 P2P3

17 Good Predictions Promote Good Schedules Jacobi2D experiments

18 SARA: An AppLeS-in-Progress SARA = Synthetic Aperture Radar Atlas –application developed at JPL and SDSC Goal: Assemble/process files for user’s desired image –thumbnail image shown to user –user selects desired bounding box for more detailed viewing –SARA provides detailed image in variety of formats

19 Simple SARA AppLeS focuses on resource selection problem: Which site can deliver data the fastest? Goal is to optimize performance by minimizing transfer time Code developed by Alan Su Compute Server Data Server Data Server Data Server Computation servers and data servers are logical entities, not necessarily different nodes Network shared by variable number of users Computation assumed to be done at compute servers

20 Experimental Setup Data for image accessed over shared networks Data sets 1.4 - 3 megabytes, representative of SARA file sizes Servers used for experiments –lolland.cc.gatech.edu –sitar.cs.uiuc –perigee.chpc.utah.edu –mead2.uwashington.edu –spin.cacr.caltech.edu via vBNS via general Internet

21 Which is “Closer”? Sites on the east coast or sites on the west coast? Sites on the vBNS or sites on the general Internet? Consistently the same site or different sites at different times?

22 Which is “Closer”? Sites on the east coast or sites on the west coast? Sites on the vBNS or sites on the general Internet? Consistently the same site or different sites at different times? Depends a lot on traffic...

23 Simple SARA Experiments Ran back-to-back experiments from remote sites to UCSD/PCL Wolski’s Network Weather Service provides forecasts of network load and availability Experiments run during normal business hours mid-week

24 Preliminary Results Experiment with larger data set (3 Mbytes) During this time-frame, general Internet provides data mostly faster than vBNS

25 Experiment with smaller data set (1.4 Mbytes) During this time frame, east coast sites provide data mostly faster than west coast sites More Preliminary Results

26 9/21/98 Experiments Clinton Grand Jury webcast commenced at trial 62

27 What if File Sizes are Larger? Storage Resource Broker (SRB) SRB provides access to distributed, heterogeneous storage systems –UNIX, HPSS, DB2, Oracle,.. –files can be 16MB or larger –resources accessed via a common SRB interface

28 An SRB AppLeS SRB Client Network Weather Service SRB Server MCAT Distributed Physical Storage AppLeS Network Being developed by Marcio Faerman Like Simple SARA, SRB focuses on resource selection NWS probe is 64K, SRB file size is 16MB How to predict SRB file transfer time?

29 Predicting Large File Transfer Times NWS and SRB present distinct behaviors Dec 3 Dec 11 Current approach: Use linear regression on NWS bandwidth measurements to track SRB behavior

30 Distributed Data Applications SARA and SRB representative of larger class of distributed data applications Simple SARA template being extended to accommodate –replicated data sources –multiple files per image –parallel data acquisition –intermediate compute sites –web interface, etc.

31 Distributed Data Applications... Compute Servers Data Servers Client Move the computation or move the data? Which compute servers to use? Which servers to use for multiple files? Simple SARA and SRB representative of a larger class of distributed data applications Goal is to develop AppLeS scheduler for “end-to-end” applications

32 A Bushel of AppLeS … almost During the first “phase” of the project, we’ve focused on developing AppLeS applications –Jacobi2D –DOT –SRB –Simple SARA –magnetohydrodynamics –CompLib –INS2D –Tomography,... What have we learned?

33 Lessons Learned From AppLeS Compile-time Blocked Partitioning Run-time AppLeS Non- Uniform Strip Partitioning Dynamic information is critical. Jacobi2D

34 Lessons Learned from AppLeS Program execution and parameters may exhibit a range of performance

35 Lessons Learned from AppLeS Knowing something about the “goodness” of performance predictions can improve scheduling SORCompLib

36 Lessons Learned from AppLeS Performance of application sensitive to scheduling policy, data, and system characteristics

37 Achieving Application Performance on the Grid AppLeS uses adaptivity to leverage deliverable resource performance Performance impact of all components considered AppLeS agents target dynamic, multi-user distributed environments AppLeS is leading project in application scheduling

38 Related Work Application Schedulers –Mars, Prophet/Gallop, VDCE,... Scheduling Services –Globus GRAM Resource Allocators –I-Soft, PBS, LSF, Maui Scheduler, Nile, Legion PSEs –Nimrod, NEOS, NetSolve, Ninf High-Throughput Schedulers –Condor Performance Steering –Autopilot, SciRun

39 New Directions AppLeS Templates –distributed data applications –parameter sweeps –master/slave applications –data parallel stencil applications AppLeS Template Retargeting Engineering Environment Application Module Performance Module Scheduling Module Deployment Module API Network Weather Service dynamic benchmarking suite selection

40 New Directions Expanding AppLeS target execution sites –interactive clusters linux, NT –Globus, Legion –batch systems –high-throughput clusters (Condor) –all of the above SCHED AppLeS

41 New Directions Real World Scheduling scheduling with –partial information –poor information –dynamically changing information Multischeduling resource economies scheduling “social structure” X

42 The Brave New World Design, development, and execution of grid-aware applications PSEPSE Config. object program whole program compiler Source appli- cation libraries Realtime perf monitor Dynamic optimizer Grid runtime system negotiation Software components Service negotiator Scheduler Performance feedback Perf problem Grid Application Development System

43 The AppLeS Project AppLeS Corps: –Fran Berman, UCSD –Rich Wolski, U. Tenn –Henri Casanova –Walfredo Cirne –Marcio Faerman –Jaime Frey –Jim Hayes –Graziano Obertelli –Gary Shao –Shava Smallen –Alan Su –Dmitrii Zagorodnov Thanks to NSF, NASA, NPACI, DARPA, DoD AppLeS Home Page: http://www-cse.ucsd.edu/groups/hpcl/apples.html


Download ppt "Achieving Application Performance on the Computational Grid Francine Berman U. C. San Diego and NPACI This presentation will probably involve audience."

Similar presentations


Ads by Google