Center for High Performance Software

Slides:



Advertisements
Similar presentations
SLA-Oriented Resource Provisioning for Cloud Computing
Advertisements

SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
Architectural Design Establishing the overall structure of a software system Objectives To introduce architectural design and to discuss its importance.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
OGCE Workflow Suite GopiKandaswamy Suresh Marru SrinathPerera ChathuraHerath Marlon Pierce TeraGrid 2008.
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
TMC BioGrid A GCC Consortium Ken Kennedy Center for High Performance Software Research (HiPerSoft) Rice University
Supercomputing Cross-Platform Performance Prediction Using Partial Execution Leo T. Yang Xiaosong Ma* Frank Mueller Department of Computer Science.
Predicting Queue Waiting Time in Batch Controlled Systems Rich Wolski, Dan Nurmi, John Brevik, Graziano Obertelli Computer Science Department University.
Chapter 10 Analysis and Design Discipline. 2 Purpose The purpose is to translate the requirements into a specification that describes how to implement.
Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.
A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.
From Use Cases to Implementation 1. Structural and Behavioral Aspects of Collaborations  Two aspects of Collaborations Structural – specifies the static.
Scheduling Strategies for Mapping Application Workflows Onto the Grid A. Mandal, K. Kennedy, C. Koelbel, G. Marin, J. Mellor- Crummey, B. Liu, L. Johnsson.
From Use Cases to Implementation 1. Mapping Requirements Directly to Design and Code  For many, if not most, of our requirements it is relatively easy.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
VGrADS Programming Tools Research: Vision and Overview Ken Kennedy Center for High Performance Software Rice University
Resource Characterization Rich Wolski, Dan Nurmi, and John Brevik Computer Science Department University of California, Santa Barbara VGrADS Site Visit.
Introduction to OOAD and UML
Lessons from LEAD/VGrADS Demo Yang-suk Kee, Carl Kesselman ISI/USC.
The Virtual Grid Application Development Software (VGrADS) Project Overview Ken Kennedy Center for High Performance Software Rice University
Resource Specification Prediction Model Richard Huang joint work with Henri Casanova and Andrew Chien.
Workload Management Workpackage
Advanced Computer Systems
OPERATING SYSTEMS CS 3502 Fall 2017
Support for Program Analysis as a First-Class Design Constraint in Legion Michael Bauer 02/22/17.
A Quick tour of LEAD for the VGrADS
Clouds , Grids and Clusters
Software Testing.
SC’07 Demo Draft VGrADS Team June 2007.
Definition CASE tools are software systems that are intended to provide automated support for routine activities in the software process such as editing.
Center for High Performance Software
EMAN, Scheduling, Performance Prediction, and Virtual Grids
Performance and Fault Tolerance
LEAD-VGrADS Day 1 Notes.
For Massively Parallel Computation The Chaotic State of the Art
Software Engineering (CSI 321)
Operating Systems (CS 340 D)
Porting MM5 and BOLAM codes to the GRID
Resource Characterization
New Workflow Scheduling Techniques Presentation: Anirban Mandal
VGrADS Tools Activities
J. Michael, M. Shing M. Miklaski, J. Babbitt Naval Postgraduate School
Abstract Machine Layer Research in VGrADS
Design and Implementation
Babak Sorkhpour, Prof. Roman Obermaisser, Ayman Murshed
Real-time Software Design
Database Performance Tuning and Query Optimization
Operating Systems (CS 340 D)
Overview of the Course Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Gabor Madl Ph.D. Candidate, UC Irvine Advisor: Nikil Dutt
CSCI1600: Embedded and Real Time Software
Open Grid Computing Environments
Chapter 5 Designing the Architecture Shari L. Pfleeger Joanne M. Atlee
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Objective of This Course
Human Complexity of Software
Chapter 1 Introduction(1.1)
Chapter 11 Database Performance Tuning and Query Optimization
Presented By: Darlene Banta
Overview of Workflows: Why Use Them?
R, Zhang, A. Chien, A. Mandal, C. Koelbel,
CSCI1600: Embedded and Real Time Software
From Use Cases to Implementation
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Center for High Performance Software Why Performance Models Matter for Grid Computing Ken Kennedy Center for High Performance Software Rice University http://vgrads.rice.edu/presentations/grids06

Vision: Global Distributed Problem Solving Where We Want To Be Transparent Grid computing Submit job Find & schedule resources Execute efficiently Where We Are Some simple (but useful) tools Programmer must manage: Heterogeneous resources Scheduling of computation and data movement Fault tolerance and performance adaptation What Do We Propose as A Solution? Separate application development from resource management Through an abstraction called the Virtual Grid Provide tools to bridge the gap between conventional and Grid computation Scheduling, resource management, distributed launch, simple programming models, fault tolerance, grid economies Database Supercomputer Database Supercomputer Intent of this slide is to define the problem Key points: Want to enable researcher sitting in Houston to use significant resources at other sites (e.g. use supercomputers to correlate huge genome databases) This is too hard today - requires detailed systems programming, rare expertise, not feasible for the discipline scientist (unless she has a collaboration with the CS department) VGrADS is raising this level of abstraction

The VGrADS Team VGrADS is an NSF-funded Information Technology Research project Keith Cooper Ken Kennedy Charles Koelbel Richard Tapia Linda Torczon Dan Reed Jack Dongarra Lennart Johnsson Can also use this slide to thank our program managers (Frederica Darema, Kamal Abdali, Mike Foster) Students involved in demos at SC: Wahid Chrabakh (GridSAT, UCSB) Anirban Mandal (EMAN, Rice) Bo Lu (EMAN, UH) Carl Kesselman Rich Wolski Fran Berman Plus many graduate students, postdocs, and technical staff!

VGrADS Project Vision Virtual Grid Abstraction Separation of Concerns Resource location, reservation, and management Application development and resource requirement specification Permits true scalability and control of resources Tools for Application Development Easy application scheduling, launch, and management Off-line scheduling of workflow graphs Automatic construction of performance models Abstract programming interfaces Support for Fault Tolerance and Reliability Collaboration between application and virtual grid execution system Research Driven by Real Application Needs EMAN, LEAD, GridSAT, Montage

VGrADS Application Collaborations EMAN Electron Micrograph Analysis GridSAT Boolean Satisfiability BPEL Workflow Engine LDM Service GridFTP Service WRF Service vgES Information Rice Scheduler Ensemble Broker Visualization Data arrives Resource Data Mining Start End Static Workflow Dynamic LEAD Atmospheric Science Montage Astronomy

VGrADS Big Ideas Virtualization of Resources Application specifies required resources in Virtual Grid Definition Language (vgDL) Give me a loose bag of 1000 processors, with 1 GB memory per processor, with the fastest possible processors Give me a tight bag of as many Opterons as possible Virtual Grid Execution System (vgES) produces specific virtual grid matching specification Avoids need for scheduling against the entire space of global resources Generic In-Advance Scheduling of Application Workflows Application includes performance models for all workflow nodes Performance models automatically constructed Software schedules applications onto virtual Grid, minimizing total makespan Including both computation and data movement times

Workflow Scheduling Results Dramatic makespan reduction of offline scheduling over online scheduling — Application: Montage Value of performance models and heuristics for offline scheduling — Application: EMAN ”Scheduling Strategies for Mapping Application Workflows onto the Grid” HPDC’05 ”Resource Allocation Strategies for Workflows in Grids” CCGrid’05

Virtual Grid Results Virtual Grid prescreening of resources produces schedules dramatically faster without significant loss of schedule quality Huang, Casanova and Chien. Using Virtual Grids to Simplify Application Scheduling. IPDPS 2006. Zhang, Mandal, Casanova, Chien, Kee, Kennedy and Koelbel. Scalable Grid Application Scheduling via Decoupled Resource Selection and Scheduling. CCGrid06. Current VGrADS heuristics are quadratic in number of distinct resource classes Two-phase approach brings this much closer to linear time

Performance Model Construction Problem: Performance models are difficult to construct By-hand and models take a great deal of time Accuracy is often poor Solution (Mellor-Crummey and Marin): Construct performance models automatically From binary for a single resource and execution profiles Generate a distinct model for each target resource Current results Uniprocessor modeling Can be extended to parallel MPI steps Memory hierarchy behavior Models for instruction mix Application-specific models Scheduling using delays provided as machine specification

Performance Prediction Overview Object Code Dynamic Analysis Binary Instrumenter Instrumented Code Execute Binary Analyzer Communication Volume & Frequency Memory Reuse Distance BB Counts Control flow graph Loop nesting structure BB instruction mix Post Processing Tool Our approach aims at building parameterized models of black-box applications in a semi-automatic way, in terms of architecture-neutral Application Signatures. We build models using information from both static and dynamic analysis of an application’s binary. We use static analysis to construct the control flow graph of each routine in an application and to look at the instruction mix inside the most frequently executed loops. We use dynamic analysis to collect data about the execution frequency of basic blocks, information about synchronization among processes and reuse distance of memory accesses. By looking at application binaries instead of source code, we are able to build language-independent tools and we can naturally analyze applications that have modules written in different languages or are linked with third party libraries. By analyzing binaries, the tool can also be useful to both application writers and to compiler developers by enabling them to validate the effectiveness of their optimizations. Performance Prediction for Target Architecture Static Analysis Architecture neutral model Scheduler Architecture Description Post Processing

Modeling Memory Reuse Distance - it is a histogram for a single program reference - explain what you mean by normalized frequency - explain that you are showing measured data then the model. - it shows surprisingly complex behavior - at least three qualitatively different types of behavior - constant - slowly growing (likely linear) - likely quadratic behavior

Execution Behavior: NAS LU 2D 3.0

Value of Performance Models True Grid Economy All resources have a “rate of exchange” Example: 1 CPU unit Opteron = 2 units Itanium (at same Ghz) Rate of exchange determined by averaging over all applications Weighted by application global resource usage percentage Improved Global Resource Utilization A particular application may be able to take advantage of performance models to reduce costs Example: For EMAN, 1 Opteron unit = 3 Itanium units Thus, EMAN should always favor Opterons when available If all applications do this, the total system resources will be used far more efficiently

Performance Models and Batch Scheduling Currently VGrADS supports scheduling using estimated batch queue waiting times Batch queue estimates are factored into communication time E.g., the delay in moving from one resource to another is data movement time + estimated batch queue waiting time Unfortunately, estimates can have large standard deviations Next phase: limiting variability through two strategies: Resource reservations: partially supported on the TeraGrid and other schedulers In-advance queue insertion: submit jobs before data arrives based on estimates Can be used to simulate advance reservations Exploiting this requires a preliminary schedule indicating when the resources are needed Problem: how to build an accurate schedule when exact resource types are unknown

Solution to Preliminary Scheduling Problem Use performance models to specify alternative resources For step B, I need the equivalent of 200 Opterons, where 1 Opteron = 3 Itanium = 1.3 Power 5 Equivalence from performance model This permits an accurate preliminary schedule because the performance model standardizes the time for each step Scheduling can then proceed with accurate estimates of when each resource collection will be needed Makes advance reservations more accurate Data will arrive neither egregiously early nor egregiously late It may provide a mixture to meet the computational requirements, if the specification permits Give me a loose bag of tight bags containing the equivalent of 200 Opterons, minimize the number of tight bags and the overall cost Solution might be 150 Opterons in one cluster and 150 Itaniums in another

The LEAD Vision: Adaptive Cyberinfrastructure DYNAMIC OBSERVATIONS Analysis/Assimilation Quality Control Retrieval of Unobserved Quantities Creation of Gridded Fields Prediction/Detection PCs to Teraflop Systems Product Generation, Display, Dissemination Models and Algorithms Driving Sensors The CS challenge: Build cyberinfrastructure services that provide adaptability, scalability, availability, usability, and real-time response. End Users NWS Private Companies Students

Scheduling to a Deadline LEAD Project has a feedback loop After each preliminary workflow, results are used to adjust Doppler radar configurations to get more accurate information in the next phase This must be done on a tight time constraint Performance models make this scheduling possible What happens if the first schedule misses the deadline? The time must be reduced, either by doing less computation or by using more resources But how many more resources? Suppose we can differentiate the model to compute for each step, the sensitivity of running time to more resources Automatic differentiation technologies exist, but other strategies may also make this possible Derivatives can be used to predict resources needed to meet the deadline along the critical path

LEAD Workflow Terrain data files NAM, RUC, GFS data 3 7 1 3D Model Data Interpolator (Initial Boundary Conditions) 3D Model Data Interpolator (lateral Boundary Conditions) Terrain Preprocessor 11 IDV Bundle 4 Radar data (level II) Surface data, upper air mesonet data, wind profiler 88D Radar Remapper 9 10 Radar data (level III) 5 ARPS to WRF Data Interpolator WRF NIDS Radar Remapper ADAS 12 6 Satellite data WRF to ARPS Data Interpolator Satellite Data Remapper 8 Surface, Terrestrial data files 13 ARPS Plotting Program 2 Triggered if a storm is detected Run Once per forecast Region WRF Static Preprocessor Repeated for periodically for new data Visualization on users request

Maximizing Robustness Two sources of robustness problems Running time estimates are really probabilistic, with a distribution of values Critical path might vary if some tasks take longer than expected Resources to which steps are assigned might fail Solution Strategies Schedule the workflow to minimize the probability of exceeding the deadline Allocate the slack judiciously Use sensitivity-based strategy described earlier Replication driven by reliability history can be used to overcome this problem to within a specified tolerance More of a less reliable resource might be better Note of caution: Tolerance cannot be too small if costs are to be managed

Summary VGrADS Project Uses Two Mechanisms for Scheduling Abstract resource specification to produce preliminary collection of resources More precise scheduling on abstract collection using performance models Combined strategy produces excellent schedules in practice Performance models can be constructed automatically New Challenges Require Sophisticated Methods Minimizing cost of application execution Scheduling with batch queues and advanced reservations Scheduling to deadlines Scheduling for robustness Current Driving Application LEAD: Linked Environments for Atmospheric Discovery Requires deadlines and efficient resource utilization

Relationship to Future Architectures Future supercomputers will have heterogeneous components Cray, Cell, Intel, CPU + coprocessor (GPU), … Scheduling subcomputations onto units while managing data movement costs will be the key to effective usage of such systems Adapt VGrADS strategy Difference: must consider alternative compilations of the same computation, then model performance of the result Could be used to tailor systems or chips to particular users’ application mixes