New Development in the AppLeS Project or User-Level Middleware for the Grid Francine Berman University of California, San Diego.

Slides:

Advertisements

Similar presentations

Nimrod/G GRID Resource Broker and Computational Economy

Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University

A Cloud Data Center Optimization Approach using Dynamic Data Interchanges Prof. Stephan Robert University of Applied Sciences.

Scheduling in Distributed Systems Gurmeet Singh CS 599 Lecture.

1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.

From Grid to Global Computing: Deploying Parameter Sweep Applications Henri Casanova Grid Research And Innovation Laboratory (GRAIL)

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.

ADAPT An Approach to Digital Archiving and Preservation Technology Principal Investigator: Joseph JaJa Lead Programmers: Mike Smorul and Mike McGann Graduate.

Problem-Solving Environments: The Next Level in Software Integration David W. Walker Cardiff University.

Workload Management Workpackage Massimo Sgaravatto INFN Padova.

Achieving Application Performance on the Information Power Grid Francine Berman U. C. San Diego and NPACI This presentation will probably involve audience.

6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.

Quality of Service in IN-home digital networks Alina Albu 7 November 2003.

Achieving Application Performance on the Computational Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion,

CSE 160/Berman Programming Paradigms and Algorithms W+A 3.1, 3.2, p. 178, 6.3.2, H. Casanova, A. Legrand, Z. Zaogordnov, and F. Berman, "Heuristics.

Adaptive Computing on the Grid – The AppLeS Project Francine Berman U.C. San Diego.

AppLeS, NWS and the IPG Fran Berman UCSD and NPACI Rich Wolski UCSD, U. Tenn. and NPACI This presentation will probably involve audience discussion, which.

MCell Usage Scenario Project #7 CSE 260 UCSD Nadya Williams

Achieving Application Performance on the Computational Grid Francine Berman This presentation will probably involve audience discussion, which will create.

The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

NPACI Alpha Project Review: Cellular Microphysiology on the Data Grid Fran Berman, UCSD Tom Bartol, Salk Institute.

Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.

Review 4 Chapters 8, 9, 10.

The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.

AppLeS / Network Weather Service IPG Pilot Project FY’98 Francine Berman U. C. San Diego and NPACI Rich Wolski U.C. San Diego, NPACI and U. of Tennessee.

Cal-(IT) 2 Francine Berman UCSD Interfaces and Software Layer Leader The Cal-IT2 Software Challenge.

Workload Management Massimo Sgaravatto INFN Padova.

Simo Niskala Teemu Pasanen

Computer Science and Engineering A Middleware for Developing and Deploying Scalable Remote Mining Services P. 1DataGrid Lab A Middleware for Developing.

CSE 160/Berman Programming Paradigms and Algorithms W+A 3.1, 3.2, p. 178, 5.1, 5.3.3, Chapter 6, 9.2.8, , Kumar Berman, F., Wolski, R.,

Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience.

National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.

Cloud Usage Overview The IBM SmartCloud Enterprise infrastructure provides an API and a GUI to the users. This is being used by the CloudBroker Platform.

A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster

Nimrod/G GRID Resource Broker and Computational Economy David Abramson, Rajkumar Buyya, Jon Giddy School of Computer Science and Software Engineering Monash.

Parallel Tomography Shava Smallen CSE Dept. U.C. San Diego.

ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.

WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.

Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Molecular Science in NPACI Russ B. Altman NPACI Molecular Science Thrust Stanford Medical.

Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rhône-Alpes GRAAL Research Team Join work with DIET TEAM D istributed I nteractive.

1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.

SURA GridPlan Infrastructure Working Group Art Vandenberg Georgia State University Mary Fran Yafchak SURA Working.

A Survey of Distributed Task Schedulers Kei Takahashi (M1)

Development Timelines Ken Kennedy Andrew Chien Keith Cooper Ian Foster John Mellor-Curmmey Dan Reed.

Grid Workload Management Massimo Sgaravatto INFN Padova.

Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.

1 Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication Micah Beck Jack Dongarra Terry Moore James Plank University.

1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11.

Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.

Problem Solving with NetSolve Michelle Miller, Keith Moore,

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.

GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.

Middleware for Campus Grids Steven Newhouse, ETF Chair (& Deputy Director, OMII)

Automatic Statistical Evaluation of Resources for Condor Daniel Nurmi, John Brevik, Rich Wolski University of California, Santa Barbara.

Adaptive Computing on the Grid Using AppLeS Francine Berman, Richard Wolski, Henri Casanova, Walfredo Cirne, Holly Dail, Marcio Faerman, Silvia Figueira,

Economic and On Demand Brain Activity Analysis on Global Grids A case study.

Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing

Timeshared Parallel Machines Need resource management Need resource management Shrink and expand individual jobs to available sets of processors Shrink.

Cyber-Research: Meeting the Challenge of a Terascale Computing Infrastructure Francine Berman Department of Computer Science and Engineering, U. C. San.

10 May 2001WP6 Testbed Meeting1 WP5 - Mass Storage Management Jean-Philippe Baud PDP/IT/CERN.

- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen,

Parallel Tomography Shava Smallen SC99. Shava Smallen SC99AppLeS/NWS-UCSD/UTK What are the Computational Challenges? l Quick turnaround time u Resource.

Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.

MicroGrid Update & A Synthetic Grid Resource Generator Xin Liu, Yang-suk Kee, Andrew Chien Department of Computer Science and Engineering Center for Networked.

Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,

Achieving Application Performance on the Computational Grid Francine Berman U. C. San Diego and NPACI This presentation will probably involve audience.

Module 01 ETICS Overview ETICS Online Tutorials

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

New Development in the AppLeS Project or User-Level Middleware for the Grid Francine Berman University of California, San Diego

The Evolving Grid Applications Resources In the beginning, there were applications and resources, and it took ninja programmers and many months to implement the applications on the Grid …

The Evolving Grid Grid Middleware Applications Resources Applications Resources And behold, there were services, and programmers saw that it was good (even though their performance was still often less than desirable) …

The Evolving Grid Grid Middleware Applications Resources User-Level Middleware Grid Middleware Applications Resources Applications Resources … and it came to pass that user-level middleware was promised to promote the performance of Grid applications, and the users rejoiced …

The Middleware Promise Grid Middleware –Provides infrastructure/services to enable usability of the Grid –Promotes portability and retargetability User-level Middleware –Hides the complexity of the Grid for the end-user –Adapts to dynamic resource performance variations –Promotes application performance Grid Middleware Applications Resources User-Level Middleware

How Do Applications Achieve Performance Now? AppLeS = Application-Level Scheduler –Joint project with R. Wolski –AppLeS + application = self- scheduling Grid application –AppLeS-enabled applications adapt to dynamic performance variations in Grid Resources Grid Middleware Resources AppLeS- enabled applications

AppLeS Architecture Grid Middleware Resources AppLeS- enabled applications Schedule Deployment Resource Discovery Resource Selection Schedule Planning and Performance Modeling Decision Model accessible resources feasible resource sets evaluated schedules “best” schedule

From AppLeS-enabled applications to User-Level Middleware Grid Middleware Applications Resources User-Level Middleware Grid Middleware Resources AppLeS- enabled applications AppLeS agent integrated within application

AppLeS User-Level Middleware Focus is development of templates which –target structurally similar classes of applications –can be instantiated in a user-friendly timeframe –provide good application performance Application Module Scheduling Module Deployment Module Grid Middleware and Resources AppLeS Template Architecture

APST – AppLeS Parameter Sweep Template Parameter Sweeps = class of applications which are structured as multiple instances of an “experiment” with distinct parameter sets Joint work with Henri Casanova First AppLeS Middleware package to be distributed to users Parameter Sweeps are common application structure used in various fields of science and engineering –Most notably: Simulations (Monte Carlo, etc.) Large number of tasks, no task precedences in the general case  easy scheduling ? –I/O constraints –Need for meaningful partial results –multiple stages of post-processing

APST Scheduling Issues Large shared files, if any, must be stored strategically Post-processing must minimize file transfers Adaptive scheduling necessary to account for changing environment

Contingency Scheduling: Allocation developed by dynamically generating a Gantt chart for scheduling unassigned tasks between scheduling events Basic skeleton 1.Compute the next scheduling event 2.Create a Gantt Chart G 3.For each computation and file transfer currently underway, compute an estimate of its completion time and fill in the corresponding slots in G 4.Select a subset T of the tasks that have not started execution 5.Until each host has been assigned enough work, heuristically assign tasks to hosts, filling in slots in G 6.Implement schedule Scheduling Approach Network links Hosts (Cluster 1) Hosts (Cluster 2) Time Resources Computation G Scheduling event Scheduling event Computation

Scheduling Heuristics Self-scheduling Algorithms workqueue workqueue w/ work stealing workqueue w/ work duplication... Gantt chart heuristics: MinMin, MaxMin Sufferage, XSufferage... Scheduling Algorithms for PS Applications Easy to implement and quick No need for performance predictions Insensitive to data placement More difficult to implement Needs performance predictions Sensitive to data placement Simulation results (HCW ’00 paper) show that: heuristics are worth it Xsufferage is good heuristic even when predictions are bad complex environments require better planning (Gantt chart)

NetSolve Globus Legion NWS Ninf IBP Condor APST Architecture transport APIexecution API metadata API scheduler API Grid Resources and Middleware APST Daemon GASSIBP NFS GRAMNetSolve Condor, Ninf, Legion,.. NWS Workqueue Gantt chart heuristic algorithms Workqueue++ MinMinMaxMinSufferageXSufferage APST Client Controller interacts Command-line client Metadata Bookkeeper Actuator Scheduler triggers transferexecutequery store actuate report retrieve

APST APST being used for –INS2D (NASA Fluid Dynamics application) –MCell (Salk, Molecular modeling for Biology) –Tphot (SDSC, Proton Transport application) –NeuralObjects (NSI, Neural network simulations) –CS simulation Applications for our own research (Model validation, long-range forecasting validation) Actuator’s APIs are interchangeable and mixable –(NetSolve+IBP) + (GRAM+GASS) + (GRAM+NFS) Scheduler API allows for dynamic adaptation No Grid software is required –However lack of it (NWS, GASS, IBP) may lead to poorer performance More details in SC’00 paper

APST Validation Experiments University of Tennessee, Knoxville NetSolve + IBP University of California, San Diego GRAM + GASS Tokyo Institute of Technology NetSolve + NFS NetSolve + IBP APST Daemon APST Client

APST Test Application – MCell MCell = General simulator for cellular microphysiology Uses Monte Carlo diffusion and chemical reaction algorithm in 3D to simulate complex biochemical interactions of molecules Focus of new multi- disciplinary ITR project –Will focus on large-scale execution-time computational steering, data analysis and visualization

Experimental Results Experimental Setting: Mcell simulation with 1,200 tasks: composed of 6 Monte-Carlo simulations input files: 1, 1, 20, 20, 100, and 100 MB 4 scenarios: Initially (a) all input files are only in Japan (b) 100MB files replicated in California (c) in addition, one 100MB file replicated in Tennessee (d) all input files replicated everywhere workqueue Gantt-chart algs

New Directions: “Mega-programming” Grid programs –Can reasonably obtain some information about environment (NWS predictions, MDS, HBM, …) –Can assume that login, authentication, monitoring, etc. available on target execution machines –Can assume that programs run to completion on execution platform Mega-programs –Cannot assume any information about target environment –Must be structured to treat target device as unfriendly host (cannot assume ambient services) –Must be structured for “throwaway” end devices –Must be structured to run continuously

Success with Mega- programming –Over 2 million users –Sustains teraflop computing Can we run non-embarrassingly parallel codes successfully at this scale? –Computational Biology, Genomics …

Joint work with Derrick Kondo Application template for peer- to-peer platforms First algorithm (Needleman- Wunsch Global Alignment) uses dynamic programming Plan is to use template with additional genomics applications Being developed for “web” rather than Grid environment GTAAG A T A C C G Optimal alignments determined by traceback

Mega-programs Provide the algorithmic counterpart for very large scale platforms –peer-to-peer platforms, Entropia, etc. –Condor flocks –Large “free agent” environments –Globus –New platforms: networks of low-level devices, etc. Different computing paradigm than MPP, Grid Globus Legion free agents … Entropia Condor Algorithm2 DNA Alignment Algorithm1

Coming soon to a computer near you: –Release of APST v0.1 by SC’00 –Release of AMWAT (AppLeS Master/ Worker Application Template) v0.1 by Jan ‘01 –First prototype of 2001 –AppLeS software and papers: Thanks! –NSF, NPACI, NASA Grid Computing Lab: –Fran Berman –Henri Casanova –Walfredo Cirne –Holly Dail –Marcio Faerman –Jim Hayes –Derrick Kondo –Graziano Obertelli –Gary Shao –Otto Sievert –Shava Smallen –Alan Su –Renata Teixeira –Nadya Williams –Eric Wing –Qiao Xin

Scheduling Results [1] Heuristics for Scheduling Parameter Sweep Applications in Grid Environments H. Casanova, A. Legrand, D. Dzagorodnov, F. Berman (HCW’00) Self-scheduling Algorithms workqueue workqueue w/ work stealing workqueue w/ work duplication... Algorithms using Gantt charts: (using heuristics) MinMin, MaxMin Sufferage, XSufferage... Scheduling Algorithms for PS Applications ? Easy to implement and quick No need for performance predictions Extremely adaptive No planning (resource selection, I/O, …) More difficult to implement Slower to run Needs performance predictions Tunable adaptivity Heuristics for better planning Simulation results in [1] show that: heuristics are worth it Xsufferage is good heuristic even when predictions are bad complex environments require better planning (Gantt chart)

APST Architecture GASSIBP NFS GRAMNetSolve Condor, Ninf, Legion,.. NWS transport APIexecution API metadata API scheduler API Metadata Bookkeeper Actuator Scheduler store The Grid NetSolve Globus Legion NWS Ninf IBP transferexecutequery APST Daemon Workqueue Gantt chart heuristic algorithms Workqueue++ MinMinMaxMinSufferageXSufferage actuate report retrieve APST Client Controller triggers interacts Command-line client

Scheduling Results Max-min Workqueue XSufferage Sufferage Min-min Data transfer <= 40 X task computation time Scheduling event every 250 seconds Heuristics for Scheduling Parameter Sweep Applications in Grid Environments H. Casanova, A. Legrand, D. Dzagorodnov, F. Berman (HCW’00) Simulation results show that: Heuristics are worth it Xsufferage is good heuristic even when predictions are bad Complex environments require better planning