Grid Application Development Software Project Outline l Resource Selection: Current Directions l Contracts: Current Directions l Current Status –Resource.

Slides:



Advertisements
Similar presentations
Computer Architecture
Advertisements

Principles of I/O Hardware I/O Devices Block devices, Character devices, Others Speed Device Controllers Separation of electronic from mechanical components.
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
1 1999/Ph 514: Channel Access Concepts EPICS Channel Access Concepts Bob Dalesio LANL.
1 (Review of Prerequisite Material). Processes are an abstraction of the operation of computers. So, to understand operating systems, one must have a.
RPC Robert Grimm New York University Remote Procedure Calls.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
Distributed components
Slide 1 Client / Server Paradigm. Slide 2 Outline: Client / Server Paradigm Client / Server Model of Interaction Server Design Issues C/ S Points of Interaction.
Cactus in GrADS Dave Angulo, Ian Foster Matei Ripeanu, Michael Russell Distributed Systems Laboratory The University of Chicago With: Gabrielle Allen,
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Cactus in GrADS (HFA) Ian Foster Dave Angulo, Matei Ripeanu, Michael Russell.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
OS Fall ’ 02 Introduction Operating Systems Fall 2002.
Architectural Support for Operating Systems. Announcements Most office hours are finalized Assignments up every Wednesday, due next week CS 415 section.
OS Spring’03 Introduction Operating Systems Spring 2003.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
.NET Mobile Application Development Remote Procedure Call.
Cactus-G: Experiments with a Grid-Enabled Computational Framework Dave Angulo, Ian Foster Chuang Liu, Matei Ripeanu, Michael Russell Distributed Systems.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Numerical Grid Computations with the OPeNDAP Back End Server (BES)
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
1 Shawlands Academy Higher Computing Software Development Unit.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
Self Adaptivity in Grid Computing Reporter : Po - Jen Lo Sathish S. Vadhiyar and Jack J. Dongarra.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Nomadic Grid Applications: The Cactus WORM G.Lanfermann Max Planck Institute for Gravitational Physics Albert-Einstein-Institute, Golm Dave Angulo University.
Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CE-321: Computer.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Problem Solving with NetSolve Michelle Miller, Keith Moore,
4/5/2007Data handling and transfer in the LHCb experiment1 Data handling and transfer in the LHCb experiment RT NPSS Real Time 2007 FNAL - 4 th May 2007.
Chapter 10 Intro to SOAP and WSDL. Objectives By study in the chapter, you will be able to: Describe what is SOAP Exam the rules for creating a SOAP document.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
Lecture 14 Today’s topics MARIE Architecture Registers Buses
Distributed System Concepts and Architectures 2.3 Services Fall 2011 Student: Fan Bai
Operating Systems Lecture November 2015© Copyright Virtual University of Pakistan 2 Agenda for Today Review of previous lecture Hardware (I/O, memory,
9 Systems Analysis and Design in a Changing World, Fourth Edition.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
The Software Development Process
CH10 Input/Output DDDData Transfer EEEExternal Devices IIII/O Modules PPPProgrammed I/O IIIInterrupt-Driven I/O DDDDirect Memory.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
UNIX Unit 1- Architecture of Unix - By Pratima.
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Dialog Design I Basic Concepts of Dialog Design. Dialog Outline Evaluate User Problem Representations, Operations, Memory Aids Generate Dialog Diagram.
1 Channel Access Concepts – IHEP EPICS Training – K.F – Aug EPICS Channel Access Concepts Kazuro Furukawa, KEK (Bob Dalesio, LANL)
1 Device Controller I/O units typically consist of A mechanical component: the device itself An electronic component: the device controller or adapter.
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
VGrADS and GridSolve Asim YarKhan Jack Dongarra, Zhiao Shi, Fengguang Song Innovative Computing Laboratory University of Tennessee VGrADS Workshop – September.
Powerpoint Templates Data Communication Muhammad Waseem Iqbal Lecture # 07 Spring-2016.
Enabling Grids for E-sciencE Agreement-based Workload and Resource Management Tiziana Ferrari, Elisabetta Ronchieri Mar 30-31, 2006.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
Chuang Liu, Lingyun Yang, Dave Angulo, Ian Foster
Prof. Leonardo Mostarda University of Camerino
Self Healing and Dynamic Construction Framework:
In-situ Visualization using VisIt
Software models - Software Architecture Design Patterns
A tool for locating QoS failures on an Internet path
Adaptive Grid Computing
Operating Systems Lecture 3.
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

Grid Application Development Software Project Outline l Resource Selection: Current Directions l Contracts: Current Directions l Current Status –Resource Selection >Request Protocol >Response Protocol –Resouce “Scheduling” –Contracts –Migration Manager

Resource Selection Current Directions

Grid Application Development Software Project Current Architecture Under Development Resource Selection Client Thorn External Resource Selection Service “Worm” Migration Module Cactus Worm Server Thorns Cactus Application Unit Cactus Flesh Performance Degradation Detection User Supplied Application Payload External Processes Migration Logic Manager GridFTP Client Thorn External GridFTP Server (Source) External GridFTP Server (Destination) Data transfer

Grid Application Development Software Project Resource Selector Architecture UCSD (UCSD) Resource Selection Client Thorn Resource Selection Library UCSD (HFA/GradsSoft) HFA/GradsSoft Translator Request in ClassAds format Response (format?) MDS NWS GRIS’s Protocol? Http? SOAP?

Grid Application Development Software Project Resource Selector Architecture ClassAds (ClassAds) Resource Selection Client Thorn ClassAds library Resource Selection Engine Request in ClassAds format Response (format?) MDS NWS GRIS’s Protocol? Http? SOAP? UTk Project Needed for recovery and timeliness?

Grid Application Development Software Project Resource Selector Architecture Other RS’s (Other) Resource Selection Client Thorn Other Resource Selection Service Request in some format Response in some format Protocol? Http? SOAP?

Contract Monitoring Current directions

Grid Application Development Software Project Contract Monitor l Driven by three user-controllable parameters –Time quantum for “time per iteration” –% degradation in time per iteration (relative to prior average) before noting violation –Number of violations before migration l Potential causes of violation –Competing load on CPU –Computation requires more processing power: e.g., mesh refinement, new subcomputation –Hardware problems

Grid Application Development Software Project Contract Monitor Details l The end user specifies several variables. l These variables can be changed during runtime by contacting the application with an HTTP interface. l These variables include: – time quantum – % degradation – number of violations before migration l The system will then calculate the average wall clock time per iteration for each time quantum. l If the average iteration in any time quantum has lower performance (by the percentage specified) than the average for all the other previous quanta, then a violation is noted.

Grid Application Development Software Project Actions Taken on Contract Violation l Occurs when more than the specified number of violations have been noted l New set of resources requested from the ResourceSelector l Checkpoints application l Moves checkpoint data to the new resources along with other data needed for restart l Restarts application on the new resources

Current Status

Grid Application Development Software Project Resource Selection l Demonstrated migration using RS with simple protocol (using raw sockets). l Working on more robust protocol over HTTP using ClassAds as request and XML as response –Robustness (error handling) critical on real grid –Important to use well known protocol l Working on incorporating performance model into ClassAds

Grid Application Development Software Project Resource Selection: Example Input [ Type="request"; Owner="dangulo"; RequiredDomains={"cs.uiuc.edu", "ucsd.edu"}; requirements= "other.opSys=="LINUX" & other.minMemSize> (100G/other.CPUCount) && Include(other.domains, RequiredDomains) "; Rank= other.minCPUSpeed * other.CPUCount / (other.maxCPULoad+1); ]

Grid Application Development Software Project Resource Selection: Input l Need to specify other user-centric informaion –Cactus is installed in user space l We’re investigating whether we can put the Performance Model equations into the ClassAds format in order to pass it to the Resource Selector. –The “Rank” value in the preceding slide shows a simple example of this.

Grid Application Development Software Project Resource Selection: Example output

Grid Application Development Software Project Resource Selection: Example output No resource is found <result statusCode="204“ statusMessage="No match Resource is Found"/>

Grid Application Development Software Project Resource Selection: Example output Bad request from client (request format error)

Grid Application Development Software Project Resource Selection: Example output MDS server is down <result statusCode="601“ statusMessage="MDS Service is not available"/>

Grid Application Development Software Project Resource “Scheduling” l What word do we use for allocating machines to data (“scheduling” seems wrong). l We’re assuming that RS does this l We need to map RS output to Cactus machine distribution

Grid Application Development Software Project Contract Monitoring l Demonstrated detection of performance degradation l Application monitors placed in Cactus scheduling –routine called once per iteration –accesses Cactus internal timing API –synchronization implies that timing on all nodes are identical >could use different Cactus scheduling times to get node dependant results

Grid Application Development Software Project Migration Manager l In initial development l Will allow RS selection to occur asynchronously l Will make intelligent choice on whether migration will actually help –Will not migrate to seemingly lower quality resources