Nomadic Grid Applications: The Cactus WORM G.Lanfermann Max Planck Institute for Gravitational Physics Albert-Einstein-Institute, Golm Dave Angulo University.

Slides:



Advertisements
Similar presentations
1 1999/Ph 514: Channel Access Concepts EPICS Channel Access Concepts Bob Dalesio LANL.
Advertisements

The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
Distributed Systems basics
GridFTP: File Transfer Protocol in Grid Computing Networks
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Visual Solution to High Performance Computing Computer and Automation Research Institute Laboratory of Parallel and Distributed Systems
A Grid Parallel Application Framework Jeremy Villalobos PhD student Department of Computer Science University of North Carolina Charlotte.
Cactus in GrADS Dave Angulo, Ian Foster Matei Ripeanu, Michael Russell Distributed Systems Laboratory The University of Chicago With: Gabrielle Allen,
Cactus in GrADS (HFA) Ian Foster Dave Angulo, Matei Ripeanu, Michael Russell.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Cactus Code and Grid Programming Here at GGF1: Gabrielle Allen, Gerd Lanfermann, Thomas Radke, Ed Seidel Max Planck Institute for Gravitational Physics,
Grid Application Development Software Project Outline l Resource Selection: Current Directions l Contracts: Current Directions l Current Status –Resource.
Grid Computing, B. Wilkinson, 20046c.1 Globus III - Information Services.
Cactus-G: Experiments with a Grid-Enabled Computational Framework Dave Angulo, Ian Foster Chuang Liu, Matei Ripeanu, Michael Russell Distributed Systems.
Service Broker Lesson 11. Skills Matrix Service Broker Service Broker, provides a solution to common problems with message delivery and consistency that.
Ajou University, South Korea ICSOC 2003 “Disconnected Operation Service in Mobile Grid Computing” Disconnected Operation Service in Mobile Grid Computing.
Self Adaptivity in Grid Computing Reporter : Po - Jen Lo Sathish S. Vadhiyar and Jack J. Dongarra.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Dynamic Firewalls and Service Deployment Models for Grid Environments Gian Luca Volpato, Christian Grimm RRZN – Leibniz Universität Hannover Cracow Grid.
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Cactus Project & Collaborative Working Gabrielle Allen Max Planck Institute for Gravitational Physics, (Albert Einstein Institute)
NeSC Apps Workshop July 20 th, 2002 Customizable command line tools for Grids Ian Kelley + Gabrielle Allen Max Planck Institute for Gravitational Physics.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Applications for the Grid Here at GGF1: Gabrielle Allen, Thomas, Dramlitsch, Gerd Lanfermann, Thomas Radke, Ed Seidel Max Planck Institute for Gravitational.
INVITATION TO COMPUTER SCIENCE, JAVA VERSION, THIRD EDITION Chapter 6: An Introduction to System Software and Virtual Machines.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Problem Solving with NetSolve Michelle Miller, Keith Moore,
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Developing Applications on Today’s Grids Tom Goodale Max Planck Institute for Gravitational Physics
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
SEE-GRID-SCI The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
Review of Condor,SGE,LSF,PBS
Cactus/TIKSL/KDI/Portal Synch Day. Agenda n Main Goals:  Overview of Cactus, TIKSL, KDI, and Portal efforts  present plans for each project  make sure.
GridLab WP-2 Cactus GAT (CGAT) Ed Seidel, AEI & LSU Co-chair, GGF Apps RG, Gridstart Apps TWG Gabrielle Allen, Robert Engel, Tom Goodale, *Thomas Radke.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
CPSC 171 Introduction to Computer Science System Software and Virtual Machines.
Connections to Other Packages The Cactus Team Albert Einstein Institute
CMS Computing Model Simulation Stephen Gowdy/FNAL 30th April 2015CMS Computing Model Simulation1.
Introduction to Grid Computing and its components.
Timeshared Parallel Machines Need resource management Need resource management Shrink and expand individual jobs to available sets of processors Shrink.
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources.
2/22/2001Greenbook 2001/OASCR1 Greenbook/OASCR Activities Focus on technology to enable SCIENCE to be conducted, i.e. Software tools Software libraries.
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Dynamic Grid Computing: The Cactus Worm The Egrid Collaboration Represented by: Ed Seidel Albert Einstein Institute
1 Channel Access Concepts – IHEP EPICS Training – K.F – Aug EPICS Channel Access Concepts Kazuro Furukawa, KEK (Bob Dalesio, LANL)
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
Next Generation of Apache Hadoop MapReduce Owen
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
VGrADS and GridSolve Asim YarKhan Jack Dongarra, Zhiao Shi, Fengguang Song Innovative Computing Laboratory University of Tennessee VGrADS Workshop – September.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Introduction to Operating Systems
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
Dynamic Grid Computing: The Cactus Worm
Adaptive Grid Computing
Presentation transcript:

Nomadic Grid Applications: The Cactus WORM G.Lanfermann Max Planck Institute for Gravitational Physics Albert-Einstein-Institute, Golm Dave Angulo University of Chicago Chicago, Il.

Grid Application Development Software Project Outline l The Worm - Migration on the Grid: –Motivation –Design l The Worm: Adaptive Migration –Data Transfer using GridFTP –Resource Selection using MDS-2 and ClassAds –Contract Monitoring using MDS-2 –Intelligent Migration using Gram

Grid Application Development Software Project This talk: s/CactusGrADS-Aug7-GlobusRetreat.ppt Other documents on GrADS in Cactus architecture: s/arch/ Paper available in back of 2001 Globus Retreat book

Grid Application Development Software Project Resource Broker Requested & Available Resource Payload Migration Payload Your Grid Migration on the Grid

Grid Application Development Software Project Large Scale HPC Simulation: Daily Routine The “daily” routine of doing large scale numerical simulations: l Take an educated guess at memory requirements, number of processors, disk space needed l Start with the first parameter in a range of values to explore the behavior of your simulation. –Select a machine and submit to the queuing system. Wait. –Archive and analyze the data; make changes to the parameter file, resubmit to the queuing system. Wait. l For the large production run, increase resolution of your experiment, take educated guess at memory,…. l Select a big machine, submit to the queue. Wait 3-7 days. l Archive data & checkpoint file, resubmit to the queue. Wait 3-7 days. l Archive data & checkpoint file, resubmit. Wait 3-7 days. l ….

Grid Application Development Software Project Automating the Routine l Let the computer find out about the code’s resource requirements. l Automatically contact appropriate machines, stage executables and submit to the queuing system. l Let the computer monitor the quality of the requested resources as the simulation progresses. l Perform multiple simulations over a range of parameters automatically and in parallel. l Archive the data and give the user a uniform access. There is plenty of room to automate the way simulations are carried out today.

Grid Application Development Software Project Cactus + Grid Cactus based Application Thorns The Physics: Initial Data, Evolution, Analysis, etc Grid Aware Application Thorns Drivers for Contract Management, Dynamic Resource Detection, Simulation Relocation Grid Enabled Communication Library MPICH-G2 implementation of MPI, can run MPI programs across heterogeneous computing resources Standard MPI Single Proc

Grid Application Development Software Project The Grid Layer Concept Application Thorns provides: Initial Data, Analysis,Evolution Grid Thorns provide: Migration & Resource Management Grid Enabled Simulation Grid Enabled Computational Framework Cactus Computational Framework

Grid Application Development Software Project Migrating Applications on the Grid Payload Application Information Server AIS Migration Unit Resource Management Resource Selector Client Worm Layer Hibernation Storage Off Site Data Server Resource Broker

Grid Application Development Software Project The WORM at HPDC10 Information Server Migration Server

Grid Application Development Software Project Current Architecture Under Development Resource Selection Client Thorn External Resource Selection Service “Worm” Migration Module Cactus Worm Server Thorns Cactus Application Unit Cactus Flesh Performance Degradation Detection User Supplied Application Payload External Processes Migration Logic Manager GridFTP Client Thorn External GridFTP Server (Source) External GridFTP Server (Destination) Data transfer Gram

Grid Application Development Software Project Migration of Checkpoint Files Uses alpha version of GridFTP l Allows Third Party Transfer –Without this, need to >do a GET to transfer files from source to Migrator >do a PUT to transfer files from Migrator to destination l Uses GSI security –Allows grid-proxy with only a single sign-on while retaining tight security l Allows fast, efficient, reliable transfer

Grid Application Development Software Project Resource Selector Architecture (ClassAds) Resource Selection Client Thorn ClassAds library Resource Selection Engine Request in ClassAds format Response in XML GIIS NWS Resources UTk Project GRISs MDS-2

Grid Application Development Software Project MDS-2 Future Plans l Resource selector goes to GRIS directly after resources discovered l To investigate: strategies for managing update traffic l Would like persistent queries to support notification of changes in resource status

Grid Application Development Software Project Resource Selection: Example Input: ClassAds format [ Type="request"; Owner="dangulo"; RequiredDomains={"cs.uiuc.edu", "ucsd.edu"}; requirements= "other.opSys==‘LINUX’ & other.minMemSize> (100G/other.CPUCount) && Include(other.domains, RequiredDomains) "; Rank= other.minCPUSpeed * other.CPUCount / (other.maxCPULoad+1); ]

Grid Application Development Software Project Resource Selection: Example output

Grid Application Development Software Project Performance Model Working on putting Performance Model into ClassAds Every processor is assigned to computer XYZ/N grid points. Requested Memory > 16(constant) * (10E-6)(constant) * (XYZ / N) (MB) Time needed to perform an iteration= (computation time + communication time) * slowdown 800 Floating point operations every grid point per iteration. Computation time= 800(constant) * (XYZ / N)/ cpuspeed cpuspeed is FLOPS Communication time= 1/G * 2*( T1 + 2 * T2 * GXYR) T1 is the communication latency between two processors. latency from NWS T2 is the transmit time for a word T2 = 1 / (available bandwidth) available bandwidth from NWS Slowdown=1 + cpuload

Grid Application Development Software Project Contract Monitor l Driven by three user-controllable parameters –Time quantum for “time per iteration” –% degradation in time per iteration (relative to prior average) before noting violation –Number of violations before migration l Potential causes of violation –Competing load on CPU –Computation requires more processing power: e.g., mesh refinement, new subcomputation –Hardware problems

Grid Application Development Software Project Contract Monitor Details l The end user specifies several variables. l These variables can be changed during runtime by contacting the application with an HTTP interface. l These variables include: – time quantum – % degradation – number of violations before migration l The system will then calculate the average wall clock time per iteration for each time quantum. l If the average iteration in any time quantum has lower performance (by the percentage specified) than the average for all the other previous quanta, then a violation is noted.

Grid Application Development Software Project Actions Taken on Contract Violation l Occurs when more than the specified number of violations have been noted l New set of resources requested from the ResourceSelector l Checkpoints the application l Moves checkpoint data to the new resources along with other data needed for restart l Restarts application on the new resources

Grid Application Development Software Project Migration Manager l Allows RS selection to occur asynchronously l Make intelligent choice on whether migration will actually help –Will not migrate to seemingly lower quality resources

Grid Application Development Software Project Summary l The Worm gives easy adaptability to changing grid environments to researchers in physics and computational science l Data Transfer using GridFTP l Resource Selection using MDS-2 and ClassAds l Contract Monitoring using MDS-2 l Intelligent Migration using Gram