Inter-cluster Job Deployment by AgentTeamwork Sentinel Agents Emory Horvath CSS497 Spring 2006 Advisor: Dr. Munehiro Fukuda.

Slides:



Advertisements
Similar presentations
UNIVERSITY OF JYVÄSKYLÄ P2PDisCo – Java Distributed Computing for Workstations Using Chedar Peer-to-Peer Middleware Presentation for 7 th International.
Advertisements

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies Experiences.
Mobile Agents Mouse House Creative Technologies Mike OBrien.
M. Muztaba Fuad Masters in Computer Science Department of Computer Science Adelaide University Supervised By Dr. Michael J. Oudshoorn Associate Professor.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
A Computation Management Agent for Multi-Institutional Grids
12/20/2005AgentTeamwork1 AgentTeamwork: Mobile-Agent-Based Middleware for Distributed Job Coordination Munehiro Fukuda Computing & Software Systems, University.
CSS434 Grid Computing1 Textbook No Corresponding Chapters Professor: Munehiro Fukuda A portion of these slides were compiled from The Grid: Blueprint for.
Parallelization and Grid Computing Thilo Kielmann Bioinformatics Data Analysis and Tools June 8th, 2006.
Implementation of XML Database and Enhancement of Resource and Sensor Agents Cuong Ngo CSS497 Summer 2006 Professor Munehiro Fukuda.
5/25/2006CSS Speaker Series1 Parallel Job Deployment and Monitoring in a Hierarchy of Mobile Agents Munehiro Fukuda Computing & Software Systems, University.
Company LOGO Development of Resource/Commander Agents For AgentTeamwork Grid Computing Middleware Funded By Prepared By Enoch Mak Spring 2005.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Message Passing Interface In Java for AgentTeamwork (MPJ) By Zhiji Huang Advisor: Professor Munehiro Fukuda 2005.
DISTRIBUTED PROCESS IMPLEMENTAION BHAVIN KANSARA.
VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT.
Distributed Process Implementation Hima Mandava. OUTLINE Logical Model Of Local And Remote Processes Application scenarios Remote Service Remote Execution.
Distributed Process Implementation
Distributed Multi-Agent Management in a parallel-programming simulation and analysis environment: diffusion, guarded migration, merger and termination.
Assignment 3: A Team-based and Integrated Term Paper and Project Semester 1, 2012.
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
RUNNING PARALLEL APPLICATIONS BEYOND EP WORKLOADS IN DISTRIBUTED COMPUTING ENVIRONMENTS Zholudev Yury.
The Old World Meets the New: Utilizing Java Technology to Revitalize and Enhance NASA Scientific Legacy Code Michael D. Elder Furman University Hayden.
Wikis are websites where pages can be edited using an online document editor. Users can easily edit and share content. Enterprise wikis are platforms.
Remote Access Chapter 4. Learning Objectives Understand implications of IEEE 802.1x and how it is used Understand VPN technology and its uses for securing.
Remote Access Chapter 4. Learning Objectives Understand implications of IEEE 802.1x and how it is used Understand VPN technology and its uses for securing.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
Crossing The Line: Distributed Computing Across Network and Filesystem Boundaries.
TRASC Globus Application Launcher VPAC Development Team Sudarshan Ramachandran.
SUMA: A Scientific Metacomputer Cardinale, Yudith Figueira, Carlos Hernández, Emilio Baquero, Eduardo Berbín, Luis Bouza, Roberto Gamess, Eric García,
Distributed Data Mining System in Java Group Member D 王春笙 D 林俊甫 D 王慧芬.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Chapter 5.4 DISTRIBUTED PROCESS IMPLEMENTAION Prepared by: Karthik V Puttaparthi
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Open Service Gateway Initiative (OSGi) Reporter : 林學灝 侯承育 1.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Plug-in for Singleton Service in Clustered environment and improving failure detection methodology Advisor:By: Dr. Chung-E-WangSrinivasa c Kodali Department.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
8/25/2005IEEE PacRim The Design Concept and Initial Implementation of AgentTeamwork Grid Computing Middleware Munehiro Fukuda Computing & Software.
© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 1: Characterization of Distributed & Mobile Systems Dr. Michael R.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
WebFlow High-Level Programming Environment and Visual Authoring Toolkit for HPDC (desktop access to remote resources) Tomasz Haupt Northeast Parallel Architectures.
Data Manipulation with Globus Toolkit Ivan Ivanovski TU München,
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
CSS497 Undergraduate Research Performance Comparison Among Agent Teamwork, Globus and Condor By Timothy Chuang Advisor: Professor Munehiro Fukuda.
8/25/2005IEEE PacRim The Check-Pointed and Error-Recoverable MPI Java of AgentTeamwork Grid Computing Middleware Munehiro Fukuda and Zhiji Huang.
MSF and MAGE: e-Science Middleware for BT Applications Sep 21, 2006 Jaeyoung Choi Soongsil University, Seoul Korea
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Configuring Print Services Lesson 7. Print Sharing Print device sharing is another one of the most basic applications for which local area networks were.
OGSA-DAI.
C HAPTER 5.4 DISTRIBUTED PROCESS IMPLEMENTAION By: Nabina Pradhan 10/09/2013.
Workload Management Workpackage
Agent Teamwork Research Assistant
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
Enterprise Computing Collaboration System Example
NGS computation services: APIs and Parallel Jobs
Class project by Piyush Ranjan Satapathy & Van Lepham
Mobile Agents.
Lecture 1: Multi-tier Architecture Overview
Distributed Systems Bina Ramamurthy 11/30/2018 B.Ramamurthy.
Basic Grid Projects – Condor (Part I)
Distributed Systems Bina Ramamurthy 12/2/2018 B.Ramamurthy.
Mobile Agents M. L. Liu.
Wide Area Workload Management Work Package DATAGRID project
A Virtual Machine Monitor for Utilizing Non-dedicated Clusters
Production Manager Tools (New Architecture)
Presentation transcript:

Inter-cluster Job Deployment by AgentTeamwork Sentinel Agents Emory Horvath CSS497 Spring 2006 Advisor: Dr. Munehiro Fukuda

What is Grid Computing? Grid Computing seeks to pool together large numbers of computers, allowing unused CPU cycles to be shared for CPU-intensive tasks. Examples:  Condor  Issues:  Job coordination  Security  Software installation and maintenance  Fault tolerance

What is AgentTeamwork? Portable Java-based grid computing platform, based on the mobile agent paradigm. Decentralized architecture, without a central manager. Easy installation and participation. Designed with fault tolerance in mind. Participating computers run a Java process (UWPlace). Each UWPlace can host one or more mobile agent Java processes (UWAgents). Central FTP server hosts the list of available computers.

How AgentTeamwork Works FTP Server User A User B User B snapshot snapshots User program wrapper Snapshot Methods GridTCP User program wrapper Snapshot Methods GridTCP User program wrapper Snapshot Methods GridTCP snapshot User A’s Process User A’s Process User B’s Process TCP Communication Commander Agent Sentinel Agent Resource Agent Sentinel Agent Resource Agent Bookkeeper Agent Results

How AgentTeamwork Works - 2 Operating systems UWAgents mobile agent execution platform Commander, resource, sentinel, bookkeeper agents User program wrapper GridTcpJava socket mpiJava-AmpiJava-S mpiJava API Java user applications

Single-Cluster Hierarchy User Commander id 0 Sentinel id 2 rank 0 Bookkeeper id 3 rank 0 Resource id 1 eXist Sentinel id 8 rank 1 Sentinel id 11 rank 4 Sentinel id 10 rank 3 Sentinel id 9 rank 2 Bookkeeper id 12 rank 1 Bookkeeper id 15 rank 4 Bookkeeper id 14 rank 3 Bookkeeper id 13 rank 2 Sentinel id 32 rank 5 Sentinel id 34 rank 7 Sentinel id 33 rank 6 Bookkeeper id 48 rank 5 Bookkeeper id 50 rank 7 Bookkeeper id 49 rank 6 Job Submission XML Query Spawn id: agent id rank: MPI Rank snapshot Sensor id 5

Single-Cluster Job Resumption Sentinel id 2 rank 0 Sentinel id 8 rank 1 Sentinel id 11 rank 4 Sentinel id 10 rank 3 Sentinel id 9 rank 2 Bookkeeper id 15 rank 4 (0) Send a new snapshot periodically MPI connections (2) Search for the latest snapshot (1) Detect a ping error Sentinel id 11 rank 4 New (4) Send a new agent (5) Notify about the restart (3) Retrieve the snapshot

Extending to Multiple Clusters The existing AgentTeamwork system allows only job deployment within a single intranet cluster. The primary focus of my project was to extend Agent Teamwork to allow job deployment and resumption across multiple clusters:  Rewrite and extend existing AgentTeamwork algorithms to support multiple clusters.  Rewrite job deployment code to deploy gateway tasks and remote-cluster jobs.  Integrate new gateway-enabled Java socket functionality.  Rewrite job-resumption code to resume failed remote clusters and remote compute nodes.

Sentinel id 131 rank 4 Sentinel id 32 rank 0 Sentinel id 130 rank 3 Sentinel id 129 rank 2 Sentinel id 512 rank 5 Sentinel id 128 rank 1 Cluster 0 Multiple-Cluster Hierarchy User Commander id 0 Sentinel id 2 Bookkeeper id 3 rank 0 Resource id 1 Sentinel id 8 rank -8 Cluster gateway 0 Sentinel id 531 rank 10 Sentinel id 33 rank -33 Sentinel id 132 rank 6 Sentinel id 530 rank 9 Sentinel id 529 rank 8 Sentinel id 528 rank 7 Cluster 1 Cluster gateway 1, Sentinel id 9 rank X Sentinel id 39 rank X+4 Sentinel id 38 rank X+3 Sentinel id 37 rank X+2 Sentinel id 36 rank X+1 Desktop computers Sentinel id 34 rank -34 Cluster 2 2, Sentinel id 35 rank -35 Cluster 3 and 3

Multiple-Cluster Job Resumption Sentinel id 131 rank 4 User Commander id 0 Sentinel id 2 Sentinel id 8 rank -8 Sentinel id 33 rank -33 Sentinel id 32 rank 0 Sentinel id 130 rank 3 Sentinel id 129 rank 2 Bookkeeper id 3 rank 0 Resource id 1 Sentinel id 512 rank 5 Sentinel id 128 rank 1 Cluster 0 Sentinel id 531 rank 10 Sentinel id 132 rank 6 Sentinel id 530 rank 9 Sentinel id 529 rank 8 Sentinel id 528 rank 7 Cluster 1 Cluster gateway 0 Cluster gateway 1 Desktop computers Extra Node Extra Node Compute Node Cluster Gateway Compute Node Compute Node Compute Node Compute Node Extra Cluster Extra Cluster gateway New Sentinel

Other Current & Ongoing Tasks AgentTeamwork is an ongoing project, with parallel contributions by many other team members:  RMI to Java Socket enhancements, developed by Duncan Smith, were integrated.  Agent file I/O enhancements (Jumpei Miyauchi), and sensor agent enhancements (Jun Morisaki) were also integrated. Although I am presenting now, I will be continuing on the project over the summer:  Completion of inter-cluster fault tolerance and job redeployment.  Completion of inter-cluster performance tests  Assisting Cuong Ngo as needed with the implementation of dynamic resource allocation.

Acknowledgements Professor Fukuda, my advisor. NSF Middleware Initiative. The UW-Bothell CSS Program. Graphics and other slide content contributed by Prof. Fukuda from earlier AgentTeamwork presentations and papers.

Questions?