8/25/2005IEEE PacRim The Design Concept and Initial Implementation of AgentTeamwork Grid Computing Middleware Munehiro Fukuda Computing & Software Systems, University of Washington, Bothell Koichi Kashiwagi Shinya Kobayashi Computer Science, Ehime University Funded by
8/25/2005 IEEE PacRim Background Most grid-computing systems Centralized resource/job management Two drawbacks A powerful central server essential to manage all slave computing nodes Applications based on master-slave or parameter-sweep model Mobile agents An execution model previously highlighted as a prospective infrastructure of distributed systems. No more than an alternative approach to centralized grid middleware implementation. Our motivation Decentralized job distribution and coordination Decentralized fault tolerance Applications based on a variety of communication models
8/25/2005 IEEE PacRim Objective A mobile agent execution platform fitted to grid computing Allowing an agent to identify which MPI rank to handle and which agent to send a job snapshot to. A fault-tolerant inter-process communication Recovering lost messages. Allowing over-gateway connections. Agent-collaborative algorithms for job coordination Allocating computing nodes in a distributed manner. Implementing decentralized snapshot maintenance and job recovery.
8/25/2005 IEEE PacRim System Overview FTP Server User A User B User B snapshot snapshots User program wrapper Snapshot Methods GridTCP User program wrapper Snapshot Methods GridTCP User program wrapper Snapshot Methods GridTCP snapshot User A’s Process User A’s Process User B’s Process TCP Communication Commander Agent Sentinel Agent Resource Agent Sentinel Agent Resource Agent Bookkeeper Agent Results
8/25/2005 IEEE PacRim Execution Layer Operating systems UWAgents mobile agent execution platform Commander, resource, sentinel, and bookkeeper agents User program wrapper GridTcpJava socket mpiJava-AmpiJava-S mpiJava API Java user applications UWAgents mobile agent execution platform Commander, resource, sentinel, and bookkeeper agents
8/25/2005 IEEE PacRim id 0 Agent domain (time=3:31pm, 8/25/05 ip = perseus.uwb.edu name = fukuda) id 0 UWInject: submits a new agent from shell. Agent domain (time=3:30pm, 8/25/05 ip = medusa.uwb.edu name = fukuda) UWAgents Execution Platform Agent domain created per each submission from the Unix shell # children each agent can spawn is given upon the initial submission No name server Messages forwarded through an agent tree A user job scheduled as a thread, using suspend/resume User id 1id 2id 3 id 7id 6id 5id 4id 11id 10id 9id 8 id 12 -m 4 id 1 id 2 -m 3 UWPlace A user job
8/25/2005 IEEE PacRim Job Distribution User Commander id 0 Sentinel id 2 rank 0 Bookkeeper id 3 rank 0 Resource id 1 eXist Sentinel id 8 rank 1 Sentinel id 11 rank 4 Sentinel id 10 rank 3 Sentinel id 9 rank 2 Bookkeeper id 12 rank 1 Bookkeeper id 15 rank 4 Bookkeeper id 14 rank 3 Bookkeeper id 13 rank 2 Sentinel id 32 rank 5 Sentinel id 34 rank 7 Sentinel id 33 rank 6 Bookkeeper id 48 rank 5 Bookkeeper id 50 rank 7 Bookkeeper id 49 rank 6 Job Submission XML Query Spawn id: agent id rank: MPI Rank snapshot
8/25/2005 IEEE PacRim Resource Allocation Node 1Node 0Node 2 User Commander id 0 Resource id 1 eXist Job submission An XML query CPU Architecture OS Memory Disk Total nodes Multiplier total nodes x multiplier A list of available nodes Spawn Sentinel id 2 rank 0 Bookkeeper id 2 rank 0 Node 1Node 0Node5Node 4Node 3Node 2 Sentinel id 8 rank 1 Bookkeeper id 12 rank 5 Sentinel id 2 rank 0 Sentinel id 8 rank 1 Bookkeeper id 2 rank 0 Bookkeeper id 12 rank 5 Case 1: Total nodes = 2 Multiplier = 1.5 Case 2: Total nodes = 2 Multiplier = 3 Future use
8/25/2005 IEEE PacRim Job Resumption by a Parent Sentinel Sentinel id 2 rank 0 Sentinel id 8 rank 1 Sentinel id 11 rank 4 Sentinel id 10 rank 3 Sentinel id 9 rank 2 Bookkeeper id 15 rank 4 (0) Send a new snapshot periodically MPI connections (2) Search for the latest snapshot (1) Detect a ping error Sentinel id 11 rank 4 New (4) Send a new agent (5) Restart a user program (3) Retrieve the snapshot
8/25/2005 IEEE PacRim Job Resumption by a Child Sentinel Commander id 0 Sentinel id 2 rank 0 Bookkeeper id 3 rank 0 Sentinel id 8 rank 1 Bookkeeper id 12 rank 1 Resource id 1 (1) No pings for 8 * 5 (= 40sec) No pings for 12 * 5 (= 60sec) (2) Search for the latest snapshot (3) Search for the latest snapshot(4) Retrieve the snapshot New Sentinel id 2 rank 0 (5) Send a new agent (7) Search for the latest snapshot (8) Search for the latest snapshot (9) Retrieve the snapshot (11) Detect a ping error (13) Detect a ping error and follow the same child resumption procedure as in p9. Commander id 0 (10) Send a new agent (6) No pings for 2 * 5 (= 10sec) (12) Restart a new resource agent from its beginning Resource id 1 New
8/25/2005 IEEE PacRim Computational Granularity 1
8/25/2005 IEEE PacRim Computational Granularity 2
8/25/2005 IEEE PacRim Computational Granularity 3
8/25/2005 IEEE PacRim Performance Evaluation - Series
8/25/2005 IEEE PacRim Performance Evaluation - RayTracer
8/25/2005 IEEE PacRim Performance Evaluation – MolDyn
8/25/2005 IEEE PacRim Overhead of Job Resumption
8/25/2005 IEEE PacRim Conclusions Our focus A decentralized job execution and fault-tolerant environment Applications not restricted to the master-slave or parameter- sweeping model. Applications 40,000 doubles x 10,000 floating-point operations Moderate data transfer combined with massive/collective communication At least three times larger than its computational granularity Future work UWAgents enhancement: over-gateway deployment and security Programming support: preprocessor implementation Job scheduling algorithms: priority-based agent migration