Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSS434 Grid Computing1 Textbook No Corresponding Chapters Professor: Munehiro Fukuda A portion of these slides were compiled from The Grid: Blueprint for.

Similar presentations


Presentation on theme: "CSS434 Grid Computing1 Textbook No Corresponding Chapters Professor: Munehiro Fukuda A portion of these slides were compiled from The Grid: Blueprint for."— Presentation transcript:

1 CSS434 Grid Computing1 Textbook No Corresponding Chapters Professor: Munehiro Fukuda A portion of these slides were compiled from The Grid: Blueprint for a New Computer Infrastructure.

2 CSS434 Grid Computing2 Network Infrastructure Users login their organizational systems first locally or remotely. If they are affiliated with other organizations, They can login from the system of their main use to some other systems. (They are given an opportunity to use those resources in parallel). Problems: They must orchestrate job execution among the resources they use. Should those resources be limited to such a handful number of researchers? High-speed Information high way

3 CSS434 Grid Computing3 Purposes of Computational Grid Use computing resource connected to high-speed information highway as if we use electric power grid Only 30% utilization in academic/commercial environments. Many applications have only episodic requirements. So, why don ’ t we share computation resource? Computational results and data should be also made available to all users. Users: Computational scientists and engineers Experimental scientists Association and corporations Training and education Consumers (E-commerce)

4 CSS434 Grid Computing4 Grid Applications CategoryExamplesCharacteristics Distributed supercomputing DIS and Stellar dynamics Very large problems needing lots of computing resource at a time High throughputChip design and parameter studies Harnessing many idle resources to increase aggregate throughput On demandMedical instrumentation Allocating special resource dynamically Data intensiveSky surveyUsing distributed data and needing high-volume data flows CollaborativeCollaborative design Education Support communication or collaborative work

5 CSS434 Grid Computing5 Grid Services Architecture from www.globus.org slide Applications Grid Services Layer InformationResource mgmt SecurityData accessFault detection... Grid Fabric Layer TransportMulticast InstrumentationControl interfacesQoS mechanisms... High-energy physics data analysis Regional climate studies Collaborative engineering Parameter studies On-line instrumentation Application Toolkit Layer Distributed computing Data- intensive Collab. design Remote viz Remote control

6 CSS434 Grid Computing6 Programming Model Uniform Access Paradigm Bag of task or master workers (Condor-MW) Client server (NetSolve) Object oriented (Legion) Synchronous applications (Not suited for massively parallel computation.) Language Support MPI-G – message passing (Globus) Open MP – shared memory Math Library – remote procedure (NetSolve)

7 CSS434 Grid Computing7 Resource Management Discovery, Allocation, and Scheduling Centralized resource manager +: easy to manage – : a bottleneck Decentralized resource manager A collection of centralized manager (Condor ’ s gate flocking) A combination of meta and local schedulers. SystemsResource descriptions Front-end process Resource manager Job launcher GlobusRSL: resource spec. language Broker and MDSGRAM CondorClassAd and DAGMan Schedd AgentMatchmaker and startd Sandbox (Starter) LegionIDL: interface def. language SchedulerCollectionEnactor

8 CSS434 Grid Computing8 Fault Tolerance Check-pointing At the master (Condor) At each node but collected at the master (Catalina) Use a whiteboard (Optimal Grid) Re-execution of fault worker jobs from the beginning (Bayanihan, Optimal Grid) Error code (NetSolve) User is responsible to handle errors.

9 CSS434 Grid Computing9 Security Resources covered with security layers Legion (Message/MayI layers) Entropia (Intercepting all system calls) A use of commodity tools SSL Public key Security Certificate Java sandbox Kerberos

10 CSS434 Grid Computing10 NetSolve http://icl.cs.utk.edu/netsolve/ http://icl.cs.utk.edu/netsolve/ RPC-based approach Clients Include a set of APIs called as (asynchronous) RPCs Agents Match client ’ s requests for services with servers Servers Encapsulates remotely accessed numerical libraries Agent Network of servers Client MPP servers Scalar server request choice reply

11 CSS434 Grid Computing11 Legion http://legion.virginia.edu/ http://legion.virginia.edu/ Legion classes Act as managers and make policy Core objects Provide mechanisms that classes use to implement policies: hosts (processors), vaults(memory), context, binding agents, etc. Per-Program Scheduling Participating sites can assure their local policies. User can choose a scheduling policy. Class Host Class tty Resources Resource database Scheduler Prog Converted Legion object ID By context objects Converted Logion object address By binding agents Enactor collection search request reserve

12 CSS434 Grid Computing12 Condor http://www.cs.wisc.edu/condor/ http://www.cs.wisc.edu/condor/ A: User ’ s local agent R: Each computer resource M: Central manager I/O forwarded to a user ’ s home

13 CSS434 Grid Computing13 AgentTeamwork at UWB Architecture

14 CSS434 Grid Computing14 Paper Review by Students Globus Legion Condor Netsolve Discussions What programming or execution model is each system based on? What resource allocation and scheduling algorithm does each system use? Are they fault-tolerant? Did they any special security features for their own?


Download ppt "CSS434 Grid Computing1 Textbook No Corresponding Chapters Professor: Munehiro Fukuda A portion of these slides were compiled from The Grid: Blueprint for."

Similar presentations


Ads by Google