Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to the Grid

Similar presentations


Presentation on theme: "Introduction to the Grid"— Presentation transcript:

1 Introduction to the Grid
Peter Kacsuk MTA SZTAKI

2 Agenda From Metacomputers to the Grid Grid Applications
Job Managers in the Grid - Condor Grid Middleware – Globus Grid Application Environments © Peter Kacsuk

3 Grid Computing in the News
© Peter Kacsuk Credit to Fran Berman

4 Real World Distributed Applications
3.8M users in 226 countries 1200 CPU years/day 38 TF sustained (Japanese Earth Simulator is 40 TF peak) 1.7 ZETAflop over last 3 years (10^21, beyond peta and exa …) Highly heterogeneous: >77 different processor types © Peter Kacsuk Credit to Fran Berman

5 Progress in Grid Systems
Supercomputing (PVM/MPI) Network Computing (sockets) Clusters Cluster computing Web Computing (scripts) OO Computing (CORBA) Client/server High-throughput computing High-performance computing Object Web Condor Globus Web Services OGSA © Peter Kacsuk Semantic Grid Grid Systems

6 Progress to the Grid Meta-computer GFlops Cluster Super-computer
2100 Cluster Super-computer Single processor Computers © Peter Kacsuk

7 Original motivation for metacomputing
Grand challenge problems run weeks and months even on supercomputers and clusters Various supercomputers/clusters must be connected by wide area networks in order to solve grand challenge problems in reasonable time © Peter Kacsuk

8 Original meaning of metacomputing
= Super computing + Wide area network Original goal of metacomputing: Distributed supercomputing to achieve higher performance than individual supercomputers/clusters can provide © Peter Kacsuk

9 Distributed Supercomputing
Caltech Exemplar Issues: Resource discovery, scheduling Configuration Multiple comm methods Message passing (MPI) Scalability Fault tolerance NCSA Origin Maui SP Argonne SP © Peter Kacsuk SF-Express Distributed Interactive Simulation: Caltech, USC/ISI

10 Technologies for metacomputers
Super-computing WAN technology Distributed computing Metacomputers © Peter Kacsuk

11 What is a Metacomputer? A metacomputer is a collection of
computers that are heterogeneous in every aspects geographically distributed connected by a wide-area network form the image of a single computer Metacomputing means: network based distributed supercomputing © Peter Kacsuk

12 Further motivations for metacomputing
Better usage of computing and other resources accessible via wide area network Various computers must be connected by wide area networks in order to exploit their spare cycles Various special devices must be accessed by wide area networks for collaborative work © Peter Kacsuk

13 Motivations for grid-computing
To form a computational grid similar to the information data access on the web. Any computers/devices must be connected by wide area networks in order to form a universal source of computing power. Grid = generalised metacomputing © Peter Kacsuk

14 Technologies that led to the Grid
Super-computing Network technology Web technology Grid © Peter Kacsuk

15 What is a Grid? A Grid is a collection of
computers, storage and other devices that are heterogeneous in every aspects geographically distributed connected by a wide-area network form the image of a single computer Generalised metacomputing means: network based distributed computing © Peter Kacsuk

16 Application areas of the Grid
Disributed supercomputing High throughput computing Parameter studies Virtual laboratory Collaborative design Data intensive applications Sky survey, particle physics Geographic Information systems Teleimmersion Enterprise architectures © Peter Kacsuk

17 Distributed Supercomputing
Caltech Exemplar Issues: Resource discovery, scheduling Configuration Multiple comm methods Message passing (MPI) Scalability Fault tolerance NCSA Origin Maui SP Argonne SP © Peter Kacsuk SF-Express Distributed Interactive Simulation: Caltech, USC/ISI

18 High-Throughput Computing
Schedule many independent tasks Parameter studies Data analysis Issues: Resource discovery Data Access Scheduling Reservation Security Accounting Code management Deadline Cost Available Machines © Peter Kacsuk Nimrod-G: Monash University

19 High throughput Computing: Condor
jobs High throughput Computing: Condor your workstation personal Condor Goal: Exploit the spare cycles of computers in the Grid Realization steps (1): Turn your desktop into a personal Condor machine © Peter Kacsuk Credit to Miron Livny

20 High throughput Computing: Condor
jobs High throughput Computing: Condor SZTAKI cluster Condor pool your workstation personal Condor Realization steps (2): Create your institute level Condor pool © Peter Kacsuk Credit to Miron Livny

21 High throughput Computing: Condor
jobs High throughput Computing: Condor Realization steps (3): Connect “friendly” Condor pools. SZTAKI cluster Condor pool personal Condor your workstation Friendly BME Condor pool © Peter Kacsuk Credit to Miron Livny

22 Condor jobs Realization steps (4): Temporary exploitation of Grid resources Hungarian Grid PBS LSF Condor SZTAKI cluster Condor pool personal Condor your workstation glide-ins Friendly BME Condor pool © Peter Kacsuk Credit to Miron Livny

23 NUG30 - Solved!!! Number of workers Credit to Miron Livny
Solved in 7 days instead of 10.9 years The first 600K seconds … Number of workers © Peter Kacsuk Credit to Miron Livny

24 The Condor model Your program moves to resource(s)
ClassAdds Match-maker Publish Resource requirement (configuration description) Resource Resource requestor TCP/IP provider Your program moves to resource(s) Security is a serious problem! © Peter Kacsuk

25 Generic Grid Architecture
Appl. Dev. Environments Analysis & Visualisation Collaboratories Problem Solving Environments Grid Portals Application Environments Application Support Grid Common Services Grid Fabric - local resources MPI CONDOR CORBA JAVA/JINI OLE DCOM Other... Information Services Global Sceduling Data Access Caching Resource Co-Allocation Authentication Authorisation Monitoring Fault Management Policy Accounting Resource Management CPUs Tertiary Storage Online Storage Communications Scientific Instruments © Peter Kacsuk

26 Middleware concepts Goal of the middleware: Three main concepts:
to turn a radically heterogeneous environment into a virtual homogeneous one Three main concepts: Toolkit (mix-and-match) approach Globus Object-oriented approach Legion, Globe Commodity Internet-www approach Web services © Peter Kacsuk

27 Globus Layered Architecture
Applications Application Toolkits GlobusView Testbed Status DUROC MPI Condor-G HPC++ Nimrod/G globusrun Grid Services Nexus GRAM I/O MDS-2 GSI GSI-FTP HBM GASS Grid Fabric Condor MPI TCP UDP © Peter Kacsuk LSF PBS NQE Linux NT Solaris DiffServ

28 Globus Approach: Hourglass
High-level services GRAM protocol Condor, LSF, NQE, PBS, etc. Resource brokers, Resource co-allocators Internet protocol TCP, FTP, HTTP, etc. Ethernet, ATM, FDDI, etc. Low-level tools © Peter Kacsuk

29 Globus hierarchical resource management architecture
RSL Application Brokers Run DIS with 100K entities Ground RSL 80 nodes on Arg SP-2, 256 nodes on CIT Exemplar Information service (MDS-2) Co-allocators Simple ground RSL GRAM Run SF-express on 80 nodes Run SF-express on 256 nodes Argonne Resource Manager SDSC Resource Manager Local resource managers © Peter Kacsuk

30 The Globus Model Your program moves to resource(s)
description Info system (MDS-2) Publish MDS-2 API (configuration description) Resource Resource requestor GRAM API provider Your program moves to resource(s) Security is a serious problem! © Peter Kacsuk

31 “Standard” MDS Architecture (MDS-2)
Resources run a standard information service (GRIS) which speaks LDAP and provides information about the resource (no searching). GIIS provides a “caching” service much like a web search engine. Resources register with GIIS and GIIS pulls information from them when requested by a client and the cache is expired. GIIS provides the collective-level indexing/searching function. Resource A GRIS Client 1 Clients 1 and 2 request info directly from resources. Resource B GRIS GIIS is an index node. Index nodes can be designed and optimized for various requirements GIIS requests information from GRIS services as needed. Client 2 Client 3 uses GIIS for searching collective information. Client 3 GIIS Cache contains info from A and B © Peter Kacsuk

32 Grid Security Infrastructure (GSI)
Proxies and delegation (GSI Extensions) for secure single Sign-on PKI for credentials Proxies and Delegation SSL (Secure Socket Layer) for Authentication and message protection PKI (CAs and Certificates) SSL © Peter Kacsuk

33 Grid application environments
Integrated environments Cactus P-GRADE (Parallel Grid Run-time and Application Development Environment) Application specific environments NetSolve Problem solving environments Grid portals © Peter Kacsuk

34 A Collaborative Grid Environment based on Cactus
Viz of data from previous simulations in Vienna café Remote Viz in St Louis Remote steering and monitoring from airport Remote Viz and steering from Berlin DataGrid/DPSS Downsampling IsoSurfaces http HDF5 T3E: Garching Origin: NCSA Globus Simulations launched from Cactus Portal Grid enabled Cactus runs on distributed machines © Peter Kacsuk Credit to Ed Seidel

35 P-GRADE: Software Development and Execution
Edit, debugging Performance-analysis Execution © Peter Kacsuk Grid

36 Nowcast Meteorology Application in P-GRADE
25 x 10 x 25 x 5 x © Peter Kacsuk

37 Performance visualisation in P-GRADE
© Peter Kacsuk

38 Nowcast Meteorology Application in P-GRADE
25 x 1st job 10 x 25 x 5 x 2nd job 3rd job 4th job 5th job © Peter Kacsuk

39 Layers of TotalGrid P-GRADE PERL-GRID Condor or SGE PVM or MPI
Internet Ethernet © Peter Kacsuk

40 PERL-GRID A thin layer for Application in the Hungarian Cluster Grid
Grid level job management between P-GRADE and various local job managers like Condor SGE, etc. file staging Application in the Hungarian Cluster Grid © Peter Kacsuk

41 Hungarian Cluster Grid Initiative
Goal: To connect 99 new clusters of the Hungarian higher education institutions into a Grid Each cluster contains 20 PCs and a network server PC. Day-time: the components of the clusters are used for education At night: all the clusters are connected to the Hungarian Grid by the Hungarian Academic network (2.5 Gbit/sec) Total Grid capacity by the end of 2003: 2079 PCs Current status: About 400 PCs are already connected at 8 universities Condor-based Grid system VPN (Virtual Private Network) Open Grid: other clusters can join at any time © Peter Kacsuk

42 Structure of the Hungarian Cluster Grid
Condor => TotalGrid 2003: 99*21 PC Linux clusters, total 2079 PCs Condor => TotalGrid 2.5 Gb/s Internet Condor => TotalGrid © Peter Kacsuk

43 Problem Solving Environments
Examples: Problem solving env. for computational chemistry Application web portals Issues: Remote job submission, monitoring, and control Resource discovery Distributed data archive Security Accounting © Peter Kacsuk ECCE’: Pacific Northwest National Laboratory

44 Grid Portals GridPort (https://gridport.npaci.edu)
Grid Resource Broker (GRB) ( Grid Portal Development Kit (GPDK) ( Genius ( © Peter Kacsuk

45 GPDK © Peter Kacsuk

46 Genius © Peter Kacsuk

47 Summary Grid is a new technology which integrates: Supercomputing
Wide-area network technology WWW technology The computational Grid will lead to a new infrastructure similar to the electrical grid This infrastructure will have a tremendous influence on the Information Society © Peter Kacsuk


Download ppt "Introduction to the Grid"

Similar presentations


Ads by Google