Download presentation
1
Introduction to the Grid
Peter Kacsuk MTA SZTAKI
2
Agenda From Metacomputers to the Grid Grid Applications
Job Managers in the Grid - Condor Grid Middleware – Globus Grid Application Environments © Peter Kacsuk
3
Grid Computing in the News
© Peter Kacsuk Credit to Fran Berman
4
Real World Distributed Applications
3.8M users in 226 countries 1200 CPU years/day 38 TF sustained (Japanese Earth Simulator is 40 TF peak) 1.7 ZETAflop over last 3 years (10^21, beyond peta and exa …) Highly heterogeneous: >77 different processor types © Peter Kacsuk Credit to Fran Berman
5
Progress in Grid Systems
Supercomputing (PVM/MPI) Network Computing (sockets) Clusters Cluster computing Web Computing (scripts) OO Computing (CORBA) Client/server High-throughput computing High-performance computing Object Web Condor Globus Web Services OGSA © Peter Kacsuk Semantic Grid Grid Systems
6
Progress to the Grid Meta-computer GFlops Cluster Super-computer
2100 Cluster Super-computer Single processor Computers © Peter Kacsuk
7
Original motivation for metacomputing
Grand challenge problems run weeks and months even on supercomputers and clusters Various supercomputers/clusters must be connected by wide area networks in order to solve grand challenge problems in reasonable time © Peter Kacsuk
8
Original meaning of metacomputing
= Super computing + Wide area network Original goal of metacomputing: Distributed supercomputing to achieve higher performance than individual supercomputers/clusters can provide © Peter Kacsuk
9
Distributed Supercomputing
Caltech Exemplar Issues: Resource discovery, scheduling Configuration Multiple comm methods Message passing (MPI) Scalability Fault tolerance NCSA Origin Maui SP Argonne SP © Peter Kacsuk SF-Express Distributed Interactive Simulation: Caltech, USC/ISI
10
Technologies for metacomputers
Super-computing WAN technology Distributed computing Metacomputers © Peter Kacsuk
11
What is a Metacomputer? A metacomputer is a collection of
computers that are heterogeneous in every aspects geographically distributed connected by a wide-area network form the image of a single computer Metacomputing means: network based distributed supercomputing © Peter Kacsuk
12
Further motivations for metacomputing
Better usage of computing and other resources accessible via wide area network Various computers must be connected by wide area networks in order to exploit their spare cycles Various special devices must be accessed by wide area networks for collaborative work © Peter Kacsuk
13
Motivations for grid-computing
To form a computational grid similar to the information data access on the web. Any computers/devices must be connected by wide area networks in order to form a universal source of computing power. Grid = generalised metacomputing © Peter Kacsuk
14
Technologies that led to the Grid
Super-computing Network technology Web technology Grid © Peter Kacsuk
15
What is a Grid? A Grid is a collection of
computers, storage and other devices that are heterogeneous in every aspects geographically distributed connected by a wide-area network form the image of a single computer Generalised metacomputing means: network based distributed computing © Peter Kacsuk
16
Application areas of the Grid
Disributed supercomputing High throughput computing Parameter studies Virtual laboratory Collaborative design Data intensive applications Sky survey, particle physics Geographic Information systems Teleimmersion Enterprise architectures © Peter Kacsuk
17
Distributed Supercomputing
Caltech Exemplar Issues: Resource discovery, scheduling Configuration Multiple comm methods Message passing (MPI) Scalability Fault tolerance NCSA Origin Maui SP Argonne SP © Peter Kacsuk SF-Express Distributed Interactive Simulation: Caltech, USC/ISI
18
High-Throughput Computing
Schedule many independent tasks Parameter studies Data analysis Issues: Resource discovery Data Access Scheduling Reservation Security Accounting Code management Deadline Cost Available Machines © Peter Kacsuk Nimrod-G: Monash University
19
High throughput Computing: Condor
jobs High throughput Computing: Condor your workstation personal Condor Goal: Exploit the spare cycles of computers in the Grid Realization steps (1): Turn your desktop into a personal Condor machine © Peter Kacsuk Credit to Miron Livny
20
High throughput Computing: Condor
jobs High throughput Computing: Condor SZTAKI cluster Condor pool your workstation personal Condor Realization steps (2): Create your institute level Condor pool © Peter Kacsuk Credit to Miron Livny
21
High throughput Computing: Condor
jobs High throughput Computing: Condor Realization steps (3): Connect “friendly” Condor pools. SZTAKI cluster Condor pool personal Condor your workstation Friendly BME Condor pool © Peter Kacsuk Credit to Miron Livny
22
Condor jobs Realization steps (4): Temporary exploitation of Grid resources Hungarian Grid PBS LSF Condor SZTAKI cluster Condor pool personal Condor your workstation glide-ins Friendly BME Condor pool © Peter Kacsuk Credit to Miron Livny
23
NUG30 - Solved!!! Number of workers Credit to Miron Livny
Solved in 7 days instead of 10.9 years The first 600K seconds … Number of workers © Peter Kacsuk Credit to Miron Livny
24
The Condor model Your program moves to resource(s)
ClassAdds Match-maker Publish Resource requirement (configuration description) Resource Resource requestor TCP/IP provider Your program moves to resource(s) Security is a serious problem! © Peter Kacsuk
25
Generic Grid Architecture
Appl. Dev. Environments Analysis & Visualisation Collaboratories Problem Solving Environments Grid Portals Application Environments Application Support Grid Common Services Grid Fabric - local resources MPI CONDOR CORBA JAVA/JINI OLE DCOM Other... Information Services Global Sceduling Data Access Caching Resource Co-Allocation Authentication Authorisation Monitoring Fault Management Policy Accounting Resource Management CPUs Tertiary Storage Online Storage Communications Scientific Instruments © Peter Kacsuk
26
Middleware concepts Goal of the middleware: Three main concepts:
to turn a radically heterogeneous environment into a virtual homogeneous one Three main concepts: Toolkit (mix-and-match) approach Globus Object-oriented approach Legion, Globe Commodity Internet-www approach Web services © Peter Kacsuk
27
Globus Layered Architecture
Applications Application Toolkits GlobusView Testbed Status DUROC MPI Condor-G HPC++ Nimrod/G globusrun Grid Services Nexus GRAM I/O MDS-2 GSI GSI-FTP HBM GASS Grid Fabric Condor MPI TCP UDP © Peter Kacsuk LSF PBS NQE Linux NT Solaris DiffServ
28
Globus Approach: Hourglass
High-level services GRAM protocol Condor, LSF, NQE, PBS, etc. Resource brokers, Resource co-allocators Internet protocol TCP, FTP, HTTP, etc. Ethernet, ATM, FDDI, etc. Low-level tools © Peter Kacsuk
29
Globus hierarchical resource management architecture
RSL Application Brokers Run DIS with 100K entities Ground RSL 80 nodes on Arg SP-2, 256 nodes on CIT Exemplar Information service (MDS-2) Co-allocators Simple ground RSL GRAM Run SF-express on 80 nodes Run SF-express on 256 nodes Argonne Resource Manager SDSC Resource Manager Local resource managers © Peter Kacsuk
30
The Globus Model Your program moves to resource(s)
description Info system (MDS-2) Publish MDS-2 API (configuration description) Resource Resource requestor GRAM API provider Your program moves to resource(s) Security is a serious problem! © Peter Kacsuk
31
“Standard” MDS Architecture (MDS-2)
Resources run a standard information service (GRIS) which speaks LDAP and provides information about the resource (no searching). GIIS provides a “caching” service much like a web search engine. Resources register with GIIS and GIIS pulls information from them when requested by a client and the cache is expired. GIIS provides the collective-level indexing/searching function. Resource A GRIS Client 1 Clients 1 and 2 request info directly from resources. Resource B GRIS GIIS is an index node. Index nodes can be designed and optimized for various requirements GIIS requests information from GRIS services as needed. Client 2 Client 3 uses GIIS for searching collective information. Client 3 GIIS Cache contains info from A and B © Peter Kacsuk
32
Grid Security Infrastructure (GSI)
Proxies and delegation (GSI Extensions) for secure single Sign-on PKI for credentials Proxies and Delegation SSL (Secure Socket Layer) for Authentication and message protection PKI (CAs and Certificates) SSL © Peter Kacsuk
33
Grid application environments
Integrated environments Cactus P-GRADE (Parallel Grid Run-time and Application Development Environment) Application specific environments NetSolve Problem solving environments Grid portals © Peter Kacsuk
34
A Collaborative Grid Environment based on Cactus
Viz of data from previous simulations in Vienna café Remote Viz in St Louis Remote steering and monitoring from airport Remote Viz and steering from Berlin DataGrid/DPSS Downsampling IsoSurfaces http HDF5 T3E: Garching Origin: NCSA Globus Simulations launched from Cactus Portal Grid enabled Cactus runs on distributed machines © Peter Kacsuk Credit to Ed Seidel
35
P-GRADE: Software Development and Execution
Edit, debugging Performance-analysis Execution © Peter Kacsuk Grid
36
Nowcast Meteorology Application in P-GRADE
25 x 10 x 25 x 5 x © Peter Kacsuk
37
Performance visualisation in P-GRADE
© Peter Kacsuk
38
Nowcast Meteorology Application in P-GRADE
25 x 1st job 10 x 25 x 5 x 2nd job 3rd job 4th job 5th job © Peter Kacsuk
39
Layers of TotalGrid P-GRADE PERL-GRID Condor or SGE PVM or MPI
Internet Ethernet © Peter Kacsuk
40
PERL-GRID A thin layer for Application in the Hungarian Cluster Grid
Grid level job management between P-GRADE and various local job managers like Condor SGE, etc. file staging Application in the Hungarian Cluster Grid © Peter Kacsuk
41
Hungarian Cluster Grid Initiative
Goal: To connect 99 new clusters of the Hungarian higher education institutions into a Grid Each cluster contains 20 PCs and a network server PC. Day-time: the components of the clusters are used for education At night: all the clusters are connected to the Hungarian Grid by the Hungarian Academic network (2.5 Gbit/sec) Total Grid capacity by the end of 2003: 2079 PCs Current status: About 400 PCs are already connected at 8 universities Condor-based Grid system VPN (Virtual Private Network) Open Grid: other clusters can join at any time © Peter Kacsuk
42
Structure of the Hungarian Cluster Grid
Condor => TotalGrid 2003: 99*21 PC Linux clusters, total 2079 PCs Condor => TotalGrid 2.5 Gb/s Internet Condor => TotalGrid © Peter Kacsuk
43
Problem Solving Environments
Examples: Problem solving env. for computational chemistry Application web portals Issues: Remote job submission, monitoring, and control Resource discovery Distributed data archive Security Accounting © Peter Kacsuk ECCE’: Pacific Northwest National Laboratory
44
Grid Portals GridPort (https://gridport.npaci.edu)
Grid Resource Broker (GRB) ( Grid Portal Development Kit (GPDK) ( Genius ( © Peter Kacsuk
45
GPDK © Peter Kacsuk
46
Genius © Peter Kacsuk
47
Summary Grid is a new technology which integrates: Supercomputing
Wide-area network technology WWW technology The computational Grid will lead to a new infrastructure similar to the electrical grid This infrastructure will have a tremendous influence on the Information Society © Peter Kacsuk
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.