Introduction to the Grid

Slides:



Advertisements
Similar presentations
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
Advertisements

The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
High Performance Computing Course Notes Grid Computing.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
Introduction to Grids and Grid applications Gergely Sipos MTA SZTAKI
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Resource Management of Grid Computing
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Introduction to Grid Computing The Globus Project™ Argonne National Laboratory USC Information Sciences Institute Copyright (c)
Hungarian GRID Projects and Cluster Grid Initiative P. Kacsuk MTA SZTAKI
Status of Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Massimo Sgaravatto INFN Padova.
1 GRID D. Royo, O. Ardaiz, L. Díaz de Cerio, R. Meseguer, A. Gallardo, K. Sanjeevan Computer Architecture Department Universitat Politècnica de Catalunya.
Grids and Globus at BNL Presented by John Scott Leita.
Status of Globus activities within INFN (update) Massimo Sgaravatto INFN Padova for the INFN Globus group
Simo Niskala Teemu Pasanen
Grid Computing Net 535.
The Globus Toolkit: Description and Applications Review Steve Tuecke & Ian Foster Argonne National Laboratory The University of Chicago Globus Co-PI: Carl.
Grid Computing 7700 Fall 2005 Lecture 17: Resource Management Gabrielle Allen
Hungarian Supercomputing GRID
INFN-GRID Globus evaluation (WP 1) Massimo Sgaravatto INFN Padova for the INFN Globus group
Nimrod/G GRID Resource Broker and Computational Economy David Abramson, Rajkumar Buyya, Jon Giddy School of Computer Science and Software Engineering Monash.
Computer and Automation Research Institute Hungarian Academy of Sciences Automatic checkpoint of CONDOR-PVM applications by P-GRADE Jozsef Kovacs, Peter.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Applications for the Grid Here at GGF1: Gabrielle Allen, Thomas, Dramlitsch, Gerd Lanfermann, Thomas Radke, Ed Seidel Max Planck Institute for Gravitational.
The Anatomy of the Grid: An Integrated View of Grid Architecture Ian Foster, Steve Tuecke Argonne National Laboratory The University of Chicago Carl Kesselman.
The Globus Project: A Status Report Ian Foster Carl Kesselman
The Anatomy of the Grid Mahdi Hamzeh Fall 2005 Class Presentation for the Parallel Processing Course. All figures and data are copyrights of their respective.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
Grid Technologies Research and Development Ian Foster Argonne National Laboratory The University of Chicago.
MTA SZTAKI Hungarian Academy of Sciences Introduction to Grid portals Gergely Sipos
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
Introduction to Grids and Grid applications Gergely Sipos MTA SZTAKI
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
New and Cool The Cactus Team Albert Einstein Institute
1 Observations on Architecture, Protocols, Services, APIs, SDKs, and the Role of the Grid Forum Ian Foster Carl Kesselman Steven Tuecke.
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Introduction to Grid Computing and its components.
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Globus: A Report. Introduction What is Globus? Need for Globus. Goal of Globus Approach used by Globus: –Develop High level tools and basic technologies.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
Dynamic Grid Computing: The Cactus Worm The Egrid Collaboration Represented by: Ed Seidel Albert Einstein Institute
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
New and Cool The Cactus Team Albert Einstein Institute
1 Porting applications to the NGS, using the P-GRADE portal and GEMLCA Peter Kacsuk MTA SZTAKI Hungarian Academy of Sciences Centre for.
The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.
Collaborative Tools for the Grid V.N Alexandrov S. Mehmood Hasan.
Introduction to Grid and Grid applications Peter Kacsuk MTA SZTAKI
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Globus —— Toolkits for Grid Computing
Grid Computing AEI Numerical Relativity Group has access to high-end resources in over ten centers in Europe/USA They want: Bigger simulations, more simulations.
Grid Computing.
University of Technology
Dynamic Grid Computing: The Cactus Worm
Presentation transcript:

Introduction to the Grid Peter Kacsuk MTA SZTAKI www.lpds.sztaki.hu

Agenda From Metacomputers to the Grid Grid Applications Job Managers in the Grid - Condor Grid Middleware – Globus Grid Application Environments © Peter Kacsuk

Grid Computing in the News © Peter Kacsuk Credit to Fran Berman

Real World Distributed Applications SETI@home 3.8M users in 226 countries 1200 CPU years/day 38 TF sustained (Japanese Earth Simulator is 40 TF peak) 1.7 ZETAflop over last 3 years (10^21, beyond peta and exa …) Highly heterogeneous: >77 different processor types © Peter Kacsuk Credit to Fran Berman

Progress in Grid Systems Supercomputing (PVM/MPI) Network Computing (sockets) Clusters Cluster computing Web Computing (scripts) OO Computing (CORBA) Client/server High-throughput computing High-performance computing Object Web Condor Globus Web Services OGSA © Peter Kacsuk Semantic Grid Grid Systems

Progress to the Grid Meta-computer GFlops Cluster Super-computer 2100 Cluster Super-computer Single processor Computers © Peter Kacsuk

Original motivation for metacomputing Grand challenge problems run weeks and months even on supercomputers and clusters Various supercomputers/clusters must be connected by wide area networks in order to solve grand challenge problems in reasonable time © Peter Kacsuk

Original meaning of metacomputing = Super computing + Wide area network Original goal of metacomputing: Distributed supercomputing to achieve higher performance than individual supercomputers/clusters can provide © Peter Kacsuk

Distributed Supercomputing Caltech Exemplar Issues: Resource discovery, scheduling Configuration Multiple comm methods Message passing (MPI) Scalability Fault tolerance NCSA Origin Maui SP Argonne SP © Peter Kacsuk SF-Express Distributed Interactive Simulation: Caltech, USC/ISI

Technologies for metacomputers Super-computing WAN technology Distributed computing Metacomputers © Peter Kacsuk

What is a Metacomputer? A metacomputer is a collection of computers that are heterogeneous in every aspects geographically distributed connected by a wide-area network form the image of a single computer Metacomputing means: network based distributed supercomputing © Peter Kacsuk

Further motivations for metacomputing Better usage of computing and other resources accessible via wide area network Various computers must be connected by wide area networks in order to exploit their spare cycles Various special devices must be accessed by wide area networks for collaborative work © Peter Kacsuk

Motivations for grid-computing To form a computational grid similar to the information data access on the web. Any computers/devices must be connected by wide area networks in order to form a universal source of computing power. Grid = generalised metacomputing © Peter Kacsuk

Technologies that led to the Grid Super-computing Network technology Web technology Grid © Peter Kacsuk

What is a Grid? A Grid is a collection of computers, storage and other devices that are heterogeneous in every aspects geographically distributed connected by a wide-area network form the image of a single computer Generalised metacomputing means: network based distributed computing © Peter Kacsuk

Application areas of the Grid Disributed supercomputing High throughput computing Parameter studies Virtual laboratory Collaborative design Data intensive applications Sky survey, particle physics Geographic Information systems Teleimmersion Enterprise architectures © Peter Kacsuk

Distributed Supercomputing Caltech Exemplar Issues: Resource discovery, scheduling Configuration Multiple comm methods Message passing (MPI) Scalability Fault tolerance NCSA Origin Maui SP Argonne SP © Peter Kacsuk SF-Express Distributed Interactive Simulation: Caltech, USC/ISI

High-Throughput Computing Schedule many independent tasks Parameter studies Data analysis Issues: Resource discovery Data Access Scheduling Reservation Security Accounting Code management Deadline Cost Available Machines © Peter Kacsuk Nimrod-G: Monash University

High throughput Computing: Condor jobs High throughput Computing: Condor your workstation personal Condor Goal: Exploit the spare cycles of computers in the Grid Realization steps (1): Turn your desktop into a personal Condor machine © Peter Kacsuk Credit to Miron Livny

High throughput Computing: Condor jobs High throughput Computing: Condor SZTAKI cluster Condor pool your workstation personal Condor Realization steps (2): Create your institute level Condor pool © Peter Kacsuk Credit to Miron Livny

High throughput Computing: Condor jobs High throughput Computing: Condor Realization steps (3): Connect “friendly” Condor pools. SZTAKI cluster Condor pool personal Condor your workstation Friendly BME Condor pool © Peter Kacsuk Credit to Miron Livny

Condor jobs Realization steps (4): Temporary exploitation of Grid resources Hungarian Grid PBS LSF Condor SZTAKI cluster Condor pool personal Condor your workstation glide-ins Friendly BME Condor pool © Peter Kacsuk Credit to Miron Livny

NUG30 - Solved!!! Number of workers Credit to Miron Livny Solved in 7 days instead of 10.9 years The first 600K seconds … Number of workers © Peter Kacsuk Credit to Miron Livny

The Condor model Your program moves to resource(s) ClassAdds Match-maker Publish Resource requirement (configuration description) Resource Resource requestor TCP/IP provider Your program moves to resource(s) Security is a serious problem! © Peter Kacsuk

Generic Grid Architecture Appl. Dev. Environments Analysis & Visualisation Collaboratories Problem Solving Environments Grid Portals Application Environments Application Support Grid Common Services Grid Fabric - local resources MPI CONDOR CORBA JAVA/JINI OLE DCOM Other... Information Services Global Sceduling Data Access Caching Resource Co-Allocation Authentication Authorisation Monitoring Fault Management Policy Accounting Resource Management CPUs Tertiary Storage Online Storage Communications Scientific Instruments © Peter Kacsuk

Middleware concepts Goal of the middleware: Three main concepts: to turn a radically heterogeneous environment into a virtual homogeneous one Three main concepts: Toolkit (mix-and-match) approach Globus Object-oriented approach Legion, Globe Commodity Internet-www approach Web services © Peter Kacsuk

Globus Layered Architecture Applications Application Toolkits GlobusView Testbed Status DUROC MPI Condor-G HPC++ Nimrod/G globusrun Grid Services Nexus GRAM I/O MDS-2 GSI GSI-FTP HBM GASS Grid Fabric Condor MPI TCP UDP © Peter Kacsuk LSF PBS NQE Linux NT Solaris DiffServ

Globus Approach: Hourglass High-level services GRAM protocol Condor, LSF, NQE, PBS, etc. Resource brokers, Resource co-allocators Internet protocol TCP, FTP, HTTP, etc. Ethernet, ATM, FDDI, etc. Low-level tools © Peter Kacsuk

Globus hierarchical resource management architecture RSL Application Brokers Run DIS with 100K entities Ground RSL 80 nodes on Arg SP-2, 256 nodes on CIT Exemplar Information service (MDS-2) Co-allocators Simple ground RSL GRAM Run SF-express on 80 nodes Run SF-express on 256 nodes Argonne Resource Manager SDSC Resource Manager Local resource managers © Peter Kacsuk

The Globus Model Your program moves to resource(s) description Info system (MDS-2) Publish MDS-2 API (configuration description) Resource Resource requestor GRAM API provider Your program moves to resource(s) Security is a serious problem! © Peter Kacsuk

“Standard” MDS Architecture (MDS-2) Resources run a standard information service (GRIS) which speaks LDAP and provides information about the resource (no searching). GIIS provides a “caching” service much like a web search engine. Resources register with GIIS and GIIS pulls information from them when requested by a client and the cache is expired. GIIS provides the collective-level indexing/searching function. Resource A GRIS Client 1 Clients 1 and 2 request info directly from resources. Resource B GRIS GIIS is an index node. Index nodes can be designed and optimized for various requirements GIIS requests information from GRIS services as needed. Client 2 Client 3 uses GIIS for searching collective information. Client 3 GIIS Cache contains info from A and B © Peter Kacsuk

Grid Security Infrastructure (GSI) Proxies and delegation (GSI Extensions) for secure single Sign-on PKI for credentials Proxies and Delegation SSL (Secure Socket Layer) for Authentication and message protection PKI (CAs and Certificates) SSL © Peter Kacsuk

Grid application environments Integrated environments Cactus P-GRADE (Parallel Grid Run-time and Application Development Environment) Application specific environments NetSolve Problem solving environments Grid portals © Peter Kacsuk

A Collaborative Grid Environment based on Cactus Viz of data from previous simulations in Vienna café Remote Viz in St Louis Remote steering and monitoring from airport Remote Viz and steering from Berlin DataGrid/DPSS Downsampling IsoSurfaces http HDF5 T3E: Garching Origin: NCSA Globus Simulations launched from Cactus Portal Grid enabled Cactus runs on distributed machines © Peter Kacsuk Credit to Ed Seidel

P-GRADE: Software Development and Execution Edit, debugging Performance-analysis Execution © Peter Kacsuk Grid

Nowcast Meteorology Application in P-GRADE 25 x 10 x 25 x 5 x © Peter Kacsuk

Performance visualisation in P-GRADE © Peter Kacsuk

Nowcast Meteorology Application in P-GRADE 25 x 1st job 10 x 25 x 5 x 2nd job 3rd job 4th job 5th job © Peter Kacsuk

Layers of TotalGrid P-GRADE PERL-GRID Condor or SGE PVM or MPI Internet Ethernet © Peter Kacsuk

PERL-GRID A thin layer for Application in the Hungarian Cluster Grid Grid level job management between P-GRADE and various local job managers like Condor SGE, etc. file staging Application in the Hungarian Cluster Grid © Peter Kacsuk

Hungarian Cluster Grid Initiative Goal: To connect 99 new clusters of the Hungarian higher education institutions into a Grid Each cluster contains 20 PCs and a network server PC. Day-time: the components of the clusters are used for education At night: all the clusters are connected to the Hungarian Grid by the Hungarian Academic network (2.5 Gbit/sec) Total Grid capacity by the end of 2003: 2079 PCs Current status: About 400 PCs are already connected at 8 universities Condor-based Grid system VPN (Virtual Private Network) Open Grid: other clusters can join at any time © Peter Kacsuk

Structure of the Hungarian Cluster Grid Condor => TotalGrid 2003: 99*21 PC Linux clusters, total 2079 PCs Condor => TotalGrid 2.5 Gb/s Internet Condor => TotalGrid © Peter Kacsuk

Problem Solving Environments Examples: Problem solving env. for computational chemistry Application web portals Issues: Remote job submission, monitoring, and control Resource discovery Distributed data archive Security Accounting © Peter Kacsuk ECCE’: Pacific Northwest National Laboratory

Grid Portals GridPort (https://gridport.npaci.edu) Grid Resource Broker (GRB) (http://sara.unile.it/grb) Grid Portal Development Kit (GPDK) (http://www.doesciencegrid.org/Grid) Genius (http://www.infn.it/grid) © Peter Kacsuk

GPDK © Peter Kacsuk

Genius © Peter Kacsuk

Summary Grid is a new technology which integrates: Supercomputing Wide-area network technology WWW technology The computational Grid will lead to a new infrastructure similar to the electrical grid This infrastructure will have a tremendous influence on the Information Society © Peter Kacsuk