Introduction to Grids and Grid applications Gergely Sipos MTA SZTAKI
What is Grid? ● They are heterogeneous in every aspect ● A Grid is a collection of computers, storages, special devices, services that can dynamically join and leave the Grid Internet ● They are geographically distributed and connected by a wide-area network ● They can be accessed on- demand by a set of users
Why use a Grid? A user has a complex problem that requires many services/resources in order to Reduce computation time Access databases Share equipments Collaborate with other users Internet
Typical Grid application areas Demand for computation capacity High-performance computing (HPC) Shorten the execution time of a single parallel application Reguirement: parallel computing High-throughput computing (HTC) Execute as many similar jobs as possible during a given period Requirement: exploit spare CPU cycles Demand for large data storage With the involvement of physically distributed data bases Demand for collaborative work Integrate several users’ knowledge in order to solve a complex problem
Example: Large Hidron Collider, CERN, Genf ATLAS CMS LHCb Mont Blanc (4810 m) Downtown Geneva ~10-15 PetaBytes /year ~10 8 events/year ~10 3 batch and interactive users
Example: Rolls Royce aircraft engines 1Gb data per engine per flight ● Real time download to basis airport ● Historical data mining ● Across airports ● Evaluation, analysis ● Pattern search on distributed platform ● Preparing maintenance crew
Other examples In silico drug discovery – molecule simulations to find drug candidates Earth science, space research – sharing, analyzing satellite pictures Archeology – digital archives, virtual simulations Weather prediction – data integration, model selection, simulations, evaluation Engineering – simulation of buildings, vehicles in extreme conditions...
Why to use grids? ● Most of these problems were solved by supercomputers and clusters 5-10 years ago. Now grids are used: ● A Grid is able to utilize spare cycles without extra investments ● Grids can share risk, can lower participation cost ● Grids can integrate resources – HW, SW, data ● Grids provide flexible access to resources
GRIDMIDDLEWAREGRIDMIDDLEWARE Visualising Workstation Mobile Access Supercomputer, PC-Cluster Data-storage, Sensors, Experiments Internet, networks Grid vision
Problems to solve ● Standard access to resources ● Computers ● Storage resources ● Special devices ● Software ● Data ● Access policy, security ● Load balancing ● Monitoring of resources ● Monitoring of applications ● Error handling ● Application methodology, programming models ●...
… then where are we now? If “The Grid” vision leads us here… Utility computing Cloud computing E-Infrastructure Cycle scavenging … IBM Grid HP Grid Oracle Grid …
Generic Grid modell Internet Donating free resources Requiring resources Inst1 Inst2 Inst4 Inst3
Two players of the Grid Resource donors = D Resource users = U Relationship between the two characterizes the Grid: if U ~ D generic Grid model if U >> D utility Grid model if U << D desktop Grid model
Generic Grid model is complex… ● Endless possible usage patterns ● Involved security solutions ● Real time information system ● Complex brokering, load balancing architecture ● Flexible programming architecture ● Simplifications were made to achieve something useful: ● Utility grids ● Desktop grids
Utility grids
Utility Grid model Internet Donating resources static 7/24 mode Dynamic resource requirements Inst1 User 1 Inst2 User N Donor and user
Characteristics of the utility Grid model Donors must be “professional” resource providers who provide production service (7/24 mode) Simplification Homogeneous resources Simplification Anybody can use the donated resources for solving her/his own applications Asymmetric relationship between donors and users: U >> D
Utility Grid example: EGEE ● > 200 sites in 40 countries ● ~ CPUs ● ~ 5 PB storage ● 98k jobs/day ● > 200 Virtual Organizations ● gLite middleware ● The World’s largest multi-disciplinary Grid
Utility Grid example: Open Science Grid 30 Virtual Organizations 105 Resources 26 Support Agencies Middleware: – Virtual Data Toolkit (VDT): collection of grid tools – Condor – Globus – VO Management Service
Dynamic Grid ~ 33 sites, ~1400 CPUS Production Grid – Applications from various scientific disciplines – Sites operate 24/7 – Mostly unattended by administrators Middleware: – Advanced Resource Connector (ARC) Utility Grid example: NorduGrid
Utility Grid example: UK National Grid Service ● 4 core sites: Leeds, Oxford, Manchester, Rutherford ● 6 partner sites ● 3 affiliate sites ● Middleware: Globus Toolkit 2 ● Additional SW services: ● OGSA DAI ● Storage Resource Broker ● NGS Oracle DB ● P-GRADE Portal ●...
Architecture of Utility Grids CPUs Tertiary Storage Online Storage Communications Scientific Instruments Resource Management Application Environments Application Support Grid Common Services: Middleware services Grid Fabric - local resources Information Services Resource Sceduling Data Access Caching Resource Co-Allocation Authentication Authorisation Monitoring Fault Management Policy Accounting Instrument Management Analysis & Visualisation Collaboratories Problem Solving Environments Grid Portals MPICONDORCORBAJAVA/JINI OLE DCOM Other... Applictations
Virtual Organizations and Utility Grids Grid: – Resources that host the same middleware version – People who use them VO: – Logical subset of sites and users – Security policy – Dynamic? Atlas VO tens of years WISDOM data challenge few weeks Grid Virtual Organization
Virtual Organizations and Utility Grids Grid: – Resources that host the same middleware version – People who use them VO: – Logical subset of sites and users – Security policy – Dynamic? Atlas VO tens of years WISDOM data challenge VO few weeks Grid Virtual Organization The Grid problem is to enable “coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations.” From ”The Anatomy of the Grid” by Ian Foster, Carl Kesselman, Steven Tuecke
Utility Grids: Based on Service Oriented Architecture Registry Service Consumers Services Register an available service Send name & description
Architecture of Service Grids: Service Oriented Architecture Registry Service Consumers Services Request a service Send a description
Architecture of Service Grids: Service Oriented Architecture Registry Service Consumers Services Set (possibly empty) of matching services
Architecture of Service Grids: Service Oriented Architecture Registry Service Consumers Services Request service operation
Architecture of Service Grids: Service Oriented Architecture Registry Service Consumers Services Return result or Error
Architecture of Service Grids: Service Oriented Architecture Registries Service Consumers Services Server programs run on the resources High availability is a must Standard protocols expected Security architecture is complicated Requires expertise at every site
Service Oriented Grids Supercomputing (PVM/MPI) Network Computing (sockets) Cluster computing OO Computing (CORBA) Web Computing (scripts) High-throughput computing High-performance computing Object Web Condor Globus, LCG-2 Web Services Client/server SOA meets the Grid Clusters Semantic Grid Utility Grid Systems
Service Oriented Grids Supercomputing (PVM/MPI) Network Computing (sockets) Cluster computing OO Computing (CORBA) Web Computing (scripts) High-throughput computing High-performance computing Object Web Condor Globus, LCG-2 Web Services Client/server SOA meets the Grid Clusters Semantic Grid Utility Grid Systems The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration Ian Foster, Carl Kesselman, Jeffrey M. Nick, Steven Tuecke
Desktop Grids
Desktop Grid model Internet Dynamic resource donation Work package distribution Company/ univ. server Donor: Company/ univ. or private PC Donor: Company/ Univ. or private PC Donor: Company/ univ. or private PC Application
Characteristics of the desktop Grid model Anybody can donate resources Heterogeneous resources, that dynamically join and leave One or a small number of projects can use the resources Simplification Resources run clients: Expertise only at the server Simplification Asymmetric relationship between donors and users: U << D Advantage: Donating a PC is extremely easy Setting up and maintaining a DG server is much easier than installing the server sw of utility grids
Types of Desktop Grids Global Desktop Grid Aim is to collect resources for grand-challenge scientific problems Example: BOINC Local Desktop Grid Aim is to enable the quick and easy creation of grid for any community (company, univ. city, etc.) to solve their own applications Example: SZTAKI Desktop Grid
SETI: a global desktop grid ● ● 3.8M users in 226 countries ● 1200 CPU years/day ● 38 TF sustained (Japanese Earth Simulator is 32 TF sustained. Currently 30rd on TOP500) ● Highly heterogeneous: >77 different processor types ● Infrastructure is separated now from application: BIONC
SZTAKI Desktop Grid: a global and local DG system Extension of BOINC Simplifies the creation of DG applications Simplifies the installation and maintenance of DG servers Global and local configuration Global installation: Mathematical problem: Search for number dimension Installation package is available Three steps to try and use the system: 1.Donate one PC to the Global system 2.Port application to the global system 3.Set up a DG for your own community Step 1 is easy. SZTAKI helps in steps 2 and 3
SZTAKI Desktop Grid global installation SZTAKI DG global installation:1318 GFlops Largest supercomputer of Hungary: 900 GFlops TOP 500 entry performance:5929 GFlops
FP7 Project - EDGeS ● Enabing Desktop Grids for e-Science ● Two years, started on the 1 st of January, ● Integrate Desktop Grids and Utility Grids ● Including BIONC and gLite technologies (besides other DG tools) ● DG jobs UG ● UG jobs DG ● Integrated portal environment to develop and manage applications on DG, UG platforms ● Coordinator MTA SZTAKI, Hungary – ● Watch for news at
Programming the grid
Available parallelism in grids Utility Grids – Master-slave (parameter study) – Inter-site parallelism – Intra-site parallelism – Workflow – Combination of the above Desktop Grids – Master-slave (parameter study)
Master-slave (parameter study) parallelism Internet Master Work package 1 Work package 2 Work package 3 Work package N
Inter-site parallelism Internet
Intra-site parallelism Internet
Workflow parallelism Internet
Combined parallelism Example: Inter-site and parameter study Internet
User concerns of Grid systems ● How to cope with the variety of these Grid systems? ● How to develop/create new Grid applications? ● How to execute Grid applications? ● How to observe the application execution in the Grid? ● How to tackle performance issues? ● How to port legacy applications ● to Grid systems ● between Grid systems? ● How to execute Grid applications over several Grids?
Goal of the course ● This is not a middleware developer course ● This is a user course specialised on gLite and related technologies. ● Why gLite? ● A utility grid implementation Anybody can access to execute applications ● Widely used, well supported, large community behind it ● Several potential use cases ● Several extra tools „around” it
Conclusion Generic grid model is good, but hard to implement Simplification in practice: Utility grids Desktop grids Existing production installation from both types EGEE, US OSG, NorduGrid, UK NGS Various patterns BOINC, SZTAKI DG Master-slave Course focus on Utility grids gLite middleware and related tools Application development and usage, installation, administration
Thank you for the attention ? sztaki. hu