Download presentation
Presentation is loading. Please wait.
1
Introduction to Grid Technology
Antun Balaz SCL, Institute of Physics Belgrade Serbia 25/03/2011
2
Agenda NGI_AEGIS, EGI and EGI-InSPIRE
AEGIS infrastructure and management gLite overview and basic services Managing jobs with gLite
3
NGI_AEGIS and EGI EGI.eu created in February 2010
Established as an international consortium based in Amsterdam Serbia represented in the EGI Council and other bodies by IPB Coordinates EGI-InSPIRE project, May 2010 – April 2014 IPB represents Serbia as a partner
4
EGI-InSPIRE FP7 RI-261323 project, ESFRI WP1: Management (NA1)
WP2: External relations (NA2) WP3: User community coordination (NA3) WP4: Operations (SA1) WP5: Provisioning the software infrastructure (SA2) WP6: Services for HUC (SA3) WP7: Operational tools (JRA1)
5
IPB and EGI-InSPIRE IPB is involved in NA2, NA3, SA1 Operations:
AEGIS operations Coordination of middleware deployment OMB OTAG
6
AEGIS infrastructure (1)
Production: AEGIS01-IPB-SCL (704 CPUs, 26 TB) AEGIS02-RCUB (48 CPUs, 113 GB) AEGIS03-ELEF-LEDA (64 CPUs, 1.5 TB) AEGIS04-KG (48 CPUs, 480 GB) AEGIS07-IPB-ATLAS (128 CPUs) AEGIS11-MISANU (64 CPUs)
7
AEGIS infrastructure (2)
Certification: AEGIS05-ETFBG AEGIS09-FTN-KM Demo/training: AEGIS08-IPB-DEMO New: UOB Faculty of Physics
8
AEGIS management NGI_AEGIS management (A. Balaz, D. Vudragovic, V. Slavnic) Helpdesk: helpdesk.aegis.rs Nagios: nagios.aegis.rs Mailing lists
9
gLite – Grid middleware
The Grid relies on advanced software – the middleware - which interfaces between resources and the applications The GRID middleware Finds convenient places for apps to run Optimises use of resources Organises efficient access to data Deals with authentication at different sites Runs the job & monitors progress Transfers the result back to the scientist
10
gLite – Overview First release 2005 currently gLite 3.13.2
Developed from existing components (globus, condor,..) Interoperability & Co-existence with deployed infrastructure Robust: Performance & Fault tolerance Open Source license
11
Set of basic Grid services
Job submission/management File transfer (individual, queued database access) Data management (replication, metadata) Monitoring/Indexing system information Advanced School in High Performance and GRID Computing – Concepts and Applications, ICTP, Trieste, Italy
12
Basic services of gLite
User Interface Information System Workload Management System Submit job query Retrieve status & output create credential query publish state Submit job File and Replica Catalog Retrieve output Job status Logging Computing Element Storage Element Site X Job status Authorization Service (VOMS) process Logging and bookkeeping
13
User interface Local Workstation User describes job in text file using Job Description Language Submits job to WMS using (usually) the command-line interface ssh UI UI (user interface) has preinstalled client software WMS Workload Management System CEs
14
Managing jobs with gLite
User Interface Submit Input “sandbox” Information System stderr.txt User interface stdout.txt Get output Output “sandbox” Job status update Job Submit Event Status / log query stderr.txt stdout.txt publish state Input “sandbox” Output “sandbox” Slide inherited from EDG – European Data Grid Job status update A worker node is allocated by the local jobmanager Logging & bookkeeping STD input stream is read from file STD out and err. streams are redirected into files stderr.txt /bin/hostname stdout.txt Computing Element
15
Characteristics of resources
Location of files LFC Network Daemon User Interface Characteristics of resources Workload Manager Inform. Service Job Contr. - CondorG CE characts & status WMS SE characts & status Computing Element Storage Element
16
glite-wms-job-submit myjob.jdl
Daemon responsible for accepting incoming requests waiting submitted LFC Network Daemon User Interface JDL Input Sandbox files Workload Manager Inform. Service RB storage glite-wms-job-submit myjob.jdl Job Contr. - CondorG CE characts & status WMS SE characts & status Computing Element Storage Element
17
WM: responsible to take the appropriate actions to satisfy the request
waiting submitted LFC Network Daemon User Interface Job Workload Manager Inform. Service RB storage WM: responsible to take the appropriate actions to satisfy the request Job Contr. - CondorG CE characts & status WMS SE characts & status Computing Element Storage Element
18
RB WMS waiting submitted LFC Network Daemon User Interface
Match- Maker/ Broker Workload Manager Inform. Service RB storage Where this job can be executed ? Job Contr. - CondorG CE characts & status WMS SE characts & status Computing Element Storage Element
19
Matchmaker: responsible to find the “best” CE where to submit a job
waiting submitted LFC Network Daemon User Interface Matchmaker: responsible to find the “best” CE where to submit a job Match- Maker/ Broker Workload Manager Inform. Service RB storage Job Contr. - CondorG CE characts & status WMS SE characts & status Computing Element Storage Element
20
RB WMS waiting submitted LFC Network Daemon User Interface Match-
Where is the needed InputData ? waiting submitted LFC Network Daemon User Interface Match- Maker/ Broker Workload Manager Inform. Service RB storage What is the status of the Grid ? Job Contr. - CondorG CE characts & status WMS SE characts & status Computing Element Storage Element
21
RB WMS waiting submitted LFC Network Daemon User Interface Match-
Maker/ Broker Workload Manager Inform. Service RB storage CE choice Job Contr. - CondorG CE characts & status WMS SE characts & status Computing Element Storage Element
22
JA: responsible for the final “touches”
waiting submitted LFC Network Daemon User Interface Workload Manager Inform. Service RB storage Job Adapter Job Contr. - CondorG CE characts & status JA: responsible for the final “touches” to the job before performing submission (e.g. creation of wrapper script, etc.) WMS SE characts & status Computing Element Storage Element
23
JC: responsible for the actual job management
submitted waiting ready LFC Network Daemon User Interface Workload Manager Inform. Service RB storage Job Job Contr. - CondorG JC: responsible for the actual job management operations (done via CondorG) CE characts & status WMS SE characts & status Computing Element Storage Element
24
RB WMS submitted waiting ready scheduled LFC Network Daemon
User Interface Workload Manager Inform. Service RB storage Job Contr. - CondorG Input Sandbox files CE characts & status WMS SE characts & status Job Computing Element Storage Element
25
RB WMS submitted waiting ready scheduled running LFC Network Daemon
User Interface Workload Manager Inform. Service RB storage Job Contr. - CondorG Input Sandbox WMS “Grid enabled” data transfers/ accesses Computing Element Storage Element Job
26
RB WMS submitted waiting ready scheduled running done LFC Network
Daemon User Interface Workload Manager Inform. Service RB storage Job Contr. - CondorG Output Sandbox files WMS Computing Element Storage Element
27
glite-wms-get-output <jobID>
submitted waiting ready scheduled running done LFC Network Daemon User Interface Workload Manager Inform. Service RB storage glite-wms-get-output <jobID> Job Contr. - CondorG Output Sandbox WMS Computing Element Storage Element
28
RB WMS submitted LFC Network Daemon User Interface waiting ready
Output Sandbox files Workload Manager Inform. Service RB storage scheduled Job Contr. - CondorG running done WMS cleared Computing Element Storage Element
29
Job monitoring glite-wms-job-status <jobID>
glite-wms-job-logging-info <jobID> User Interface Network Daemon LB: receives and stores job events; processes corresponding job status LB proxy Workload Manager Job status Logging & Bookkeeping Job Contr. - CondorG WMS Log of job events Computing Element
30
Enjoy further details in presentations and hands-on sessions during the day!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.