INFSO-RI-508833 Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.

Slides:



Advertisements
Similar presentations
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Advertisements

Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
Job Submission The European DataGrid Project Team
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
Basic Grid Job Submission Alessandra Forti 28 March 2006.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
INFSO-RI Enabling Grids for E-sciencE Comparison of LCG-2 and gLite Author E.Slabospitskaya Location IHEP.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management Services - Overview Mike Mineter National e-Science Centre, Edinburgh.
FESR Consorzio COMETA Grid Introduction and gLite Overview Corso di formazione sul Calcolo Parallelo ad Alte Prestazioni (edizione.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Configuring and Maintaining EGEE Production.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Computational grids and grids projects DSS,
Enabling Grids for E-sciencE ENEA and the EGEE project gLite and interoperability Andrea Santoro, Carlo Sciò Enea Frascati, 22 November.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System on gLite middleware Vincent.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Using gLite API Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
E-science grid facility for Europe and Latin America GridwWin: porting gLite to run under Windows Fabio Scibilia – Consorzio COMETA 30/06/2008.
CLRC and the European DataGrid Middleware Information and Monitoring Services The current information service is built on the hierarchical database OpenLDAP.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
1 Grid2Win: porting of gLite middleware to Windows Dario Russo INFN Catania
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid2Win: Porting of gLite middleware to.
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
User Interface UI TP: UI User Interface installation & configuration.
13th EELA Tutorial, La Antigua, 18-19, October E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
RI EGI-TF 2010, Tutorial Managing an EGEE/EGI Virtual Organisation (VO) with EDGES bridged Desktop Resources Tutorial Robert Lovas, MTA SZTAKI.
Consorzio COMETA - Progetto PI2S2 UNIONE EUROPEA Grid2Win : gLite for Microsoft Windows Elisa Ingrà - INFN.
FESR Consorzio COMETA - Progetto PI2S2 Introduction to Grid Computing Pietro Di Primo INFN – Catania , Catania.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Overveiw of the gLite middleware Yaodong Cheng
Gri2Win: Porting gLite to run under Windows XP Platform
Grid2Win Porting of gLite middleware to Windows XP platform
Workload Management System on gLite middleware
Workload Management System ( WMS )
gLite Grid Services Salma Saber
Grid2Win: Porting of gLite middleware to Windows XP platform
Introduction to Grid Technology
Grid Services Ouafa Bentaleb CERIST, Algeria
Gri2Win: Porting gLite to run under Windows XP Platform
Data services in gLite “s” gLite and LCG.
EGEE Middleware: gLite Information Systems (IS)
Overview of gLite Middleware
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical University of Athens gLite middleware

Enabling Grids for E-sciencE INFSO-RI Let’s enter the Grid…

Enabling Grids for E-sciencE INFSO-RI Application’s structure The execution of a typical Grid application follows this scenario: –The user submits its application’s job to the “Grid” –The job is being executed –The job’s execution may include the processing of one or more Input Files stored in a Storage node –The job may produce one or more Output Files –The Output Files can be stored somewhere in the Grid system (perhaps in the Storage Element or in the User Interface)‏ –The User can access the Output Files using the corresponding Grid mechanisms

Enabling Grids for E-sciencE INFSO-RI A typical structure of a Grid platform

Enabling Grids for E-sciencE INFSO-RI gLite Middleware Services API Access Workload Mgmt Services Computing Element Workload Management Metadata Catalog Data Management Storage Element Data Movement File & Replica Catalog Authorization Security Services Authentication Information & Monitoring Information & Monitoring Services Application Monitoring Connectivity Accounting Auditing Job Provenance Package Manager CLI

Enabling Grids for E-sciencE INFSO-RI Basic gLite components Security –Virtual Organization Server (VOMS) –MyProxy server (Proxy) Information System (IS) Job handling –Workload Management System (WMS) –Logging & Bookkeeping (LB) Data Management –File Catalog –File Transfer Service –File Placement Service

Enabling Grids for E-sciencE INFSO-RI User Interface (1)‏ Allows users to access Grid functionalities A machine where users have a personal account and where the user certificate is installed Gateway to Grid Services

Enabling Grids for E-sciencE INFSO-RI User Interface (2)‏ It provides a Command Line Interface to perform some basic Grid operations such as:  List all the resources suitable to execute a given job  Submit jobs for execution  Show the status of submitted jobs  Cancel one or more jobs  Retrieve the logging and bookkeeping information of jobs  Retrieve the output of finished jobs  Copy, replicate and delete files from Grid

Enabling Grids for E-sciencE INFSO-RI Workload Management System The resource broker is responsible for the acceptance of submitted jobs and for sending those jobs to the appropriate Computing Element Retrieves information from Information Catalogues so as to find the proper available resources depending on the job requirements

Enabling Grids for E-sciencE INFSO-RI Computing Element Grid interface” It is built on a farm of a computing nodes called Worker Nodes (WNs) Executes the basic queues functions In the Computing Element, a process is being executed that accepts jobs and dispatch them for execution to the Worker nodes (WNs)‏ The state of an executing job is being watched by the Computing Element

Enabling Grids for E-sciencE INFSO-RI Worker Node The submitted jobs are being executed in the Worker nodes Need only inbound connectivity Only basic services of middleware are required to be provided by the Worker nodes such as –Application libraries –Application Programming Interfaces (API) –Commands for performing actions on Grid resources and Grid data

Enabling Grids for E-sciencE INFSO-RI Storage Element It provides uniform access to storage resources (it may control simple disk servers, large disk arrays or Mass Storage Systems (MSS)‏ Each site may provide one or more SEs

Enabling Grids for E-sciencE INFSO-RI Job flow (1)‏ j Input/ Output Sandbox RB storage c c c c “Grid enabled” data transfers/ f j i e d c b Job Status Submitted Waiting Ready Scheduled Running Done Cleared Match Maker/ Broker RLS Information Service Job Adapter Network Server Workload Manager Job Control UI JOB SE accesses d RB node b

Enabling Grids for E-sciencE INFSO-RI Job flow (2)‏ Job submission –The user logs in the UI and submits the job to a Resource broker. –If one or more files need to be copied from the UI to the WN, this is specified in the job description and the files are initially copied to the RB. This set of files is called the Input Sandbox –Job status  SUBMITTED Finding the proper CE –The WMS interrogates the Information Supermarket (ISM) (an internal cache of information read from the BDII), to determine the status of computational and storage resources. –The WMS interogates the File Catalogue to find the location of any required input files – Job status  WAITING Job submission from the WMS to the selected CE –The RB prepares a wrapper script that will be passed together with other parameters, to the selected CE. – Job status  READY Job arrival to the CE –The CE receives the request and sends the job for execution to the local LRMS. –Job status  SCHEDULED

Enabling Grids for E-sciencE INFSO-RI Job flow (3)‏ Job submission to the Worker node –The LRMS handles the execution of jobs on the local Worker Nodes. –The Input Sandbox files are copied from the RB to an available WN where the job is executed. –While the job runs, Grid files can be directly accessed from an SE using either the RFIO or gsidcap protocol –Any new produced output files which can be uploaded to the Grid and made available for other Grid users to use. This can be achieved using the Data Management tools described later. Uploading a file to the Grid means copying it to a Storage Element and registering it in a file catalogue. –Job status  RUNNING Job finished –If the job ends without errors, the small output files specified by the user in the Output Sandbox are transferred back to the RB node. –Job status  DONE Output retrieval –The user can retrieve the output files to the UI –Job status  Cleared

Enabling Grids for E-sciencE INFSO-RI Job status SUBMITTED WAITING READY SCHEDULED RUNNING DONE(failed)‏ DONE (ok)‏ CLEARED ABORTED CANCELLED

Enabling Grids for E-sciencE INFSO-RI Job Description Language A high-level language based on the Classified Advertisement (ClassAd) language JDL describes jobs and aggregates of jobs with arbitrary dependency relations JDL specifies the desired job characteristics and constraints, which are taken into account by the WMS to select the best resource to execute the job A JDL file consists of lines having the format: attribute = expression; –Expressions can span several lines, but only the last one must be terminated by a semicolon –Literals are enclosed in double quotes –“ in strings must be escaped with a backslash ("\"Hallo“)‏ –The character “ ‘ ” cannot be used in the JDL –Comments of each line begin with # or // –Multi-line comments must be enclosed between “/*” and “*/” –No blank characters or tabs should follow the semicolon at the end of a line

Enabling Grids for E-sciencE INFSO-RI Cryptography components Encryption algorithm Decryption algorithm Malicious third party Plain text ciphertext channel Public key Private key

Enabling Grids for E-sciencE INFSO-RI Digital certificate Χ.509 Each entity (user, resource) must obtain a certificate The certificate includes information, such as the expiration date, the Certification Authority that signed it, the owner’s public key and a DN The DN defines uniquely the owner and has the following fields: C = Owner’s country O = Owner’s organization OU = Owner’s group CN = Owner’s name C = Owner’s country O = Owner’s organization OU = Owner’s group CN = Owner’s name

Enabling Grids for E-sciencE INFSO-RI Certificate authority Encryption algorithm Certificate Certification Authority’s public key Certification Authority(CA)‏ User

Enabling Grids for E-sciencE INFSO-RI Proxy certificates A new temporal certificate created taking into account the issued certificate by the corresponding CA  a new key pair is created to be used during the period that the proxy is valid The new private key is not secured by a password The use of a proxy is recommended because: the proxy has a short lifetime uses a different private key from the issued certificate

Enabling Grids for E-sciencE INFSO-RI VOMS Virtual Organisation Membership Service (VOMS)‏ –A system which allows a proxy to have extensions containing information:  About the VO  The groups the user belongs to in the VO  Any roles the user is entitled to have Group: subset of the VO containing members who share some responsibilities or privileges in the project –Hierarchically organised –A user can be a member of any number of groups –VOMS proxy contains the list of all groups the user belongs to –Group  privileges the user ALWAYS has Role: Attribute which typically allows a user to acquire special privileges to perform specific tasks –Role  privileges the user needs to have only from time to time

Enabling Grids for E-sciencE INFSO-RI MyProxy service(1) Enabling proxies with longer duration By default, the long-term proxy is valid for one week, while the proxies for jobs are valid for 12 hours

Enabling Grids for E-sciencE INFSO-RI Filenames(1)‏ Grid Unique Identifier (GUID)‏ –Identifies a file uniquely –Example: guid:ab993b98-8bc e c Logical File Name (LFN) (User Alias)‏ –Refers to a file instead of a GUID –lfn: –LFC catalogue: lfn:/grid/ / / –Example: lfn:/grid/hgdemo/test_egee01/test_file Storage URL (SURL) (Physical File Name-PFN)‏ –Identifies a replica in a SE – :// / –Example: sfn://se01.isabella.grnet.gr/storage/hgdemo/generated/ /filec dbaa e2-3c105fa0a3df Transport URL (TURL)‏ –A valid URI with the necessary information to access a file in a SE – :// –Example:gsiftp://se01.isabella.grnet.gr/storage/hgdemo/generated/ /file1a08d327-d7dc-4d89-bb01-2c86f59eae37

Enabling Grids for E-sciencE INFSO-RI Filenames(2)‏ Replicas LFN 1 GUID SURL 1 SURL2 SURL3 Grid Unique IDentifier User Metadata TURL 1 TURL2 User Metadata System Metadata LFN 2 LFN 3

Enabling Grids for E-sciencE INFSO-RI File Catalogue File Catalogue (gLite) or LCG File Catalogue –Maintains mappings between LFNs, GUID, SURLs –Local File Catalogue, holding only replicas stored at a given site –Global File Catalogue, containing information about all files in the Grid –Consists of a unique catalogue, where the LFN is the main key –System metadata are supported Grid file –Physically present in a SE –Registered in the file catalogue High Level Tools (lcg_util)  Consistency between files in the SEs and entries in the file catalogue Low level Data Management tools  Inconsistency between SEs physical files and catalogue entries

Enabling Grids for E-sciencE INFSO-RI Information System (IS)‏ It provides information about the Grid resources and their status  This information is essential for the operation of the whole Grid Location of availiable Computing Elements to run jobs Finding of SEs that holding replicas of Grid files and the catalogs keeping the information on these files The information is stored in databases The published information is used for monitoring purposes  for analyzing usage and performance of the Grid, detecting fault situations and any other interesting events accounting purposes  for creating statistics of the applications run by the users in the resources

Enabling Grids for E-sciencE INFSO-RI MDS Globus Moinitoring and Discovery service Resource Discovery and publishing the resource status OpenLDAP which is an open source implementation of the Lightweight Directory Access Protocol (LDAP), a specialised database optimised for reading, browsing and searching information Hierarchical architecture: –In every resource runs a Grid Resource Information Server (GRIS) providing relevant information about the resource –At each site runs a Site Grid Information Server (GIIS) that collects information from the local GRISes and republishes it. The GIIS uses a Berkeley Database Information Index (BDII) to store data –A BDII is used to read from a group of sites, depicting a view of the overall Grid resources (on top of the hierarchy)‏

Enabling Grids for E-sciencE INFSO-RI R-GMA

Enabling Grids for E-sciencE INFSO-RI gLite features Increased modularity –Services can be deployed independently XML based configuration Finer grained security (VOMS) Pull model for job management POSIX IO to grid files User friendly LFNs File transfer services (data management jobs)

Enabling Grids for E-sciencE INFSO-RI Q&A