Presentation is loading. Please wait.

Presentation is loading. Please wait.

Middleware Overview University of the Free State Albert van Eck

Similar presentations


Presentation on theme: "Middleware Overview University of the Free State Albert van Eck"— Presentation transcript:

1 Middleware Overview University of the Free State Albert van Eck
Grid Application and Users Training School, Cape Town

2 Outline General overview Security System Information Service
VOMS server LCAS LCMAPS Information Service Berkeley DB Information Index (BDII) Workload Management System JDL Computing Element Logging and bookkeeping Questions

3 gLite Middleware overview

4 The Middleware structure
Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed to help the users building their computing infrastructure but should not be mandatory Foundation Grid Middleware are actually developed in EGEE Must be complete and robust Should allow interoperation with other major grid infrastructures Should not assume the use of Higher- Level Grid Services

5 gLite Services Decomposition
Access CLI API Security Services Information & Monitoring Services Authorization Information & Monitoring Service Discovering Auditing Authentication Network Monitoring Data Services Job Mgmt. Services Metadata Catalog File & Replica Catalog Job Provenance Package Manager Accounting Storage Element Data Movement Computing Element Workload Management

6 Workload Management System (WMS)‏
gLite infrastructure Workload Management System (WMS)‏ Data Management

7 Security System

8 gLite Security Authentication based on X.509 PKI infrastructure
Certificate Authorities (CA) issue (long lived) certificates identifying individuals (much like a passport) Trust between CAs and sites is established (offline) In order to reduce vulnerability, Grid user identification is done by (short lived) proxies of their certificates Proxies can Be delegated to a service such that it can act on the user’s behalf Include additional attributes (like VO information via the VO Membership Service VOMS) Be stored in an external proxy store (MyProxy) Be renewed (in case they are about to expire)

9 X.509 Proxy Certificate Proxy: GSI extension to X.509 Identity Certificates signed by the normal end entity cert (or by another proxy). It enables single sign-on. It supports some important features: Delegation, Mutual authentication It has a limited lifetime (minimized risk of “compromised credentials”) It is created by the voms-proxy-init command Options for voms-proxy-init: -hours <lifetime of credential> -bits <length of key>

10 GRID Security: Components
Large and dynamic population Different accounts at different sites Personal and confidential data Heterogeneous privileges (roles)‏ Desire Single Sign-On Users “Group” data Access Patterns Membership “Groups” Grid Sites Heterogeneous Resources Access Patterns Local policies Membership

11 VOMS: concepts VOMS Virtual Organization Membership Service: client
Extends the proxy with info on VO membership, group, roles Fully compatible with GSI Each VO has a database containing group membership, roles and capabilities informations for each user User contacts VOMS server requesting his authorization info Server sends authorization info to the client, which includes it in a proxy certificate VOMS client Query Authentication Request Auth DB OK C=IT/O=INFN /L=CNAF /CN=Pinco Palla /CN=proxy VOMS AC

12 FQAN and AC VOMS uses the Fully Qualified Attribute Name (FQAN) to express membership and other authorization info Groups membership, roles and capabilities may be expressed in a format that bounds them together <group>/Role=[<role>][/Capability=<capability>] FQAN are included in an Attribute Certificate Attribute Certificates are used to bind a set of attributes (like membership, roles, authorization info etc) with an identity ACs are digitally signed VOMS uses AC to include the attributes of a user in a proxy certificate

13 VOMS Certificate AC is included by the client in a well-defined, non critical, extension assuring compatibility with GT-based mechanism voms-proxy-init --voms gilda Your identity: /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Enter GRID pass phrase: Creating temporary proxy Done Contacting voms.ct.infn.it:15001 [/C=IT/O=INFN/OU=Host/L=Catania/CN=voms.ct.infn.it] "gilda" Done Creating proxy Done Your proxy is valid until Tue Jun 26 03:16

14 VOMS Certificate Attributes asli@levrek:~$ voms-proxy-info -all
subject : /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargetta/CN=proxy issuer : /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargetta identity : /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargetta type : proxy strength : 512 bits path : /tmp/x509up_u18948 timeleft : 11:57:20 === VO gilda extension information === VO : gilda subject : /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargetta issuer : /C=IT/O=INFN/OU=Host/L=Catania/CN=voms.ct.infn.it attribute : /gilda/Role=NULL/Capability=NULL attribute : /gilda/grelc/Role=NULL/Capability=NULL attribute : /gilda/grelc/das/Role=NULL/Capability=NULL attribute : /gilda/grelc/das/grelc.unile.it/Role=NULL/Capability=NULL attribute : /gilda/grelc/das/grelc.unile.it/sakila/Role=NULL/Capability=NULL attribute : /gilda/grelc/das/grelc02.unile.it/Role=NULL/Capability=NULL attribute : /gilda/grelc/das/grelc02.unile.it/sakila/Role=NULL/Capability=NULL timeleft : 11:57:48 Attributes

15 VOMS enabled Grid User can be in multiple VOs VO can have groups
Aggregate rights VO can have groups Different rights for each Different groups of experimentalists Nested groups VO has roles Assigned to specific purposes E,g. system admin When assume this role Proxy certificate carries the additional attributes

16 Information Service

17 Information Service What? Why? How? Who?
System to collect information on the state of resources Why? To discover resources of the grid and their nature To check for health status of resources To provide data in order to manage the workload more efficiently How? Monitoring and publishing fresh data on the state of resources Adopting a well known data model Who? User searching specific resources for their activity Workload Management System Other monitoring system

18 Information Service Systems
The gLite Data Model is based on Grid Laboratory Uniform Environment (GLUE) Schema The IS architecture used in gLite is Berkeley DB Information Index (BDII) has been adopted in LCG middleware as the Information System provider It is an evolution of the Globus Meta Directory System (MDS)‏ It is based on Lightweight Directory Access Protocol (LDAP) servers

19 GLUE Schema Describe the Grid resources information stored in the IS
Independent from the underlying technology Actual release is mapped on LDAP XML ClassAd (Condor Matchmaking language)‏ The entities of the GLUE Schema are organised hierarchically Include the concept of Site, Cluster, Computing Element, Storage Element, and an abstraction of service

20 GRISs, local BDII and BDII
Abbreviations: BDII: Berkeley DataBase Information Index GIIS: Grid Index Information Server GRIS: Grid Resource Information Server Each site can run a BDII. It collects the information given by the local BDIIs At each site, a *local* BDII collects the information given by the GRISs Local GRISes run on CEs and SEs at each site and report dynamic and static information This slide shows the BDII IS in a graphical format. At the bottom of the pyramid we have the Local Grid Resource Information Server GRISES, the GRISES run on the computing elements or on the storage elements and report dynamic an static information. Then we have the Grid Index Information Server, one for each site. A GIIS collects information given by the GRISes. At the top there are the BDIIs, a BDII collects information given by the GIISes. Users and other grid services can interrogate BDII to get information about Grid status.

21 BDII Users and other Grid services (such as the WMS) can interrogate BDIIs to get information about the Grid status. Each BDII collects information from the site GIISes (or local BDII) defined in a configuration file, which it accessed through a web interface. Every two minutes a cron-job runs a script and collects information (pull model) from all the GIIS (local BDII) listed in the configuration file

22 Summary The security system of gLite is based on X.509 certificates
Users are identified by certificates VOMS server link user to VOs, groups and roles adding attributes to the proxy certificate LCAS and LCMAPS control the local access to the resources checking the user certificates Information System provided by gLite is the BDII The information are organised following the GLUE Schema Current implementation use only BDII to check the state of the resources The user can contact the top BDII in the hierarchy to get the information of all the resources

23 gLite Workload Management System

24 Outline gLite Overview Workload Management System Security overview
WMS Architecture Job state machine Job Description Language Overview Security overview

25 gLite services User Interface Workload Management
Logging & Bookkeeping Information System submit query retrieve discover services update credential publish state submit publish state query retrieve File and Replica Catalogs Site X Computing Element Storage Element Authorization Service

26 WMS Objectives The Workload Management System (WMS) comprises a set of Grid middleware components responsible for distribution and management of tasks across Grid resources. The purpose of the Workload Manager (WM) is to accept and satisfy requests for job management coming from its clients meaning of the submission request is to pass the responsibility of the job to the WM. WM will pass the job to an appropriate CE for execution taking into account requirements and the preferences expressed in the job description file The decision of which resource should be used is the outcome of a matchmaking process.

27 Job Description Language
In gLite, Job Description Language (JDL) is used to describe jobs for execution on Grid. The JDL adopted within the gLite middleware is based upon Condor’s CLASSified Advertisement language (ClassAd). A ClassAd is a record-like structure composed of a finite number of attributes separated by semi-colon (;) A ClassAd is highly flexible and can be used to represent arbitrary services The Job Description Language (JDL) is the high level language that must be used by all grid users to submit their applications on the grid. With this language in fact the user specifies requirements and constrains of his application in order to be than analyzed/processed by the WMS The JDL is used in gLite to specify the job’s characteristics and constrains, which are used during the match-making process to select the best resources that satisfy job’s requirements.

28 JDL: an example Type = "Job"; JobType = "Normal";
Executable = "startGen4.sh"; Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","L CG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"}; Arguments = " aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/"; StdOutput = "sample.out"; StdError = "sample.err"; InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"}; OutputSandbox = {"sample.err","sample.out","res.txt"}; Requirements = Member("GLITE- 3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment); InputSandbox defines a set of local files that you want to be staged remotely for execution Requirements allows to specify a set of characteristic (hardware or software that you wish for the resource. StdError is the remote file where std error will be redirected OutputSandbox defines a set of remote files that you want to get back after execution StdOutput is the remote file where output will be redirected Executable indicates which file will be executed remotely Environment allows to specify env. variables which will be set at run time Arguments appends a string (to be used as argument) to Executable This slide shows an example of jdl. We can see some of the most important jdl attribute. JobType. Executable: a string representing the executable/command name. Environment: List of environment settings needed by the job to run properly. Arguments: a string containing all the job command line arguments. InputSandbox: List of files on the UI local disk needed by the job for running. The listed files will automatically loaded to the remote resource. OutputSandbox: List of files, generated by the job, which have to be retrieved. Requirement: Job requirements on computing resources

29 Workflows of jobs With a single request, multiple jobs can be generated and executed Direct Acyclic Graph (DAG) is a set of jobs where the input, output, or execution of one or more jobs depends on one or more other jobs A Collection is a group of jobs with no dependencies basically a collection of JDL’s nodeE nodeC nodeA nodeD nodeB A Parametric job is a job having one or more attributes in the JDL that vary their values according to parameters Using compound jobs it is possible to have one shot submission of a (possibly very large, up to thousands) group of jobs Submission time reduction Single call to WMProxy server / single Authentication and Authorization process Sharing of files between jobs Availability of both a single Job Id to manage the group as a whole and Job Ids for each single job in the group Un job non e’ solo la richiesta e successiva esecuzione di una singola computazione; queste possono essere diverse, e per qualche ragione nota all’utente essere fatta attraverso una singola istanza. E’ il caso dei cosiddetti job complessi

30 Job submission example
~]$ glite-wms-job-submit -d emidio -o jobid-file sfk-explorer.jdl Connecting to the service glite-wms-job-submit Success ======================The job has been successfully submitted to the WMProxyYour job identifier is: job identifier has been saved in the following file:/home/issgc59/jid=====================================================================

31 Logging and Bookkeping
Every step of the job life cycle is logged on a service called Logging and Bookkeeping It is useful for users willing to know the status of their execution when a job is submitted the UI logs it on LB As result of submission a job identifier is returned WMS logs each step of scheduling CE logs when it receive a job (scheduled), when it’s running and when it’s done Users can query the job status to the LB providing the job id Asynchronous updates....

32 The Computing Element The CE is the front-end machine (master node) to a local batch system supported batch systems are PBS(Torque/MAUI), LSF, Condor WMS “pushes” job execution requests to the CE using condor-G when a CE receives a job, this is moved on a queue Then the job will be executed on the first available among its Worker Nodes (where the batch system clients run) when execution is complete, output files are copied to the CE using scp If the job is succesfully executed, output files are copied back to the WMS using globus-url-copy By queries to the LB, users knows when a job is done and they can retrieve the output

33 Summary WMS catchs users’ request for job executions
Requests are expressed through JDL JDL allows to specify requirements that selected resources must have The WMS processes request and chooses (matchmaking) a Computing Element for the actual execution Status of resources is known to WMS with queries to BDII The CE tries to execute the job and copies back output files to WMS status of execution is logged on LB Users queries LB, discovers their job is done and download output files from WMS

34 Outline Grid Data Management Challenge gLite DM Services:
Storage Elements and SRM File Catalog and Data Management tools (lcg_utils) The AMGA Metadata Service

35 The Grid DM Challenge Heterogeneity Distribution Data description
Need common interface to storage resources Storage Resource Manager (SRM) Need to keep track where data is stored File and Replica Catalog Need scheduled, reliable file transfer File Transfer Service Need a way to associate descriptive information to files and query them Metadata Service Heterogeneity Data are stored on different storage systems using different access technologies Distribution Storage systems are located in different locations – in most cases there is no shared file system or common namespace Data need to be moved between different locations Data description Data are stored as files: need a way to “describe” files and locate them according to their content

36 Storage Resource Management
Storage Resource Management needs to take into account Transparent access to files (migration to/from disk pool) File pinning Space reservation File status notification Life time management The SRM (Storage Resource Manager) takes care of all these details a single interface that takes care of local storage interaction and provides a Grid interface to the outside world In gLite, interactions with the SRM are hidden by higher level services (DM lcg_utils tools and APIs)

37 gLite Storage Element disk pools NAS MSS

38 Files Naming conventions
Logical File Name (LFN) An alias created by a user to refer to some item of data, e.g. “lfn:/grid/gilda/ /run2/track1” Globally Unique Identifier (GUID) A non-human-readable unique identifier for an item of data, e.g. “guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6” Site URL (SURL) (or Physical File Name - PFN) The location of an actual piece of data on a storage system, e.g. “srm://grid009.ct.infn.it/dpm/ct.infn.it/gilda/output10_1” (SRM) Transport URL (TURL) Temporary locator of a replica + access protocol: understood by a SE, e.g. “rfio://lxshare0209.cern.ch//data/alice/ntuples.dat”

39 What is a file catalog File Catalog SE SE gLite UI SE

40 Metadata on the Grid Information about files -- but not only!
metadata can describe any grid entity/object ex: JobIDs - add logging information to your jobs monitoring of running applications: ex: ongoing results from running jobs can be published on the metadata server Inputset for a storm of parametric jobs information exchanging among grid peers ex: producers/consumers job collections: master jobs produce data to be analyzed; slave jobs query the metadata server to retrieve input to “consume” Simplified DB access on the grid Grid applications that needs structured data can model their data schemas as metadata

41 Data Management Services Summary
Storage Elements – save date and provide a common interface Storage Resource Manager (SRM) Castor, dCache, DPM, … Native Access protocols rfio, dcap, nfs, … Transfer protocols gsiftp, ftp, … Catalogs – keep track where data are stored File Catalog Replica Catalog Metadata Catalog Data Movement – schedules reliable file transfer File Transfer Service gLite FTS (manages physical transfers) LCG File Catalog (LFC) AMGA Metadata Catalogue

42 gLite Grid Storage Requirements
The Storage Element is the service which allows a user or an application to store data for future retrieval Manage local storage (disks), networked storage(via gpfs/nfs) and interface to Mass Storage Systems(tapes) like HPSS, CASTOR, DiskeXtender (UNITREE), … provide the abstraction of different storage backend as a whole gLite requirements: Be able to manage different storage backend uniformly and transparently for the user (providing an SRM interface) Support basic file transfer protocols Globus GridFTP mandatory Others if available (https, ftp, etc) Support a native I/O (remote file) access protocol POSIX (like) I/O client library for direct access of data

43 References gLite GILDA Infrastructure VOMS GGF Security GLUE Schema
GILDA Infrastructure VOMS GGF Security GLUE Schema EGEE

44 Questions ?


Download ppt "Middleware Overview University of the Free State Albert van Eck"

Similar presentations


Ads by Google