EMI INFSO-RI Servizi Grid per il calcolo e l'accesso ai dati Workshop DUCK – Bologna, Francesco Giacomini INFN-CNAF
EMI INFSO-RI The Grid Infrastructure The Grid Middleware The Grid Ecosystem Grid Services - Workshop DUCK, Bologna 2 Outline
EMI INFSO-RI In 2002, I. Foster wrote* that “... The essence of the [definition] can be captured in a simple checklist,... a Grid is a system that: – coordinates resources that are not subject to centralized control… – … using standard, open, general-purpose protocols and interfaces... – … to deliver nontrivial qualities of service.” Corollaries – It has to work in a large-scale production environment – It has to be secure [*] What is the Grid? A Three Point Checklist Grid Services - Workshop DUCK, Bologna 3 What is a Grid?
EMI INFSO-RI THE GRID INFRASTRUCTURE Grid Services - Workshop DUCK, Bologna 4
EMI INFSO-RI Grid Services - Workshop DUCK, Bologna 5 Some numbers Source: Accounting Portal, 90% LHC Experiments 10% all the others (including the rest of HEP)
EMI INFSO-RI sites (ex-EGEE infrastructure) ~50 countries ~150 Virtual Organizations registered users 150K CPU cores – min/avg/max: 0/640/ PB of disk – min/avg/max: 0/270TB/18PB 60+ PB of tape – min/avg/max: 0/290TB/31PB Everybody can join the Grid! Source: Gstat Grid Services - Workshop DUCK, Bologna 6 Some numbers /2
EMI INFSO-RI THE GRID MIDDLEWARE Grid Services - Workshop DUCK, Bologna 7
EMI INFSO-RI Slightly adapted from the OGF PGI WG Given a job description, a meta-scheduler interrogates the information system, locates an optimal execution resource, which in turn runs it, fetching the necessary input data from a remote storage. Upon completion, the newly created data is uploaded to a storage resource, registered in the necessary data indexing catalogs, and the job record is updated in the accounting and monitoring system Other use cases are possible – e.g. pilot jobs, to create an overlay network of agents under the control of the application framework Grid Services - Workshop DUCK, Bologna 8 Basic Use Case
EMI INFSO-RI % of the Grid is agreement Key terms – No centralized control – Coordination – Standards – Protocols – Interfaces Standards, protocols, interfaces,... aim at providing common abstractions of different implementations of similar services The real challenge is finding the most appropriate abstractions that allow real work to be accomplished Grid Services - Workshop DUCK, Bologna 9 1 st Law of the Grid
EMI INFSO-RI Grid Services - Workshop DUCK, Bologna 10 The Big Picture Computing Resources Storage Resources Indexes/Catalogs Job Management Data Management (Scientific) Applications Site Application MW High-level MW Low-level MW
EMI INFSO-RI The one and only nice property of security: – Security creates assurance – That’s why we need it! A pretty useless property of security: – Security doesn’t add functionality – That’s why we are tempted to ignore it! The many stupid properties of security: – Imposes limits, which often vary over time (often suddenly) – Creates dependencies on things beyond our control – Is utterly unimpressed by cool features, but forces us to think about very weird stuff happening in weird circumstances – In short: it’s a pain! – That’s why we hate it! (© Christoph Witzig, EGEE-III Security Architect) Grid Services - Workshop DUCK, Bologna 11 Properties of Security
EMI INFSO-RI A Grid may count hundreds of sites. Do I need an account on each of them? No A Grid identity is managed with an X.509 certificate /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Francesco Giacomini A certificate is issued by a trusted Certification Authority A Grid identity is transparently mapped to a local identity/account, provided the authorization is granted Better integration with existing Authentication and Authorization Infrastructures (AAI) is planned – Shibboleth, Kerberos – Short-Lived Credential Service (SLCS) – Lower an important entry barrier to use the Grid Grid Services - Workshop DUCK, Bologna 12 Identity
EMI INFSO-RI A user typically belongs to a (scientific) community (e.g. a physics experiment) and to different groups within that community On the Grid a community is a Virtual Organization (VO) Identities can be associated to VOs and groups by the VO Membership Service (VOMS) Information on membership, roles within VO and groups and other generic attributes are included in the user's credentials – Attribute Certificate Proxies ACs are used, for example, in the AuthZ phase when accessing a service It's difficult to use the Grid if you are not part of a VO – Won't change much in the near future Grid Services - Workshop DUCK, Bologna 13 Membership Management
EMI INFSO-RI Provide the possibility to execute a computation But also: – Get the status of the computation – Cancel the computation Computing resources are typically provided by possibly large farms of computers (Worker Nodes) – Usually managed by a batch system (e.g. LSF, PBS, Condor) The corresponding Grid abstraction is called a Computing Element (CE) BES/JSDL standard available, but not adequate for production use Example of CE: CREAM Grid Services - Workshop DUCK, Bologna 14 Core Services: Computing
EMI INFSO-RI Provide the possibility to manage the storage of data – Data are typically in the form of files Create, read, write, delete files/directories Reserve space, stage files to disk from tape and viceversa, pin files Storage may be provided using different technologies – Disk, tape, a combination of the two,... – Different implementations The corresponding Grid abstraction is called Storage Element (SE) – SRM standard for management – GridFTP for transfer HTTP coming Example of SE: StoRM Grid Services - Workshop DUCK, Bologna 15 Core Services: Storage
EMI INFSO-RI How do I know which resources or services are available? How do I know which ones I can use? Services publish their existence, characteristics and status in the Information System – The information is published according to an agreed-upon schema, called the GLUE schema – The most common implementation is based on LDAP and is called BDII Service Discovery Grid Services - Workshop DUCK, Bologna 16 Information System
EMI INFSO-RI How do I name a file? – Logical File Name (LFN) – user-friendly: /cms/ /run2/track1 – GUID – machine-friendly: f81d4fae-7dec-11d0-a765-00a0c91e6bf6 Where is a (copy of a) certain file stored? – A data file can be replicated in multiple places to improve availability A File Catalog translates a LFN/GUID to physical locations (possibly more than one) srm://pcrd24.cern.ch/flatfiles/cms/output10_1 Example of File Catalog: LFC Grid Services - Workshop DUCK, Bologna 17 File Catalog
EMI INFSO-RI Finding resources and performing operations on core services by hand is at best inconvenient Higher-level services are needed – Applications often have their own Other advantages – Optimizations are easier to achieve – Management of errors As a general rule a high-level service accepts a logical description of what the user wants to achieve and transforms it in a concrete operation, following it until completion A high-level service operates on behalf of the user – Credential delegation/impersonation – Not completely satisfactory Grid Services - Workshop DUCK, Bologna 18 Higher-level Services
EMI INFSO-RI Anything that can go wrong, will The Grid is a very complex environment, with many cooperating components Components are heterogeneous, provided by different parties, managed by different parties, etc. Some errors are preventable, some are manageable by the infrastructure, some can only be managed by the user Grid Services - Workshop DUCK, Bologna 19 2 nd Law of the Grid
EMI INFSO-RI The Workload Management System (WMS) is responsible for the distribution and management of tasks across Grid resources, in particular Computing Elements, in such a way that applications are conveniently, efficiently and effectively executed Complemented by the Logging&Bookkeeping Service – Keep track of a number of events generated by different components involved in job management – Provide query interface Including the status of a job – Specifically designed and optimized for this purpose Grid Services - Workshop DUCK, Bologna 20 High-level Services: Job Management
EMI INFSO-RI [ Executable = “my_exe”; StdOutput = “out”; StdError = “err”; Arguments = “a b c”; InputSandbox = {“/home/giaco/my_exe”}; OutputSandbox = {“out”, “err”}; Requirements = Member( other.GlueHostApplicationSoftwareRunTimeEnvironment, "ALICE “ ); Rank = -other.GlueCEStateEstimatedResponseTime; RetryCount = 3 ] Grid Services - Workshop DUCK, Bologna 21 Job Description Language
EMI INFSO-RI Grid Services - Workshop DUCK, Bologna 22 Job Scheduling
EMI INFSO-RI How many resources have I used? How many resources has a certain VO used? An accounting system provides support to give precise answers to such questions – Collect information at resource level – Propagate the info at higher-levels, where it can be aggregated according to different views Base for grid-wide quotas – Not available yet Limited to computing resources at the moment – Storage accounting is planned Example of accounting systems: DGAS Grid Services - Workshop DUCK, Bologna 23 Accounting
EMI INFSO-RI Other (scientific) Grids exist – NorduGrid, DEISA, OSG,... – Often serving the same community Interoperability through a five-step plan – Requirements collection Identify a clear need to interoperate with another Grid Involve the users – Analysis understand the similarities and differences between the infrastructures – Development Find and implement a solution that just works: parallel deployment, adaptors and traslators, gateways,... – Support Maintain a production-quality service – Standardization Grid Services - Workshop DUCK, Bologna 24 Interoperability and Standardization
EMI INFSO-RI THE GRID ECOSYSTEM Grid Services - Workshop DUCK, Bologna 25
EMI INFSO-RI A pan-European Grid Infrastructure Collaboration between: – National Grid Initiatives (IGI in Italy) – Research Communities – Coordination body EGI.eu Foundation under Dutch law Mission: guarantee the long-term availability of a generic e-infrastructure for all European research communities and their international collaborators Grid Services - Workshop DUCK, Bologna 26 EGI
EMI INFSO-RI Integrated Sustainable Pan-European Infrastructure for Researchers in Europe EU-funded project to support the start of EGI – Project budget: 72 M€ – EU contribution: 25 M€ – Duration: 4 years – Effort: 9261 PM – Project partners: Grid Services - Workshop DUCK, Bologna 27 EGI-InSPIRE
EMI INFSO-RI The European Middleware Initiative (EMI) project represents a close collaboration of the major European middleware providers - ARC, gLite, UNICORE and dCache - to establish a sustainable model to support, harmonise and evolve distributed computing middleware for deployment in EGI, PRACE and other distributed e-Infrastructures Grid Services - Workshop DUCK, Bologna 28 EMI
EMI INFSO-RI Consolidate the existing middleware distribution simplifying services and components to make them more sustainable (including use of off-the-shelf and commercial components whenever possible) Evolve the middleware services/functionality following the requirement of infrastructures and users, mainly focusing on operational, standardization and interoperability aspects Reactively and proactively maintain the middleware distribution to keep it in line with the growing infrastructure usage Grid Services - Workshop DUCK, Bologna 29 EMI Primary Objectives Consolidate Evolve Support
EMI INFSO-RI Grid Services - Workshop DUCK, Bologna 30 EMI Partners INFN-Grid - Incontro con i Referee Project budget: ~23 M€ EC contribution: 12 M€ Duration: 3 years Effort: 2319 PM Project budget: ~23 M€ EC contribution: 12 M€ Duration: 3 years Effort: 2319 PM
EMI INFSO-RI The Grid is a reality and everybody can join 95% of the Grid is agreement Anything that can go wrong, will EGI, EMI and other projects are working on the long-term sustainability of the infrastructure Grid Services - Workshop DUCK, Bologna 31 Conclusions