Presentation is loading. Please wait.

Presentation is loading. Please wait.

EMI INFSO-RI-261611 Servizi Grid per il calcolo e l'accesso ai dati Workshop DUCK – Bologna, 10-11-2010 Francesco Giacomini INFN-CNAF.

Similar presentations


Presentation on theme: "EMI INFSO-RI-261611 Servizi Grid per il calcolo e l'accesso ai dati Workshop DUCK – Bologna, 10-11-2010 Francesco Giacomini INFN-CNAF."— Presentation transcript:

1 EMI INFSO-RI-261611 Servizi Grid per il calcolo e l'accesso ai dati Workshop DUCK – Bologna, 10-11-2010 Francesco Giacomini INFN-CNAF

2 EMI INFSO-RI-261611 The Grid Infrastructure The Grid Middleware The Grid Ecosystem 10-11-2010 Grid Services - Workshop DUCK, Bologna 2 Outline

3 EMI INFSO-RI-261611 In 2002, I. Foster wrote* that “... The essence of the [definition] can be captured in a simple checklist,... a Grid is a system that: – coordinates resources that are not subject to centralized control… – … using standard, open, general-purpose protocols and interfaces... – … to deliver nontrivial qualities of service.” Corollaries – It has to work in a large-scale production environment – It has to be secure [*] What is the Grid? A Three Point Checklist 10-11-2010 Grid Services - Workshop DUCK, Bologna 3 What is a Grid?

4 EMI INFSO-RI-261611 THE GRID INFRASTRUCTURE 10-11-2010 Grid Services - Workshop DUCK, Bologna 4

5 EMI INFSO-RI-261611 10-11-2010 Grid Services - Workshop DUCK, Bologna 5 Some numbers Source: Accounting Portal, http://www3.egee.cesga.es/gridsite/accounting/CESGA/egee_view.html 90% LHC Experiments 10% all the others (including the rest of HEP)

6 EMI INFSO-RI-261611 230 sites (ex-EGEE infrastructure) ~50 countries ~150 Virtual Organizations 10000+ registered users 150K CPU cores – min/avg/max: 0/640/18000 60+ PB of disk – min/avg/max: 0/270TB/18PB 60+ PB of tape – min/avg/max: 0/290TB/31PB Everybody can join the Grid! Source: Gstat - https://gstat-prod.cern.ch/gstat/summary/ 10-11-2010 Grid Services - Workshop DUCK, Bologna 6 Some numbers /2

7 EMI INFSO-RI-261611 THE GRID MIDDLEWARE 10-11-2010 Grid Services - Workshop DUCK, Bologna 7

8 EMI INFSO-RI-261611 Slightly adapted from the OGF PGI WG Given a job description, a meta-scheduler interrogates the information system, locates an optimal execution resource, which in turn runs it, fetching the necessary input data from a remote storage. Upon completion, the newly created data is uploaded to a storage resource, registered in the necessary data indexing catalogs, and the job record is updated in the accounting and monitoring system Other use cases are possible – e.g. pilot jobs, to create an overlay network of agents under the control of the application framework 10-11-2010 Grid Services - Workshop DUCK, Bologna 8 Basic Use Case

9 EMI INFSO-RI-261611 95% of the Grid is agreement Key terms – No centralized control – Coordination – Standards – Protocols – Interfaces Standards, protocols, interfaces,... aim at providing common abstractions of different implementations of similar services The real challenge is finding the most appropriate abstractions that allow real work to be accomplished 10-11-2010 Grid Services - Workshop DUCK, Bologna 9 1 st Law of the Grid

10 EMI INFSO-RI-261611 10-11-2010 Grid Services - Workshop DUCK, Bologna 10 The Big Picture Computing Resources Storage Resources Indexes/Catalogs Job Management Data Management (Scientific) Applications Site Application MW High-level MW Low-level MW

11 EMI INFSO-RI-261611 The one and only nice property of security: – Security creates assurance – That’s why we need it! A pretty useless property of security: – Security doesn’t add functionality – That’s why we are tempted to ignore it! The many stupid properties of security: – Imposes limits, which often vary over time (often suddenly) – Creates dependencies on things beyond our control – Is utterly unimpressed by cool features, but forces us to think about very weird stuff happening in weird circumstances – In short: it’s a pain! – That’s why we hate it! (© Christoph Witzig, EGEE-III Security Architect) 10-11-2010 Grid Services - Workshop DUCK, Bologna 11 Properties of Security

12 EMI INFSO-RI-261611 A Grid may count hundreds of sites. Do I need an account on each of them? No A Grid identity is managed with an X.509 certificate /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Francesco Giacomini A certificate is issued by a trusted Certification Authority A Grid identity is transparently mapped to a local identity/account, provided the authorization is granted Better integration with existing Authentication and Authorization Infrastructures (AAI) is planned – Shibboleth, Kerberos – Short-Lived Credential Service (SLCS) – Lower an important entry barrier to use the Grid 10-11-2010 Grid Services - Workshop DUCK, Bologna 12 Identity

13 EMI INFSO-RI-261611 A user typically belongs to a (scientific) community (e.g. a physics experiment) and to different groups within that community On the Grid a community is a Virtual Organization (VO) Identities can be associated to VOs and groups by the VO Membership Service (VOMS) Information on membership, roles within VO and groups and other generic attributes are included in the user's credentials – Attribute Certificate Proxies ACs are used, for example, in the AuthZ phase when accessing a service It's difficult to use the Grid if you are not part of a VO – Won't change much in the near future 10-11-2010 Grid Services - Workshop DUCK, Bologna 13 Membership Management

14 EMI INFSO-RI-261611 Provide the possibility to execute a computation But also: – Get the status of the computation – Cancel the computation Computing resources are typically provided by possibly large farms of computers (Worker Nodes) – Usually managed by a batch system (e.g. LSF, PBS, Condor) The corresponding Grid abstraction is called a Computing Element (CE) BES/JSDL standard available, but not adequate for production use Example of CE: CREAM 10-11-2010 Grid Services - Workshop DUCK, Bologna 14 Core Services: Computing

15 EMI INFSO-RI-261611 Provide the possibility to manage the storage of data – Data are typically in the form of files Create, read, write, delete files/directories Reserve space, stage files to disk from tape and viceversa, pin files Storage may be provided using different technologies – Disk, tape, a combination of the two,... – Different implementations The corresponding Grid abstraction is called Storage Element (SE) – SRM standard for management – GridFTP for transfer HTTP coming Example of SE: StoRM 10-11-2010 Grid Services - Workshop DUCK, Bologna 15 Core Services: Storage

16 EMI INFSO-RI-261611 How do I know which resources or services are available? How do I know which ones I can use? Services publish their existence, characteristics and status in the Information System – The information is published according to an agreed-upon schema, called the GLUE schema – The most common implementation is based on LDAP and is called BDII Service Discovery 10-11-2010 Grid Services - Workshop DUCK, Bologna 16 Information System

17 EMI INFSO-RI-261611 How do I name a file? – Logical File Name (LFN) – user-friendly: /cms/20030203/run2/track1 – GUID – machine-friendly: f81d4fae-7dec-11d0-a765-00a0c91e6bf6 Where is a (copy of a) certain file stored? – A data file can be replicated in multiple places to improve availability A File Catalog translates a LFN/GUID to physical locations (possibly more than one) srm://pcrd24.cern.ch/flatfiles/cms/output10_1 Example of File Catalog: LFC 10-11-2010 Grid Services - Workshop DUCK, Bologna 17 File Catalog

18 EMI INFSO-RI-261611 Finding resources and performing operations on core services by hand is at best inconvenient Higher-level services are needed – Applications often have their own Other advantages – Optimizations are easier to achieve – Management of errors As a general rule a high-level service accepts a logical description of what the user wants to achieve and transforms it in a concrete operation, following it until completion A high-level service operates on behalf of the user – Credential delegation/impersonation – Not completely satisfactory 10-11-2010 Grid Services - Workshop DUCK, Bologna 18 Higher-level Services

19 EMI INFSO-RI-261611 Anything that can go wrong, will The Grid is a very complex environment, with many cooperating components Components are heterogeneous, provided by different parties, managed by different parties, etc. Some errors are preventable, some are manageable by the infrastructure, some can only be managed by the user 10-11-2010 Grid Services - Workshop DUCK, Bologna 19 2 nd Law of the Grid

20 EMI INFSO-RI-261611 The Workload Management System (WMS) is responsible for the distribution and management of tasks across Grid resources, in particular Computing Elements, in such a way that applications are conveniently, efficiently and effectively executed Complemented by the Logging&Bookkeeping Service – Keep track of a number of events generated by different components involved in job management – Provide query interface Including the status of a job – Specifically designed and optimized for this purpose 10-11-2010 Grid Services - Workshop DUCK, Bologna 20 High-level Services: Job Management

21 EMI INFSO-RI-261611 [ Executable = “my_exe”; StdOutput = “out”; StdError = “err”; Arguments = “a b c”; InputSandbox = {“/home/giaco/my_exe”}; OutputSandbox = {“out”, “err”}; Requirements = Member( other.GlueHostApplicationSoftwareRunTimeEnvironment, "ALICE3.07.01“ ); Rank = -other.GlueCEStateEstimatedResponseTime; RetryCount = 3 ] 10-11-2010 Grid Services - Workshop DUCK, Bologna 21 Job Description Language

22 EMI INFSO-RI-261611 10-11-2010 Grid Services - Workshop DUCK, Bologna 22 Job Scheduling

23 EMI INFSO-RI-261611 How many resources have I used? How many resources has a certain VO used? An accounting system provides support to give precise answers to such questions – Collect information at resource level – Propagate the info at higher-levels, where it can be aggregated according to different views Base for grid-wide quotas – Not available yet Limited to computing resources at the moment – Storage accounting is planned Example of accounting systems: DGAS 10-11-2010 Grid Services - Workshop DUCK, Bologna 23 Accounting

24 EMI INFSO-RI-261611 Other (scientific) Grids exist – NorduGrid, DEISA, OSG,... – Often serving the same community Interoperability through a five-step plan – Requirements collection Identify a clear need to interoperate with another Grid Involve the users – Analysis understand the similarities and differences between the infrastructures – Development Find and implement a solution that just works: parallel deployment, adaptors and traslators, gateways,... – Support Maintain a production-quality service – Standardization 10-11-2010 Grid Services - Workshop DUCK, Bologna 24 Interoperability and Standardization

25 EMI INFSO-RI-261611 THE GRID ECOSYSTEM 10-11-2010 Grid Services - Workshop DUCK, Bologna 25

26 EMI INFSO-RI-261611 A pan-European Grid Infrastructure Collaboration between: – National Grid Initiatives (IGI in Italy) – Research Communities – Coordination body EGI.eu Foundation under Dutch law Mission: guarantee the long-term availability of a generic e-infrastructure for all European research communities and their international collaborators 10-11-2010 Grid Services - Workshop DUCK, Bologna 26 EGI

27 EMI INFSO-RI-261611 Integrated Sustainable Pan-European Infrastructure for Researchers in Europe EU-funded project to support the start of EGI – Project budget: 72 M€ – EU contribution: 25 M€ – Duration: 4 years – Effort: 9261 PM – Project partners: 50 10-11-2010 Grid Services - Workshop DUCK, Bologna 27 EGI-InSPIRE

28 EMI INFSO-RI-261611 The European Middleware Initiative (EMI) project represents a close collaboration of the major European middleware providers - ARC, gLite, UNICORE and dCache - to establish a sustainable model to support, harmonise and evolve distributed computing middleware for deployment in EGI, PRACE and other distributed e-Infrastructures 10-11-2010 Grid Services - Workshop DUCK, Bologna 28 EMI

29 EMI INFSO-RI-261611 Consolidate the existing middleware distribution simplifying services and components to make them more sustainable (including use of off-the-shelf and commercial components whenever possible) Evolve the middleware services/functionality following the requirement of infrastructures and users, mainly focusing on operational, standardization and interoperability aspects Reactively and proactively maintain the middleware distribution to keep it in line with the growing infrastructure usage 10-11-2010 Grid Services - Workshop DUCK, Bologna 29 EMI Primary Objectives Consolidate Evolve Support

30 EMI INFSO-RI-261611 10-11-2010 Grid Services - Workshop DUCK, Bologna 30 EMI Partners INFN-Grid - Incontro con i Referee 302010-10-12 Project budget: ~23 M€ EC contribution: 12 M€ Duration: 3 years Effort: 2319 PM Project budget: ~23 M€ EC contribution: 12 M€ Duration: 3 years Effort: 2319 PM

31 EMI INFSO-RI-261611 The Grid is a reality and everybody can join 95% of the Grid is agreement Anything that can go wrong, will EGI, EMI and other projects are working on the long-term sustainability of the infrastructure 10-11-2010 Grid Services - Workshop DUCK, Bologna 31 Conclusions


Download ppt "EMI INFSO-RI-261611 Servizi Grid per il calcolo e l'accesso ai dati Workshop DUCK – Bologna, 10-11-2010 Francesco Giacomini INFN-CNAF."

Similar presentations


Ads by Google