II EGEE conference Den Haag November, ROC-CIC status in Italy
II EGEE conference – Den Haag November, Summary The INFN-GRID Release Resource Centres and supported VOs Services Site certification Support: the Ticketing System Installation activities Ongoing activities Open issues Some useful links
II EGEE conference – Den Haag November, The INFN-GRID Release INFN-GRID release is based upon the official LCG is 100% compatible Additions to LCG 2.2.0: Added support for DAG jobs; Added support for AFS on the WorkerNodes; Added support for MPI jobs via home syncronisation with ssh; Documented installation of WNs on a private network; Added VOMS+LCAS/LCMAPS: CDF, ZEUS, INFNGRID VOs are managed via a VOMS server. All resources are fully managed via LCFGng
II EGEE conference – Den Haag November, INFN-GRID: Resource Centres and supported VOs
II EGEE conference – Den Haag November, INFN-GRID: The Production Grid Services Grid Services are open to all supported VOs Scope: Italian Grid NEW! Resource Broker/UI DAG: prod-rb-01.pd.infn.it
II EGEE conference – Den Haag November, EGEE/LCG: The Production Grid Services egee-rb-01.cnaf.infn.it supports BIOMED VO RB/UI with DAG support: Grid Services are open to all VOs supported by INFN-GRID and EGEE/LCG
II EGEE conference – Den Haag November, Site certification Testing if "the grid is working" is not very easy Certification activity in INFN-GRID can be classified into four levels: Local tests by the local resource center managers; Certification tests by CMT* Team; Monitoring tests by CMT Team; Certification on demand, performed by both CMT and Application Teams. * CMT (Central Management Team) corresponding to the ROC+CIC deployment teams according to EGEE execution plan
II EGEE conference – Den Haag November, Local site tests These tests can be performed by the local resource center manager, just after an installation/upgrade or also after in case of troubles reported by users or found by our periodic test activity ; All nodes: Check that all nodes are mounting the LCFGng RPM repository from the LCFGng server; CE/SE: Verify the files access permissions and check the validity and the subject of the host certificate; CE: Check if the local scheduler works fine locally; SE: In the SE storage area there should be one directory for each VO supported with permissions and owners; WN: WNs should have some pool accounts for each supported VOs.
II EGEE conference – Den Haag November, The Central Management Team is responsible for the Resource Centres certification i.e. checking the functionality of a site before joining the production grid. Although all certification jobs are VO independent, the INFNGRID VO is used to run these jobs; In particular are checked: GIIS' information consistence; Local jobs submission (LRMS); Grid submission with Globus (globus-job-run); Grid submission with the ResourceBroker; ReplicaManager functionality; MPI functionality In order to certificate a site the CMT uses dedicated grid services (TEST ZONE): RB & BDII: gridit-cert-rb.cnaf.infn.it In this way we avoid to have an uncertified site in the production grid service Certification tests
II EGEE conference – Den Haag November, Periodic tests (monitoring and on-demand) CMT and system managers, could notify advices about their resources via web inserting a “Downtime advices”. The Calendar shows the snapshot of the Production Service Status. Certification jobs are periodically submitted to the sites in order to pro-actively find ‘troubles’ before users find them.
II EGEE conference – Den Haag November, The Ticketing System The INFN-GRID ticketing system is used for user and operation support by: users to ask for information or notify troubles; system managers to communicate about common grid tasks (e.g. upgrading to a new grid release) CMT to notify a problem to local system managers Interface between OneOrZero and GGUS ready Demo later today
II EGEE conference – Den Haag November, Installation: migration to SL (1/2) Very hard to deploy LCFGng in different non-HEP sites: Their resources are already installed and managed; the requirement is “give me just the middleware” Often they do not like the “I-will-do-EVERYTHING” approach No worthwhile effort in porting LCFGng to Scientific Linux To study the transition from LCFGng to a working group has been set up; the starting points are: Quattor will replace LCFGng Quattor will be used at CNAF-Tier1 At the moment quattor appears to be too complex for small/medium sites New tools for manual installation of the middleware are available (YAIM) and they seem to be easy to use
II EGEE conference – Den Haag November, Installation: migration to SL (2/2) For small/medium sites it seems that new YAIM tools are a good solution Easy to use Supported They address the “give me just the middleware” requirement (and without 300 pages to read!) The ROC-IT Installation WG is splitted in 2 sub-groups: quattor and YAIM. Each group will: Learn how the tool works Report suggestions/bugs to the developers Contribute to the development (in a joint effort with LCG/EGEE WG?) Customize it for an easy deployment in the Italian Region (e.g. adding/enabling features like VOMS, DAG jobs, MPI jobs,...) Study a mixed approach (e.g. use quattor but only to configure the middleware)
II EGEE conference – Den Haag November, Ongoing activities GridICE improvements: job monitoring, application monitoring, SLA monitoring, alarms notification DGAS integration in the INFN-GRID release INFNGRID release porting to Scientific Linux CIC-on-duty shifts Training for local site managers and grid services administrators coming soon Adding new sites (Spaci, Enea, etc) to the production grid Pre-production service in place (and waiting for the new release)
II EGEE conference – Den Haag November, Open issues The VO deployment procedures have to be defined Application support organization both at regional and global levels has to be defined Tools for allocating resources to a VO still missing Remote management tools and procedures to be developed Need for a realistic gLite deployment plan Need for training the trainees about the new middleware components, services and tools
II EGEE conference – Den Haag November, Some useful links INFN Production Grid INFN GridICE INFN test and certification INFN Support Contact