EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks ROCs Top 5 Middleware Issues Daniele Cesini,

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite Release Process Maria Alandes Pradillo.
Advertisements

INFSO-RI Enabling Grids for E-sciencE SRMv2.2 experience Sophie Lemaitre WLCG Workshop.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Loomis (CNRS/LAL) M.-E. Bégin (SixSq.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks General relationships with EGEE JRA1 SA3.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Loomis (CNRS/LAL) M.-E. Bégin (SixSq.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Middleware Deployment and Support in EGEE.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Bazaar Vision Ideas of RC/VO coordination,
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team James Casey EGEE’08.
INFSO-RI Enabling Grids for E-sciencE Integration and Testing, SA3 Markus Schulz CERN IT JRA1 All-Hands Meeting 22 nd - 24 nd March.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-EGI Grid Operations Transition Maite.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Information System Status and Evolution Maria Alandes Pradillo, CERN CERN IT Department, Grid Technology Group GDB 13 th June 2012.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Antonio Retico CERN, Geneva 19 Jan 2009 PPS in EGEEIII: Some Points.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
Rutherford Appleton Lab, UK VOBox Considerations from GridPP. GridPP DTeam Meeting. Wed Sep 13 th 2005.
LCG workshop on Operational Issues CERN November, EGEE CIC activities (SA1) Accounting: current status
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The future of the gLite release process Oliver.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Real Life Examples Tickets – Real life examples Mário David LIP - Lisbon.
European Middleware Initiative (EMI) The Software Engineering Model Alberto Di Meglio (CERN) Interim Project Director.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GLite testing status and future Gianni Pucciani.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Communication tools between Grid Virtual.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA3 partner collaboration tasks & process.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 in DPM Sophie Lemaitre Jean-Philippe.
INFSO-RI Enabling Grids for E-sciencE Policy management and fair share in gLite Andrea Guarise HPDC 2006 Paris June 19th, 2006.
1Maria Dimou- cern-it-gd LCG November 2007 GDB October 2007 VOM(R)S Workshop report Grid Deployment Board.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
INFSO-RI Enabling Grids for E-sciencE gLite Certification and Deployment Process Markus Schulz, SA1, CERN EGEE 1 st EU Review 9-11/02/2005.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Patch Preparation SA3 All Hands Meeting.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSA3.4.1 “The process document” Oliver Keeble.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data Management cluster summary David Smith JRA1 All Hands meeting, Catania, 7 March.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Storage Element Model and Proposal for Glue 1.3 Flavia Donno,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementing product teams Oliver Keeble.
EGEE-III INFSO-RI Enabling Grids for E-sciencE JRA1 and SA3 All Hands Meeting December 2009, CERN, Geneva Product Teams –
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Configuration Data or “What should be.
Enabling Grids for E-sciencE EGEE-III-INFSO-RI EGEE and gLite are registered trademarks Francesco Giacomini JRA1 Activity Leader.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Study on Authorization Christoph Witzig,
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
EGEE is a project funded by the European Union under contract IST Issues from current Experience SA1 Feedback to JRA1 A. Pacheco PIC Barcelona.
II EGEE conference Den Haag November, ROC-CIC status in Italy
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid is a Bazaar of Resource Providers and.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Management Claudio Grandi.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Towards an Information System Product Team.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Argus gLite Authorization Service Workplan.
Enabling Grids for E-sciencE EGEE-II INFSO-RI ROC managers meeting at EGEE 2007 conference, Budapest, October 1, 2007 Admin Matters Vera Hanser.
CREAM Status and plans Massimo Sgaravatto – INFN Padova
SRM2 Migration Strategy
TCG Discussion on CE Strategy & SL4 Move
Francesco Giacomini – INFN JRA1 All-Hands Nikhef, February 2008
Ian Bird EGEE’06 – JRA1/SA1/SA3 session Geneva, 29th September 2006
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROCs Top 5 Middleware Issues Daniele Cesini, Alessandra Forti WLCG Operation workshop CERN, 25th January 2007

Enabling Grids for E-sciencE EGEE-II INFSO-RI Introduction Following the presentation of the ROC UKI Top 5 m/w issues at June 2006 Operation Workshop, each ROC produced its own list of Top 5 issues. All the lists from the ROCs were collected and discussed at the ROC managers meetings Each ROC voted 3 issues from that list and this resulted in a final list published in the sa1 CERN wiki: This list has been presented and discussed at the TCG by the sites representatives (Alessandra, Jeff, Daniele) on Oct. 18th and Nov. 29th 2006 Only voted issues were discussed This talk is the result of the follow up of that discussion The issues are not presented in voting order

Enabling Grids for E-sciencE EGEE-II INFSO-RI Issues Issue number 1: “Fine grained list of RPMs per service component (and not only per node/service. Those lists should include all dependencies both internal and external, and should be available in a official and stable url). Does the information at this URL: satisfy this requirement?” SA3: the link reported on the issue will be maintained. For a finer grain list SA3 will investigate the possibilities to use the ETICS facilities. Issue number 3: “Reduce dependency of OS: e.g. better support for the UI/WN tarball (it should be certified and released at the same time as the WN/UI RPMs are certified and released.)” For the specific examples of the UI/WN tarballs, the problem is now fixed and in the future they will be deployed with the standard releases. About the porting to OS different from SLC, this is a big and complex issue and it is already faced by an SA3 activity that will take all the time needed. ETICS could help also in this. JRA1 is already doing some work in release3.1 in order to have them more portable.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Issues Issue number 4: “Documentation: release notes should contain all information on: required configuration changes; detailed deployment instructions (for example, must service be re-started after upgrade); bugs fixed by the update; new functionality introduced by the update; etc.” SA3 says that information are published but probably not properly read by system administrators and site managers. Information about patches and upgrades are published since the upgrade to savannah. The link to this info is not very visible on the glite web portal (glite.org) - it is only in the news section - and will be put in greater evidence. (DONE:  Packages). The possibility of making a pdf with the release notes will be investigated. Issue number 5: “New services should come with clear and complete error messages. This should be part of the SA3 checklist (if it isn't already); new components should not be accepted if they don't bring this.” JRA1: the activity of improving error messages is already ongoing inside JRA1, but it doesn’t appear yet in the JRA1 work plan, however the TCG asked JRA1 to add some information.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Issues Issue number 10: “Implement VO quotas on shared disk pools provided by DPM or dcache. Publish the quota and also available space, per VO, based on this quota.” The publication of VO available space on a SE will be there with the new glue schema. There is some work on VO quotas on SE, but it has not been a priority, because the priority was the implementation of the srm2.2. At the moment can be used some mechanism such as space reservation, but this not a real VO quota implementation. For what concerns VO quotas we have to wait at least the srm2.2 implementation that will not be there before spring Issue number 12: “Provide better and clearer log system, with a standard logging format. This should be part of the SA3 checklist (if it isn't already); new components should not be accepted if they don't bring this. Also, the logging system must be able to store the logs on a remote host (as the "syslog" system does).” The issue of logging standardization has been recognized as a valid and urgent issue. Action raised on JRA1 to implement the syslog functionalities for the internal developed components. For what concern external component JRA1 cannot have any control. Using syslog automatically enables the remote logging possibility. The syslog implementation for logging is not in the workplan as it a general activity for every component and isn't listed for each one, but there was a discussion at the jra1 all-hands in order to understand if syslog is enough for the jra1 requirements.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Issues Issue number 14: “Provide service management interface for all middleware services (start/stop/status, etc.). Clarifications: We don't mean "start/stop/status" of linux services. We are actually talking about administration tools which are more or less node-specific. For example, add/remove/supend a VO from a node, add/remove/close a CE queue, close a site (on site BDII), close a storage area, etc. We don't want to launch a node configuration. A good example is the FTS, you can dynamically add or remove a transfer channel by the way of a command. The FTS node is easier to administrate.” The issue has been received. It is decided that in a first phase it will be implemented a system to monitor all the services and components status inside a node. The use of GridICE sensors will be investigated for the monitoring purposes. As site representatives we clarified that the request is not only for a monitoring tool, but for an administrative tool (i.e. enabling/disabling VO). The development of such tools is postponed to a second phase. Monitoring issues should converge into the monitoring-dashboard group Issue number 17: “Failover in clients and user tools. E.g. enable user data management tools to use redundant BDIIs to look up information. So if the primary BDII specified by LCG_GFAL_INFOSYS does not respond a backup BDII can be used instead. This will eliminate the need to setup HA BDII service which most sites do not have.” This is an issue and should be analyzed for any service/client. JRA1 is working on this already (clients of wmproxy are an example). In the end the most complicated thing is to have a failover mechanism for the site and top level IS. Having that all the other failover mechanisms will be easier to implement.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Issues Issue number 18: “Pass parameters to LRMS: In order to improve the efficiency of LRMS, some information from user's job description should be passed to the CE through the RB as, for instance, the required amount of memory, the required size of scratch space, the required CPU max. time.” This is an already ongoing activity of JRA1 and the functionality is present in blah. Issue number 20: “Test under real conditions. E.g. make it easier to tailor the MW to specific fabrics needs (e.g. Quattor, batch systems, etc.). CLARIFICATION - It means:This is not restricted to regional certification, moreover it is raised as a top issue for the middleware specifically design, development and rollout (SA3 and JRA1). E.g. if a middleware distribution would be designed for one single use case, we had a problem……” TCG asked for a clarification the first time we presented it, the clarification was added. The TCG answered that this is already in place and it is the Pre-Production-Service. The project cannot put more efforts on the topic of testing under real conditions.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Issues Issue number 21: “need clear concepts for maintenance and garbage collection for each persistent/core MW node (like RB/WMS and storage services)” This was recognized a real issue by everybody. For some items JRA1 will provide guidelines about garbage collection, i.e. LB database. For other items it should be agreed somewhere the period we want to store data i.e. sandbox of not collected jobs, logs etc. It is still not clear were this agreement should be done (SA1?) About garbage collection the only guidelines that come from the TCG are those regarding security, so 90 days for logs related to security (see the documents regarding this topic). The rest are policies at the operational level to be discussed inside SA1 and the ROC managers. Erwin proposed that this topic will be discussed inside SA1 to have a proposal. This proposal should than be discussed at the TCG to see if application are happy with it.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Conclusion The list is now in the TCG work plan A slot for sites issues will be allocated every couple of months in the TCG agenda Although it is very difficult to obtain even broad deadlines from JRA1 and SA3 because: –priorities change continuously depending on the number of problems and bugs found –the move from SL3 to SL4 and from 32 to 64 bit is more complicated than expected It is important that the presentation and voting of m/w issues from ROCs is a continuous process.