Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Report on pre-production status Joint OSG and EGEE Operations Workshop Culham Conference Centre.

Similar presentations


Presentation on theme: "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Report on pre-production status Joint OSG and EGEE Operations Workshop Culham Conference Centre."— Presentation transcript:

1 INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Report on pre-production status Joint OSG and EGEE Operations Workshop Culham Conference Centre (RAL), UK September 26th-30th 2005 Antonio Retico, Nicholas Thackray

2 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 2 Enabling Grids for E-sciencE INFSO-RI-508833 Contents PPS, a snapshot –Sites and Resources –Sites and Services –Real VOs (beyond Star Trek) Feedback from PPS –Installation tools –Configuration tools Burning Topics Short-term Plans and Discussions [OT] The MIX activity PPS “InterestMeter”

3 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 3 Enabling Grids for E-sciencE INFSO-RI-508833 PPS, a snapshot Resources ASGC, Taipei CERN, Geneva PIC, Barcelona IFIC, València CESGA, S. de Compostela LIP, Lisbon CYFRONET, Krakow NIKHEF, Amsterdam CNAF, Bologna UPATRAS, Patras UOM, Thessaloniki UOA, Athens FZK, Karlsruhe IN2PN3,Lyon na 2 3 4 2 2 3 50 na #CPUs Resources ~200 CPU (October) CERN PPS has got access to the Production Cluster via an LSF queue Currently self-limited to 50 running jobs Extensible to 1400 CPU CNAF Access to Production Cluster in preparation planned for the beginning of October Then ~150 slots will be available Among the major computing centres, CERN and CNAF are the only ones currently granting access to production facilities 50194 423289 30863 546853 #Job Submit Jobs ~1M (submitted to WMS)

4 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 4 Enabling Grids for E-sciencE INFSO-RI-508833 PPS, a snapshot Core Services ASGC, Taipei CERN, Geneva PIC, Barcelona IFIC, València CESGA, S. de Compostela LIP, Lisbon CYFRONET, Krakow NIKHEF, Amsterdam CNAF, Bologna UPATRAS, Patras UOM, Thessaloniki UOA, Athens FZK, Karlsruhe IN2PN3,Lyon WMS + LB Work Flow Management VO Management VOMS R-GMA Information System BDII Fireman(My) Catalogues MyProxy Authentication IO(DPM) IO(castor) FTS Data Management NIKHEF Currently phasing out from the PPS activity Only available VOMS service for a long time They introduced the “Star Trek VO” concept So … Thanks ASGC Joining the PPS They start running a WSM +LB service Welcome !

5 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 5 Enabling Grids for E-sciencE INFSO-RI-508833 Real VOs (beyond Star Trek ;-) First analysis jobs accessing data from Castor run at CERN by CMS LHCb and Atlas are also already experimenting with the PPS New VO DILIGENT started –The aim of the Diligent project is to build an infrastructure to merge Grid and Digital Library technologies ( www.diligentproject.org ) System interruptions can’t be that frequent anymore –SW Upgrade procedure or, more likely, a careful roll-out needed Avoid “virtuoso” hacks on (pre)-production systems –Indispensable SW upgrades should be required (and certified) as “quick fixes” and not simply taken from the “nightly build” Back-up needed for key services –At least two VOMSes already there –R-GMA registry still single point of failure (the replication feature is not there yet) SO …

6 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 6 Enabling Grids for E-sciencE INFSO-RI-508833 Feedback on Installation tools APT Repository –At present the repository is meant to be one for Testing, Certification and Pre-production. –Quick fixes go straight in PPS without certification stage. –Changes in some RPMs names (notably the FTS clients) with no “obsolete” option make the upgrade very tricky and not compliant to a real production environment. (feedback from the SC group #10931). Currently the advice is “Install everytime from scratch”. A change in the build system is on study for this. –Versions of some RPMs in the APT not compatible with existing infrastructure services (e.g. Mysql version not compatible with SFT) –Mirrors of the repository set-up at CERN and at CNAF to face some of these issues respectively for Certification and PPS

7 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 7 Enabling Grids for E-sciencE INFSO-RI-508833 Feedback on configuration tools XML configuration files –Sometimes perceived as “difficult” but at least an homogeneous configuration language: Site Admins seem to have assimilated it –Self documenting: as a standard a significant part of the file is dedicated to description and examples. –Self documenting: The installation documentation tend to be just a copy of what is written in the template. Sometimes a better description of the meaning of some parameters would be appreciated (e.g. CESE bindings). –Schema Changes: Still very frequent in the last releases. Sometimes is more effective to start from scratch than to re-use the old config files. –Two-day analysis needed on some (experienced) sites to study new parameters. (Results from a survey on 1.3 configuration) –Very poor error messages. VO Management –To activate a new VO (e.g. Diligent) several tedious changes needed overall the configuration files. A feature to support the use case is coming (not before v1.5 yet)

8 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 8 Enabling Grids for E-sciencE INFSO-RI-508833 Burning Topics FPS –Still could not get it working in Certification –Not deployable yet –Status: Progress FTS –It could not be certified using the standard configuration tool –One instance has been set-up in IN2P3 using the LCG SC wiki –Definitely important to have it running, BUT also to have the configuration procedure documented. –We should try and avoid overlapping with the SC team. –The correct process is to submit documentation bugs and have them implemented as quick fixes. –Status: Progress SFT (not just a permutation) monitor running on the PPS –It works for job submission –Service Discovery still missing (WMS to be configured in all CEs) –Still impossible to publish results (#10913) –Status: Progress

9 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 9 Enabling Grids for E-sciencE INFSO-RI-508833 PPS next Plans and Discussions Complete the upgrade to 1.3. –This has to be done in a coordinated way in order not to have service disruption Test the installations of the FTS at IN2P3. Run the R-GMA tests on the PPS. Make sure that values published in R-GMA are consistent between sites. –Very important for Service Discovery –for example CE gateways are being given the type of “CE Gatekeeper” by some sites and “org.glite.ce.gatekeeper” by other sites –Possibly avoid this point to become a long, long thread –A proposal should be written, (briefly) discussed, and possibly agreed Keep PPS wiki up to date. Sites should edit their own pages. Decide whether to maintain the access restriction on the PPS wiki –Eventually identify and move elsewhere (e.g. to GOC db) sensible information (node lists, configuration files) Define a tentative orientation process to support “small” VOs wishing to work in PPS –E.g. a map “VO” vs. “Supporting Sites” on the wiki page

10 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 10 Enabling Grids for E-sciencE INFSO-RI-508833 [OT] The MIX (gLite-LCG) The MIX testbed is another outstanding activity of SA1 –To test how specific gLite services can be integrated in the current LCG infrastructure –A dedicated testbed is run at CERN for this –Tools and methods can be different from those used in PPS but the results can be of interest for people in this meeting Some scenarios we actually tested –gLite WMS can work with LCG-BDII –gLite WMS can work with a LCG farm –gLite WMS can’t work with LFC (problems with the DLI interface) –gLite CE can share the farm with LCG (thanks to the log parser deamon) –Publishing of VO software tags not supported yet

11 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 11 Enabling Grids for E-sciencE INFSO-RI-508833 PPS – “InterestMeter” PPS “InterestMeter” People seems to switch between LCG and PPS 22-Sep-2005 Traffic on RO list Traffic on PPS list # e-mail

12 OSG/EGEE Operation Meeting - RAL, UK 27 th –Sep-2005 12 Enabling Grids for E-sciencE INFSO-RI-508833 Questions?


Download ppt "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Report on pre-production status Joint OSG and EGEE Operations Workshop Culham Conference Centre."

Similar presentations


Ads by Google