EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Angela Poschlad (PPS-FZK), Antonio Retico (CERN_PPS) EGEE 08 Istanbul - 24 Sep 2008 Recent developments: is PPS now used?
Enabling Grids for E-sciencE EGEE-III INFSO-RI Contents Last year’s developments in PPS (quick overview) Bulletin of PPS recent usage The glexec testing in the “classic” PPS The Cream Pilot (Angela) EGEE08 - Istanbul - 22/28 Sept
Enabling Grids for E-sciencE EGEE-III INFSO-RI Promote Usage Optimise Work Reduce Costs ( … and make it faster, please !!) –Improve the elapsed time-to-production of the middleware updates (without losing quality) EGEE08 - Istanbul - 22/28 Sept Input for PPS from EGEE’07
Enabling Grids for E-sciencE EGEE-III INFSO-RI How PPS changed with EGEE III Changes to the Mandate New Structure New Services New Procedures + a lot of the “old things” 4 EGEE08 - Istanbul - 22/28 Sept 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI Mandate in EGEEIII The EGEE Pre-Production provides early access on- demand to grid services to interested users, in order to test, evaluate and give feedback to changes and new features of the middleware. In addition to that, the pre-production extends the middleware certification activity, helping to evaluate deployment procedures, [inter]operability and basic functionality of the software against operational scenarios reflecting real production conditions 5 EGEE08 - Istanbul - 22/28 Sept 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI A new structure Three synergetic service areas: Middleware PIlot Services –Provisioning and operation of grid pilots Middleware Quality Services –Deployment Testing –Release Testing –Monitoring Utilities Support Area – SW Release Coordination – Activity Management EGEE08 - Istanbul - 22/28 Sept PPS MQS SUPMPS
Enabling Grids for E-sciencE EGEE-III INFSO-RI New Services Pilot Services in Production (within MPS) Production Release Testing (within MQS) EGEE08 - Istanbul - 22/28 Sept
Enabling Grids for E-sciencE EGEE-III INFSO-RI New Procedures Enhanced Release Procedure to cover pilot services Activity Management Tools (next talk) (within PPS-Support area) EGEE08 - Istanbul - 22/28 Sept
Enabling Grids for E-sciencE EGEE-III INFSO-RI And among the old things... A number of PPS sites still running as public infrastructure Not easy to decommission without providing the users with alternatives for some features –E.g. A space to test new version of clients on WNs EGEE08 - Istanbul - 22/28 Sept
Enabling Grids for E-sciencE EGEE-III INFSO-RI Focus of this talk Many things to talk about, but we have to focus So, we focus this talk on our users What changed in “PPS Users’ World” since last year? EGEE08 - Istanbul - 22/28 Sept
Enabling Grids for E-sciencE EGEE-III INFSO-RI News from PPS Users The Diligent project is over the VO stopped its PPS activity ; good luck to its successor D4Science Three pilot services formally started with LHC Experiments between April and June 2008 –Amga, with LHCb (in progress) –WMS SL4, with CMS and Atlas (Apr08 May08) –Cream, with Alice and CMS (in progress) glexec mechanism tested on the “classic” public PPS infrastructure –Test in progress with LHCb and Atlas –One of the reasons why we cannot just “shut it down” EGEE08 - Istanbul - 22/28 Sept
Enabling Grids for E-sciencE EGEE-III INFSO-RI glexec The glexec testing at CERN_PPS and PPS-CESGA 12 EGEE08 - Istanbul - 22/28 Sept 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI glexec A program used on the WN to implement traceability and intra-VO accounting for pilot jobs (or glide-ins) Full functionality not delivered yet, but VOs can test the basic mechanism More Info about “pilot jobs” and glexec mechanism – – 13 EGEE08 - Istanbul - 22/28 Sept 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI LHCb and glexec LHCb ready to run low-level testing on WNs at CERN_PPS –Set-up done in July –Tests not started yet, but about to VO more interested to test the full chain as soon as available 14 EGEE08 - Istanbul - 22/28 Sept 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI Atlas and glexec Two PPS CEs (CERN_PPS and PPS-CESGA) run glexec in setuid mode (enabling the credentials of the real submitter) The configurations are special (e.g. the WN needs a host certificate), but that does not matter for testing ATLAS tested glexec within their PanDA pilot job framework and uncovered various bugs (7 so far, 3 of them major for deployment): ATLAS tests succeeded after LCMAPS was upgraded and its configuration was adapted Working in PPS the tester profited of conditions rapidly adapted and special access arrangements for debugging 15 EGEE08 - Istanbul - 22/28 Sept 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI Cream The Cream CE Pilot (by Angela Poschlad) –PPS-FZK –PPS-CNAF –PPS-SCAI 16 EGEE08 - Istanbul - 22/28 Sept 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI The Cream CE Pilot in a nutshell Started in Jun 2008 General objectives: –Collect feedback from the experiments –Enable monitoring tools and procedures for production Design and duration: –2 phases initially planned: PPS and production –Phase1 extended until now (with Alice using production queues) –Phase2 to start with release of ICE WMS More info: – EGEE08 - Istanbul - 22/28 Sept
Enabling Grids for E-sciencE EGEE-III INFSO-RI A formal incorrectness The model we envisage for deployment of pilots We adopted the model of the Experimental Services instead –Pilot started with software not yet released to certification –Judgement of operability emitted by PPS instead of certifiers –Version in certification older than the one in pilot –Not formally correct, but the conditions to start were tempting –Glitch in transition between old and new release process – EGEE08 - Istanbul - 22/28 Sept dev + int set-up pilot user testing inst test other tests dep test JRA1 SA3 SA1 production JRA1,SA3,SA1 in certification In PPS In production Pilot PPS dev + int set-up preview user testing inst test other tests dep test user testing dep JRA1 SA3 SA1 production In certification In PPS In production EXP
Enabling Grids for E-sciencE EGEE-III INFSO-RI Pilot installations EGEE08 - Istanbul - 22/28 Sept CREAM CE CREAM-CLI WMS CREAM CE installed in CNAF and FZK ICE enabled WMS installed in SCAI and FZK For direct job submission the CREAM CLI was installed in CNAF and FZK During the installation process the communication with the developers was excellent
Enabling Grids for E-sciencE EGEE-III INFSO-RI Setup at GridKa CREAM CE WMS integrated in PPS system (30 cores) ICE enabled WMS ready for alice for testing ALICE: Testing with the production setup! VOBOX VOBox with alice production setup Problem with proxy delegation -> Direct job submission
Enabling Grids for E-sciencE EGEE-III INFSO-RI Alice's tests CREAM CE VOBOX CREAM-CLI Gridftp Finally the job submission succeeded ;-) Operated through a VOBOX parallel to the already existing service at GridKa Access to the CREAM CE Initially 30 CPUS (PPS) available for the testing For more load temporarily raised to 300 cores Moved later to production ALICE queue 2000 concurrently jobs workmanaged by CREAM during several days
Enabling Grids for E-sciencE EGEE-III INFSO-RI ALICE production jobs via CREAM CE (ca. 2000) Alice jobs via lcg-CE The two CEs used have the same hardware
Enabling Grids for E-sciencE EGEE-III INFSO-RI feedback on the installation ALICE is interested in the direct submission feature of the CREAM CE for production as fast as possible geclipse VO asks for access to test the job submission Some sites in the ROC DECH region ask for the date of release because they want to setup the CREAM CE very soon some advertisement made Installation in the “gLite administration workshop” at GridKa School 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI For Further Reading 24 WLCG/EGEE Pre Production: Use Cases (EGEEIII) WLCG/EGEE Pre Production: Service Description (EGEEIII) The Release Procedures: From Certification to Production All Available on the PPS web site EGEE08 - Istanbul - 22/28 Sept 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI Questions? EGEE08 - Istanbul - 22/28 Sept 2008