EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks glexec/SCAS pilot service Status and short-term.

Slides:



Advertisements
Similar presentations
CREAM: Update on the ALICE experiences WLCG GDB Meeting Patricia Méndez Lorenzo (IT/GS) CERN, 11th March 2009.
Advertisements

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA3 Partner Update: FOM Dennis van Dok Jan.
Patricia Méndez Lorenzo (IT/GS) ALICE Offline Week (18th March 2009)
Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite Release Process Maria Alandes Pradillo.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Steven Newhouse EGEE’s plans for transition.
PanDA Multi-User Pilot Jobs Maxim Potekhin Brookhaven National Laboratory Open Science Grid WLCG GDB Meeting CERN March 11, 2009.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks PPS All sites Meeting: Introduction & Agenda.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Middleware Deployment and Support in EGEE.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
WLCG GDB, CERN, 10th December 2008 Latchezar Betev (ALICE-Offline) and Patricia Méndez Lorenzo (WLCG-IT/GS) 1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GStat 2.0 Joanna Huang (ASGC) Laurence Field.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks JRA1 summary Claudio Grandi EGEE-II JRA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Angela Poschlad (PPS-FZK), Antonio Retico.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite Build Programme and Multi-Platform.
GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Direct gLExec integration with PanDA Fernando H. Barreiro Megino CERN IT-ES-VOS.
Glexec, SCAS & CREAM. Milestones CREAM-CE capable of large-scale direct job submission Glexec & SCAS capable of large-scale use on WN in logging only.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM and ICE Massimo Sgaravatto – INFN Padova.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
INFSO-RI Enabling Grids for E-sciencE Information and Monitoring Status and Plans Plzeň, 10 July 2006 Steve Fisher/RAL.
INFSO-RI Enabling Grids for E-sciencE EGEE Security Joni Hahkala, UH-HIP On behalf of JRA3 JRA1 AH March 22-24, 2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Antonio Retico CERN, Geneva 19 Jan 2009 PPS in EGEEIII: Some Points.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Pre-production in EGEEIII Operation principles Antonio Retico EGEE-II / EGEE II SA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks New Authorization Service Christoph Witzig,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GLite testing status and future Gianni Pucciani.
LCG Support for Pilot Jobs John Gordon, STFC GDB December 2 nd 2009.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Update Authorization Service Christoph Witzig,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA3 partner collaboration tasks & process.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Martín, A. Lorca (UCM) Introduction to.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Middleware Update Maria Alandes Pradillo.
INFSO-RI Enabling Grids for E-sciencE SAML-XACML interoperability Oscar Koeroo.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
Proxy management mechanism and gLExec integration with the PanDA pilot Status and perspectives.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
INFSO-RI Enabling Grids for E-sciencE gLite Certification and Deployment Process Markus Schulz, SA1, CERN EGEE 1 st EU Review 9-11/02/2005.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Patch Preparation SA3 All Hands Meeting.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSA3.4.1 “The process document” Oliver Keeble.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 December 2007.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Study on Authorization Christoph Witzig,
CERN IT Department CH-1211 Genève 23 Switzerland t CHEP 2009, Monday 26rd March 2009 (Prague) Patricia Méndez Lorenzo on behalf of the IT/GS-EIS.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Management Claudio Grandi.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
INFSO-RI Enabling Grids for E-sciencE Report on pre-production status Joint OSG and EGEE Operations Workshop Culham Conference Centre.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Towards an Information System Product Team.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM: current status and next steps EGEE-JRA1.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Argus gLite Authorization Service Workplan.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios Grid Monitor E. Imamagic, SRCE OAT.
Summary on PPS-pilot activity on CREAM CE
glexec/SCAS pilot service
Glexec/SCAS Pilot: IN2P3-CC status
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
MB Maarten Litmaath CERN v1.0
Presentation transcript:

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks glexec/SCAS pilot service Status and short-term plans Antonio Retico, Gianni Pucciani GDB 11-Mar-09 - CERN

Enabling Grids for E-sciencE EGEE-III INFSO-RI Agenda Description of the glexec/SCAS pilot –Subject –Objectives –Partners –Success conditions –Access info Recent history –Software versions –Deployment –Integration works –Stress testing Next Steps GDB - 1 Mar 09 - CERN 2 Good afternoon!

Enabling Grids for E-sciencE EGEE-III INFSO-RI Agenda Description of the glexec/SCAS pilot –Subject –Objectives –Partners –Success conditions –Access info GDB - 1 Mar 09 - CERN 3

Enabling Grids for E-sciencE EGEE-III INFSO-RI Subject Set-up worker nodes enabling glexec/SCAS –Production nodes –lcg-CE –PROD BDII Versions –Starting from version in gLite PPS PPS Update 43 –Newer versions deployed if required (both on the pilot and in certification) GDB - 1 Mar 09 - CERN 4

Enabling Grids for E-sciencE EGEE-III INFSO-RI Objectives Integration with job management frameworks PanDA (ATLAS), Dirac3 (LHCb), AliEn (Alice) –Basic use case (identity switch) in real production conditions –New error codes (“internal failure” vs. “user non-authorised”) Operability and scalability –Set of SA3 scalability tests of SCAS –Resilience of glexec to short-duration failures in the backend –Configuration of SCAS in load-balancing GDB - 1 Mar 09 - CERN 5

Enabling Grids for E-sciencE EGEE-III INFSO-RI Partners Coordination: Antonio Retico, Gianni Pucciani (CERN) JRA1: Oscar Koeroo (NIKHEF) –Development, support SA3: Gianni Pucciani (CERN) –Stress testing SA1: Angela Poschlad (FZK), Pierre Girard (IN2P3), Ronald Starink (NIKHEF) –Site installations Atlas: Jose Caballero, Maxim Potekhin (BNL) –PanDA Integration LHCb: Andrei Tsaregorodtsev, Stuart Paterson (CERN) –Dirac3 Integration Alice: Latchezar Betev GDB - 1 Mar 09 - CERN 6

Enabling Grids for E-sciencE EGEE-III INFSO-RI Success conditions GDB - 1 Mar 09 - CERN 7 No major issues present in glexec and SCAS Stable activity (multi user pilot jobs) for ~2 weeks Achieved integration with experiments’ frameworks Positive feedback of site managers about operability

Enabling Grids for E-sciencE EGEE-III INFSO-RI Access info Home Page – Meetings –05-Feb-09: Kick off. Minutes at PPIslandKickOff2009x02x05PPIslandKickOff2009x02x05 –19-Feb-09: Follow-up. Minutes at PPIslandFollowUp2009x02x19PPIslandFollowUp2009x02x19 –26-Feb-09: Follow-up. Minutes at PPIslandFollowUp2009x02x26PPIslandFollowUp2009x02x26 Contacts GDB - 1 Mar 09 - CERN 8

Enabling Grids for E-sciencE EGEE-III INFSO-RI Agenda Recent history –Software versions –Deployment –Integration works –Stress testing GDB - 1 Mar 09 - CERN 9

Enabling Grids for E-sciencE EGEE-III INFSO-RI Software versions 4th EGEE User Forum - 2/6 Feb 09 - Catania (IT) 10 Software versions –Glexec: patch #2770 -> #2829 (27 th -Feb).#2770 #2829 –Consistent error codes plus improvements to SCAS client. –SCAS service: patch #2767patch #

Enabling Grids for E-sciencE EGEE-III INFSO-RI New releases:SCAS client (#2829) lcmaps-plugins-scas-client –Improved:  Back off tactics improved on TCP/IP and SSL layer  Treats SOAP level failures properly –New feature:  Fail-over and/or load balancing options (diff strategies): Round-robin: ofollow the configured list of endpoint (top to bottom) Round-robin random start (load balancing): o(default) Like previous, but the the first endpoint to try is randomly selected, adequately balancing between the endpoints Random (least effective, but load balancing): oPseudo random endpoint selection (not smart enough to not try a failed endpoint yet). GDB - 1 Mar 09 - CERN 11

Enabling Grids for E-sciencE EGEE-III INFSO-RI New releases:SCAS client (#2829) gLExec –Improved stability: –Fixing open issues  file locking technique has more options (for CREAM CE)  removed rudimentary safety checks –New feature:  Consolidating exit codes of gLExec: 201: User error oInput proxy wasn’t setup correctly, command not found, other items 202: System error oglexec.conf has wrong perms, or init problem in LCAS or LCMAPS 203: AuthZ failed oCalling user wasn’t a pilot/prod user, proxy verification failed, SCAS didn’t return an account (blacklisted user or no account available) 204: Child process executed and returned overlapping exit code oChild process return with exit codes 201, 202, 203 or : Shell returned that the executable can't be executed GDB - 1 Mar 09 - CERN 12

Enabling Grids for E-sciencE EGEE-III INFSO-RI Deployment FZK (ready since 19 th -Feb) –Dedicated CE test-mw-2-fzk.gridka.de accounted for FZK-LCG2 –CE-Status is set to SCASPilot –currently published in pre-production. SITE_NAME=FZK-PPS –can be used by  lhcb:/lhcb/Role=pilot (queue lhcbXXL)  atlas:/atlas/Role=production + atlas:/atlas/usatlas/Role=pilot (queue atlasXXL)  cms:/cms/Role=production (queue cmsXL)  alice:/alice/Role=pilot (queue aliceXL) Nikhef EL-Prod (ready since 27 th -Feb) –All the WNs are gLExec-enabled and accessible by all prod CEs –multiple SCAS endpoints for fault tolerance. All production CEs are configured to use WNs with gLExec. –can be used by:  lhcb:/lhcb/Role=pilot  atlas:/atlas/Role=production Both sites run (#2829) IN2P3 ready to step-in after the 15 th Mar GDB - 1 Mar 09 - CERN 13

Enabling Grids for E-sciencE EGEE-III INFSO-RI Integration works: Atlas Testing in progress at FZK since 26 th - Feb installation of gLExec/SCAS at gridka works fine –If a myproxy server is used to pass the credentials, myproxy- logon has to be installed on the WN –Old problems with proxies found in previous versions have been fixed. More info in the summary of PanDA initial testingPanDA initial testing Issues to be addressed by the exp framework –When gLExec is invoked, the environment (belonging the pilot) vanishes –when gLExec is invoked the current directory is moved to the new user HOME directory –the new user has no permissions to execute programs in the pilot directories nor to write (i.e. the output and log files which then the pilot will be looking for...) GDB - 1 Mar 09 - CERN 14

Enabling Grids for E-sciencE EGEE-III INFSO-RI Integration works: LHCb Testing started at NIKHEF on the 9 th - Mar –No progresses reported so far. Error messages from LCMAPS –Waiting to be white-listed at FZK to cross-check GDB - 1 Mar 09 - CERN 15

Enabling Grids for E-sciencE EGEE-III INFSO-RI Integration works: Alice Taking part to pilot check-points Currently more focused on CREAM Integration tests tentatively for the beginning of summer –Pilot layout may evolve in the meantime GDB - 1 Mar 09 - CERN 16

Enabling Grids for E-sciencE EGEE-III INFSO-RI Stress testing GDB - 1 Mar 09 - CERN 17 Stress testing: –Details at: SA3-SCAStestsresultsSA3-SCAStestsresults –Latest results summary: the memory leak in the server is still present but the new SCAS client rpm nearly removes the errors due to the internal SCAS server refresh. From (6M request, 14K errors) to (3M requests, 1 error) Glexec response time on a WN: Zone [0,2): 98.16% Zone 2 [2,10): 1.80% Zone 3 [10, +inf): 0.04%

Enabling Grids for E-sciencE EGEE-III INFSO-RI Next steps Finish Panda and Dirac integration test –Start working on return codes CE at FZK published in production Deployment of SCAS/glexec at IN2P3 –Site available starting from 15 th of March –Focus on load balancing Next check-point : 19 th of March GDB - 1 Mar 09 - CERN 18

Enabling Grids for E-sciencE EGEE-III INFSO-RI Questions? GDB - 1 Mar 09 - CERN