CREAM Status and Plans Massimo Sgaravatto – INFN Padova

Slides:



Advertisements
Similar presentations
HTCondor and the European Grid Andrew Lahiff STFC Rutherford Appleton Laboratory European HTCondor Site Admins Meeting 2014.
Advertisements

WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite Release Process Maria Alandes Pradillo.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Campus Grids Report OSG Area Coordinator’s Meeting Dec 15, 2010 Dan Fraser (Derek Weitzel, Brian Bockelman)
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
Grid Workload Management Massimo Sgaravatto INFN Padova.
EMI 1 Release The EMI 1 (Kebnekaise) release features for the first time a complete and consolidated set of middleware components from ARC, dCache, gLite.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Towards a Global Service Registry for the World-Wide LHC Computing Grid Maria ALANDES, Laurence FIELD, Alessandro DI GIROLAMO CERN IT Department CHEP 2013.
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 November 2007.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
Report from the WLCG Operations and Tools TEG Maria Girone / CERN & Jeff Templon / NIKHEF WLCG Workshop, 19 th May 2012.
Production Oct 31, 2012 Dan Fraser. Current Production Focus Transition to RPMs 52(44) sites using RPM based installs 52(44) sites using RPM based installs.
WLCG Software Lifecycle First ideas for a post EMI approach 0.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
OSG Area Report Production – Operations – Campus Grids June 19, 2012 Dan Fraser Rob Quick.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 December 2007.
MW Readiness WG Update Andrea Manzi Maria Dimou Lionel Cons Maarten Litmaath On behalf of the WG participants GDB 09/09/2015.
EMI INFSO-RI Testbed for project continuous Integration Danilo Dongiovanni (INFN-CNAF) -SA2.6 Task Leader Jozef Cernak(UPJŠ, Kosice, Slovakia)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Management Claudio Grandi.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI GLUE 2: Deployment and Validation Stephen Burke egi.eu EGI OMB March 26 th.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
Accounting Update John Gordon. Outline Multicore CPU Accounting Developments Cloud Accounting Storage Accounting Miscellaneous.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM: current status and next steps EGEE-JRA1.
Maria Alandes Pradillo, CERN Training on GLUE 2 information validation EGI Technical Forum September 2013.
Accounting Review Summary and action list from the (pre)GDB Julia Andreeva CERN-IT WLCG MB 19th April
EGEE is a project funded by the European Union under contract IST Padova report Massimo Sgaravatto On behalf of the INFN Padova JRA1 Group.
CREAM Status and plans Massimo Sgaravatto – INFN Padova
EMI INFSO-RI EMI 1 (Kebnekaise) Updates C. Aiftimiei (INFN) EMI Release Manager.
Lund All Hands meeting Compute Area Section Massimo Sgaravatto INFN Padova.
INFSO-RI Enabling Grids for E-sciencE CREAM, WMS integration and possible deployment scenarios Massimo Sgaravatto – INFN Padova.
Claudio Grandi INFN Bologna Workshop congiunto CCR e INFNGrid 13 maggio 2009 Le strategie per l’analisi nell’esperimento CMS Claudio Grandi (INFN Bologna)
Argus EMI Authorization Integration
Resource access in the EGEE project Massimo Sgaravatto INFN Padova
WLCG IPv6 deployment strategy
OpenSSL and Java 7 vs. 512-bit proxy keys
CEMon
EGI Operations Management Board
Regional Operations Centres Core infrastructure Centres
The EDG Testbed Deployment Details
StoRM: current status and developments
OGF PGI – EDGI Security Use Case and Requirements
Sviluppi in ambito WLCG Highlights
VOs and ARC Florido Paganelli, Lund University
Security aspects of the CREAM-CE
Andreas Unterkircher CERN Grid Deployment
Andrea Manzi, Oliver Keeble
Farida Naz Andrea Sciabà
ATLAS Cloud Operations
Summary on PPS-pilot activity on CREAM CE
EMI Interoperability Activities
Massimo Sgaravatto INFN Padova On behalf of the CREAM product team
Moving from CREAM CE to ARC CE
The CREAM CE: When can the LCG-CE be replaced?
Compute Area Marco Cecchi Massimo Sgaravatto
EMI: dal Produttore al Consumatore
Leigh Grundhoefer Indiana University
Report on GLUE activities 5th EU-DataGRID Conference
gLite The EGEE Middleware Distribution
Andrea Manzi, Oliver Keeble
The LHCb Computing Data Challenge DC06
Presentation transcript:

CREAM Status and Plans Massimo Sgaravatto – INFN Padova On behalf of the gLite job management Product Team

Status CREAM: service for job management operations at the Computing Element (CE) level Allows to submit, cancel, monitor,…jobs CREAM CE installations keep increasing Used by several VOs, using different clients Direct submission (in particular via CREAM CLI) Through gLite WMS Through CondorG No more showstoppers reported by VOs  Deployment of LCG-CE no longer mandatory in WLCG

CREAM-CE vs. LCG-CE Since a few days the number of EGI-related CREAM installations exceeds the number of LCG-CE installations May 6, 2011 9:09 UTC CREAM-CE LCG-CE Total 289 280 OK 253 247 4

Submissions of pilots to CREAM using the CREAM CLI ALICE Submissions of pilots to CREAM using the CREAM CLI L. Betev (ALICE Computing Operations Manager): “The service is very solid, and as before, we are happy with its performance and especially the support. All sites are using exclusively CREAM. I think there is still a bit of mixture of old/new versions. Given that both perform well, there is no issue with this.” 5

ATLAS Submissions of pilots to CREAM mainly using CondorG as client They have recently (~ 2 weeks ago) reported some issues Condor gridmanager crashes, the status does not seem to update quickly Fixed with a newer version of Condor General problem 2 software components under the responsibility of different teams (CREAM product team, Condor team) Difficult for end users to understand if a particular issue is up to Condor or to CREAM 6

ATLAS (cont.ed) To address the issue: We (CREAM team) will analyze how CREAM is used by Condor and in case suggest improvements in the use of CREAM by Condor and/or address problems if they are in the CREAM side Already identified and reported to Condor developers some possible sources of problems in how CREAM is/was used by Condor E.g. approach used to detect job status changes Long term infrastructure (provided by Cern IT) to test at some scale Condor submission to CREAM A CREAM CE pointing to a batch system with O(1K) job slots An ATLAS pilot factory running the version of Condor suggested by the Condor team Stress tests run any time a new release of CREAM is certified, as well as any time the developer of the pilot factory believes it is time to upgrade the version of Condor 7

CMS Submissions to CREAM Through gLite WMS Through glidein WMS (Condor) Reporting problems for CREAM CEs using SGE as batch system CESGA responsibility Manpower issues being sorted out CREAM CE 1.6.6 will address these problems Just certified by CESGA Not aware of other major issues 8

Submitting pilots to CREAM through gLite WMS LHCB Submitting pilots to CREAM through gLite WMS But now going to (also) move to direct submission Operations Not aware of major issues Reported some instabilities some months ago Mainly site level problems Receiving and following all their GGUS tickets created for CREAM related issues Things seem better now 9

Releases In production: CREAM CE 1.6.5 for gLite 3.2/sl5_x86_64 CREAM CE 1.6.6 for gLite 3.2/sl5_x86_64 just certified by CESGA Bug fixes for interactions with (S)GE as batch system Updates for gLite 3.1/sl4_ia32 finished at end of March Next release is the one just released with EMI-1 No more updates in gLite apart from fixes for major problems 10

What’s new in EMI-1 CREAM-CE can be optionally configured to use ARGUS as authorization system Support for gLite-cluster Node type that can publish information about clusters and subclusters in a site, referenced by any number of compute elements Support for description and allocation of resources in a multi-core environment How cores should be distributed over the cluster, whether whole nodes should be used, how many nodes should be used Experimental publishing of computing service information in Glue 2.0 format 11

What’s new in EMI-1 (cont.ed) Support of OutputData JDL attribute in the CREAM CE Allows automatic upload and registration to the Replica Catalog of datasets produced by the job Bug fixes Other general (non CREAM specific) EMI-1 highlights Support for sl5_x86_64 Supported package formats: tar.gz, src.tar.gz, rpm, src.rpm One single EMI repository for all components No more a repo per node type Same version of a given component everywhere Components follow common packaging guidelines and policies defined by Fedora, EPEL and Debian Adherence to FHS (Filesystem Hierarchy Standard) 12

CREAM in Open Science Grid (OSG) A. Roy: “We’re really interested in CREAM. It shows a lot of promise, and we have users who want it “yesterday”. So we would really like to get it working.” They don’t use the gLite/EMI installation system and they have problems in integrating CREAM installation in their pacman based installation system We are now helping them in the finalization of the installation procedure Next steps CREAM – GUMS integration (for authorization) Improve CREAM knowledge by OSG/VDT guys Probably someone from their team will visit us 13

Some next steps Implementation of EMI Execution Service interface Specification already defined To be implemented in CREAM-CE, ARC-CE and UNICORE CE CREAM legacy interface will not be dropped Support for high availability/scalability Pool of CREAM CE machines seen as a single CREAM CE Cern in particular is pushing for this stuff Address batch system support issue Now provided by several different partners but in most of the cases only partially and on a best effort basis Support to be provided by IGI ? 14

Other info http://grid.pd.infn.it/cream being moved to: http://wiki.italiangrid.org/CREAM 15

Thank you!