EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE IT cluster activity status (Status of WMS & CE) Francesco Prelz – IT.

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Advertisements

INFSO-RI Enabling Grids for E-sciencE XACML and G-PBox update MWSG 14-15/09/2005 Presenter: Vincenzo Ciaschini.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Interoperability Shibboleth - gLite Christoph.
INFSO-RI Enabling Grids for E-sciencE CREAM: a WebService based CE Massimo Sgaravatto INFN Padova On behalf of the JRA1 IT-CZ Padova.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Security and Job Management.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Middleware Deployment and Support in EGEE.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Angela Poschlad (PPS-FZK), Antonio Retico.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM and ICE Massimo Sgaravatto – INFN Padova.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware reengineering Claudio Grandi – JRA1 Activity Manager - INFN EGEE Final EU.
VOMS: Status & Plans Vincenzo Ciaschini, Valerio Venturi MWSG Meeting, CERN, Feb
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Progress on first user scenarios Stephen.
INFSO-RI Enabling Grids for E-sciencE gLite Middleware Status Frédéric Hemmer, JRA1 Manager, CERN On behalf of JRA1.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE gLite and Condor present and future Claudio Grandi (INFN – Bologna)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Update Authorization Service Christoph Witzig,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA3 partner collaboration tasks & process.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE JRA1 All Hands Meeting July 10-12, 2006 Pilsen, CZ.
INFSO-RI Enabling Grids for E-sciencE Policy management and fair share in gLite Andrea Guarise HPDC 2006 Paris June 19th, 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite configuration (plans) Robert Harakaly.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
WP1 Status and plans Francesco Prelz, Massimo Sgaravatto 4 th EDG Project Conference Paris, March 6 th, 2002.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE middleware status and plans Claudio Grandi (INFN and CERN) John White.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Management Claudio Grandi.
INFSO-RI Enabling Grids for E-sciencE Status of BLAH Francesco Prelz ( ) JRA1 All-Hands meeting, Nicosia,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM: current status and next steps EGEE-JRA1.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Simone Campana (CERN) Job Priorities: status.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
CREAM Status and plans Massimo Sgaravatto – INFN Padova
EGEE-II INFSO-RI Enabling Grids for E-sciencE Status of INFN middleware in gLite Claudio Grandi INFNGrid EB CNAF,
INFSO-RI Enabling Grids for E-sciencE CREAM, WMS integration and possible deployment scenarios Massimo Sgaravatto – INFN Padova.
Resource access in the EGEE project Massimo Sgaravatto INFN Padova
Practical using C++ WMProxy API advanced job submission
CEMon
gLite: status and perspectives
Claudio Grandi – JRA1 Activity Manager INFN and CERN
CREAM and ICE Test Results
IT Cluster Summary JRA1 All Hands Meeting CERN, October 2007
Security aspects of the CREAM-CE
Andreas Unterkircher CERN Grid Deployment
Summary on PPS-pilot activity on CREAM CE
Preview Testbed Massimo Sgaravatto – INFN Padova
Claudio Grandi (INFN and CERN)
BLAHPd developments (David, Massimo, Giuseppe)
GDB 8th March 2006 Flavia Donno IT/GD, CERN
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
Global Banning List and Authorization Service
gLite Middleware Status
Accounting at the T1/T2 Sites of the Italian Grid
Massimo Sgaravatto INFN Padova On behalf of the CREAM product team
ICE-CREAM Luigi Zangrando On behalf of the JRA1 IT-CZ Padova group
The CREAM CE: When can the LCG-CE be replaced?
WP1 activity, achievements and plans
Short update on the latest gLite status
TCG Discussion on CE Strategy & SL4 Move
Francesco Giacomini – INFN JRA1 All-Hands Nikhef, February 2008
Chapter 2: Operating-System Structures
Chapter 2: Operating-System Structures
gLite The EGEE Middleware Distribution
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE IT cluster activity status (Status of WMS & CE) Francesco Prelz – IT Cluster Manager INFN Milano EGEE’06 Conference Geneva, 28 September 2006

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI Outline Activity on glite 3.0.* Activity on the gLite 3.1 branch Progress on TCG actions (or lack thereof) CE solutions and CREAM VOMS and DGAS status Summary and Plans

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI Activity on the 3.0 branch Backwards-compatible release ➔ All components of the 'LCG' stack are still there. Since the beginning of EGEE-II, we delivered 6 bugfix tags for org.glite.ce and 13 bugfix tags for org.glite.wms. We were asked to back-port several features from the 3.1 branches: Glue 1.2 Voviews used for matchmaking. Forwarding of CE requirements to/via BLAH DAG node resubmits Creation of common-format accounting logs in BLAH and integration with DGAS. Support for job 'prologues', that can check and prepare the environment. Failure of the prologue causes a shallow resubmission. Support for job 'epilogues' was also added.

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI The 3.1 branch 3.1 maintenance branches were forked as requested on March 31st, and are being continuously “internal”-tested since then. Remaining "not-yet-backported" features: g-pbox (was excluded from glite 1.5 and 3.0) Proxy renewal service with no WMS dependencies ICE-CREAM (see later) Streamlined logging of bulk jobs in WMproxy WM code cleanup Jobwrapper code cleanup Status and statistics collection on WMS node. 3.1 WMS requires at least '3.1' security.

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI branch complications The scenario for deploying the 3.1 version of gLite has been unclear for a long time, so the (in principle release-candidate) 3.1 branch was used as a testing ground for: Merging changes to support 64-bit builds Merging changes to complete a build against VDT 1.3 (updated OpenSSL caused problems at runtime) Obtaining a build from the new build system This caused some interference: moving now from internal test machines to preview testbed. Should allow to upgrade the WMS node (WMS + deps) components only.

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI TCG Actions (1) 301. "Configuration that defines a set of primary RB's to be used by the VO for load balancing and allows defining alternative sets to be used in case the primary set is not available." –The appropriate modifications were merged into the 3.1 branch at the end of July, supporting the requested fall-back to Service Discovery. This includes a callout to refuse jobs when an overload condition is detected by a local script "Single RB end point with automatic load balancing." –This is the "real" high availability solution. The TCG decided it's not a priority "No loss of jobs due to temporary unavailability of an RB instance." –We are working on a "hot standby" configuration. Testing of the LCG HAGD and of 'filelist' code replacements just resumed. Completion, commit, integration and test depend on 3.1 plans.

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI TCG Actions (2) 304. "Handling of 10**6 jobs/day." –After massaging bulk submission in, we are now working on a bulk match-making feature, based on user-specified "significant" attributes. At the same time, we are decoupling bulk-submitted jobs from DAGs. Work just stabrted, in parallel with #303, but these features will be added in a tag after the tag for # "Better input sandbox management (caching of sandboxes)." –The few corrections needed to support Input Sandbox files via HTTP (and HTTP proxies) were committed into the 3.1 branch at the end of July. htcp (from gridsite), which provides proxy support, was used.

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI Highlights: Computing Element Three flavours available now: LCG-CE (GT2 GRAM) gLite-CE (GSI-enabled Condor-C) CREAM (WS-I based interface) How to deal with them: – LCG-CE is in production now but will be phased-out by the end of the year – The gLite-CE still needs thorough testing and tuning. Being done now – CREAM is being deployed on the JRA1 preview test-bed now. After a first testing phase will be certified and deployed together with the gLite-CE BLAH is the interface to the local resource manager (via plug-ins) – CREAM and gLite-CE – Information pass-through: pass parameters to the LRMS to help job scheduling WMS, Clients LRMS WN bdII R-GMA CEMon Computing Element glexec + LCAS/ LCMAPS BLAH Grid Site Information System

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI CREAM (1) Implemented functionality – Job submission Supported job types: simple, batch jobs, MPI jobs, interactive jobs (for submissions via WMS-ICE) – Proxy delegation Delegation 2.0 integrated – Job cancellation – Job status – Job list – Job suspension and job resume – Proxy renewal – Disabling of new job submissions Can be explicitly invoked by CE admin and also possible to define policies on waiting/pending/running jobs to disable new job submissions E.g. disable new submissions if the number of active jobs is > 3000 Interfaces: CREAM CLI and Java client available – Of course everyone can implement his own interface to interact with CREAM (Web service) Comprehensive (at least we hope so) documentation available in the CREAM web site (

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI CREAM (2) Performing stress and performance tests –See “Preview Testbed” presentation and also the CREAM web site ( under “Test Results” Some issues –Problems with VOMS based AuthZ (bug #18244) –Thread safe issues in the BLAH-glexec interaction Current activities –Fixing the problems found in the tests –Support for bulk jobs –Starting JSDL support (with the OMII hat, see Future activities –BES support (again, with the OMII hat)

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI ICE (1) ICE (Interface to Cream Environment) is responsible for WM—CREAM interaction It implements the functionalities currently done by JC+LM+Condor –Job submission, Job cancellation, Proxy renewal –Job status monitoring (and then taking the appropriate actions)  Receiving notifications about jobs from CEMon and doing an active polling if/when needed A single ICE instance can interact with multiple CREAM servers –ICE uses a thread pool to interact with CREAM servers Everything is user-configurable –...with sensible defaults which should be OK for most installations

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI ICE (2) Tests done so far in a small and controlled environment Starting now with stress and performance tests in the Preview testbed Some next steps –Debugging and fixing of problems found during testing –Better approach for proxy renewal? (to be discussed and agreed with Daniel K.) –Better integration with the WM? (to be discussed with Francesco G.) currently ICE runs as a separate process, interacting with the WM via filelist;

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI VOMS Recent Updates Insertion of AA Certificates into the AC –Requested by SA1 (easier system management) VO-specific directories in vomsdir –Better separation of duty among VOs. Generic Attributes –(name, value) pairs could be added to VOMS credentials Requested by many of people. Very useful indeed. Updated Java APIs –Fixed several standing problems. These updates were applied to the '3.1' branch of VOMS:

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI Voms Updates Continued Support for PKCS12-formatted credentials –As you get them from CAs. Support for GT4 –Verified in the context of the GIN activity. Globus-free C/C++ APIs The Future –New version 2.0 of voms-admin More stable, faster, and more maintainable.

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI Status of DGAS Site-level HLR available –irepository for detailed (with user, VO and VOMS attributes) and aggregatejob information for jobs executed on a given site, or reported from multiple sites. DGAS sensors are available for BLAH and globus- jobmanager based CEs –PBS and LSF, with Condor and SGE being implemented. VO-level HLR also available –mostly for evaluation purpose, not foreseen to be deployed CLI interface available – query for accounting records or aggregate reports Site HLR, receiving records from the “unified” accounting log via DGAS sensors, being deployed on preview testbed.

EGEE-06 Conference, Geneva, September Enabling Grids for E-sciencE EGEE-II INFSO-RI Summary Software from the 3.1 maintenance branches is being deployed on the preview testbed. TCG actions commits and other progress on the 3.0 branch (DGAS addition and VDT transition) that was supposed to happen at the end of july were sidetracked by the “escalated” developer attention requested on rb102.cern.ch: –A few scale issues were identified and addressed. –Hardware and software configuration was performed. Program of work to address TCG items is now resuming. New planning for 3.1 certification, deployment (which components, when?) and for addition of these pending changes is needed.