Download presentation
Presentation is loading. Please wait.
Published byArchibald Waters Modified over 6 years ago
1
ICE-CREAM Luigi Zangrando On behalf of the JRA1 IT-CZ Padova group
2
Slide shown last time Last time we showed these 2 slides:
Test: LSFConnector vs BLAHConnector Submitted to CREAM 100 jobs to a CREAM based CE, sequentially No other load (e.g. no other jobs) on CREAM Measured LRMSSubmissionTime – SubmissionTime for all the jobs, in the two scenarios (LSFConnector and BLAHConnector) SubmissionTime: when the job is received by CREAM (i.e. when CREAM insert the job in its journal manager) LRMSSubmissionTime: when the job is submitted to LSF (as reported by the LSF log) For the purpose of this test, jobs in the JournalManager are managed sequentially (i.e. a job is submitted to the LRMS, only when the previous job has been submitted) I.e. Used the sync mode, for what concerns BLAH Possible to do a better job for both connectors EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
3
Slide shown last time EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
4
CREAM - BLAH This triggered a discussion with BLAH developers
Decided a revision of the CREAM architecture Decided to give up with the LRMS specific connector to use instead BLAH for every “interaction” with the underlying resource management system CREAM journal manager modified allowing parallel BLAH submissions Since BLAH submission is I/O bound Number of threads is configurable Test repeated with 10 threads 9-10 s. (constant) as LRMSSubmissionTime – SubmissionTime 4 s. measured on another CREAM installation Not investigated further EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
5
CREAM - BLAH Changes negotiated with BLAH developers to get by BLAH log parser notifications about job status changes See: Just provided by the BLAH developers Starting integration with CREAM Changes negotiated with BLAH developers to have BLAH commands working on multiple jobs Waiting to get these modifications EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
6
Credential mapping Glexec (formerly known as su-exec) not in GLite 1.5 and not released yet Needed for credential mapping Talked in Pisa with JRA3 developers Discussed about the dirty details Agreed on some needed modifications They reported that in about 10 days after Pisa they should be able to release something working It should be now Started discussing with BLAH developers where to apply this integration In CREAM calling BLAH or in BLAH calling the LRMS commands ? Decision also depends on the overhead introduced by glxexec To be measured when glexec is usable In the meantime started applying some other needed changes Deployment and integration gridftp server LCMAPS enabled Proper ownerships and protections of directories EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
7
CREAM: other accomplishments
Porting to Axis 1.2.1 Porting to GSoap 2.7.6b Several problems managing faults Applied modifications needed because of changes in delegation stuff Support for configuration file in CREAM CLI Several bug fixes (in both client and server) User documentation updated First draft of a “high level” document describing CREAM architecture and functionality available First unit tests committed EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
8
CREAM: other issues and other next steps
Integration of VOMS based authorization VomsPDP just released Integration with CEMon To provide asynchronous notifications about CREAM jobs Support of DAGs and bulk jobs We plan to implement parametric and collection jobs as DAG jobs, as done in the WMS CREAM CLI in the build system Still the circular dependency problem to be addressed EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
9
Safe interactive access to jobs
Tobia Conforto joined us for his “BSC” University stage Resumed the work about interactive access to job General idea One-way interactivity: job → user Let a user monitor her job’s stdout, stderr and output files in real time In detail Interactive read-only access to a running job’s environment The CREAM JobId is the only parameter needed Remote ps, top, ls, cat and tail-like functionality on the Worker Node Intelligent browsing of remote files: client-side hex viewer and view-like functionality only trasfers needed chunks of the remote file as needed GUI clients are possible, although not currently scheduled Why Inspection of long-running jobs: the user is not blind to the job’s progress, she can make an informed decision on whether to stop it or let it run Early sampling of a batch of jobs’ correct operation can save considerable amounts of possibly wasted resources Faster turnaround of debug sessions, trial runs and other kinds of tests EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
10
Safe interactive access to jobs
How glite-authenticated SOAP messages ssh as the local user C++ Client Specific CE Webservice Worker Node (Internet) (CE LAN) Security considerations Access to the service is subject to the same authentication as CREAM is The user has only access to worker nodes where one of her jobs is running She may only issue a fixed set of commands, none of which can alter files User-supplied arguments are strictly parsed against shell escaping Privacy considerations SOAP messages, including all traffic payload, are encrypted with SSL The set of files / directories / devices the user has read access to on the worker node is restricted by the same OS file permissions as her job’s Additional filters can restrict the commands to the job’s working directories EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
11
CREAM: “external links”
GRIDCC GRIDCC is integrating CREAM submission in their portal Based on Java clients that we provided, as they requested This has been shown in the recent GridCC EU review We maintain a CREAM installation, deployed on a small LSF farm in Legnaro Support to Laura Del Cano (Elettra, Trieste) who is doing the work AVANADE Software company with whom we had a meeting some time ago Interesting in evaluating our stuff and possibly collaborating with us Trying to deploy CREAM in their .NET environment They need a document/literal version of the services Provided for CREAM The problem is with the delegation stuff Pinged the security group, but it looks like they are not going to do it in the short term EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
12
XYZ ICE Found a better (than XYZ) name for the WMS component dealing with submissions to CREAM CEs: ICE ICE: Interface to Cream Environment Isn’t “ICE-CREAM integration” nice ? Contacted first the GridICE team They didn’t see problems, even if the ICE name can make people think about GridICE (and viceversa) But this couldn’t be bad ICE in CVS (org.glite.wms.ice) but not yet linked to the build system EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
13
ICE (Interface to Cream Environment)
ICE is the software component acting as an interface between the WMS and CREAM CEs Operations initially handled by ICE Job submissions Job removals ICE is being developed as a stand-alone process Written in C++ It will be investigated if it can be a WM thread for the future At the moment it is under heavy development; many features are missing Jobs right now are polled to get status changes In the future, there will be an additional ICE thread which will receive notifications from CEMon coupled with CREAM CEs EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
14
WMS-CREAM integration / 1
ICE takes the job management requests from its filelist ICE manages the submission to CREAM (see next slides); ICE keeps the mapping between the GridjobId and CREAMjobId This mapping is critical. It is essential that ICE remembers which job it controls The mapping is for the moment kept on-disk, using a journal to record updates To be investigated of LBproxy can be used Failed submissions are reinserted into the WM’s filelist as in the current implementation (JC+LM) ICE features: Multithreaded (the submitter and status poller are two separate threads) Uses log4cpp for logging debug messages Tries to be fault tolerant NS WMProxy Helpers FileList MM WM JA FileList FileList ICE Submitter Poller JC+LM Condor CREAM EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
15
ICE: what we have so far Implemented To do
Submission to a CREAM CE is working Support for multiple CREAM CEs done sequentially right now The job status poller is working fine Removal (cancel) of a job is coming soon To do Job status change listener via CEMon Extending ICE to handle submission to multiple CREAM CEs In parallel (being implemented) LB logging “Lease” submission protocol Proxy Renewal All tests are being done with a stand-alone ICE to easily identify where problems are located Requests are inserted into the “WM” filelist via a testing tool which simulates the WM inserting requests in the filelist “True” WM integration will be done when everything is tested enough EGEE JRA1-ITCZ cluster meeting. Torino, November 2005
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.