CREAM and ICE Test Results

Slides:



Advertisements
Similar presentations
Workload Management David Colling Imperial College London.
Advertisements

EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
SEE-GRID-SCI Hands-On Session: Workload Management System (WMS) Installation and Configuration Dusan Vudragovic Institute of Physics.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
INFSO-RI Enabling Grids for E-sciencE CREAM: a WebService based CE Massimo Sgaravatto INFN Padova On behalf of the JRA1 IT-CZ Padova.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
Grid job submission using HTCondor Andrew Lahiff.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Security and Job Management.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM and ICE Massimo Sgaravatto – INFN Padova.
WP1 WMS rel. 2.0 Some issues Massimo Sgaravatto INFN Padova.
EGEE is a project funded by the European Union under contract INFSO-RI Practical approaches to Grid workload management in the EGEE project Massimo.
OSG AuthZ components Dane Skow Gabriele Carcassi.
Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.
JSS Job Submission Service Massimo Sgaravatto INFN Padova.
WP1 WMS release 2: status and open issues Massimo Sgaravatto INFN Padova.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite – UNICORE interoperability Daniel Mallmann.
EGEE is a project funded by the European Union under contract IST LCG open issues Massimo Sgaravatto INFN Padova JRA1 IT-CZ cluster meeting,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Management Claudio Grandi.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM: current status and next steps EGEE-JRA1.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
CE design report Luigi Zangrando
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
EGEE is a project funded by the European Union under contract IST Padova report Massimo Sgaravatto On behalf of the INFN Padova JRA1 Group.
CREAM Status and plans Massimo Sgaravatto – INFN Padova
EU 2nd Year Review – Feb – WP1 Demo – n° 1 WP1 demo Grid “logical” checkpointing Fabrizio Pacini (Datamat SpA, WP1 )
INFSO-RI Enabling Grids for E-sciencE CREAM, WMS integration and possible deployment scenarios Massimo Sgaravatto – INFN Padova.
EGEE-II INFSO-RI Enabling Grids for E-sciencE IT cluster activity status (Status of WMS & CE) Francesco Prelz – IT.
LCG and Glite open issues Massimo Sgaravatto INFN Padova
Practical using C++ WMProxy API advanced job submission
CEMon
JRA1 IT-CZ cluster meeting Milano, May 3-4, 2004
StoRM: a SRM solution for disk based storage systems
Use of Nagios in Central European ROC
How to connect your DG to EDGeS? Zoltán Farkas, MTA SZTAKI
Security aspects of the CREAM-CE
Workload Management System on gLite middleware
Design rationale and status of the org.glite.overlay component
WP1 WMS release 2: status and open issues
Workload Management System ( WMS )
Summary on PPS-pilot activity on CREAM CE
Preview Testbed Massimo Sgaravatto – INFN Padova
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
BOSS: the CMS interface for job summission, monitoring and bookkeeping
gLite Middleware Status
Massimo Sgaravatto INFN Padova On behalf of the CREAM product team
Massimo Sgaravatto INFN Padova On behalf of the Padova CREAM group
ICE-CREAM Luigi Zangrando On behalf of the JRA1 IT-CZ Padova group
The CREAM CE: When can the LCG-CE be replaced?
Introduction to Grid Technology
Workload Management System
Update on gLite WMS tests
BOSS: the CMS interface for job summission, monitoring and bookkeeping
TCG Discussion on CE Strategy & SL4 Move
Francesco Giacomini – INFN JRA1 All-Hands Nikhef, February 2008
gLite Job Management Christos Theodosiou
GRID Workload Management System for CMS fall production
Presentation transcript:

CREAM and ICE Test Results Alvise Dorigo – INFN Padova Massimo Sgaravatto – INFN Padova EGEE-II JRA-1 All-Hands Meeting 7th-9th November 2006

Updates since last report @ EGEE06 conference in Geneva Several bug fixes and enhancements have been made in CREAM Adoption of new glexec (which fixes bug #20744) required some code changes in BLAH and CREAM Support for JSDL and BES (see next slide) Several bug fixes and enhancements also in ICE Installation of UI and WMS 3.1 (ICE enabled) done on the preview testbed (thanks to CNAF's system administrators) Tests reported in this presentation performed on this layout JRA-1 All Hands Meeting 7th -9th Nov. 2006

Basic Execution Service (BES) Basic Execution Service (BES) is a new GGF specification that defines Web Services interfaces for creating, monitoring and controlling computational entities called activities Very primitive now Doesn’t cover all CREAM functionality Activities in BES are defined using the Job Submission Description Language (JSDL) JSDL: GGF specification for describing the requirements of computational jobs for submission to resources in Grid environments. First implementation of BES support done in CREAM This will be shown at SC’06 (Tampa-FLORIDA) in a interoperability demo with other computational services JRA-1 All Hands Meeting 7th -9th Nov. 2006

Tests performed Direct submission to CREAM CE using the command line CREAM UI Submission to CREAM CE via gLite WMS (ICE enabled) Submission through the WMS/ICE Direct Job Submission using CREAM CLI WMS/ICE CREAM CREAM CREAM JRA-1 All Hands Meeting 7th -9th Nov. 2006

Quick overview of CREAM Arch. Thread pool (# of threads is configurable) Queue of requests (submit, cancel, ...) Request coming from an upper layer of CREAM ... TrustManager extracts the user's DN, VO and other metadata from the user proxy certificate and forwards them to CREAM TrustManager extracts the user's DN, VO and other metadata from the user proxy certificate and forwards them to CREAM TrustManager extracts the user's DN, VO and other metadata from the user proxy certificate and forwards them to CREAM TrustManager extracts the user's DN, VO and other metadata from the user proxy certificate and forwards them to CREAM TrustManager extracts the user's DN, VO and other metadata from the user proxy certificate and forwards them to CREAM TrustManager extracts the user's DN, VO and other metadata from the user proxy certificate and forwards them to CREAM TrustManager extracts the user's DN, VO and other metadata from the user proxy certificate and forwards them to CREAM User sends a request for a job operation (submit/cancel/status...) User sends a request for a job operation (submit/cancel/status...) User sends a request for a job operation (submit/cancel/status...) User sends a request for a job operation (submit/cancel/status...) User sends a request for a job operation (submit/cancel/status...) User sends a request for a job operation (submit/cancel/status...) User sends a request for a job operation (submit/cancel/status...) BLAH JnlManager has an internal logic; e.g. if a cancel immediately follows a submit, the submission isn't executed JnlManager has an internal logic; e.g. if a cancel immediately follows a submit, the submission isn't executed JnlManager has an internal logic; e.g. if a cancel immediately follows a submit, the submission isn't executed JnlManager has an internal logic; e.g. if a cancel immediately follows a submit, the submission isn't executed JnlManager has an internal logic; e.g. if a cancel immediately follows a submit, the submission isn't executed JnlManager has an internal logic; e.g. if a cancel immediately follows a submit, the submission isn't executed JnlManager has an internal logic; e.g. if a cancel immediately follows a submit, the submission isn't executed LRMS (PBS, LSF, ...) CREAM pushes the command into a cache CREAM pushes the command into a cache CREAM pushes the command into a cache CREAM pushes the command into a cache CREAM pushes the command into a cache CREAM pushes the command into a cache CREAM pushes the command into a cache CREAM creates an Object: Command CREAM creates an Object: Command CREAM creates an Object: Command CREAM creates an Object: Command CREAM creates an Object: Command CREAM creates an Object: Command CREAM creates an Object: Command CMD cache CMD cache CMD cache CMD cache CMD cache CMD cache CMD cache JRA-1 All Hands Meeting 7th -9th Nov. 2006 CREAM Journal Manager CREAM Journal Manager CREAM Journal Manager CREAM Journal Manager CREAM Journal Manager CREAM Journal Manager CREAM Journal Manager CMD CMD CMD CMD CMD CMD CMD Before every modification/insertion to the “CMD cache”, Journal Manager logs the changes to a binary journal file. It is read to recover the cache status after a crash. Before every modification/insertion to the “CMD cache”, Journal Manager logs the changes to a binary journal file. It is read to recover the cache status after a crash. Before every modification/insertion to the “CMD cache”, Journal Manager logs the changes to a binary journal file. It is read to recover the cache status after a crash. Before every modification/insertion to the “CMD cache”, Journal Manager logs the changes to a binary journal file. It is read to recover the cache status after a crash. Before every modification/insertion to the “CMD cache”, Journal Manager logs the changes to a binary journal file. It is read to recover the cache status after a crash. Before every modification/insertion to the “CMD cache”, Journal Manager logs the changes to a binary journal file. It is read to recover the cache status after a crash. Before every modification/insertion to the “CMD cache”, Journal Manager logs the changes to a binary journal file. It is read to recover the cache status after a crash. CREAM Main service CREAM Main service CREAM Main service CREAM Main service CREAM Main service CREAM Main service CREAM Main service T1 ... Tn Thread Pool T1 ... Tn Thread Pool T1 ... Tn Thread Pool T1 ... Tn Thread Pool T1 ... Tn Thread Pool T1 ... Tn Thread Pool T1 ... Tn Thread Pool Authorization Framework pipeline Authorization Framework pipeline Authorization Framework pipeline Authorization Framework pipeline Authorization Framework pipeline Authorization Framework pipeline Authorization Framework pipeline authorization for current VO to get access to CREAM (gridmapfile) authorization for current VO to get access to CREAM (gridmapfile) authorization for current VO to get access to CREAM (gridmapfile) authorization for current VO to get access to CREAM (gridmapfile) authorization for current VO to get access to CREAM (gridmapfile) authorization for current VO to get access to CREAM (gridmapfile) authorization for current VO to get access to CREAM (gridmapfile) authorization for current user to get access to CREAM (gridmap file) authorization for current user to get access to CREAM (gridmap file) authorization for current user to get access to CREAM (gridmap file) authorization for current user to get access to CREAM (gridmap file) authorization for current user to get access to CREAM (gridmap file) authorization for current user to get access to CREAM (gridmap file) authorization for current user to get access to CREAM (gridmap file) Is admin ? Is admin ? Is admin ? Is admin ? Is admin ? Is admin ? Is admin ? other pluggable authz policies other pluggable authz policies other pluggable authz policies other pluggable authz policies other pluggable authz policies other pluggable authz policies other pluggable authz policies CREAM Abstract connector CREAM Abstract connector CREAM Abstract connector CREAM Abstract connector CREAM Abstract connector CREAM Abstract connector CREAM Abstract connector A command has a history status: CREATED QUEUED PENDING ERROR SUCCESFULL This history is dumped into the journal file and is used by CREAM after a restart to know which phase of command execution has been succesful before the restart. A command has a history status: CREATED QUEUED PENDING ERROR SUCCESFULL This history is dumped into the journal file and is used by CREAM after a restart to know which phase of command execution has been succesful before the restart. A command has a history status: CREATED QUEUED PENDING ERROR SUCCESFULL This history is dumped into the journal file and is used by CREAM after a restart to know which phase of command execution has been succesful before the restart. A command has a history status: CREATED QUEUED PENDING ERROR SUCCESFULL This history is dumped into the journal file and is used by CREAM after a restart to know which phase of command execution has been succesful before the restart. A command has a history status: CREATED QUEUED PENDING ERROR SUCCESFULL This history is dumped into the journal file and is used by CREAM after a restart to know which phase of command execution has been succesful before the restart. A command has a history status: CREATED QUEUED PENDING ERROR SUCCESFULL This history is dumped into the journal file and is used by CREAM after a restart to know which phase of command execution has been succesful before the restart. A command has a history status: CREATED QUEUED PENDING ERROR SUCCESFULL This history is dumped into the journal file and is used by CREAM after a restart to know which phase of command execution has been succesful before the restart. BLAH connector BLAH connector BLAH connector BLAH connector BLAH connector BLAH connector BLAH connector Other possible batch system implementations... Other possible batch system implementations... Other possible batch system implementations... Other possible batch system implementations... Other possible batch system implementations... Other possible batch system implementations... Other possible batch system implementations... Communication via pipe Communication via pipe Communication via pipe Communication via pipe Communication via pipe Communication via pipe Communication via pipe BLAH process BLAH process BLAH process BLAH process BLAH process BLAH process BLAH process glexec: mapping grid user to local user glexec: mapping grid user to local user glexec: mapping grid user to local user glexec: mapping grid user to local user glexec: mapping grid user to local user glexec: mapping grid user to local user glexec: mapping grid user to local user BATCH SYSTEM (LRMS) BATCH SYSTEM (LRMS) BATCH SYSTEM (LRMS) BATCH SYSTEM (LRMS) BATCH SYSTEM (LRMS) BATCH SYSTEM (LRMS) BATCH SYSTEM (LRMS)

Direct submission tests Stress tests: Submission of an increasing number of jobs from UI @ CNAF (pre-ui-01.cnaf.infn.it) to CREAM CE @ Padova (CEId cream-01.pd.infn.it:8443/cream-pbs-long with 4 worker nodes) Submission of 100 jobs from 1, 2, 5, 10 parallel threads Submission of 250 jobs from 1, 2, 5, 10 parallel threads Submission of 500 jobs from 1, 2, 5, 10 parallel threads Submission of 1000 jobs from 1, 2, 5, 10 parallel threads Submission of 2000 jobs from 1, 2, 5, 10 parallel threads CREAM has been configured with 50 threads (see figure in previous slide) Tests have been made using a pre-delegated proxy Measured values: The number of failed jobs (taking into account the reported failure reasons) The time taken to submit each job to the CREAM CE (i.e. the time needed to get back the CREAM JobID) The time needed to submit the job to the LRMS via BLAH (i.e. the time needed to get the BLAH jobid) JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission results Results : Job JDL: Executable = "test.sh"; StdOutput = "std.out";; InputSandbox = {"gsiftp://grid005.pd.infn.it/Preview/test.sh"} OutputSandbox = “out.out”;; OutputSandboxDestURI = {"gsiftp://grid005.pd.infn.itPreview/Output/std.out"}; #!/bin/sh echo “I am running on `hostname`” echo “I am running as `whoami`” sleep 600 Results : … more tests and details in the CREAM web site (http://grid.pd.infn.it/cream/field.php ) under “Test Results” JRA-1 All Hands Meeting 7th -9th Nov. 2006

Direct submission tests: failures 96 jobs failed because of gridftp problems when transferring input sandbox 29 jobs failed because of problems when using glexec JRA-1 All Hands Meeting 7th -9th Nov. 2006

Some conclusions on the direct job submission tests Submission time to CREAM looks good Submission time to LRMS (schedule time) is not too good Necessary to perform some profiling to better understand where most of the time is spent Some tests increasing/decreasing the number of CREAM threads (and see how performance is impacted) are needed as well Overall efficiency is good but can be improved gridftp problem when transferring input sandbox is not completely understood, even if we suspect on BLAH bug #20357 JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission through WMS (ICE vs JC+Condor+LM) Testbed configuration WMS, BDII, UI @ INFN-CNAF A single CREAM CE @ INFN-PADOVA configured with 50 threads (cream-01.pd.infn.it:8443/blah-pbs-long) A single gLite 3.0 CE @ INFN-PADOVA (cert-04.pd.infn.it:8443/blah-pbs-long) What has been measured Efficiency (reporting the number of failed jobs along with the failure reasons) For both JC+Condor+LM and ICE: for each job, the time needed for the submission to the LRMS and the corresponding JC+Condor+LM/ICE throughput Only for ICE: the time needed for the submission to the CREAM CE and the corresponding ICE throughput Not straightforward to distinguish submission to CE vs submission to LRMS in the JC+Condor+LM scenario JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission through WMS (ICE vs JC+Condor+LM) How the tests have been performed ICE/JC is turned OFF Submission of 1000 jobs to the WMS in order to fill the ICE/JC filelist ICE/JC is turned ON, so it can start to satisfy the submission requests deep and shallow resubmission were disabled How the measurements have been performed Tstart = LB timestamp of first ICE/JC dequeued event (i.e. request removed from the filelist) Tstop = LB timestamp of the last “Transferred OK to CE” event (when measuring throuput to submit to CE for ICE scenario) or timestamp of submission event in the BLAH accounting log file* (when measuring throughput to submit to LRMS for both ICE and JC+Condor+LM scenarios) Throughput = # jobs / (Tstop - Tstart) WMS filelist ICE JC + Condor + LM #!/bin/sh echo “I am running on `hostname`” echo “I am running as `whoami`” sleep 600 Job JDL: Executable = "test.sh"; StdOutput = "std.out";; InputSandbox = {"gsiftp://grid005.pd.infn.it/Preview/test.sh"} OutputSandbox = “out.out”;; OutputSandboxDestURI = {"gsiftp://grid005.pd.infn.itPreview/Output/std.out"}; RetryCount = 0; ShallowRetryCount = 0; CREAM gLite CE BLAH BLAH * This timestamp logged by BLAH to the accounting log for DGAS is not very accurate, but the overall results shouldn't be affected too much LRMS LRMS JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission through WMS/ICE Test description ICE throughput test: submission of 1000 jobs to the WMS Performed tests using different configuration of ICE (5, 10, 15, 20, 25, 30 threads) Each ICE thread performs the following operations: parsing of classad request setting up the user credentials (gsoap-plugin) connection to endpoint (trustmanager authentication) and user proxy delegation (getProxyReq(), GRSTx509MakeProxyCert(), putProxy() ) connection and dial with CREAM service Event logging to LB ( jobRegister, wms_dequeued, cream_transfer_start, cream_transfer_ok, etc...) Time_1: time between the first “dequeued by ICE” LB event and the last “Transferred to CE” LB event (see previous slide) Time_2: time between the first “dequeued by ICE” LB event and the last timestamp logged by BLAH in the accounting log file ICE ICE input queue CREAM BLAH LRMS Time 1 Transfer to CE LB event Timestamp logged by BLAH ICE Dequeued LB event Time 2 JRA-1 All Hands Meeting 7th -9th Nov. 2006

Test results 1000 jobs using gLite UI (*) These jobs remain in “Waiting” state, due to a problem in WMproxy-LBproxy interaction; relevant developers have already been informed JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission times to CE and to LRMS (ICE scenario) JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission times to CE and to LRMS (ICE scenario) Massimo Sgaravatto INFN Padova - EGEE06 conference JRA-1 All Hands Meeting 7th -9th Nov. 2006 15

Submission times to CE and to LRMS (ICE scenario) JRA-1 All Hands Meeting 7th -9th Nov. 2006

Some conclusions on the submission via WMS-ICE tests Submission of a single job takes some time ICE performs a new proxy delegation for each new job Delegation consists of a getProxyRequest (I/O bound), signing with user proxy, a putProxy (I/O bound) ICE calls a JobRegister SOAP function (I/O bound) Each relevant operation is logged to LB (I/O bound) Each operation involves authentication, encrypt-decrypt and XML SOAP-based dialog with several services. This takes a while... Some optimization can be performed: E.g. putProxy and JobRegister in the same call Code profiling already on-going to identify problems and optimize performance Considering multiple threads which perform in parallel several of these operations increase the overall throughput ... ... but sooner or later CREAM saturates Please note that in our test we considered a single CREAM CE, but a single ICE istance can of course interact with multiple CEs A larger number of ICE threads can help when dealing with a larger number of different CREAM CEs Efficiency is not too bad, but can be improved Basically same failures reasons and same considerations done for the direct job submission tests JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission through WMS (JC+Condor+LM scenario) Submitted 1000 jobs to the WMS with JC switched off Restarted JC when the JC input queue was full Measured Time: time between the first "dequeued by JC" LB event and the last timestamp logged by BLAH in the accounting log file Calculated throughput Performed 3 tests JC+Condor+LM JC input queue gLite CE BLAH LRMS Timestamp logged by BLAH Measured time JC Dequeued LB event JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission through JC+Condor+LM In the tests considering JC+Condor+LM scenario we observed that at any given time only about 100 jobs were in CE's schedd (and PBS). Therefore the resulting throughput is pretty low (see next slides) Tried to play (thanks to FrancescoP) with Condor config files (to see if there are some limits defined in Condor configuration) Contacted Condor developers: they reported it is a bug in Condor "I have tracked the problem down to a bug in the gridmanager. It ends up ignoring GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE for condor-c jobs. Unfortunately, there is no workaround. We'll have it fixed for the next release." reported by Jamie Frey (Condor Team) JRA-1 All Hands Meeting 7th -9th Nov. 2006

JC+Condor+LM scenario test results (1) 50 jobs aborted, 2 Submission to Condor failed 13 File not available. Cannot read JobWrapper output, both from Condor and from Maradona 5 Job got an error while in the CondorG queue 28 Standard output does not contain useful data. Cannot read JobWrapper output, both from Condor, and from Maradona 2 Removal retries exceeded 7 jobs in waiting (2) 61 jobs aborted 21 Jobs got an error while in the CondorG queue 40 Standard output does not contain useful data. Cannot read JobWrapper output, both from Condor and from Maradona (3) 40 jobs aborted 1 Submission to Condor failed 25 Standard output does not contain useful data. Cannot read JobWrapper output, both from Condor and from Maradona 14 Jobs got an error while in the CondorG queue JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission times to LRMS (JC+Condor+LM scenario) JRA-1 All Hands Meeting 7th -9th Nov. 2006

Submission times to LRMS (JC+Condor+LM scenario) JRA-1 All Hands Meeting 7th -9th Nov. 2006

Next steps Continue testing and debugging of ICE and CREAM perform testing of ICE considering more than a single CREAM CE Another CREAM CE is being installed in Prague for the Preview testbed Continue code profiling to better understand where the bottlenecks are 2 known critical bugs affecting CREAM #18244(just fixed): there was a bug in VomsServicePDP of gJAF Because of this bug we needed to specify all user DNs in the grid- mapfile developers committed the fix yesterday: to be tested (hopefully today) #20357: race condition in BLAH This causes problems in concurrent submissions (in particular when done by different persons) When these 2 bugs get fixed, we should be ready to open the Preview testbed to users interested to test ICE & CREAM More information: CREAM web site: http://grid.pd.infn.it/cream JRA-1 All Hands Meeting 7th -9th Nov. 2006