Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.

Slides:



Advertisements
Similar presentations
Workload Management David Colling Imperial College London.
Advertisements

EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
Job Submission The European DataGrid Project Team
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
Special Jobs Claudio Cherubino INFN - Catania. 2 MPI jobs on gLite DAG Job Collection Parametric jobs Outline.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
Expanding scalability of LCG CE A.Kiryanov, PNPI.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
Glite I/O Storm Testing in EDG-LCG Framework Elena Slabospitskaya, Vadim Petukhov, (IHEP, Russia) Gilbert Grosdidier, (CNRC, France) NEC'2005, Sept 16.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) WMPROXY API Python & C++ Diego Scardaci
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
Nadia LAJILI User Interface User Interface 4 Février 2002.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.
Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.
The EDGeS project receives Community research funding 1 SG-DG Bridges Zoltán Farkas, MTA SZTAKI.
Grid infrastructure analysis with a simple flow model Andrey Demichev, Alexander Kryukov, Lev Shamardin, Grigory Shpiz Scobeltsyn Institute of Nuclear.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Using gLite API Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
INFSO-RI Enabling Grids for E-sciencE The gLite Workload Management System Elisabetta Molinari (INFN-Milan) on behalf of the JRA1.
Jan 31, 2006 SEE-GRID Nis Training Session Hands-on V: Standard Grid Usage Dušan Vudragović SCL and ATLAS group Institute of Physics, Belgrade.
FRANEC and BaSTI grid integration Massimo Sponza INAF - Osservatorio Astronomico di Trieste.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
ATLAS Production System Monitoring John Kennedy LMU München CHEP 07 Victoria BC 06/09/2007.
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
INFSO-RI Enabling Grids for E-sciencE Claudio Cherubino, INFN Catania Grid Tutorial for users Merida, April 2006 Special jobs.
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
INFSO-RI Enabling Grids for E-sciencE Job Workflows with gLite Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January 2006.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
1 DIRAC Job submission A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Job sandboxes.
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMS tricks & tips – further scripting Giuseppe.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
User Interface UI TP: UI User Interface installation & configuration.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
EGEE is a project funded by the European Union under contract IST GENIUS and GILDA Guy Warner NeSC Training Team Induction to Grid Computing.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
Practical using C++ WMProxy API advanced job submission
Information System testing for LCG-1
Summary on PPS-pilot activity on CREAM CE
Patricia Méndez Lorenzo ALICE Offline Week CERN, 13th July 2007
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
Short update on the latest gLite status
gLite Advanced Job Management
Job Submission M. Jouvin (LAL-Orsay)
Presentation transcript:

Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of various software features Task 4: Creating the specialized testbed for developing test suites Task 5: Preparing intermediate and final reports PNPI – Yu. Ryabov, N. Klopov

Plans for the second year 1.Development of the stress and performance tests for WMS and CE according with requests from developers and/or certification team 2. New gLite 3.1 middleware installation on the testbed

Requirements to the test 1.Submit a large number of jobs simultaneously 2.Submit jobs from one or many users. 3.Monitoring of a load of CE and WMS during testing. 4.Monitoring of jobs status (pass through system’s components) on the CE and WMS during testing. 5.Storing of status information for all submitted jobs. 6.Possibility of express visual analysis of results.

Functional schema of the test Monitoring CE Monitoring WMS Job submitter UI Parametric job ….. Parametric job Data collector Jobs logging info Monitoring data

Jobs submission Job submission program (several scripts) has the following input parameters: -u- the number of the users -x- path to the directory with users proxy certificates (x1- path to the user proxy certificate) -n -the number of the subjobs from each user -s- time interval between jobs status request -t -max time of the test execution -a- the time of a subjob will execute on WN -l- path to the logfile

Monitoring These scripts run on CE and WMS and provide receiving and saving information about load average and system processes names. The script runs with the following parameters: –t - pool time –l -request for load average –p -request for process names Load average ~The quantity of active processes (from UNIX)

Data collector The Data collector script is executed after finish of all jobs and does the following: -copy monitoring data from WMS and CE; -request the event time information for each subjob, using glite-wms-job-logging- info command; -preliminary data processing (formatting);

Parametric job Parametric job functionality was used to solve the problem of simultaneous submission of large number of jobs to CE. Parametric job is a set of jobs (subjobs) with the same descriptions apart from the values of the parametric attributes. JobType = "Parametric"; Executable = "tst.sh"; InputSandbox = {“tst.sh", "input_PARAM_.txt"}; StdOutput = "out_PARAM_.txt"; StdError="err_PARAM_.txt"; OutputSandbox={"out_PARAM_.txt", "err_PARAM_.txt"}; Parameters=1000; ParameterStart=0; ParameterStep=1; Parametric attributes get values from 0 to 999. WMS will create individual subjob for each parameter value. N=(Parameters-ParameterStart)/ParameterStep subjobs will be created Both main parametric job and its subjobs will have unique IDs.

Testbed gLite 3.1 middleware was installed on the testbed: CE WMS+LB+ BDII UI WN

Measurement of “load average” as function of time under the following condition: N jobs from each of K users Test usage in PNPI: 1000 jobs from each user (1 user, 3 users) for “old” and “new” versions gLite; Old - we had been using till January 2008 New (with marshal patches) - we have been using since January 2008 New version with marshal patches was released to production 10 April 2008 (gLite update 23) Marshal patches was developed by A.Kiryanov (PNPI) Test usage

Aim is to improve behavior of LCG CEs under load by regulating requests from job managers (hence the term ‘marshal’) due to : 1.Eliminate the necessity to recompile heavy Perl code on every job manager invocation Memory-persistent daemons handle the requests 2.Control of the number of simultaneously running job manager queries Decreases load on file system and batch system 3.Prevent CE overload by WMSes 4.Decrease system’s losses Jobs complete faster, especially visible with large number of short jobs Marshal patches for LCG CE

CE monitoring

WMS monitoring

Express visual analysis (WEB viewer) Each job passes through the different WMS components (the corresponding events are generated and stored in LB. Example of these events: “RegJob,NetworkServer”, “Match,WorkloadManager”,…,”Done, LogMonitor”). It gives the possibility to evaluate the performance of the WMS components. The WEB viewer was developed to provide the visual representation of events timestamp for the jobs running through the different components. This viewer provide the following functions: - to choose the event type which will be sorted by the timestamp value; - to choose data file with logging info data; - to get the graph of the event time since job registration in WMS for each job; - to choose the additional event type (will be represented on the same graph); - to get and store graph data as text file for the future analysis; - to get ID and logging info data for the subjobs those lost the chosen events; - to view monitoring data.

Express visual analysis Transfer (source- Logmonitor destination- LRMS) Accepted (source Logmonitor)

Express visual analysis We can view the monitoring data

Summary  The testbed was created with the gLite 3.1  A complex test was developed which provide the following:  Submission of the large number of jobs from many users  Load average monitoring on WMS and CE  Data acquisition of the test results  Developed test has been used on concrete sets of input parameters  HTML viewer was developed for the presentation of test results

Summary (First year of the grant) Set of WMS tests (control of functionality) was developed according to the request from gLite certification team for the following types of jobs: parametric, interactive, checkpointable, partitionable. Long and complex JDL stress test (for estimation of critical size of file) Some of the tests were included into certification SAM framework. 5 bugs were found and submitted in Savannah.

Conclusion Task 2 (PNPI)- done Task 4 (PNPI)-done Task 5 –under preparation (together with collaborating teams)