Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.

Slides:



Advertisements
Similar presentations
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Advertisements

Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
Special Jobs Claudio Cherubino INFN - Catania. 2 MPI jobs on gLite DAG Job Collection Parametric jobs Outline.
Expanding scalability of LCG CE A.Kiryanov, PNPI.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
Glite I/O Storm Testing in EDG-LCG Framework Elena Slabospitskaya, Vadim Petukhov, (IHEP, Russia) Gilbert Grosdidier, (CNRC, France) NEC'2005, Sept 16.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) WMPROXY API Python & C++ Diego Scardaci
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Computational grids and grids projects DSS,
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
Nadia LAJILI User Interface User Interface 4 Février 2002.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.
Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
INFSO-RI Enabling Grids for E-sciencE The gLite Workload Management System Elisabetta Molinari (INFN-Milan) on behalf of the JRA1.
FRANEC and BaSTI grid integration Massimo Sponza INAF - Osservatorio Astronomico di Trieste.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
INFSO-RI Enabling Grids for E-sciencE Claudio Cherubino, INFN Catania Grid Tutorial for users Merida, April 2006 Special jobs.
INFSO-RI Enabling Grids for E-sciencE Job Workflows with gLite Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January 2006.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
1 DIRAC Job submission A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Job sandboxes.
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
EGEE is a project funded by the European Union under contract IST GENIUS and GILDA Guy Warner NeSC Training Team Induction to Grid Computing.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
FESR Trinacria Grid Virtual Laboratory Practical using WMProxy advanced job submission Emidio Giorgio INFN Catania.
Practical using C++ WMProxy API advanced job submission
Information System testing for LCG-1
MCproduction on the grid
WP1 WMS release 2: status and open issues
Workload Management System ( WMS )
Practical: The Information Systems
Summary on PPS-pilot activity on CREAM CE
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Patricia Méndez Lorenzo ALICE Offline Week CERN, 13th July 2007
Accounting at the T1/T2 Sites of the Italian Grid
Grid2Win: Porting of gLite middleware to Windows XP platform
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
Introduction to Grid Technology
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Short update on the latest gLite status
Job workflow Pre production operations:
5. Job Submission Grid Computing.
gLite Advanced Job Management
a VO-oriented perspective
SCL, Institute of Physics Belgrade, Serbia
Application development on EGEE with P-GRADE Portal
EGEE Middleware: gLite Information Systems (IS)
gLite Job Management Christos Theodosiou
a middleware implementation
Job Submission M. Jouvin (LAL-Orsay)
Presentation transcript:

Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of various software features Task 4: Creating the specialized testbed for developing test suites Task 5: Preparing intermediate and final reports PNPI – Yu. Ryabov, N. Klopov

Plans for the second year Development of the stress and performance tests for WMS and CE according with requests from developers and/or certification team 2. New gLite 3.1 middleware installation on the testbed

Requirements to the test Submit a large number of jobs simultaneously Submit jobs from one or many users. Monitoring of a load of CE and WMS during testing. Monitoring of jobs status (pass through system’s components) on the CE and WMS during testing. Storing of status information for all submitted jobs. Possibility of express visual analysis of results.

Functional schema of the test Monitoring Monitoring CE WMS Monitoring data Jobs logging info Jobs logging info UI ….. Data collector Job submitter zapuskaet v backgrounde neskol’ko scriptov, kagdii iz kotorih zapuskaet I monitoriruet parametric job ot imeni konkretnogo usera. Na WMS and CE pered testirovaniem zapuskaetsya monitor, kotorii sledit za load average I zagrugennimi processami. Posle zaversheniya testa Job collector zabiraet dannie monitorirovaniya s CE I WMS, a takge zaprashivaet po komande glite-wms-job-logging-info statusnuyu informaciyu dlya vseh subjobs vsex parametric jobs. Eta informaciya m.b. peredana dlya express visual analysis na web site. P.S. V kachestve programmi, kotoraya zapuskaetsya na work nodaes, used simple bash script which sleep zadannoe chislo secund (esli sprosyat) Parametric job Parametric job Job submitter

Jobs submission Job submission program (several scripts) has the following input parameters: u- the number of the users x- path to the directory with users proxy certificates (x1- path to the user proxy certificate) n -the number of the subjobs from each user s- time interval between jobs status request t -max time of the test execution a- the time of a subjob will execute on WN l- path to the logfile

These scripts run on CE and WMS and provide Monitoring These scripts run on CE and WMS and provide receiving and saving information about load average and system processes names. The script runs with the following parameters: t - pool time l -request for load average p -request for process names Load average ~The quantity of active processes (from UNIX)

copy monitoring data from WMS and CE; Data collector The Data collector script is executed after finish of all jobs and does the following: copy monitoring data from WMS and CE; request the event time information for each subjob, using glite-wms-job-logging-info command; preliminary data processing (formatting); Data processing – formatirovanie dannix chtobi bilo udobno obrabativat’ na saite v cgi scriptah

Parametric job Parametric job functionality was used to solve the problem of simultaneous submission of large number of jobs to CE. Parametric job is a set of jobs (subjobs) with the same descriptions apart from the values of the parametric attributes. JobType = "Parametric"; Executable = "tst.sh"; InputSandbox = {“tst.sh", "input_PARAM_.txt"}; StdOutput = "out_PARAM_.txt"; StdError="err_PARAM_.txt"; OutputSandbox={"out_PARAM_.txt", "err_PARAM_.txt"}; Parameters=1000; ParameterStart=0; ParameterStep=1; Parametric attributes get values from 0 to 999. WMS will create individual subjob for each parameter value. N=(Parameters-ParameterStart)/ParameterStep subjobs will be created Both main parametric job and its subjobs will have unique IDs.

Testbed WN WMS+LB+ BDII CE WN WN UI WN gLite 3.1 middleware was installed on the testbed: WN WMS+LB+ BDII CE WN WN UI WN

Test usage Measurement of “load average” as function of time under the following condition: N jobs from each of K users Test usage in PNPI: 1000 jobs from each user (1 user, 3 users) for “old” and “new” versions gLite; Old - we had been using till January 2008 New (with marshal patches) - we have been using since January 2008 New version with marshal patches was released to production 10 April 2008 (gLite update 23) Marshal patches was developed by A.Kiryanov (PNPI)

Marshal patches for LCG CE Aim is to improve behavior of LCG CEs under load by regulating requests from job managers (hence the term ‘marshal’) due to : Eliminate the necessity to recompile heavy Perl code on every job manager invocation Memory-persistent daemons handle the requests Control of the number of simultaneously running job manager queries Decreases load on file system and batch system Prevent CE overload by WMSes Decrease system’s losses Jobs complete faster, especially visible with large number of short jobs

CE monitoring Dlya 1 usera net suschestvennih izmenenii dlya staroi I new version

CE monitoring Dlya 3 users (kagdii po 1000 subjobs) – zametnaya raznica po load average and time vipolneniya vseh subjobs

WMS monitoring Zagruzka WMS malen’kaya v oboih sluchayah

Express visual analysis (WEB viewer) Each job passes through the different WMS components (the corresponding events are generated and stored in LB. Example of these events: “RegJob,NetworkServer”, “Match,WorkloadManager”,…,”Done, LogMonitor”). It gives the possibility to evaluate the performance of the WMS components. The WEB viewer was developed to provide the visual representation of events timestamp for the jobs running through the different components. This viewer provide the following functions: - to choose the event type which will be sorted by the timestamp value; to choose data file with logging info data; to get the graph of the event time since job registration in WMS for each job; to choose the additional event type (will be represented on the same graph); - to get and store graph data as text file for the future analysis; - to get ID and logging info data for the subjobs those lost the chosen events; - to view monitoring data. Data processing – formatirovanie dannix chtobi bilo udobno obrabativat’ na saite v cgi scriptah

Express visual analysis Transfer (source- Logmonitor destination- LRMS) Sostav saita – html stranici and cgi-scripts Na picture vidno kak event Running ot Logmonitor otstaet ot event Running P.S. slaid s animaciei Accepted (source Logmonitor)

Express visual analysis We can view the monitoring data Est’ vozmognost’ rassmatrivat’ dannie monitorirovaniya CE and WMS. Link: http://biod.pnpi.spb.ru/totem/test/ctest.html Posle etogo slaida nugen slaid pro novuyu versiyu

Summary The testbed was created with the gLite 3.1 A complex test was developed which provide the following: Submission of the large number of jobs from many users Load average monitoring on WMS and CE Data acquisition of the test results Developed test has been used on concrete sets of input parameters HTML viewer was developed for the presentation of test results

Summary (First year of the grant) Set of WMS tests (control of functionality) was developed according to the request from gLite certification team for the following types of jobs: parametric, interactive, checkpointable, partitionable. Long and complex JDL stress test (for estimation of critical size of file) Some of the tests were included into certification SAM framework. 5 bugs were found and submitted in Savannah.

Conclusion Task 2 (PNPI)- done Task 4 (PNPI)-done Task 5 –under preparation (together with collaborating teams)