Testing the EGI-DRIHM TestBed

Slides:



Advertisements
Similar presentations
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
Advertisements

SARA Reken- en NetwerkdienstenToPoS | 3 juni 2007 More efficient job submission Evert Lammerts SARA Computing and Networking Services High Performance.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
Riccardo Bruno, INFN.CT Sevilla, 10-14/09/2007 GENIUS Exercises.
MPI support in gLite Enol Fernández CSIC. EMI INFSO-RI CREAM/WMS MPI-Start MPI on the Grid Submission/Allocation – Definition of job characteristics.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Supporting MPI applications on the EGEE Grid.
Enabling Grids for E-sciencE gLite training at Sinaia '06 Victor Penso Kilian Schwarz GSI Darmstadt Germany.
The EDG Testbed Deployment Details The European DataGrid Project
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Luciano Díaz ICN-UNAM Based on Domenico.
EGEE is a project funded by the European Union under contract IST Input from Generic and Testing Roberto Barbera NA4 Generic Applications Coordinator.
OGF 25/EGEE User Forum Catania, March 2 nd 2009 Meta Scheduling and Advanced Application Support on the Spanish NGI Enol Fernández del Castillo (IFCA-CSIC)
Computational grids and grids projects DSS,
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
1 HeMoLab - Porting HeMoLab's SolverGP to EELA glite Grid Environment FINAL REPORT Ramon Gomes Costa - Paulo Ziemer.
Nadia LAJILI User Interface User Interface 4 Février 2002.
E-science grid facility for Europe and Latin America Marcelo Risk y Juan Francisco García Eijó Laboratorio de Sistemas Complejos Departamento.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.
E-science grid facility for Europe and Latin America gLite MPI Tutorial for Grid School Daniel Alberto Burbano Sefair, Universidad de Los.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
Jan 31, 2006 SEE-GRID Nis Training Session Hands-on V: Standard Grid Usage Dušan Vudragović SCL and ATLAS group Institute of Physics, Belgrade.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM and ICE Massimo Sgaravatto – INFN Padova.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
Enabling Grids for E-sciencE Workload Management System on gLite middleware - commands Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi.
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Moisés Hernández Duarte UNAM FES Cuautitlán.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
Weather Research and Forecast implementation on Grid Computing Chaker El Amrani Department of Computer Engineering Faculty of Science and Technology, Tangier.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI MPI and Parallel Code Support Alessandro Costantini, Isabel Campos, Enol.
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
User Interface UI TP: UI User Interface installation & configuration.
LCG2 Tutorial Viet Tran Institute of Informatics Slovakia.
Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni.
Probes Requirement Review OTAG-08 03/05/ Requirements that can be directly passed to EMI ● Changes to the MPI test (NGI_IT)
INFN/IGI contributions Federated Clouds Task Force F2F meeting November 24, 2011, Amsterdam.
GRID commands lines Original presentation from David Bouvet CC/IN2P3/CNRS.
The Finite Difference Time Domain Method FDTD By Dr. Haythem H. Abdullah Researcher at ERI, Electronics Research Institute, Microwave Engineering Dept.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Advanced Job Riccardo Rotondo
LA 4 CHAIN GISELA EPIKH School SPECFEM3D on Science Gateway.
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
FESR Consorzio COMETA - Progetto PI2S2 Jobs with Input/Output data Fabio Scibilia, INFN - Catania, Italy Tutorial per utenti e.
Create an script to print “hello world” in an output file with also the information of an input file. The input file should be previously register in the.
CREAM Status and plans Massimo Sgaravatto – INFN Padova
Grid Engine batch system integration in the EMI era
Grid Computing: Running your Jobs around the World
The EDG Testbed Deployment Details
Stephen Childs Trinity College Dublin
Job Management Exercises
Advanced Topics: MPI jobs
gLite MPI Job Amina KHEDIMI CERIST
Summary on PPS-pilot activity on CREAM CE
Laboratory: Hands-on using EGEE Grid and gLite middleware
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
lcg-infosites documentation (v2.1, LCG2.3.1) 10/03/05
First Latin-american Grid Workshop
Laboratory: Hands-on using EGEE Grid and gLite middleware
Gridifying the LHCb Monte Carlo simulation system
I2G CrossBroker Enol Fernández UAB
Workload Management System
MPI probes OMB Meeting 26th February 2013
CompChem VO: User experience using MPI
5. Job Submission Grid Computing.
Job Management with DATA
login: clermont-ferrandxx password: GridCLExx
gLite Job Management Christos Theodosiou
GENIUS Grid portal Hands on
Presentation transcript:

Testing the EGI-DRIHM TestBed D.Cesini

Preliminary tests Authentication MPI && MPI-START published CE: HelloWorld JDL submission SE: Lcg-rep of WRF input data file lcg-rep -v -d SE srm://darkstorm.cnaf.infn.it/drihm.eu/generated/2013-07-18/file05cf726f-1894-4f08-8531-c516d4144403 (using LFC=lfc.ipb.ac.rs  lfn:/grid/drihm.eu/cesini/genova.tgz) Repeated twice with certificates released by the two replica VOMS servers MPI && MPI-START published Requirements = (other.GlueCEStateStatus == "Production") && Member("MPI-START", other.GlueHostApplicationSoftwareRunTimeEnvironment) Member("OPENMPI", other.GlueHostApplicationSoftwareRunTimeEnvironment) Published Total CPUs CPUs GlueCEInfoTotalCPUs in SubCluster Published OS version GlueHostOperatingSystemRelease

WRF test WRF (v3.4.1) compiled in SL6 using OPENMPI (v1.6.4) and NETCDF libs (v 4.2.1.1) Input data prepared by Antonio Parodi for the Genoa flooding case on 4th Nov 2011 Data available for a run that starts on 4-11-2011 00:00 and ends on 5-11-2011 00:00 Two nested domains, one coarse and one fine integration grid Just one simulated hour run Just the coarse grid used (no nesting) Executable, input data, configuration files (namelist.input) and netcdf libs uploaded in Grid in a tgz file (world-readable) lfn:/grid/drihm.eu/cesini/genova.tgz CPUNumber = 40 (because we have the reference timings obtained at LRZ-LMU by Antonio for 40, 80, 120 processors ) No SMPGranularity required Submitted only if the preliminary tests were OK

WRF JDL CPUNumber = 40; #SMPGranularity = 8; Executable = "/usr/bin/mpi-start"; Arguments = "-t openmpi -x LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./netcdf-lib/ -d MPI_START_TEMP_DIR=\"$HOME/\" -vvv ./wrf.exe"; StdOutput = "std.out"; StdError = "std.err"; InputSandbox = {"wrf-prologue.sh"}; OutputSandbox = {"std.err", "std.out", "prologue.log" , "rsl.out.0000" , "rsl.error.0000" }; Prologue = "wrf-prologue.sh"; Requirements = ( (other.GlueCEStateStatus == "Production") && Member("MPI-START", other.GlueHostApplicationSoftwareRunTimeEnvironment) Member("OPENMPI", other.GlueHostApplicationSoftwareRunTimeEnvironment) && (other.GlueCEUniqueID=="cream-02.cnaf.infn.it:8443/cream-pbs-prod-sl6") ) ; RetryCount = 0; ShallowRetryCount = -1; MyProxyServer="myproxy.cnaf.infn.it"; FuzzyRank = true; [$$] cat wrf-prologue.sh export LFC_HOST=lfc.ipb.ac.rs lcg-cp -v srm://darkstorm.cnaf.infn.it/drihm.eu/generated/2013-07-18/file05cf726f-1894-4f08-8531-c516d4144403 file:genova.tgz tar -xvzf genova.tgz >> prologue.log 2>&1 cp genova/* .

Available Resources (CE) [cesini2@igi-ui ~]$ lcg-infosites --vo drihm.eu ce # CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------------- 408 360 0 0 0 ce.ceta-ciemat.es:8443/cream-sge-drihm 1560 1520 1 1 0 ce.hpgcc.finki.ukim.mk:8443/cream-pbs-drihm 624 302 1 1 0 ce64.ipb.ac.rs:8443/cream-pbs-drihm 358 58 0 0 0 cream-02.cnaf.infn.it:8443/cream-pbs-prod-sl6 180 103 0 0 0 cream-ce01.ariagni.hellasgrid.gr:8443/cream-pbs-drihm 118 21 0 0 0 cream-ce01.marie.hellasgrid.gr:8443/cream-pbs-drihm 4 4 0 0 0 cream-ce02.marie.hellasgrid.gr:8443/cream-pbs-drihm 104 52 0 0 0 cream.afroditi.hellasgrid.gr:8443/cream-pbs-drihm 624 339 1 0 1 cream.ipb.ac.rs:8443/cream-pbs-drihm 196 8 0 0 0 cream01.athena.hellasgrid.gr:8443/cream-pbs-drihm 398 0 2 0 2 cream01.grid.uoi.gr:8443/cream-pbs-drihm 104 48 0 0 0 cream01.kallisto.hellasgrid.gr:8443/cream-pbs-drihm 392 0 1 0 1 cream02.athena.hellasgrid.gr:8443/cream-pbs-drihm 224 123 0 0 444444 cream1.grid.cesnet.cz:8443/cream-pbs-drihm 224 123 0 0 444444 cream2.grid.cesnet.cz:8443/cream-pbs-drihm 3880 0 1213 963 250 dissel.nikhef.nl:2119/jobmanager-pbs-medium 64 64 0 0 0 emi-ce01.scope.unina.it:8443/cream-pbs-hpc 3880 1 0 0 0 gazon.nikhef.nl:8443/cream-pbs-flex 3880 1 1213 963 250 gazon.nikhef.nl:8443/cream-pbs-medium 3880 2 0 0 0 juk.nikhef.nl:8443/cream-pbs-flex 3880 2 1213 963 250 juk.nikhef.nl:8443/cream-pbs-medium 3880 6 0 0 0 klomp.nikhef.nl:8443/cream-pbs-flex 3880 6 1213 963 250 klomp.nikhef.nl:8443/cream-pbs-medium 55 0 1 0 1 snf-10952.vm.okeanos.grnet.gr:8443/cream-pbs-drihm GT5 LRZ-LMU not publishing in the IS on 18/07 available resource for DRIHM.eu - I had no time to investigate with the sites

Available Resources (SE) [cesini2@igi-ui ~]$ lcg-infosites --vo drihm.eu se Avail Space(kB) Used Space(kB) Type SE ------------------------------------------ 149210299 150789700 SRM darkstorm.cnaf.infn.it 8759271843 21543304907 SRM dpm.ipb.ac.rs 175109859 96715642 SRM se.hpgcc.finki.ukim.mk 1004763730 6991387962 SRM se01.afroditi.hellasgrid.gr 2470142232 749367310 SRM se01.ariagni.hellasgrid.gr 8589864576 70016 SRM se01.athena.hellasgrid.gr 242994986 696998531 SRM se01.grid.uoi.gr 2307901070 2976565813 SRM se01.kallisto.hellasgrid.gr 404758881 239343606 SRM se02.marie.hellasgrid.gr 8794743316 812834 SRM tbn18.nikhef.nl

Results Authentication: MPI && MPI-START Published 4 sites failed using both VOMSes proxies on CE and SE 1 sites failed for one of the VOMSes proxies, ok with the other one 1 site Ok on CE but failing on SE 9 sites OK 1 CE at NIKHEF is a GRAM5 based CE and AUTH worked fine MPI && MPI-START Published 3 sites do not publish OPENMPI and MPI-START in GlueHostSoftwareRunTimeEnvironment in all the CEs 1 site does not publish in all CEs OPENMPI and MPI-START 10 sites publish both TAGs in all CEs Published Total CPUs 1 site has one CE publishing just 4 CPUs Published OS version 6 sites pubblish SL6.x - 8 sites pubblish SL5.x The WRF test could be run in 3 sites that passed all the preliminary tests CESNET (prague_cesnet_lcg2 ), BOLOGNA (igi-bologna) and NAPLES (UNINA-EGEE) But it seems that at CESNET 40 cores cannot be allocated for a single job – submitted using 16 cored

40 processors used on every system Performances Time to simulate 1 second in Domain1 (no nesting) during the first simulated hour 40 processors used on every system SUPERMIC@LRZ-LMU (s) IGI-BOLOGNA MPI-40 cores in 2 nodes Ethernet (s) UNINA-EGEE MPI - 40 cores in 8nodes-Infiniband (s) AVG 0.68 1.90 1.98 MIN 0.66 1.72 1.84 MAX 7.3 7.4 8.4 Writing operations 0.30s using 80 processors 0.20s using 120 processors