Outline Introduction Objectives Motivation Expected Output

Slides:



Advertisements
Similar presentations
1 EGI Federated Clouds Task Force HEPiX Spring 2012 Workshop Matteo Turilli
Advertisements

EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Unified Middleware Distribution (UMD): SW provisioning to EGI Mario David.
EMI SA2: Quality Assurance (EMI-SA2 Work Package) Alberto Aimar (CERN) WP Leader.
Costantini Alessandro INFN - IGI MPI-VT within EGI 23/03/2012Costantini A. – MPI-Multicore1.
Enabling Grids for E-sciencE SGE J. Lopez, A. Simon, E. Freire, G. Borges, K. M. Sephton All Hands Meeting Dublin, Ireland 12 Dec 2007 Batch system support.
Towards a WBEM-based Implementation of the OGF GLUE Information Model Sergio Andreozzi, INFN-CNAF, Bologna (Italy) Third EGEE User Forum 13 Feb 2008, Clermont-Ferrand,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
EMI INFSO-RI Accounting John Gordon (STFC) APEL PT Leader.
Automatic Resource & Usage Monitoring Steve Traylen/Flavia Donno CERN/IT.
JRA1.4 Models for implementing Attribute Providers and Token Translation Services Andrea Biancini.
Accounting Update John Gordon and Stuart Pullinger January 2014 GDB.
State of Georgia Release Management Training
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Accounting Old and New Requirements John Gordon Revised 22/3/12.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
AEGIS Academic and Educational Grid Initiative of Serbia Antun Balaz (NGI_AEGIS Technical Manager) Dusan Vudragovic (NGI_AEGIS Deputy.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
CERN - IT Department CH-1211 Genève 23 Switzerland t IT-GD-OPS attendance to EGEE’09 IT/GD Group Meeting, 09 October 2009.
EMI INFSO-RI Testbed for project continuous Integration Danilo Dongiovanni (INFN-CNAF) -SA2.6 Task Leader Jozef Cernak(UPJŠ, Kosice, Slovakia)
Probes Requirement Review OTAG-08 03/05/ Requirements that can be directly passed to EMI ● Changes to the MPI test (NGI_IT)
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Regionalisation summary Prague 1.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI GLUE 2: Deployment and Validation Stephen Burke egi.eu EGI OMB March 26 th.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Accounting Requirements Stuart Pullinger STFC 09/04/2013 EGI CF – Accounting.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarksEGEE-III INFSO-RI MPI on the grid:
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Regional tools use cases overview Peter Solagna – EGI.eu On behalf of the.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Evaluation of Liferay modules EGI-InSPIRE mini-project Gergely Sipos EGI.eu.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI UMD Platform Documentation Álvaro Simón CESGA 02/12/2011 SA2 TF2011, Amsterdam1.
Accounting Update John Gordon. Outline Multicore CPU Accounting Developments Cloud Accounting Storage Accounting Miscellaneous.
Storage Accounting John Gordon STFC GDB, Lyon 6 th April2011 GDB January 2012.
Maria Alandes Pradillo, CERN Training on GLUE 2 information validation EGI Technical Forum September 2013.
© 2008 Open Grid Forum Production Grid Infrastructure (PGI) 101 Morris Riedel, Balazs Konya, Moreno Marzolla OGF PGI Working Group Co-Chairs.
Implementation of GLUE 2.0 support in the EMI Data Area Elisabetta Ronchieri on behalf of JRA1’s GLUE 2.0 Working Group INFN-CNAF 13 April 2011, EGI User.
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI APEL Regional Accounting Alison Packer (STFC) Iván Díaz Álvarez (CESGA) APEL.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI VT-GPGPU Working Group First Conference Call.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI IPv6 Report for HEPiX CERN October 5, 2012 CERN 1
Accounting Review Summary and action list from the (pre)GDB Julia Andreeva CERN-IT WLCG MB 19th April
CREAM Status and plans Massimo Sgaravatto – INFN Padova
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI MPI VT report OMB Meeting 28 th February 2012.
Grid Engine batch system integration in the EMI era
1 EGI Federated Cloud Architecture Matteo Turilli Senior Research Associate, OeRC, University of Oxford Chair – EGI Federated Clouds Task Force
SA2.6 Task: EMI Testbeds Danilo Dongiovanni INFN-CNAF.
EMI is partially funded by the European Commission under Grant Agreement RI EMI Status And Plans Laurence Field, CERN Towards an Integrated Information.
CESGA QR2 SA1-SWE Partner Coordination Meeting 2 CICA, Sevilla
Transition to EGI PSC-06 Istanbul Ioannis Liabotis Greece GRNET
Il Sistema di Supporto INFNGrid & GGUS (Global Grid User Support )
EGI Operations Management Board
John Gordon STFC OMB 26 July 2011
OGF PGI – EDGI Security Use Case and Requirements
TSA2.2 QC Change Management
Project Management Processes
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
EMI Interoperability Activities
Global Banning List and Authorization Service
PRACE-EGI helpdesk integration
Towards an Integrated Information system: the EMI view
Proposal for obtaining installed capacity
Report on SLA progress Ioannis Liabotis <ilaboti at grnet.gr>
ServiceNow Implementation Knowledge Management
NA3: User Community Support Team
Maite Barroso, SA1 activity leader CERN 27th January 2009
MPI probes OMB Meeting 26th February 2013
Action U-E-5 Technical Coordination – User Technical Support
Infrastructure Area EMI All Hands Summary.
EMI: dal Produttore al Consumatore
Discussions on group meeting
Health Ingenuity Exchange - HingX
New Types of Accounting Beyond CPU
GLOBUS ACCOUNTING USING GRID-SAFE - DEMO
Presentation transcript:

VT MPI within EGI Community Forum 2012 Álvaro Simón (CESGA) Zdnek Sustr (CESNET)

Outline Introduction Objectives Motivation Expected Output Status Report Open tasks and progress Open issues Future work

Introduction MPI VT created in November 2011 Supposed to end on May 2012 Wiki page: https://wiki.egi.eu/wiki/VT_MPI_within_EGI Mailing list: vt-mpi@mailman.egi.eu Objectives Collaborate with MPI user communities Promote MPI deployment Enhance MPI experience and support for EGI users. Output Improve the communications between MPI users and developers within EGI SA3. Improve / create new MPI documentation for EGI users and administrators. Find and detect current issues and bottlenecks related with MPI jobs. Set up a VO on EGI with site committed to support MPI jobs.

Task 1: Documentation Motivation: MPI documentation was fragmented in different places. Too much site admin / infrastructure oriented. Progress: Documentation reviewed and extended by MPI VT members. MPI Administrator guide: https://wiki.egi.eu/wiki/MAN03 new sections to clarify MPI doubts Working on an MPI user guide Propose to concentrate documentation under a single endpoint

Task 2: IS Motivation: Few sites are publishing correctly MPI tags and flavours Some MPI info is omitted, missing or incorrect. Incoherent information published by queues supporting MPI Progress: Survey the information published by sites Inspect which GLUE 1.3/2.0 variables could be used in the MPI context and ask for its correct implementation GLUE 1.3 “MaxSlotsPerJobs” could be already used but only one site publishes a different value wrt the default “999999999” Detected problems in the CPU / WallClockTime relationship published by sites. May lead to the early killing of parallel jobs

Task 3: Nagios Probes Motivation: Current SAM nagios probes are quite simple. It does not detect if a MPI sites is really working fine or not. Progress: A new set of nagios probes specifications were created by MPI members: https://wiki.egi.eu/wiki/VT_MPI_within_EGI:Nag ios Probe to check information system Set of probes to check the MPI functionality (EGEE MPI WG recommendations) Request the creation of a new GOCDB Service Type GOCDB acting like an ITIL CMDB: all configuration items should be registered Release the dependency from the information system Allows accurate infrastructure information about infrastructure status

Task 4: Accounting Motivation: The only way to recognize a parallel jobs in the accounting system and looking for jobs with efficiency >1 APEL is not ready to registry MPI usage records. This work is on progress and tracked by JRA1.4 task and EMI. New APEL with MPI capabilities will be ready at the end of summer. Progress: MPI VT members are in contact with EMI developers to describe the parallel usage record for the different batch systems. Hopefully this will lead to some requirements to an accounting system able to correctly record parallel usage

Task 5: LRMS status Motivation: Detect LRMS MPI issues. Check MPI LRMS status support. Progress: CREAM GE EMI release fully supports MPI. Tested by MPI VT members. Issue with Torque / Maui MPI support (GGUS tickets #57828 #67870) Not supported by EMI but by a third party. EPEL includes torque server / client packages and an external EMI repo includes MAUI updates. Issue raised to EGI SA1/2 and to EMI

Task 6: MPI-Kickstart VO Motivation: MPI VO bring together sites and users interested in MPI. This VO is NOT intended for everyday use by all users wishing to use MPI. The main reason for its establishment is to collect experience that will be later adopted by regular VOs. Progress: MPI-Kicktart VO established and populated MPI VO configuration is available in the MPI VT wiki page. Resources currently provided by CESGA and CESNET More welcome Resolving issues as they arrive

Sites Feedback NGI Italy has presented a survey about MPI usage. This is allow the detection of problems from user perspective. Slides available in indico: https://www.egi.eu/indico/materialDisplay.py?contribId=1&materialId=slides&confId=828 NGI_SK has presented a report to check MPI JDL attributes required by different experiments: https://www.egi.eu/indico/getFile.py/access?contribId=4&resId=0&materialId=0&confId=828

Open Issues Batch system support: It was discussed with EMI. Torque batch system is not supported by EMI. Is a third-party piece of software. Torque server/client are maintained by Steve Traylen (not in EMI). These packages are included into EPEL repo. MAUI scheduler is also maintained by Steve Traylen (not in EMI) and updates an external CERN repo.

Future Work Continue to unify and simplify MPI documentation Inspect how the IS schema can adjust to specific LRMS policies, and enforce the correct values May involve interaction with LRMS IP developers Tested “new” probes functionality Should be developed by SA3 Include new sites to support VO MPI For testing purposes. Provide requirements to a new accounting system able to support MPI jobs

Questions Questions?