Report on the INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group

Slides:



Advertisements
Similar presentations
Installation and evaluation of the Globus toolkit WP 1 INFN-GRID Workload management WP 1 DATAGRID WP 2.1 INFN-GRID Massimo Sgaravatto INFN Padova.
Advertisements

INFN & Globus activities Massimo Sgaravatto INFN Padova.
WP 1 (Globus) Status Report Massimo Sgaravatto INFN Padova for the INFN Globus group
Work Package 1 Installation and Evaluation of the Globus Toolkit Massimo Sgaravatto INFN Padova.
Author - Title- Date - n° 1 GDMP The European DataGrid Project Team
Evaluation of the Globus Toolkit: Status Roberto Cucchi – INFN Cnaf Antonia Ghiselli – INFN Cnaf Giuseppe Lo Biondo – INFN Milano Francesco Prelz – INFN.
Andrew McNab - EDG Access Control - 14 Jan 2003 EU DataGrid security with GSI and Globus Andrew McNab University of Manchester
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
CMS HLT production using Grid tools Flavia Donno (INFN Pisa) Claudio Grandi (INFN Bologna) Ivano Lippi (INFN Padova) Francesco Prelz (INFN Milano) Andrea.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
INFN Grid Information Services G. Lo Biondo INFN Milano (presented by F. Prelz)
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
Andrew McNab - Manchester HEP - 6 November Old version of website was maintained from Unix command line => needed (gsi)ssh access.
Status of Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Massimo Cafaro GridLab Review GridLab WP10 Information Services Massimo Cafaro CACT/ISUFI University of Lecce, Italy.
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group
Report on the INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group
GRID Workload Management System Massimo Sgaravatto INFN Padova.
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Massimo Sgaravatto INFN Padova.
Grid Computing, B. Wilkinson, 20046c.1 Globus III - Information Services.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Status of Globus activities within INFN (update) Massimo Sgaravatto INFN Padova for the INFN Globus group
INFN experience with Globus GIS A. Cavalli - F. Semeria INFN Grid Information Services workshop CERN, March 2001.
First ideas for a Resource Management Architecture for Productions Massimo Sgaravatto INFN Padova.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
GRID Centralized management of the Globus grid-mapfile Carlo Rocca INFN, Catania.
EDG - WP1 (Grid Work Scheduling) Status and plans Massimo Sgaravatto - INFN Padova Francesco Prelz – INFN Milano.
INFN-GRID Globus evaluation (WP 1) Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
STAR scheduling future directions Gabriele Carcassi 9 September 2002.
GRID The GRID distribution toolkit at INFN Flavia Donno (INFN Pisa) Andrea Sciaba` (INFN Pisa) Zhen Xie (INFN Pisa) presented by Massimo Sgaravatto (INFN.
A. Cavalli - F. Semeria INFN Experience With Globus GIS 1 A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania, 9-11 April 2001 INFN Experience.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
Grid Workload Management Massimo Sgaravatto INFN Padova.
First attempt for validating/testing Testbed 1 Globus and middleware services WP6 Meeting, December 2001 Flavia Donno, Marco Serra for IT and WPs.
Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
Andrew McNab - Manchester HEP - 11 May 2001 Packaging / installation Ready to take globus from prerelease to release. Alex has prepared GSI openssh.
GRID Zhen Xie, INFN-Pisa, on DataGrid WP6 meeting1 Globus Installation Toolkit Zhen Xie On behalf of grid-release team INFN-Pisa.
Globus – Part II Sathish Vadhiyar. Globus Information Service.
Proposal for a IS schema Massimo Sgaravatto INFN Padova.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
GRID Centralized Management of the Globus grid-mapfile Carlo Rocca, INFN Catania.
GRID The GRID distribution toolkit at INFN Flavia Donno (INFN Pisa) Andrea Sciaba` (INFN Pisa) Zhen Xie (INFN Pisa) presented by Massimo Sgaravatto (INFN.
Condor on WAN D. Bortolotti - INFN Bologna T. Ferrari - INFN Cnaf A.Ghiselli - INFN Cnaf P.Mazzanti - INFN Bologna F. Prelz - INFN Milano F.Semeria - INFN.
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.
Andrew McNab - Globus Distribution for Testbed 1 Globus Distribution for Testbed 1 Andrew McNab, University of Manchester
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001.
EDG - WP1 (Grid Work Scheduling) Status and plans Massimo Sgaravatto INFN Padova.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
Tests at Saclay D. Calvet, A. Formica, Z. Georgette, I. Mandjavidze, P. Micout DAPNIA/SEDI, CEA Saclay Gif-sur-Yvette Cedex.
Active Directory Domain Services (AD DS). Identity and Access (IDA) – An IDA infrastructure should: Store information about users, groups, computers and.
First evaluation of the Globus GRAM service Massimo Sgaravatto INFN Padova.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
Workload Management Workpackage
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Installation toolkit and deployment of Globus in Pisa
Wide Area Workload Management Work Package DATAGRID project
GRID Workload Management System for CMS fall production
Presentation transcript:

Report on the INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group

Why Globus ? Some basic services (security, information services, resource management, …) must be deployed in order to implement and use a Grid for real applications Globus identified as possible Grid framework providing these services Need to assess Globus packages (effectiveness, completeness, robustness, ease of use, …)  WP “Installation and Evaluation of the Globus Toolkit” of the INFN-GRID Project Goal: evaluation of the Globus toolkit Which services can be useful ? What is necessary to integrate/modify ? What is missing ?

Globus activities within INFN Activities driven by the following work plan Evaluation of Globus security services Evaluation of Grid Information Service Evaluation of Globus services for resource management Evaluation of Globus tools for data management Evaluation of Globus HBM for fault monitoring Evaluation of Globus GEM for execution environment management Globus deployment and installation tools Not only a simple evaluation Some existing shortcomings addressed Specific configurations and customizations implemented INFN-GRID Globus evaluation activities performed between June 2000 and January 2001 “Official” Globus (1.1.4 for MPICH-G2) release tested

Globus security services The Globus GSI security model seems to satisfy the INFN community current requirements on security One time login mechanism Use of X509 certificates Possibility for extending relations of trust to multiple CA’s without having to interfere with their X.500 naming scheme Some shortcomings Need for limited (by scope or purpose) proxies Memory leaks in the GAA library Cryptic diagnostics Interface between GSI and AFS Hopefully addressed with gsiklog No tools for group management Hopefully addressed with CAS

INFN customizations on security INFN-CA CRL distribution Centralized management of the grid-mapfile Goal: Ease the sharing of the same access policies (represented by the grid-mapfiles) for groups of hosts with common purposes Proposed system Central repository (LDAP server) to store user certificates (subjects) and to define groups of users Certificates published by CA manager Group manager responsible for editing group memberships (using a LDAP client) Resource owners (Globus administrators) periodically (i.e. cron job) “connect” to this repository, “download” the subject of the certificates that meet a specified criterion (e.g. all users of group X), and produce grid-mapfile entries

Globus Information Services INFN implemented a hierarchical structure of GIS based on geographical entities Site GIIS’s Local GRIS’s registered at the site GIIS Root GIIS where local GIIS’s are registered

Dc=mi, Dc=infn, dc=it,o=grid Milano GIIS Dc=pd,Dc=infn, dc=it,o=grid Top Level INFN GIIS Dc=infn,dc=it, o=grid Padova INFN GIS Topology GRIS

GIIS GRIS GIIS …….. 1 st level query focus on a set of resources 2 nd and 3 rd level query Get more updated info root GIIS High Availability ldbm backend (?) GIIS replication (?) A global view Scheduling/ Resource discovery

Globus Information Services Problems Performance Querying the root GIIS server, on the worst case the whole namespace must be searched The overall response time is limited by the slowest response of a descendant Poor GRIS performance (shell backend) Example (querying a site GIIS): ~ 1 sec. When cache is on ~ 5-10 sec. When cache expired and GIIS and GRIS not busy > 1 min. when cache expired and GRIS busy

Globus Information Services Problems Pull model Mixed push/pull model more suitable Security and access controls Any GRIS can register itself to a GIIS No access control when searching the GIS Fault tolerance No automatic failover mechanisms

Globus Information Services Other INFN customisations INFN-GIS browser Tools (MRTG based) to monitor LDAP servers Entries returned Connections On-going MDS-2.1 alpha evaluation

cgi-bin/mdsbrowse1.pl INFN-GIS browser

Resource Management Evaluation of Globus GRAM Focus on possible use of GRAM as uniform interface to different underlying local resource management systems Tests with Condor, LSF and PBS as LRMS INFN WAN Condor pool as Globus resource The model is fine, but lack of “robustness” (needed for real production environments) Memory leaks in the Globus job manager Fixes provided by our group were fed back to Globus Scalability (one job manager for each job) Reliability (the job manager is not persistent) Hopefully addressed with the new jobmanager (by Condor team) Globus GRAM integrated in the first workload management system prototype of the DataGrid project

INFN WAN Condor pool Single pool To optimize CPU usage of all INFN hosts > 200 machines Mainly Linux and Digital Unix machines Spread in the different INFN sites Sub-pools To define policies/priorities on resource usage Multiple checkpoint servers To guarantee the performance and the efficiency of the system To reduce network traffic for checkpointing activity General purpose computing facility for all INFN users Different kinds of applications Allocation time for Condor jobs January 00 – December 00: > 45 years

Resource Management GRAM Reporter (Information providers) in particular for farms Many useless attributes (at least for our needs), attributes not calculated (always defined as 0), some attributes not properly calculated by Globus shell scripts Some important information describing the farms and the submitted jobs (necessary for example for a resource broker) missing  We are addressing this problem in the context of the DataGrid Project Submission of Condor jobs to Globus resources Condor-G Useful as a reliable job submission service Persistent queue of jobs Logging information Exploitation of the new persistent Globus jobmanager (hopefully in the next release) Reliable (two phase commit) submission protocol (hopefully in the next release) Exploited in the first workload management system prototype of the DataGrid project as job submission service GlideIn Evaluation of MPICH-G2 vs. MPICH Some shortcomings found (lack of support for shared memory, worse latency performance for small messages wrt. MPICH)

Data management Tests with GASS Tests with GridFTP alpha release 2 Capability of resuming an interrupted file transfer successfully tested Support for the GSI authentication mechanisms successfully tested Throughput tests Increasing number of parallel streams and fixed file size Increasing file size and fixed number of streams Increasing TCP buffer size Increasing block size

Other services Fault Monitoring (HBM) Evaluation of HBM for fault detection (for “system” and “user” processes) … but the HBM package is not seeing active development Execution Environment Management (GEM) Evaluation of GEM as service for “code migration” … but Globus now provides only limited capabilities (executable staging)

Globus installation tools Various problems installing and deploying Globus using the standard install procedures Installation and configuration partially manual (error prone) Very long compilation time No hooks for local customizations...  INFN-GRID Globus installation toolkit To shorten the installation time of the Globus toolkit Support for specific customisations Quick distribution of patches Support for distribution of new tools and packages

INFN-GRID Installation toolkit Characteristics Distribution of binary files Distribution of the packages needed to install/use Globus Distribution of various Globus flavoured compilations (kerberos, MPICH, AFS) Support for the most used platforms in the HENP community (Linux RH, Solaris) Binary file relocation supported Latest patches included (e.g. fixes for Globus jobmanager memory leaks) Support for local customisations (hook to support different CA’s, support for different GIS configurations, support for different LRMS,…) Support for distribution of new tools and packages (certretrieve, GDMP, …) Upgrade and uninstall procedures Documentation Proven to be successful Used to setup a INFN GRID Testbed and also outside (CERN, FNAL, …) Used as installation tool for DataGrid Testbed 0

Conclusions The Globus toolkit can provide basic services useful to create and deploy usable Grids, but various shortcomings and issues must be addressed Other info Report on the INFN-GRID Globus Evaluation evaluation.pdf Response from Globus team to “Report on the INFN-GRID Globus Evaluation”