EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Infrastructure overview Arnold Meijster &
Advertisements

Cracow - CYFRONET PACKAGING pack into portable format e.g. rpm PACKAGING pack into portable format e.g. rpm PACKAGING pack into portable format e.g. rpm.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Configuring and Maintaining EGEE Production.
Publication and Protection of Site Sensitive Information in Grids Shreyas Cholia NERSC Division, Lawrence Berkeley Lab Open Source Grid.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Real Time Monitor of Grid Job Executions Janusz Martyniak Imperial College London.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite IPv6 compliance project tests Further.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System on gLite middleware Vincent.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The network monitoring in grid context Operations.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GStat 2.0 Joanna Huang (ASGC) Laurence Field.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Bazaar Vision Ideas of RC/VO coordination,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
Enabling Grids for E-sciencE INFSO-RI Tools for CIC Operations, Bologna, 24th May Monitoring workflow in EGEE GOC DB is used to get the list.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
Grid Deployment Enabling Grids for E-sciencE BDII 2171 LDAP 2172 LDAP 2173 LDAP 2170 Port Fwd Update DB & Modify DB 2170 Port.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Operational Architecture of PL-Grid project M.Radecki,
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America gLite Information System Claudio Cherubino.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Resource Allocation in EGEEIII Overview &
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Communication tools between Grid Virtual.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CIC portal Requirements from users WLCG service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Monitoring Tools E. Imamagic, SRCE CE.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deliverable DSA1.4 Jules Wolfrat ARM-9 –
Tim Dyce Australia-ATLAS Experiences from the other end of the network.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Alistair.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CharonGUI A Graphical Frontend on top of.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite – UNICORE interoperability Daniel Mallmann.
Mardi 8 mars 2016 Status of new features in CIC Portal Latest Release of 22/08/07 Osman Aidel, Hélène Cordier, Cyril L’Orphelin, Gilles Mathieu IN2P3/CNRS.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System Tutorial Laurence Field.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROCs Top 5 Middleware Issues Daniele Cesini,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Configuration Data or “What should be.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
INFSO-RI Enabling Grids for E-sciencE VOCE & AUGER User Support a Current State & Future Plans Jan Kmuníček, Jiří Chudoba CESNET.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid is a Bazaar of Resource Providers and.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
E-science grid facility for Europe and Latin America Updates on Information System Annamaria Muoio - INFN Tutorials for trainers 01/07/2008.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations automation team presentazione.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks IT ROC: Vision for EGEE III Tiziana Ferrari.
Enabling Grids for E-sciencE EGEE-II INFSO-RI ROC managers meeting at EGEE 2007 conference, Budapest, October 1, 2007 Admin Matters Vera Hanser.
Information System Evolution Enabling Grids for E-sciencE EGEE-III INFSO-RI LDAP LDAP_ADD LDAP_MODIFY Query Merge Update Provider Plugin LDIF.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios Grid Monitor E. Imamagic, SRCE OAT.
Piotr Bała, Marcin Radecki, Krzysztof Benedyczak
Jean-Philippe Baud, IT-GD, CERN November 2007
Job monitoring and accounting data visualization
BDII Performance Tests
EGEE Middleware: gLite Information Systems (IS)
Information System (BDII)
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information Service in EGEE J. Astalos, Ł. Flis, M. Radecki, W. Ziajka CYFRONET, IISAS EGEE CE Regional Operations Centre Cracow Grid Workshop '07

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 2 Outline Background BDII and EGEE Information Service Performance issues in large-scale production Grid Problem analysis: Central European case Improvements: quick fix and long term Network load and stress tests results Summary & future work

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 3 Top level BDII Service Information service based on OpenLDAP catalogue service and LDBM as database backend delivered with gLite middleware Contains info about all EGEE services & resources Used to discover these services/resource state and their attributes Indispensable for proper operation of –Resource Brokers, to find grid clusters and match jobs –File Transfer Service, to find file transfer endpoints –Replica Management Services, to find necessary services –... “Freshness” of information: 2-3 minutes Around 30 MB of data in total, constantly grows

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 4 EGEE - Information System Architecture TOP-LEVEL BDII Information system architecture Grid Resource Information Services SITE BDII Site 1 Site 2 GRISes provide information about basic services Site BDII combines all services available at a site and exposes them to the „world” Top-level BDIIs are the single point of access to entire infrastructure Information propagates top- down i.e. Site BDII asks GRISes, then Top-level BDII scans sites for information

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 5 Production Infrastructure Issues EGEE Production infrastructure size –Rapid growth from some tens to more than 250 sites –More CPUs added, more users coming, more jobs running (~20k jobs running at any time)‏ –Virtual Organizations matured to run large-scale computations No way to provide info. service at one host & location –BDII updates its DB each 2-3 minutes –Scans all EGEE site BDIIs (~30 MB download)‏ –Reloads entire LDAP database –Rebuilds DB indices –~30MB of data searched by clients (users, services, jobs)‏ Need for regional Top Level BDIIs –Each of 11 EGEE regions deployed a dedicated machine for it –Still too much load on the service

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 6 Problem analysis: CE case Central European Region –24 sites, ~1k jobs on average BDII Service Load –Operation  ~20 GB of daily data download from sites at 2 min. update rate –Client side  65 MB streamed to clients per minute  71 queries/minute  1,1 MB/query – are all services using BDII efficiently? Network Load –Number of Top-level BDIIs in EGEE Grid  Officially registered in GOC DB: 72  Derived from Site BDII logs: 130 –Each TL BDII downloads all Site BDII info  e.g. CYFRONET-LCG2 site BDII size: ~0.26 MB  At 2 min. TL BDII update rate gives ~1GB/hour from site

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 7 Quick Fixes in CE region Migrate to Scientific Linux 4 –Allowed to use BDB (Berkeley Database) in OpenLDAP backend BDII DNS pool –The BDII service is stateless – easy for DNS round-robin pool (service distributed to Croatia, Czech Republic and Poland under common name bdii.cyf-kr.edu.pl )‏ –Load balancing –Failover – clients just use another instance from DNS pool –Special procedure for downtime or maintenance periods Database Indices –Speeds-up most common queries

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 8 Current Architecture Each TL BDII asks for entire site info generating huge network load BDII could limit to changed entries only DB is reloaded after each update, indices need to be rebuild Hierarchy could be improved at the top level Site 1 Current BDII architecture Regional TL-BDII Pool Site 2Site...Site 3Site n GRISes Information system clients

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 9 Our approach: Architecture Improved Concept of using difference info instead of downloading entire site info – easy with LDAP timestamps, less bandwidth used Improved hierarchy at the top level – only one host (Master) scans all sites and provides information to Slave BDIIs Slave BDIIs exchange only differences with Master Modify TL BDII DB on the fly by ldapmodify tool - no need to restart the DB and rebuild indices First prototype implemented and running at IISAS and CYFRONET Site 1 Target BDII architecture Site 2Site...Site 3Site n Master TL-BDII Information system clients Slave TL-BDII Ldapmodify rejects entries not compliant to schema automatically

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 10 Results – Network Load Measured amount of data needed for one TL BDII update cycle Amount of data transferred scales linearly with number of sites Using differences instead of entire DB limits data exchange to ~40% compared to classic approach. Could be improved more if: –disk space not in kBytes –ERT time not in seconds – these are the most frequent changing fields –Dynamic and static data separated (do not download entire record if only 5% changed)‏ Additional compression can reduce data transferred to 10%

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 11 Results - Stress tests Two twin machines for tests –Glite 3.1 TL BDII service –Differential TL BDII Stress tests scheme –Up to 60 ldapsearch processes “attacking” TL BDII from different hosts –Measure response time for one simple query asked from a probe host Reply >30sec. raise an error Differential TL BDII shows better performance –No DB reload – DB update via. ldapmodify tool –Less network intensive update

Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 12 Summary and future work In real production grids a problem of services scalability is essential In CE ROC a stable instance of BDII service, distributed geographically in 3 locations is run Space for improvements in EGEE Top level BDII to reduce data transfers and improve service robustness –Improve hierarchy at top level by introducing Slave TL BDII –Make use of difference info instead of all info and compress it before sending –Separation of dynamic/static data –Efficiency of WMS matchmaking process can be improved by VO level BDIIs with information only relevant to VOs supported by WMS. Especially significant for small or regional VOs. A new differential BDII was deployed by CE ROC and stress tests results are promising. Need to validate it.