EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information Service in EGEE J. Astalos, Ł. Flis, M. Radecki, W. Ziajka CYFRONET, IISAS EGEE CE Regional Operations Centre Cracow Grid Workshop '07
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 2 Outline Background BDII and EGEE Information Service Performance issues in large-scale production Grid Problem analysis: Central European case Improvements: quick fix and long term Network load and stress tests results Summary & future work
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 3 Top level BDII Service Information service based on OpenLDAP catalogue service and LDBM as database backend delivered with gLite middleware Contains info about all EGEE services & resources Used to discover these services/resource state and their attributes Indispensable for proper operation of –Resource Brokers, to find grid clusters and match jobs –File Transfer Service, to find file transfer endpoints –Replica Management Services, to find necessary services –... “Freshness” of information: 2-3 minutes Around 30 MB of data in total, constantly grows
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 4 EGEE - Information System Architecture TOP-LEVEL BDII Information system architecture Grid Resource Information Services SITE BDII Site 1 Site 2 GRISes provide information about basic services Site BDII combines all services available at a site and exposes them to the „world” Top-level BDIIs are the single point of access to entire infrastructure Information propagates top- down i.e. Site BDII asks GRISes, then Top-level BDII scans sites for information
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 5 Production Infrastructure Issues EGEE Production infrastructure size –Rapid growth from some tens to more than 250 sites –More CPUs added, more users coming, more jobs running (~20k jobs running at any time) –Virtual Organizations matured to run large-scale computations No way to provide info. service at one host & location –BDII updates its DB each 2-3 minutes –Scans all EGEE site BDIIs (~30 MB download) –Reloads entire LDAP database –Rebuilds DB indices –~30MB of data searched by clients (users, services, jobs) Need for regional Top Level BDIIs –Each of 11 EGEE regions deployed a dedicated machine for it –Still too much load on the service
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 6 Problem analysis: CE case Central European Region –24 sites, ~1k jobs on average BDII Service Load –Operation ~20 GB of daily data download from sites at 2 min. update rate –Client side 65 MB streamed to clients per minute 71 queries/minute 1,1 MB/query – are all services using BDII efficiently? Network Load –Number of Top-level BDIIs in EGEE Grid Officially registered in GOC DB: 72 Derived from Site BDII logs: 130 –Each TL BDII downloads all Site BDII info e.g. CYFRONET-LCG2 site BDII size: ~0.26 MB At 2 min. TL BDII update rate gives ~1GB/hour from site
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 7 Quick Fixes in CE region Migrate to Scientific Linux 4 –Allowed to use BDB (Berkeley Database) in OpenLDAP backend BDII DNS pool –The BDII service is stateless – easy for DNS round-robin pool (service distributed to Croatia, Czech Republic and Poland under common name bdii.cyf-kr.edu.pl ) –Load balancing –Failover – clients just use another instance from DNS pool –Special procedure for downtime or maintenance periods Database Indices –Speeds-up most common queries
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 8 Current Architecture Each TL BDII asks for entire site info generating huge network load BDII could limit to changed entries only DB is reloaded after each update, indices need to be rebuild Hierarchy could be improved at the top level Site 1 Current BDII architecture Regional TL-BDII Pool Site 2Site...Site 3Site n GRISes Information system clients
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 9 Our approach: Architecture Improved Concept of using difference info instead of downloading entire site info – easy with LDAP timestamps, less bandwidth used Improved hierarchy at the top level – only one host (Master) scans all sites and provides information to Slave BDIIs Slave BDIIs exchange only differences with Master Modify TL BDII DB on the fly by ldapmodify tool - no need to restart the DB and rebuild indices First prototype implemented and running at IISAS and CYFRONET Site 1 Target BDII architecture Site 2Site...Site 3Site n Master TL-BDII Information system clients Slave TL-BDII Ldapmodify rejects entries not compliant to schema automatically
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 10 Results – Network Load Measured amount of data needed for one TL BDII update cycle Amount of data transferred scales linearly with number of sites Using differences instead of entire DB limits data exchange to ~40% compared to classic approach. Could be improved more if: –disk space not in kBytes –ERT time not in seconds – these are the most frequent changing fields –Dynamic and static data separated (do not download entire record if only 5% changed) Additional compression can reduce data transferred to 10%
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 11 Results - Stress tests Two twin machines for tests –Glite 3.1 TL BDII service –Differential TL BDII Stress tests scheme –Up to 60 ldapsearch processes “attacking” TL BDII from different hosts –Measure response time for one simple query asked from a probe host Reply >30sec. raise an error Differential TL BDII shows better performance –No DB reload – DB update via. ldapmodify tool –Less network intensive update
Enabling Grids for E-sciencE EGEE-II INFSO-RI Cracow Grid Workshop 12 Summary and future work In real production grids a problem of services scalability is essential In CE ROC a stable instance of BDII service, distributed geographically in 3 locations is run Space for improvements in EGEE Top level BDII to reduce data transfers and improve service robustness –Improve hierarchy at top level by introducing Slave TL BDII –Make use of difference info instead of all info and compress it before sending –Separation of dynamic/static data –Efficiency of WMS matchmaking process can be improved by VO level BDIIs with information only relevant to VOs supported by WMS. Especially significant for small or regional VOs. A new differential BDII was deployed by CE ROC and stress tests results are promising. Need to validate it.