BDII Performance Tests Felix Ehm CERN IT/GD
Content The BDII GLUE Schema BDII Performance Tests Future Introduction Architecture GLUE Schema Purpose Latest News BDII Performance Tests Reasons Test Setup Relational vs. LDAP backend Results Future Felix Ehm, CERN 2008
The BDII Felix Ehm, CERN 2008
The BDII What is it ? What is it used for ? Who uses it ? Berkley Database Information Index Main purpose : provide a way to discover services in a Grid Infrastructure Evolved from Globus MDS Uses the OpenLDAP server (and Berkley database) internally What is it used for ? Publishing resource/service status info Matchmaking of jobs/resources Monitoring Accounting Who uses it ? nearly every gLite component (SE, CE, WMS, UI, ..) Felix Ehm, CERN 2008
The BDII Architecture One core component (BDII) Site-, Resource- and Top-Level BDII only differ by their configuration Information flow follows ‘pull’ principle Uses OpenLDAP to pull/store/provide information Example for Top-Level: Information Flow Incoming Requests Serving old requests Port forwarder BDII Serving new requests Provider Site-BDII Site-BDII Site-BDII Felix Ehm, CERN 2008
GLUE Schema Felix Ehm, CERN 2008
GLUE Schema What is it ? Latest News : Grid Laboratory Unified Environment defines a common conceptual data model to be used for Grid resource/service discovery Working group part of the OpenGridForum (OGF) Available as Version 1.3 ( http://forge.ogf.org/ ) Latest News : GLUE 2.0 in progress : Elaborated in respect of 1.3 problems Not backward compatible to 1.3 Computing schema almost finished Storage schema now hot topic When deployed ? Felix Ehm, CERN 2008
BDII Performance Tests Felix Ehm, CERN 2008
BDII Performance Tests Why ? No existing performance characterization User complains about request timeouts What do we test ? Request handling rate Effects on data size (currently 250 sites ~ 30Mb) How well do we scale (when do timeouts occur) ? In fact, we test the OpenLDAP server Felix Ehm, CERN 2008
BDII Performance Tests Test setup: 9 dedicated worker nodes Issuing parallel a number of one/mixed queries against 1 top level BDII instance for a time period of x seconds 15 sec timeout limit Bunch of bash scripts for Preparing the machines Executing the test Tune test results Ignore results at beginning Watch the system in a ‘stable’ state Felix Ehm, CERN 2008
BDII Performance Tests Relational vs. LDAP data model test setup LDAP2SQL conversion tool (https://twiki.cern.ch/twiki/bin/view/Main/BDIIRelationalDBBackend) 30K LDIF entries ~120K rows MySQL 4.1, same hardware as OpenLDAP server Oracle 10.2 RAC, 2 node database cluster Also tested for completion Native OpenLDAP client connects, searches, disconnects Diffcult to do the same for relational database Not a normal scenario for a relational DB Felix Ehm, CERN 2008
BDII Performance Tests Client Execution Time Test Which client implementation for LDAP vs. relational model test ? Reason Minimize client execution latency Find common client Comparison: Result No common (fast) implementation PERL for relational Native OpenLDAP client for LDAP Felix Ehm, CERN 2008
BDII Performance Tests Results BDII Performance Tests Felix Ehm, CERN 2008
Results OpenLDAP server with indexed/nonindexed DB Indexed DB nearly 100 times faster then nonindexed CPU load on indexed DB ~10 times lower More CPU capacity for other requests to handle Felix Ehm, CERN 2008
Results Comparison of OpenLDAP Software 2.2 (SLC4), 2.1 (SLC3) and 2.2 on 4 core machine Version 2.2 scales much better than 2.1 on same hardware At 90 parallel requests ~ 20% faster than 2.1 Version 2.2 on 4 core machine ~ 65% faster than on DualCore ( 32% speedup/core) ~ 117% faster than 2.1 Felix Ehm, CERN 2008
Results Multiple Queries issuing against a running top-level BDII instance with 3 switching DBs Felix Ehm, CERN 2008
MySQL, Oracle and LDAP multi query results Each worker node spawns one request continuously Felix Ehm, CERN 2008
Result Effect of Data Size Currently ~ 30Mb OpenLDAP serves data very well (close to network interface limit): Clients retrieve requested information within the given timeout (15s) Datasize: 100K 1MB 10MB Parallel Requests: ~2000 ~200 ~18 Felix Ehm, CERN 2008
Result Relational Model vs. LDAP Returned data size different although information content is the same OpenLDAP server sends also the objectclass and attribute names Small dataset (169 Entries) MySQL ~70% faster Oracle ~429% faster Big dataset (8185 Entries) MySQL ~411% faster Oracle ~1500% faster Felix Ehm, CERN 2008
Conclusion BDII However: Indexes help a lot to improve performance Handles ~100 parallel requests with small dataset very well (< 2sec) Clients are advised to use queries which result in a small dataset NO (objectClass=*) SEARCHES ! However: Adding full content every refresh cycle loads the machine Implementations of a relational model showed better performance should be considered for future developments Felix Ehm, CERN 2008
Future Felix Ehm, CERN 2008
Future Compressed content exchange Data is exchanged in compressed format 30MB LDIF is reduced to 1.4MB Speeds up fetching data from site-level BDIIs Decrease information age Prototype ready Splitting dynamic and static information Reduce amount of data being populated More Information on plans : https://twiki.cern.ch/twiki//bin/view/EGEE/InfoPlan Support : http://twiki.cern.ch/twiki/bin/view/EGEE/InformationSystem http://twiki.cern.ch/twiki/bin/view/EGEE/BDII http://twiki.cern.ch/twiki/bin/view/EGEE/GIP http://twiki.cern.ch/twiki/bin/view/EGEE/GlueUse http://twiki.cern.ch/twiki/bin/view/EGEE/InfoTrouble Felix Ehm, CERN 2008
Questions ? Felix Ehm, CERN 2008