BDII Performance Tests

Slides:



Advertisements
Similar presentations
Welcome to Middleware Joseph Amrithraj
Advertisements

Zookeeper at Facebook Vishal Kathuria.
1 INDIACMS-TIFR TIER-2 Grid Status Report IndiaCMS Meeting, Sep 27-28, 2007 Delhi University, India.
KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division.
A. Cavalli - F. Semeria INFN Experience With Globus GIS 1 A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania, 9-11 April 2001 INFN Experience.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Real Time Monitor of Grid Job Executions Janusz Martyniak Imperial College London.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Felix Ehm CERN IT-GD EGEE 2008 GLUE 2.0.
MACIASZEK, L.A. (2001): Requirements Analysis and System Design. Developing Information Systems with UML, Addison Wesley Chapter 6 - Tutorial Guided Tutorial.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System on gLite middleware Vincent.
Open Search Office Web Services Database Doc Mgt Sys Pipeline Index Geospatial Analysis Text Search Faceting Caching Query parsing Clustering Synonyms.
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
Grid Deployment Enabling Grids for E-sciencE BDII 2171 LDAP 2172 LDAP 2173 LDAP 2170 Port Fwd Update DB & Modify DB 2170 Port.
Hadoop IT Services Hadoop Users Forum CERN October 7 th,2015 CERN IT-D*
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America gLite Information System Claudio Cherubino.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
E-infrastructure shared between Europe and Latin America gLite Information System(s) Manuel Rubio del Solar CETA-CIEMAT EELA Tutorial, Mérida,
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System Tutorial Laurence Field.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Andrea Sciabà Ideal information system - CMS Andrea Sciabà IS.
INFSO-RI Enabling Grids for E-sciencE Summary of NAREGI discussions on GLUE/CIM Sergio Andreozzi INFN-CNAF March 24, 2006 – JRA1.
FESR Trinacria Grid Virtual Laboratory gLite Information System Muoio Annamaria INFN - Catania gLite 3.0 Tutorial Trigrid Catania,
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Research Infrastructures Grant Agreement n
LCG/gLite BDII performance measurements Lev Shamardin Scobeltsyn Institute of Nuclear Physics, Moscow State University.
INFSO-RI Enabling Grids for E-sciencE Information and Monitoring Status and Plans GridPP-DB, IC London, 12 July, 2007.
E-science grid facility for Europe and Latin America Updates on Information System Annamaria Muoio - INFN Tutorials for trainers 01/07/2008.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Implementation of GLUE 2.0 support in the EMI Data Area Elisabetta Ronchieri on behalf of JRA1’s GLUE 2.0 Working Group INFN-CNAF 13 April 2011, EGI User.
The Grid Information System Maria Alandes Pradillo IT-SDC White Area Lecture, 4th June 2014.
Information System Evolution Enabling Grids for E-sciencE EGEE-III INFSO-RI LDAP LDAP_ADD LDAP_MODIFY Query Merge Update Provider Plugin LDIF.
EMI is partially funded by the European Commission under Grant Agreement RI EMI Status And Plans Laurence Field, CERN Towards an Integrated Information.
WP4 meeting Heidelberg - Sept 26, 2003 Jan van Eldik - CERN IT/FIO
The Information System in gLite middleware
The advances in IHEP Cloud facility
The EDG Testbed Deployment Details
gLite Information System
Database Replication and Monitoring
Installation and configuration of a top BDII
gLite Information System(s)
Practical: The Information Systems
Consulting Services JobScheduler Architecture Decision Template
VirtualGL.
The Information System in gLite
Sergio Fantinel, INFN LNL/PD
Platform as a Service.
Information System Virginia Martín-Rubio Pascual
gLite Information System
Short update on the latest gLite status
Torrent-based software distribution
gLite Information System
Interoperability & Standards
gLite Information System(s)
The Globus Toolkit™: Information Services
Ch 4. The Evolution of Analytic Scalability
Support for ”interactive batch”
EGEE Middleware: gLite Information Systems (IS)
Information and Monitoring System
MORE ON ARCHITECTURES The main reasons for using an architecture are maintainability and performance. We want to structure the software into reasonably.
gLite Information System
Performance And Scalability In Oracle9i And SQL Server 2000
The gLite Information System
Information System (BDII)
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

BDII Performance Tests Felix Ehm CERN IT/GD

Content The BDII GLUE Schema BDII Performance Tests Future Introduction Architecture GLUE Schema Purpose Latest News BDII Performance Tests Reasons Test Setup Relational vs. LDAP backend Results Future Felix Ehm, CERN 2008

The BDII Felix Ehm, CERN 2008

The BDII What is it ? What is it used for ? Who uses it ? Berkley Database Information Index Main purpose : provide a way to discover services in a Grid Infrastructure Evolved from Globus MDS Uses the OpenLDAP server (and Berkley database) internally What is it used for ? Publishing resource/service status info Matchmaking of jobs/resources Monitoring Accounting Who uses it ? nearly every gLite component (SE, CE, WMS, UI, ..) Felix Ehm, CERN 2008

The BDII Architecture One core component (BDII) Site-, Resource- and Top-Level BDII only differ by their configuration Information flow follows ‘pull’ principle Uses OpenLDAP to pull/store/provide information Example for Top-Level: Information Flow Incoming Requests Serving old requests Port forwarder BDII Serving new requests Provider Site-BDII Site-BDII Site-BDII Felix Ehm, CERN 2008

GLUE Schema Felix Ehm, CERN 2008

GLUE Schema What is it ? Latest News : Grid Laboratory Unified Environment defines a common conceptual data model to be used for Grid resource/service discovery Working group part of the OpenGridForum (OGF) Available as Version 1.3 ( http://forge.ogf.org/ ) Latest News : GLUE 2.0 in progress : Elaborated in respect of 1.3 problems Not backward compatible to 1.3 Computing schema almost finished Storage schema now hot topic When deployed ? Felix Ehm, CERN 2008

BDII Performance Tests Felix Ehm, CERN 2008

BDII Performance Tests Why ? No existing performance characterization User complains about request timeouts What do we test ? Request handling rate Effects on data size (currently 250 sites ~ 30Mb) How well do we scale (when do timeouts occur) ? In fact, we test the OpenLDAP server Felix Ehm, CERN 2008

BDII Performance Tests Test setup: 9 dedicated worker nodes Issuing parallel a number of one/mixed queries against 1 top level BDII instance for a time period of x seconds 15 sec timeout limit Bunch of bash scripts for Preparing the machines Executing the test Tune test results Ignore results at beginning Watch the system in a ‘stable’ state Felix Ehm, CERN 2008

BDII Performance Tests Relational vs. LDAP data model test setup LDAP2SQL conversion tool (https://twiki.cern.ch/twiki/bin/view/Main/BDIIRelationalDBBackend) 30K LDIF entries ~120K rows MySQL 4.1, same hardware as OpenLDAP server Oracle 10.2 RAC, 2 node database cluster Also tested for completion Native OpenLDAP client connects, searches, disconnects Diffcult to do the same for relational database Not a normal scenario for a relational DB Felix Ehm, CERN 2008

BDII Performance Tests Client Execution Time Test Which client implementation for LDAP vs. relational model test ? Reason Minimize client execution latency Find common client Comparison: Result No common (fast) implementation PERL for relational Native OpenLDAP client for LDAP Felix Ehm, CERN 2008

BDII Performance Tests Results BDII Performance Tests Felix Ehm, CERN 2008

Results OpenLDAP server with indexed/nonindexed DB Indexed DB nearly 100 times faster then nonindexed CPU load on indexed DB ~10 times lower More CPU capacity for other requests to handle Felix Ehm, CERN 2008

Results Comparison of OpenLDAP Software 2.2 (SLC4), 2.1 (SLC3) and 2.2 on 4 core machine Version 2.2 scales much better than 2.1 on same hardware At 90 parallel requests ~ 20% faster than 2.1 Version 2.2 on 4 core machine ~ 65% faster than on DualCore ( 32% speedup/core) ~ 117% faster than 2.1 Felix Ehm, CERN 2008

Results Multiple Queries issuing against a running top-level BDII instance with 3 switching DBs Felix Ehm, CERN 2008

MySQL, Oracle and LDAP multi query results Each worker node spawns one request continuously Felix Ehm, CERN 2008

Result Effect of Data Size Currently ~ 30Mb OpenLDAP serves data very well (close to network interface limit): Clients retrieve requested information within the given timeout (15s) Datasize: 100K 1MB 10MB Parallel Requests: ~2000 ~200 ~18 Felix Ehm, CERN 2008

Result Relational Model vs. LDAP Returned data size different although information content is the same OpenLDAP server sends also the objectclass and attribute names Small dataset (169 Entries) MySQL ~70% faster Oracle ~429% faster Big dataset (8185 Entries) MySQL ~411% faster Oracle ~1500% faster Felix Ehm, CERN 2008

Conclusion BDII However: Indexes help a lot to improve performance Handles ~100 parallel requests with small dataset very well (< 2sec) Clients are advised to use queries which result in a small dataset NO (objectClass=*) SEARCHES ! However: Adding full content every refresh cycle loads the machine Implementations of a relational model showed better performance should be considered for future developments Felix Ehm, CERN 2008

Future Felix Ehm, CERN 2008

Future Compressed content exchange Data is exchanged in compressed format 30MB LDIF is reduced to 1.4MB Speeds up fetching data from site-level BDIIs Decrease information age Prototype ready Splitting dynamic and static information Reduce amount of data being populated More Information on plans : https://twiki.cern.ch/twiki//bin/view/EGEE/InfoPlan Support : http://twiki.cern.ch/twiki/bin/view/EGEE/InformationSystem http://twiki.cern.ch/twiki/bin/view/EGEE/BDII http://twiki.cern.ch/twiki/bin/view/EGEE/GIP http://twiki.cern.ch/twiki/bin/view/EGEE/GlueUse http://twiki.cern.ch/twiki/bin/view/EGEE/InfoTrouble Felix Ehm, CERN 2008

Questions ? Felix Ehm, CERN 2008