LCG Storage Management Workshop, CERN, 7th April 2005

Slides:



Advertisements
Similar presentations
The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Advertisements

EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release Plan for 2003 Dirk Düllmann LCG Application Area Meeting, 5 th March 2003.
11 WORKING WITH GROUPS Chapter 7. Chapter 7: WORKING WITH GROUPS2 CHAPTER OVERVIEW  Understand the functions of groups and how to use them.  Understand.
©Company confidential 1 Performance Testing for TM & D – An Overview.
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
LFC tutorial Jean-Philippe Baud, IT-GT, CERN July 2010.
LoadRunner SE Guide 김범수 한국비지네스써비스 ( 주 )
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
Designing and Deploying a Scalable EPM Solution Ken Toole Platform Test Manager MS Project Microsoft.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
18 Copyright © Oracle Corporation, All rights reserved. Workshop.
EGEE Catalogs Peter Kunszt EGEE Data Management Middleware Service Grids NeSC, July 2004 EGEE is a project funded by the.
Daniela Anzellotti Alessandro De Salvo Barbara Martelli Lorenzo Rinaldi.
INFSO-RI Enabling Grids for E-sciencE Distributed Metadata with the AMGA Metadata Catalog Nuno Santos, Birger Koblitz 20 June 2006.
1 Performance Tuning Next, we focus on lock-based concurrency control, and look at optimising lock contention. The key is to combine the theory of concurrency.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
DELETION SERVICE ISSUES ADC Development meeting
INFSO-RI Enabling Grids for E-sciencE Experiences with LFC and comparison with RNS Erwin Laure Jean-Philippe.
EGEE is a project funded by the European Union under contract IST “Interfacing to the gLite Prototype” Andrew Maier / CERN LCG-SC2, 13 August.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.
1 MSRBot Web Crawler Dennis Fetterly Microsoft Research Silicon Valley Lab © Microsoft Corporation.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
Testing the HEPCAL use cases J.J. Blaising, F. Harris, Andrea Sciabà GAG Meeting April,
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
Peck: Transparent Distributed Backup Using Chirp Graduate Operating Systems, Fall 2005 Matthew Van Antwerp December 15, 2005.
FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
AMGA-Bookkeeping Carmine Cioffi Department of Physics, Oxford University UK Metadata Workshop Oxford, 05 July 2006.
21 Copyright © 2008, Oracle. All rights reserved. Enabling Usage Tracking.
DataGrid is a project funded by the European Commission EDG Conference, Heidelberg, Sep 26 – Oct under contract IST OGSI and GT3 Initial.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
1 Thierry Titcheu Chekam 1,2, Ennan Zhai 3, Zhenhua Li 1, Yong Cui 4, Kui Ren 5 1 School of Software, TNLIST, and KLISS MoE, Tsinghua University 2 Interdisciplinary.
SQL Server 2016 – New Features Tilahun Endihnew March 12, 2016.
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
1 PVSS Oracle scalability Target = changes per second (tested with 160k) changes per client 5 nodes RAC NAS 3040, each with one.
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
Testing the Zambeel Aztera Chris Brew FermilabCD/CSS/SCS Caveat: This is very much a work in progress. The results presented are from jobs run in the last.
Oracle Clustering and Replication Technologies UK Metadata Workshop - Oxford Barbara Martelli Gianluca Peco.
Outline Introduction and motivation, The architecture of Tycho,
Jean-Philippe Baud, IT-GD, CERN November 2007
Outline Introduction. Changes made to the Tycho design from last time (June 2005). Example Tycho setup. Tycho benchmark motivations and methodology. Some.
Security and Replication of Metadata with AMGA
Java API del Logical File Catalog (LFC)
3D Application Tests Application test proposals
BDII Performance Tests
Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.
Jean-Philippe Baud - Sophie Lemaitre IT-GD, CERN May 2005
POOL: Component Overview and use of the File Catalog
SAM at CCIN2P3 configuration issues
Partner Status HPCL-University of Cyprus
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
LCG middleware and LHC experiments ARDA project
11/12/2018 6:58 PM © 2004 Microsoft Corporation. All rights reserved.
POOL/RLS Experience Current CMS Data Challenges shows clear problems wrt to the use of RLS Partially due to the normal “learning curve” on all sides in.
Moodle Scalability What is Scalability?
Kundan Singh [please remove this page after merging]
PROOF - Parallel ROOT Facility
INFNGRID Workshop – Bari, Italy, October 2004
Performance And Scalability In Oracle9i And SQL Server 2000
Evaluation of Objectivity/AMS on the Wide Area Network
Presentation transcript:

LCG Storage Management Workshop, CERN, 7th April 2005 LFC and Fireman Performance Measurements Caitriana Nicholson / IT-GD, CERN / Glasgow University Craig Munro / IT-GD, CERN / Brunel University EGEE is a project funded by the European Union under contract IST-2003-508833

Overview Aims of Testing Test Methodology & Setup LFC Performance Results FiReMan Performance Results Conclusions

Aims of Testing Data Challenges of 2004 exposed limitations in LCG Data Management tools LCG File Catalog developed to address problems with the RLS Suite of tests developed to check the functionality and performance of the LFC Comparison required of performance of EDG RLS Globus RLS LFC FiReMan

Test Methodology Multi-threaded C client program written to test each type of operation (insert, query, delete etc) ./create_files -d /grid/dteam/caitriana/insert/ -f $num_files -t $num_threads C programs wrappered by Perl scripts Typically, each operation performed several thousand times in the client program ($num_files=3000) and mean result returned Client program called several times from Perl script and mean result taken Any entries removed before next test run

LFC Test Setup Oracle DB on Xeon 2.4GHz PIII 1GHz, 512MB server running 20 threads PIII 853MHz, 128MB client with configurable number of threads (single client tests) 10 x PIII 1GHz, 512MB clients with configurable number of threads (multiple clients) 100 Mb/s LAN Insecure LFC Comparison against published results for insecure catalogues Security overheads now being tested Quality of machines used should be noted when comparing LFC and RLS results! SPEC CINT2000 values: 420 220 400 Multiple Clients Single Client 810 Server EDG RLS Globus RLS LFC

LFC Performance (i) - Inserts Mean insert time as number of entries increased up to 40M remains below 30 ms EDG mean insert time was ~40 ms with 500,000 entries Insert rate, with increasing number of client threads, for ~1M entries Increases up to ~200 adds/sec up to server thread limit Globus RLS gave ~84 adds/sec when run with consistency

LFC Performance (ii) - Queries Rate of querying for a single LFN, increasing number of client threads, ~1M entries Not really comparable with Globus results RLS does 1-to-1 lookup LFC stat() returns system metadata, checks permissions… EDG RLS rate ~63 queries/sec for 1 thread; LFC with 1 thread gives ~90 queries/sec

LFC Performance (ii) - Queries Time to list and stat all replicas of a file proportional to number of replicas Time to read a directory is directly proportional to directory size

LFC Performance (ii) - Queries Default buffer size in LFC is small (4 KB) Tuning the buffer size leads to improved performance for readdir() Time to read directory of 100000 entries measured with varying buffer sizes If bulk queries implemented, they should show similar behaviour

LFC Performance (iii) – Delete Rates Delete rate, with increasing number of client threads, for ~1M entries With 1 thread, time per delete ~16 ms EDG RLS with 1 thread gave ~30 ms per delete No comparable results for Globus RLS

LFC Performance (iv) – chdir & transactions Performing a chdir() before many simultaneous operations in the same directory improves performance significantly Using transactions gave loss of performance with single client Extra time being spent in DB Requires further investigation

LFC Performance (v) - Multiple Client Tests Single Operations Tests run simultaneously on varying number of client machines Clients all running with 10 threads Insert and query rates measured with: Single operations Transactions (100 operations per transaction) No reduction in operation rate as number of clients increases Transactions

LFC Summary LFC has been tested and shown to be scalable to at least: 40 million entries 100 client threads Performance improved with comparison to RLSs Stable : Continuous running at high load for extended periods of time with no crashes Based on code which has been in production for > 4 years Tuning required to improve bulk performance

FiReMan Test Setup LFC and FiReMan tests performed on identical hardware Catalogue Server and Oracle DB running on same machine Dual Xeon 2.4Ghz with 2048 MB RAM PIII 800 Mhz, 512 MB RAM Client with configurable number of threads 100 Mb/s LAN Insecure FiReMan and LFC

Limitations of the Setup Limited time -> limited scope of tests Only one multi-threaded client used to test server. Difficult to explore all limits of server All tests performed over LAN, need to look at WAN Influence of round trip time

LFC Comparison LFC Tests repeated on identical hardware to give fair comparison LFC Tests performed with relative paths (chdir) and without transactions 240 350 220 300 200 180 250 160 Inserts / Second Queries / Second 200 140 120 150 100 LFC - New 100 80 LFC - New LFC - Old LFC - Old 60 50 1 2 5 10 20 50 100 1 2 5 10 20 50 100 Number of Threads Number of Threads

FiReMan Performance - Inserts Inserted ~1M entries in bulk with insert time ~5ms Insert Rate for different bulk sizes 350 Single Bulk 1 Bulk 10 Bulk 100 Bulk 500 Bulk 1000 Bulk 5000 300 250 Inserts / Second 200 150 100 50 1 2 5 10 20 50 Number Of Threads

FiReMan Performance - Insert Comparison with LFC: 350 Fireman - Single Entry Fireman - Bulk 100 300 LFC 250 Inserts / Second 200 150 100 50 1 2 5 10 20 50 100 Number of Threads

FiReMan Performance - Queries Query Rate for an LFN 1200 Fireman Single Fireman Bulk 1 Fireman Bulk 10 1000 Fireman Bulk 100 Fireman Bulk 500 Fireman Bulk 1000 Fireman Bulk 5000 800 Entries Returned / Second 600 400 200 5 10 15 20 25 30 35 40 45 50 Number Of Threads

FiReMan Performance - Queries Comparsion with LFC: 1200 Fireman - Single Entry Fireman - Bulk 100 1000 LFC 800 Entries Returned / Second 600 400 200 1 2 5 10 20 50 100 Number Of Threads

FiReMan Performance - Delete Rate LFNs can be deleted from catalogue 240 Single Bulk 1 220 Bulk 10 Bulk 100 200 Bulk 500 Bulk 1000 180 Bulk 5000 160 Deletes / Second 140 120 100 80 60 40 20 1 2 5 10 20 50 Number of Threads

FiReMan Performance - Delete Comparison with LFC: 240 Fireman Single Fireman Bulk 100 220 LFC 200 180 160 140 Deletes / Second 120 100 80 60 40 20 1 2 5 10 20 50 Number of Threads

Conclusions Both LFC and FiReMan offer large improvements over RLS Still some issues remaining: Scalability of FiReMan Bulk Entry for LFC More work needed to understand performance and bottlenecks Need to test some real Use Cases

Questions?