Scalability to Hundreds of Clients in HEP Object Databases

Slides:



Advertisements
Similar presentations
Tradeoffs in Scalable Data Routing for Deduplication Clusters FAST '11 Wei Dong From Princeton University Fred Douglis, Kai Li, Hugo Patterson, Sazzala.
Advertisements

CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
Data Clustering Research in CMS Koen Holtman CERN/CMS Eindhoven University of technology CHEP ’2000 Feb 7-11, 2000.
Building a Large Location Table to Find Replicas of Physics Objects Koen Holtman Heinz Stockinger CERN/CMS CHEP ’2000 Feb 7-11, 2000.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Server Platforms Week 11- Lecture 1. Server Market $ 46,100,000,000 ($ 46.1 Billion) Gartner.
Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
© 2011 IBM Corporation 11 April 2011 IDS Architecture.
THE AFFORDABLE SUPERCOMPUTER HARRISON CARRANZA APARICIO CARRANZA JOSE REYES ALAMO CUNY – NEW YORK CITY COLLEGE OF TECHNOLOGY ECC Conference 2015 – June.
Shilpa Seth.  Centralized System Centralized System  Client Server System Client Server System  Parallel System Parallel System.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
BESIII computing 王贻芳. Peak Data volume/year Peak data rate at 3000 Hz Events/year: 1*10 10 Total data of BESIII is about 2*640 TB Event size(KB)Data volume(TB)
 Introduction, concepts, review & historical perspective  Processes ◦ Synchronization ◦ Scheduling ◦ Deadlock  Memory management, address translation,
Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
DONE-08 Sizing and Performance Tuning N-Tier Applications Mike Furgal Performance Manager Progress Software
A User-Lever Concurrency Manager Hongsheng Lu & Kai Xiao.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
1 Database mini workshop: reconstressing athena RECONSTRESSing: stress testing COOL reading of athena reconstruction clients Database mini workshop, CERN.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Atlas Software Structure Complicated system maintained at CERN – Framework for Monte Carlo and real data (Athena) MC data generation, simulation and reconstruction.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
PSM, Database requirements for POOL (File catalog performance requirements) Maria Girone, IT-DB Strongly based on input from experiments: subject.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Hans Wenzel CDF CAF meeting October 18 th -19 th CMS Computing at FNAL Hans Wenzel Fermilab  Introduction  CMS: What's on the floor, How we got.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques Dr. Xiao Qin Auburn University
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Getting the Most out of Scientific Computing Resources
Experience of Lustre at QMUL
Getting the Most out of Scientific Computing Resources
Chapter 1: Introduction
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Multiple Platters.
Diskpool and cloud storage benchmarks used in IT-DSS
Parallel Processing - introduction
Chapter 1: Introduction
Chapter 1: Introduction
for the Offline and Computing groups
CLUSTER COMPUTING Presented By, Navaneeth.C.Mouly 1AY05IS037
Bernd Panzer-Steindel, CERN/IT
Experience of Lustre at a Tier-2 site
The COMPASS event store in 2002
Sharing Memory: A Kernel Approach AA meeting, March ‘09 High Performance Computing for High Energy Physics Vincenzo Innocente July 20, 2018 V.I. --
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
Chapter 1: Introduction
Chapter 1: Introduction
Memory Management for Scalable Web Data Servers
Chapter 1: Introduction
Oracle Storage Performance Studies
Grid Canada Testbed using HEP applications
Chapter 1: Introduction
ALICE Data Challenges Fons Rademakers Click to add notes.
13.3 Accelerating Access to Secondary Storage
Chapter 1: Introduction
Chapter 1: Introduction
CS246: Search-Engine Scale
Chapter 1: Introduction
The Performance and Scalability of the back-end DAQ sub-system
Chapter 1: Introduction
The Gamma Database Machine Project
Chapter 1: Introduction
Presentation transcript:

Scalability to Hundreds of Clients in HEP Object Databases Koen Holtman CERN / Eindhoven University of Technology Julian Bunn CERN / Caltech CHEP ‘98 September 3, 1998

Koen Holtman / Julian Bunn Introduction CMS requires massive parallelism in reconstruction high data rates in reconstruction and DAQ Means that OODB must scale to 100s of clients and high aggregate throughputs We tested scalability of reconstruction and DAQ workloads under Objectivity/DB v4.0.10 Tests performed on HP Exemplar supercomputer Potential problems: Lockserver Small database page size (used 32K) Clustering efficiency (used sequential reading/writing based on ODMG containers) September 3, 1998 Koen Holtman / Julian Bunn

The HP Exemplar at Caltech 256 processor SMP machine, one single OS 16 nodes with 16 processors 4 disk striped disk array (~22 MB/s) visible as a filesystem Very fast node interconnect (GB/s range) May be a model of future PC farms Tests put up to 240 Objectivity/DB clients on the machine, catalog and journals on one node filesystem, data on 2,4,8 node filesystems September 3, 1998 Koen Holtman / Julian Bunn

Koen Holtman / Julian Bunn Reconstruction test Reconstruction for 1 event Reading: 1 MB raw data, as 100 objects of 10 KB, divided over 3 containers (50%, 25%, 25%) in 3 databases Writing: 100 KB reconstructed data, as 10 objects of 10 KB, to one container Computation: 2 * 10^4 MIPSs (=5 seconds on 1 Exemplar CPU) All interleaved Up to 240 reconstruction clients in parallel Each client has its own set of 3 ODMG databases Databases divided over 4 node filesystems In all containers, data is clustered in reading order Databases created with simulated DAQ 8 parallel writers to 1 node September 3, 1998 Koen Holtman / Julian Bunn

Reconstruction results Blue curve shows reco throughput CPU bound Very good resource usage (91% - 83% CPU usage for reco) Red curve shows shows throughput for reco with half the CPU Disk bound for >160 clients, up to 55 MB/s Filesystems rated 88 MB/s Used read-ahead optimisation September 3, 1998 Koen Holtman / Julian Bunn

Read-ahead optimisation Each container was read through an iterator class which performs a read-ahead optimisation Reads 4MB chunk of container into DB client cache at once Without read-ahead optimisation, scheduling of disk reads by OS is less efficient -> more long seeks -> loss of I/O performance September 3, 1998 Koen Holtman / Julian Bunn

Koen Holtman / Julian Bunn DAQ test Each client writing 10 KB objects to a container in its own database Databases divided over 8 node filesystems 1 event = 1 MB 0.45 CPU sec in user code 0.20 CPU sec by Objectivity 0.01 CPU sec by OS Test goes up to 238 clients Disk bound above 100 clients Up to 145 MB/s on filesystems rated 176 MB/s DAQ Filesystems sluggish above 100 clients September 3, 1998 Koen Holtman / Julian Bunn

Scaling of client startup Startup times for reconstruction test,new clients started in bunches of 16 Other tests show similar curves Startup time can be much worse if catalog/journal filesystem saturated Conclusion: leave clients running all the time? September 3, 1998 Koen Holtman / Julian Bunn

Koen Holtman / Julian Bunn Conclusions Objectivity/DB shows good scalability, up to 240 clients, under CMS DAQ and reconstruction workloads Using very fast network Reconstruction divided raw data over 3 containers good CPU utilisation (91%-83% in user code) up to 55 MB/s (63% of rated maximum) Needed simple read-ahead optimisation DAQ all clients write to their own container up to 145 MB/s (82% of rated maximum) do not overload the DAQ filesystems The lockserver is not a bottleneck (yet) used large container growth factor (20%) September 3, 1998 Koen Holtman / Julian Bunn