Download presentation
Presentation is loading. Please wait.
Published byCurtis King Modified over 6 years ago
1
Scalability to Hundreds of Clients in HEP Object Databases
Koen Holtman CERN / Eindhoven University of Technology Julian Bunn CERN / Caltech CHEP ‘98 September 3, 1998
2
Koen Holtman / Julian Bunn
Introduction CMS requires massive parallelism in reconstruction high data rates in reconstruction and DAQ Means that OODB must scale to 100s of clients and high aggregate throughputs We tested scalability of reconstruction and DAQ workloads under Objectivity/DB v4.0.10 Tests performed on HP Exemplar supercomputer Potential problems: Lockserver Small database page size (used 32K) Clustering efficiency (used sequential reading/writing based on ODMG containers) September 3, 1998 Koen Holtman / Julian Bunn
3
The HP Exemplar at Caltech
256 processor SMP machine, one single OS 16 nodes with 16 processors 4 disk striped disk array (~22 MB/s) visible as a filesystem Very fast node interconnect (GB/s range) May be a model of future PC farms Tests put up to Objectivity/DB clients on the machine, catalog and journals on one node filesystem, data on 2,4,8 node filesystems September 3, 1998 Koen Holtman / Julian Bunn
4
Koen Holtman / Julian Bunn
Reconstruction test Reconstruction for 1 event Reading: 1 MB raw data, as 100 objects of 10 KB, divided over 3 containers (50%, 25%, 25%) in 3 databases Writing: 100 KB reconstructed data, as 10 objects of 10 KB, to one container Computation: 2 * 10^4 MIPSs (=5 seconds on 1 Exemplar CPU) All interleaved Up to 240 reconstruction clients in parallel Each client has its own set of 3 ODMG databases Databases divided over 4 node filesystems In all containers, data is clustered in reading order Databases created with simulated DAQ 8 parallel writers to 1 node September 3, 1998 Koen Holtman / Julian Bunn
5
Reconstruction results
Blue curve shows reco throughput CPU bound Very good resource usage (91% - 83% CPU usage for reco) Red curve shows shows throughput for reco with half the CPU Disk bound for >160 clients, up to 55 MB/s Filesystems rated 88 MB/s Used read-ahead optimisation September 3, 1998 Koen Holtman / Julian Bunn
6
Read-ahead optimisation
Each container was read through an iterator class which performs a read-ahead optimisation Reads 4MB chunk of container into DB client cache at once Without read-ahead optimisation, scheduling of disk reads by OS is less efficient -> more long seeks -> loss of I/O performance September 3, 1998 Koen Holtman / Julian Bunn
7
Koen Holtman / Julian Bunn
DAQ test Each client writing 10 KB objects to a container in its own database Databases divided over 8 node filesystems 1 event = 1 MB 0.45 CPU sec in user code 0.20 CPU sec by Objectivity 0.01 CPU sec by OS Test goes up to 238 clients Disk bound above 100 clients Up to 145 MB/s on filesystems rated 176 MB/s DAQ Filesystems sluggish above 100 clients September 3, 1998 Koen Holtman / Julian Bunn
8
Scaling of client startup
Startup times for reconstruction test,new clients started in bunches of 16 Other tests show similar curves Startup time can be much worse if catalog/journal filesystem saturated Conclusion: leave clients running all the time? September 3, 1998 Koen Holtman / Julian Bunn
9
Koen Holtman / Julian Bunn
Conclusions Objectivity/DB shows good scalability, up to 240 clients, under CMS DAQ and reconstruction workloads Using very fast network Reconstruction divided raw data over 3 containers good CPU utilisation (91%-83% in user code) up to 55 MB/s (63% of rated maximum) Needed simple read-ahead optimisation DAQ all clients write to their own container up to 145 MB/s (82% of rated maximum) do not overload the DAQ filesystems The lockserver is not a bottleneck (yet) used large container growth factor (20%) September 3, 1998 Koen Holtman / Julian Bunn
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.