Benchmarking Storage Systems How to characterize the system Storage Network Clients Specific benchmarks iozone mdtest h5perf Hdf5-aggregation (tiff2nexus)

Slides:



Advertisements
Similar presentations
Key Metrics for Effective Storage Performance and Capacity Reporting.
Advertisements

University of Michigan Electrical Engineering and Computer Science Anatomizing Application Performance Differences on Smartphones Junxian Huang, Qiang.
IB in the Wide Area How can IB help solve large data problems in the transport arena.
MSN 2004 Network Memory Servers: An idea whose time has come Glenford Mapp David Silcott Dhawal Thakker.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
CS CS 5150 Software Engineering Lecture 19 Performance.
CSE 190: Internet E-Commerce Lecture 16: Performance.
Wednesday, June 07, 2006 “Unix is user friendly … it’s just picky about it’s friends”. - Anonymous.
Modifying the SCSI / Fibre Channel Block Size Presented by Keith Bonneau, John Chrzanowski and Craig O’Brien Advised by Robert Kinicki and Mark Claypool.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
RDMA vs TCP experiment.
Performance Evaluation
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
I/O Systems ◦ Operating Systems ◦ CS550. Note:  Based on Operating Systems Concepts by Silberschatz, Galvin, and Gagne  Strongly recommended to read.
Configuration of Linux Terminal Server Group: LNS10A6 Thebe Laxmi, Sharma Prabhakar, Patrick Appiah.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
Appendix B Planning a Virtualization Strategy for Exchange Server 2010.
Slide 1 DESIGN, IMPLEMENTATION, AND PERFORMANCE ANALYSIS OF THE ISCSI PROTOCOL FOR SCSI OVER TCP/IP By Anshul Chadda (Trebia Networks)-Speaker Ashish Palekar.
An Introduction to 64-bit Computing. Introduction The current trend in the market towards 64-bit computing on desktops has sparked interest in the industry.
ICOM Noack Operating Systems - Administrivia Prontuario - Please time-share and ask questions Info is in my homepage amadeus/~noack/ Make bookmark.
Operating Systems. Operating systems  Most important program that runs on a computer  Every general-purpose (such as desktop) computer must have OS.
Pursuing Faster I/O in COSMO POMPA Workshop May 3rd 2010.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
An Introduction to Linux Name: Haixin Wang ID :
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Data management for ATLAS, ALICE and VOCE in the Czech Republic L.Fiala, J. Chudoba, J. Kosina, J. Krasova, M. Lokajicek, J. Svec, J. Kmunicek, D. Kouril,
1.4 Open source implement. Open source implement Open vs. Closed Software Architecture in Linux Systems Linux Kernel Clients and Daemon Servers Interface.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
Background: Operating Systems Brad Karp UCL Computer Science CS GZ03 / M th November, 2008.
Linux Architecture Overview 1. Initialization Uboot – hardware init, loads kernel Kernel – remaining initialization, calls “init” Init – 1 st process,
ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
Real-Time, Clocking, and Porting (My Job ) Determining the Real Time Capabilities of various Operating Systems. Writing code to support Real Time Clocking.
CS 360 Lecture 11.  In most computer systems:  The cost of people (development) is much greater than the cost of hardware  Yet, performance is important.
WP19 DESY Development Plan Frank Schlünzen Jürgen Starek.
Manchester University Tiny Network Element Monitor (MUTiny NEM) A Network/Systems Management Tool Dave McClenaghan, Manchester Computing George Neisser,
Using IOR to Analyze the I/O Performance
High Speed Detectors at Diamond Nick Rees. A few words about HDF5 PSI and Dectris held a workshop in May 2012 which identified issues with HDF5: –HDF5.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
NetLogger Using NetLogger for Distributed Systems Performance Analysis of the BaBar Data Analysis System Data Intensive Distributed Computing Group Lawrence.
1.4 Open source implement. Open source implement Open vs. Closed Software Architecture in Linux Systems Linux Kernel Clients and Daemon Servers Interface.
First Look at the New NFSv4.1 Based dCache Art Kreymer, Stephan Lammel, Margaret Votava, and Michael Wang for the CD-REX Department CD Scientific Computing.
Issues on the operational cluster 1 Up to 4.4x times variation of the execution time on 169 cores Using -O2 optimization flag Using IBM MPI without efficient.
OSes: 2. Structs 1 Operating Systems v Objective –to give a (selective) overview of computer system architectures Certificate Program in Software Development.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
GridFTP Guy Warner, NeSC Training Team.
Tackling I/O Issues 1 David Race 16 March 2010.
Operational and Application Experiences with the Infiniband Environment Sharon Brunett Caltech May 1, 2007.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Software System Performance CS 560. Performance of computer systems In most computer systems:  The cost of people (development) is much greater than.
Report on PaNdata Work Package 8 Scalability Acknowledgements: Mark Basham, Nick Rees, Tobias Richter,Ulrik Pedersen, Heiner Billich et al (PSI/SLS), Frank.
Chapter 4: server services. The Complete Guide to Linux System Administration2 Objectives Configure network interfaces using command- line and graphical.
CRISP WP18, High-speed data recording Krzysztof Wrona, European XFEL PSI, 18 March 2013.
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
Using Deduplicating Storage for Efficient Disk Image Deployment Xing Lin, Mike Hibler, Eric Eide, Robert Ricci University of Utah.
New directions in storage | ISGC 2015, Taipei | Patrick Fuhrmann | 19 March 2015 | 1 Presenter: Patrick Fuhrmann dCache.org Patrick Fuhrmann, Paul Millar,
HELMHOLTZ INSTITUT MAINZ Dalibor Djukanovic Helmholtz-Institut Mainz PANDA Collaboration Meeting GSI, Darmstadt.
1© Copyright 2015 EMC Corporation. All rights reserved. NUMA(YEY) BY JACOB KUGLER.
Mini-Workshop on multi-core joint project Peter van Gemmeren (ANL) I/O challenges for HEP applications on multi-core processors An ATLAS Perspective.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
E VALUATION OF GLUSTER AT IHEP CHENG Yaodong CC/IHEP
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Diskpool and cloud storage benchmarks used in IT-DSS
Mechanism: Limited Direct Execution
Semester Review Chris Gill CSE 422S - Operating Systems Organization
Building continuously available systems with Hyper-V
Linux Architecture Overview.
Presentation transcript:

Benchmarking Storage Systems How to characterize the system Storage Network Clients Specific benchmarks iozone mdtest h5perf Hdf5-aggregation (tiff2nexus) Pilatus Detector-Simulation Visualization of huge (>250GB) HDF5 datasets Conclusion limitations

System description UML-like description Simple annotation of redundancies/multiplicities Obvious bottlenecks

System characterization Network Basic test: iperf, ntop, ping, traceroute, etc … mpi pingpong for ib as well as tcp ( mpirun –mca btl ^tcp –hostfile pingpong.hosts mpitests-IMB-MPII pingpong ) Storage Omreport et al. to report the basics ( omreport storage pdisk controller=0 ) iozone for basic I/O speed ( iozone -i0 -i1 -r2m -s300g -f /storage_location/file -+n ) Mdtest to test basic capabilities H5perf to test mpiio/phdf5 capabilities Things which might have an influence … Topology (of course) Number of head nodes, controller, number of physical disks and speed Protocol (and implementation) Underlying filesystem & utilization OS & Kernel; Kernel driver and modules IRQ / Numa configuration. Clients Things which might have an influence … Most of the storage relevant items CPU and Frontbus speed Governance model Bios and firmware Boot options

Storage characterization - fhgfs

FHGFS 2011FHGFS 2012

Storage tests Pilatus 6M Pump 2000 images (total 18 GByte) with up to 50Hz and up to 16 concurrent streams Tiff2nexus Convert GISAXS images into one HDF5 file H5perf Test posix, mpiio and pHDF5 Iozone Basic I/O test mdtest Basic meta data operations (low I/O) Visualization SCA3D, adaptive visualization of distributed 3d documents in open information spaces

Benchmarks – mdtest Name: mdtest Version: Url: Description: mdtest is an MPI-coordinated metadata benchmark test that performs open/stat/close operations on files and OS: Scientific Linux 6.2 Kernel: el6.x86_64 Boot: root=UUID=24fa1ddf-e a612-cdef7b60abc0 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 CPU affinity: none IRQ binding: none Governance: unknown MPI: openmpi Platform: DESY-HPC 2011 Name: mdtest Version: Url: Description: mdtest is an MPI-coordinated metadata benchmark test that performs open/stat/close operations on files and OS: Scientific Linux 6.2 Kernel: el6.x86_64 Boot: root=UUID=24fa1ddf-e a612-cdef7b60abc0 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 CPU affinity: none IRQ binding: none Governance: unknown MPI: openmpi Platform: DESY-HPC 2011

Benchmarks – mdtest

Benchmarks – Pilatus 6M Pilatus 6M detector simulation Currently the most demanding detector at Synchrotrons and running EuroFEL lightsources Can operate at ~20Hz Data format either raw (tiff) or compressed (cbf) Data 1Gb/s for cbf, twice as much for tiff Not a challenge at all Mutliple beamlines equipped with Pilatus 6M, so up to 4 parallel/concurrent streams Most systems start to suffer Execution : pssh –t 0 –H “host1 host2” pilatus.sh Pilatus 6M detector simulation Currently the most demanding detector at Synchrotrons and running EuroFEL lightsources Can operate at ~20Hz Data format either raw (tiff) or compressed (cbf) Data 1Gb/s for cbf, twice as much for tiff Not a challenge at all Mutliple beamlines equipped with Pilatus 6M, so up to 4 parallel/concurrent streams Most systems start to suffer Execution : pssh –t 0 –H “host1 host2” pilatus.sh

Benchmarks – Pilatus 6M Single stream: 10Hz no problem 20Hz no problem (except for dCache) 50Hz no problem for fhgfs

Benchmarks – Pilatus 6M Single stream: 10Hz no problem 20Hz no major problem 50Hz might be a problem

Benchmarks – Pilatus 6M Two concurrent streams: 10Hz no problem 20Hz no major problem 50Hz might be a problem

Benchmarks – Pilatus 6M Four concurrent streams: 10Hz no problem 20Hz no major problem 50Hz becomes a problem

Problems Kernel updates – Spoil NFS daemon for certain combinations of eth0, kernel & firmware High load and memory consumption No transfer of large files – Kernel bug cpu timer elevates cpu consumption by ksoftirqd & frequent kernel cache trashing Extremely high number of timer interrupts (/proc/interrupts) Re-configuration renders an extremely fast fhgfs into an extremely slow one Bios vs System governance models have a substantial influence Irqbalancer on/off Numa bindings of eth/ib Autotuning (e.g. defragmentation runs) Infiniband not homogenous – Some host-host speeds at 70% of the design value, some at only 50%. – Reason unknown. Might correlate with cable length. – Benchmarks depend on the choice of hosts Several more factor affecting the results – too many parameters not under control

in brief ST1 nfs3 behaves ok. ST2 nfs4.1 showed initially some instabilities – Meta data operations lack behind nfs3 – Greatly improved meanhwile dCache not fast, but very stable and nfs 4.1 capable ST3 – Not as fast as advertisements suggest – Meta data operations not convincing Glusterfs – In our configuration unusable FHGFS – Strongly depends on the configuration, but could easily outperform any of the others at a fraction of the costs.

Conclusions Mdtest+iozone+h5perf sufficient to test basis capabilities – Poor performance on tests -> poor performance in real applications – Good performance on tests doesn‘t always imply good performance in real applications Needs to simulate applications on real hardware to get an idea – Can offer access to platforms Qualitative figures (stable, unstable, unusable) helpful