CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers Benchmarking of (CPU) servers Dr. Ulrich Schwickerath, CERN.

Slides:



Advertisements
Similar presentations
Alastair Dewhurst, Dimitrios Zilaskos RAL Tier1 Acknowledgements: RAL Tier1 team, especially John Kelly and James Adams Maximising job throughput using.
Advertisements

Agricultural and Biological Statistics
Physics 114: Lecture 7 Uncertainties in Measurement Dale E. Gary NJIT Physics Department.
Datacenter Power State-of-the-Art Randy H. Katz University of California, Berkeley LoCal 0 th Retreat “Energy permits things to exist; information, to.
CHAPTER 6 Statistical Analysis of Experimental Data
HS06 on the last generation of CPU for HEP server farm Michele Michelotto 1.
Chapter 3: Central Tendency
Optimizing RAM-latency Dominated Applications
Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.
Joint Histogram Based Cost Aggregation For Stereo Matching Dongbo Min, Member, IEEE, Jiangbo Lu, Member, IEEE, Minh N. Do, Senior Member, IEEE IEEE TRANSACTION.
TPB Models Development Status Report Presentation to the Travel Forecasting Subcommittee Ron Milone National Capital Region Transportation Planning Board.
Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
AUTHORS: STIJN POLFLIET ET. AL. BY: ALI NIKRAVESH Studying Hardware and Software Trade-Offs for a Real-Life Web 2.0 Workload.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
1 Selecting LAN server (Week 3, Monday 9/8/2003) © Abdou Illia, Fall 2003.
June 21, Objectives  Enable the Data Analysis Add-In  Quickly calculate descriptive statistics using the Data Analysis Add-In  Create a histogram.
Central Tendency & Dispersion
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
CPE 619 One Factor Experiments Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in.
CERN IT Department CH-1211 Genève 23 Switzerland t Load Testing Dennis Waldron, CERN IT/DM/DA CASTOR Face-to-Face Meeting, Feb 19 th 2009.
CERN IT Department CH-1211 Genève 23 Switzerland PES 1 Ermis service for DNS Load Balancer configuration HEPiX Fall 2014 Aris Angelogiannopoulos,
Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
Calibration of energies at the photon collider Valery Telnov Budker INP, Novosibirsk TILC09, Tsukuba April 18, 2009.
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
Evangelos Markatos and Charalampos Gkikas FORTH-ICS Athens, th Mar Institute of Computer Science - FORTH Christos.
The Worldwide LHC Computing Grid Frédéric Hemmer IT Department Head Visit of INTEL ISEF CERN Special Award Winners 2012 Thursday, 21 st June 2012.
CERN - IT Department CH-1211 Genève 23 Switzerland t Power and Cooling Challenges at CERN IHEPCCC Meeting April 24 th 2007 Tony Cass.
CALICE, CERN June 29, 2004J. Zálešák, APDs for tileHCAL1 APDs for tileHCAL MiniCal studies with APDs in e-test beam J. Zálešák, Prague with different preamplifiers.
CERN Disk Storage Technology Choices LCG-France Meeting April 8 th 2005 CERN.ch.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
HPC need and potential of ANSYS CFD and mechanical products at CERN A. Rakai EN-CV-PJ2 5/4/2016.
CERN IT Department CH-1211 Genève 23 Switzerland PES Virtualization at CERN – a status report Virtualization at CERN – status report Sebastien.
CERN IT Department CH-1211 Genève 23 Switzerland The CERN internal Cloud Sebastien Goasguen, Belmiro Rodrigues Moreira, Ewan Roche, Ulrich.
J.J. Keijser Nikhef Amsterdam Grid Group MyFirstMic experience Jan Just Keijser 26 November 2013.
Cluster Status & Plans —— Gang Qin
Title of the Poster Supervised By: Prof.*********
Experience of Lustre at QMUL
Statistical Methods Michael J. Watts
OPERATING SYSTEMS CS 3502 Fall 2017
Different Types of Data
Statistical Methods Michael J. Watts
Diskpool and cloud storage benchmarks used in IT-DSS
Descriptive Statistics
Outline Benchmarking in ATLAS Performance scaling
WLCG Memory Requirements
Solving pedestal problem
Experience of Lustre at a Tier-2 site
Passive benchmarking of ATLAS Tier-0 CPUs
Practical aspects of multi-core job submission at CERN
Genomic Data Clustering on FPGAs for Compression
PES Lessons learned from large scale LSF scalability tests
CPU accounting of public cloud resources
Intel’s Core i7 Processor
Statistical Process Control
DAY 3 Sections 1.2 and 1.3.
Descriptive and inferential statistics. Confidence interval
Statistical Inference about Regression
Chapter 3: Central Tendency
Imperial laser system and analysis
Outline System architecture Current work Experiments Next Steps
Higher National Certificate in Engineering
Chapter 3: Central Tendency
p0 ALL analysis in PHENIX
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers Benchmarking of (CPU) servers Dr. Ulrich Schwickerath, CERN See also: g

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 2 Outline: Hardware used for HEPSPEC06 measurements Influence of SMT and Turbo mode on HEPSPEC06 results HEPSPEC06 Scaling behavior HEPSPEC06 Systematic errors First look at high statistics power measurements Summary Outlin e

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 3 Hardware used for HEPSPEC06 measurements Chassis: SuperMicro CSE-827T-R1200B Twin2 equipped with 4 identical servers Servers: 4 Supermicro X8DTT-F board Intel® 5520 “Tylersburg” chipset 2 Intel® XEON L5520 CPUs, 2.26 GHz (Nehalem) 16 GB RAM 2X 500GB disks

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 4 Can we gain from SMT or Turbo mode ? Idea of this test: Perform “large” number of HEPSPEC06 measurements on identical hardware but different BIOS settings Turbo mode off, SMT off (default) Turbo mode on, SMT off Turbo mode off, SMT on Turbo mode on, SMT on Histogram results by machine Measure relative improvement with respect to the default setting, and derive errors if possible

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 5 SMT and Turbo mode RMS: 0.5RMS: 0.8 RMS:0.9RMS1. 2

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 6 SMT and Turbo mode: relative improvements 4% 25 % (27±2)%

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 7 SMT and Turbo mode test: Observations The two options are correlated, the individual improvements don't sum up Turbo mode alone gives a (4.1±1.0)% improvement SMT alone results in (25.4±1.3)% performance improvement Both together result in (26.9±1.6)% Both options increase the uncertainty on of the measurements Note: the benchmark was run with 1 process/logical CPU (8 or 16 processes)

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 8 HEPSPEC06 benchmark scaling behavior Idea of this test: Test HEPSPEC06 performance as a function of the number of processes Use the same machines as in the previous test Scan up to 16 processes (=number of logical cores with SMT) Compare to theoretical linear scaling, based on the first two measurement points

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 9 HEPSPEC06 benchmark: scaling behavior Theoretical Linear scaling

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 10 HEPSPEC06 benchmark: scaling behavior Theoretical Linear scaling Linear scaling up to 4 processes

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 11 HEPSPEC06 benchmark: scaling behavior Theoretical Linear scaling Non-linear above 4 processes, Falling behind expectations

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 12 HEPSPEC06 benchmark: scaling behavior Theoretical Linear scaling 19%(10%) below theoretical performance at a working point of one process per physical core if both options are on (off).

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 13 HEPSPEC06 benchmark: scaling behavior Theoretical Linear scaling Below one process/physical core, SMT is less important than turbo mode (green/black above red/blue)

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 14 HEPSPEC06 benchmark: scaling behavior Theoretical Linear scaling Moderate over-committing can be an advantage

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 15 HEPSPEC06 benchmark: scaling behavior Theoretical Linear scaling With SMT on, the performance raises up to the 1 process/logical core

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 16 Scaling behavior: observations Linear scaling up to 4 processes Non-linear scaling behavior above 4 processes, below theoretical curves 19%(10%) below theoretical performance at a working point of one process per physical core if both options are on (off). Below 1 process/physical core, SMT is less important than turbo mode Even with SMT off, moderate over-committing can be an advantage With SMT on, the performance raises up to the 1 process/logical core

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 17 Alternative application:CDB benchmarking CDB benchmark: central piece of Quattor tool suite Metric: real time needed, in seconds Compilation of large number of small files which include each other Resolving dependencies Used at CERN for node management SMT off SMT on turbo offturbo on Turbo mode does not help SMT is harmful !

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 18 Work in progress: systematic errors Idea of this test: test for systematic variations Run the benchmark on all 4 machines with the same settings Any significant differences should be treated as systematic errors Test system enclosure rear view Individual test systems

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 19 Systematic errors: cont. 90.9± ± ± ±0.4

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 20 Systematic errors: cont. Left-right asymmetry seen for this hardware type Left hand side machines: 90.7±0.6 Right hand side machines 91.5±0.5 Cross checked with a different twin2 system from a different vendor Distributions are smaller No left-right asymmetry seen ! NOT really significant, still … possible reasons: Statistics ? Real general hardware issue ? Bad cabling of the test system ? Just a bad sample box … ? → needs verification with final machines and more statistics

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 21 Implications for procurement procedures Both SMT and Turbo mode increase the spread of the HEPSPEC06 measurements Not all CPUs support SMT and turbo mode Recommendations: If the proposed CPUs support SMT and/or Turbo mode, these have to be switched off for the HEPSPEC performance evaluation for the bid For production purpose, and depending on the application to be run, turbo mode should be switched on Switching on SMT mode on batch worker nodes can help for multi- threaded applications but... No additional i ncrease of job slot number (limited by disk/memory requirements) Possible implications on licensing for commercial batch system solutions Implications for CPU count publishing on the GRID Switching on SMT also for dedicated machines needs case by case study as it can be harmful

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 22 Power measurements (preliminary) Hardware: “small disk server” test machine with CPU and memory layout similar to the CPU servers Two (redundant) power supply units Hardware raid controller Power measurement unit: ZES/Zimmer Electronics LMG500, 8 channels 2 channels used (one per power supply) Measurement conditions Machine idle Each measurement is an average over 20min time Main interest of this study: shape of distributions and variations of mean values, rather than the absolute mean values

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 23 Power factors: using one PSU only or both Ratio of the sum of effective and apparent power Ratio of effective and apparent power 92.8% Non-gaussion tail

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 24 Power factors: using one PSU only or both Ratio of the sum of effective and apparent power Ratio of effective and apparent power 80.6 % 2 peaks ?!

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 25 Power factors of the two power supplies Broad distribution Bad mean Narrow distribution Reasonable mean value Mean: 92.8% X-axis: power factor(pf)

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 26 Power factors of the two power supplies Broad distribution Bad mean Using the other power supply Same mean value But twice broader distribution X-axis: power factor(pf) Bad power supply ???

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 27 Power factors of the two power supplies Broad distribution Bad mean Both units connected, Power factor PSU2 Fairly narrow for PSU1 Not really Gaussian shape Mean 83.7% Whats this ?

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 28 Power factors of the two power supplies Broad distribution Mean 76.5% Both units connected, Power factor PSU2 Broad distribution Not gaussian with 2 peaks ! Mean 83.7% Bad power supply ?

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 29 Power factors of the two power supplies Mean 76.5% Mean 83.7% Mean 92.8 Mean 82.7

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 30 Power measurements summary Disclaimer: The results are not very conclusive. Further verification is needed !!!!! Using only one power supply instead of both seems to be an advantage for power saving The two power supplies, although the same model, showed a very different performance The width of the distributions is non-negligible The shape of the distributions is non-gaussian Lessons learned: Seen variations of up to 4% in apparent and effective power measurements Error estimation difficult due to non-gaussion distributions Significant power factor variations between PSU of the same model Power factor variations of several per-cent points are not unlikely

CERN IT Department CH-1211 Genève 23 Switzerland Benchmarking of CPU servers - 31 Summary Up to (27±2) % performance gain by SMT and turbo mode, in terms of HEPSPEC06 Scaling in terms of HEPSPEC06 remains below expectations Hints of systematic performance variations, depending on the hardware type Enhanced statistics power measurements are not very conclusive but interesting nevertheless All presented results may have implications for procurement procedures.