Moving out of SI2K How INFN is moving out of SI2K as a benchmark for Worker Nodes performance evaluation Michele Michelotto at pd.infn.it.

Slides:

Advertisements

Similar presentations

Hepmark Valutazione della potenza dei nodi di calcolo nella HEP Michele Michelotto Padova Ferrara Bologna.

Advertisements

Computer Abstractions and Technology

Hepmark project Evaluation of HEP worker nodes Michele Michelotto at pd.infn.it.

SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.

Lecture 7: 9/17/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.

Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.

CS2214 Recitation Presented By Veejay Sani. Benchmarking SPEC CPU2000 Integer Benchmark Floating Point Benchmark We will only deal with Integer Benchmark.

A comparison of HEP code with SPEC benchmark on multicore worker nodes HEPiX Benchmarking Group Michele Michelotto at pd.infn.it.

Performance benchmark of LHCb code on state-of-the-art x86 architectures Daniel Hugo Campora Perez, Niko Neufled, Rainer Schwemmer CHEP Okinawa.

HS06 on the last generation of CPU for HEP server farm Michele Michelotto 1.

Test results Test definition (1) Istituto Nazionale di Fisica Nucleare, Sezione di Roma; (2) Istituto Nazionale di Fisica Nucleare, Sezione di Bologna.

Transition to a new CPU benchmark on behalf of the “GDB benchmarking WG”: HEPIX: Manfred Alef, Helge Meinhard, Michelle Michelotto Experiments: Peter Hristov,

Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Background: MapReduce and FREERIDE Co-clustering on FREERIDE Experimental.

University of Maryland Compiler-Assisted Binary Parsing Tugrul Ince PD Week – 27 March 2012.

Computer Performance Computer Engineering Department.

SYNAR Systems Networking and Architecture Group Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures Daniel Shelepov and Alexandra.

3. April 2006Bernd Panzer-Steindel, CERN/IT1 HEPIX 2006 CPU technology session some ‘random walk’

F. Brasolin / A. De Salvo – The ATLAS benchmark suite – May, Benchmarking ATLAS applications Franco Brasolin - INFN Bologna - Alessandro.

C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.

RISC Architecture RISC vs CISC Sherwin Chan.

COP4020 Programming Languages Subroutines and Parameter Passing Prof. Xin Yuan.

HS06 on new CPU, KVM virtual machines and commercial cloud Michele Michelotto 1.

Fast Benchmark Michele Michelotto – INFN Padova Manfred Alef – GridKa Karlsruhe 1.

Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.

Benchmarking status Status of Benchmarking Helge Meinhard, CERN-IT WLCG Management Board 14-Jul Helge Meinhard (at) CERN.ch.

CERN IT Department CH-1211 Genève 23 Switzerland t IHEPCCC/HEPiX benchmarking WG Helge Meinhard / CERN-IT LCG Management Board 11 December.

A Preliminary Investigation on Optimizing Charm++ for Homogeneous Multi-core Machines Chao Mei 05/02/2008 The 6 th Charm++ Workshop.

Benchmarking Benchmarking in WLCG Helge Meinhard, CERN-IT HEPiX Fall 2015 at BNL 16-Oct Helge Meinhard (at) CERN.ch.

HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09 INFN - Padova michele.michelotto at pd.infn.it.

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks John Gordon SA1 Face to Face CERN, June.

Processors with Hyper-Threading and AliRoot performance Jiří Chudoba FZÚ, Prague.

HS06 performance per watt and transition to SL6 Michele Michelotto – INFN Padova 1.

HEPMARK2 Consiglio di Sezione 9 Luglio 2012 Michele Michelotto - Padova.

From Westmere to Magny-cours: Hep-Spec06 Cornell U. - Hepix Fall‘10 INFN - Padova michele.michelotto at pd.infn.it.

CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.

Michele Michelotto – CCR Marzo 2015 Past present future Fall 2012 Beijing IHEP Spring 2013 – INFN Cnaf Fall 2013 – Ann Arbour, Michigan Univ. Spring.

New CPU, new arch, KVM and commercial cloud Michele Michelotto 1.

Chapter 1 Performance & Technology Trends. Outline What is computer architecture? Performance What is performance: latency (response time), throughput.

The last generation of CPU processor for server farm. New challenges Michele Michelotto 1.

Moving out of SI2K How INFN is moving out of SI2K as a benchmark for Worker Nodes performance evaluation Michele Michelotto at pd.infn.it.

CERN IT Department CH-1211 Genève 23 Switzerland t IHEPCCC/HEPiX benchmarking WG Helge Meinhard / CERN-IT Grid Deployment Board 09 January.

SI2K and beyond Michele Michelotto – INFN Padova CCR – Frascati 2007, May 30th.

J.J. Keijser Nikhef Amsterdam Grid Group MyFirstMic experience Jan Just Keijser 26 November 2013.

Computer Architecture & Operations I

Benchmarking of CPU models for HEP application

Measuring Performance II and Logic Design

Brief introduction about “Grid at LNS”

Evaluation of HEP worker nodes Michele Michelotto at pd.infn.it

NFV Compute Acceleration APIs and Evaluation

CSCI206 - Computer Organization & Programming

CCR Autunno 2008 Gruppo Server

Gruppo Server CCR michele.michelotto at pd.infn.it

Computer Architecture & Operations I

CS161 – Design and Architecture of Computer Systems

How to benchmark an HEP worker node

Outline Benchmarking in ATLAS Performance scaling

Low Power processors in HEP

Gruppo Server CCR michele.michelotto at pd.infn.it

How INFN is moving out of SI2K has a benchmark for Worker Nodes

Passive benchmarking of ATLAS Tier-0 CPUs

Morgan Kaufmann Publishers

Transition to a new CPU benchmark

CERN Benchmarking Cluster

CSCI206 - Computer Organization & Programming

CMSC 611: Advanced Computer Architecture

Hybrid Programming with OpenMP and MPI

Multithreading Why & How.

CMSC 611: Advanced Computer Architecture

CS161 – Design and Architecture of Computer Systems

Presentation transcript:

Moving out of SI2K How INFN is moving out of SI2K as a benchmark for Worker Nodes performance evaluation Michele Michelotto at pd.infn.it

October 9th 2007 GDB michele michelotto - INFN2 SI2K frozen SI2K is the benchmark used up to now to measure the computing power of all the HEP experiments –Computing power requested by experiment –Computing power offered by a Tier-0/Tier1/Tier2 SI2K is actually the SPEC CPU Int 2000 benchmark –Came after Spec89, Spec Int 92 and Spec Int 95 –Declared obsolete by SPEC in 2006 –The new benchmark from SPEC is CPU Int 2006

October 9th 2007 GDB michele michelotto - INFN3 Problems with SI2K Impossible to find SPEC Int 2000 results for the new processors (e.g. the not so new Clovertwon 4core) Impossible to find SPEC Int 2006 for old processor (before 2006) It doesn’t match the HEP program behaviour like it did in the past

October 9th 2007 GDB michele michelotto - INFN4 Nominal SI vs real SI SI2K results for the last generation processor afftected by inflation So CERN (and FZK) started to use a new currency: SI2K measured with “gcc”, the gnu C compiler and using two flavour of optimization –High tuning: gcc –O3 –funroll-loops–march=$ARCH –Low tuning: gcc –O2 –fPIC –pthread CERN Proposal: Use as site rating the “Real SI” increased by 50% –Actually this make sense only for a short period of time and for the last generation of processor

October 9th 2007 GDB michele michelotto - INFN5 Nominal SI vs real SI CERN Proposal: Use as site rating the “Real SI” obatained by nominal SI increased by 50% –Actually this make sense only for a short period of time and for the last generation of processor Run n copies in parallel –Where n is the number of cores in the worker node

October 9th 2007 GDB michele michelotto - INFN6 Too many SI2K Take as an example a worker node with two Intel Woodcrest dual core 5160 at 3.06 GHz SI2K nominal: 2929 – 3089 (min – max) SI2K on 4 cores: SI2K gcc-low: 5523 SI2K gcc-high: 7034 SI2K gcc-low + 50%: 8284

October 9th 2007 GDB michele michelotto - INFN7 Even more Actually all the gcc results in the previous slide are on i386 (32bit) if you would like to know how your code is running on 64 bit machine, you can measure Specint INT 2000 with gcc on x86_64. So the worker node with two Intel Woodcrest dual core 5160 at 3.06 GHz SI2K nominal: 2929 – 3089 (min – max) SI2K on 4 cores: SI2K gcc-low: 6021 SI2K gcc-high: 6409 SI2K gcc-low + 50%: 9031

October 9th 2007 GDB michele michelotto - INFN8 A scale factor All these numbers would be only annoying in a world with a unique architecture, in which only clock improves in time You would be able to find a fixed ratio between all those number. But the ratio depends on CPU producer (intel vs AMD) and processor generation (old xeon vs new “core” Xeon

October 9th 2007 GDB michele michelotto - INFN9 The nominal SI2K Big Differences between Intel and AMD when SI2K/GHz are plotted

October 9th 2007 GDB michele michelotto - INFN10 The nominal SI2Krate Smaller differences between Intel and AMD using SI2K Rate /GHz / core SI2K RATE differentiate the dual core (TOP) from the 4 core (bottom)

October 9th 2007 GDB michele michelotto - INFN11 The CERN SI A TEMPORAY Solution: The intel-amd gap now is smaller Intel xeon scores -50% Amd opteron scores only -29% Big differences between 32 and 64 bit SI2K/core/Ghz taken measured by M.Alef and pubblished on the HEPIX site at Caspur

October 9th 2007 GDB michele michelotto - INFN12 SI2K6rate

October 9th 2007 GDB michele michelotto - INFN13 Babar TierA Results If you normalize by core and clock all new processors have the same performance Doubling the older generation cpu SI2006 matches this pattern (pubblished and gcc ratio constant) SI2000-cern better than SI2K nominal SI2000 clearly does’nt work

October 9th 2007 GDB michele michelotto - INFN14 CMS sw SIM and Pythia CMS Montecarlo simulation (32bit) and Pythia (64bit) show the same performance once normalized Both Specint 2006 pubblished and Specint 2006 with gcc show the same behaviour SI2K clearly wrong. SI2K cern better but not as good as SI2006

October 9th 2007 GDB michele michelotto - INFN15 What should be do? We should stop using SI2000 pubblished SI2k CERN a temporary solution but is still makes use of a 7 years old benchmark that is not mantained by SPEC (v1.3)

October 9th 2007 GDB michele michelotto - INFN16 Why SI2006 Best candidate is Spec INT 2006 rate measured with gcc –Spec INT 2006 because it’s new, you can find pubblished results, will have bug fixes, and will be portable to new platform compiler –RATE because you can take in account of scalability issues with the whole machine architecture –Measured with GCC: to keep the environment as close as possible as the experiments.

October 9th 2007 GDB michele michelotto - INFN17 What is missing All the official number are still expressed in terms of SI2K (nominal) So funding agencies are still bounded to use SI2K for all formal agreements A general agreements on the next benchmark is still MISSING

October 9th 2007 GDB michele michelotto - INFN18 Final remarks The working group is still collecting results (A. Crescente, L. Dell’Agnello, D. Salomoni, F. Rosso, A. Sartirana, R.Stroili, D.Salomoni, M.Morandin, A.De Salvo, T.Zani) NB: Results from SI2006 change with –Version of gcc (gcc3 vs gcc4) –32bit vs 64bit –Tuning gcc –O2 –fPIC –pthread vs gcc –O3 So it’s difficult to have an homogeneous set of results Will add Atlas and Babar results when available