SI2K and beyond Michele Michelotto – INFN Padova CCR – Frascati 2007, May 30th.

Slides:



Advertisements
Similar presentations
Hepmark Valutazione della potenza dei nodi di calcolo nella HEP Michele Michelotto Padova Ferrara Bologna.
Advertisements

C++ vs. Python By Jahrain Jackson Home Institution: University of Hawaii at Hilo Internship: Subaru Telescope Mentor: Matt Dinkins.
MEMORY HIERARCHY – Microprocessor Asst. Prof. Dr. Choopan Rattanapoka and Asst. Prof. Dr. Suphot Chunwiphat.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Hepmark project Evaluation of HEP worker nodes Michele Michelotto at pd.infn.it.
Nov COMP60621 Concurrent Programming for Numerical Applications Lecture 6 Chronos – a Dell Multicore Computer Len Freeman, Graham Riley Centre for.
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
CS2214 Recitation Presented By Veejay Sani. Benchmarking SPEC CPU2000 Integer Benchmark Floating Point Benchmark We will only deal with Integer Benchmark.
UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.
A comparison of HEP code with SPEC benchmark on multicore worker nodes HEPiX Benchmarking Group Michele Michelotto at pd.infn.it.
HS06 on the last generation of CPU for HEP server farm Michele Michelotto 1.
Moving out of SI2K How INFN is moving out of SI2K as a benchmark for Worker Nodes performance evaluation Michele Michelotto at pd.infn.it.
Processors Menu  INTEL Core™ i Processor INTEL Core™ i Processor  INTEL Core i Processor INTEL Core i Processor  AMD A K.
Test results Test definition (1) Istituto Nazionale di Fisica Nucleare, Sezione di Roma; (2) Istituto Nazionale di Fisica Nucleare, Sezione di Bologna.
Transition to a new CPU benchmark on behalf of the “GDB benchmarking WG”: HEPIX: Manfred Alef, Helge Meinhard, Michelle Michelotto Experiments: Peter Hristov,
A COMPARISON MPI vs POSIX Threads. Overview MPI allows you to run multiple processes on 1 host  How would running MPI on 1 host compare with POSIX thread.
Operational computing environment at EARS Jure Jerman Meteorological Office Environmental Agency of Slovenia (EARS)
Testing Virtual Machine Performance Running ATLAS Software Yushu Yao Paolo Calafiura LBNL April 15,
DELL PowerEdge 6800 performance for MR study Alexander Molodozhentsev KEK for RCS-MR group meeting November 29, 2005.
Different CPUs CLICK THE SPINNING COMPUTER TO MOVE ON.
3. April 2006Bernd Panzer-Steindel, CERN/IT1 HEPIX 2006 CPU technology session some ‘random walk’
F. Brasolin / A. De Salvo – The ATLAS benchmark suite – May, Benchmarking ATLAS applications Franco Brasolin - INFN Bologna - Alessandro.
CMSBrownBag,05/29/2007 B.Mangano How to “use” CMSSW on own Linux Box and be happy In this context “use” means: - check-out pre-compiled CMSSW code - run.
HS06 on new CPU, KVM virtual machines and commercial cloud Michele Michelotto 1.
Fast Benchmark Michele Michelotto – INFN Padova Manfred Alef – GridKa Karlsruhe 1.
Benchmarking status Status of Benchmarking Helge Meinhard, CERN-IT WLCG Management Board 14-Jul Helge Meinhard (at) CERN.ch.
CERN IT Department CH-1211 Genève 23 Switzerland t IHEPCCC/HEPiX benchmarking WG Helge Meinhard / CERN-IT LCG Management Board 11 December.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
How are they called?.
HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09 INFN - Padova michele.michelotto at pd.infn.it.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks John Gordon SA1 Face to Face CERN, June.
Processors with Hyper-Threading and AliRoot performance Jiří Chudoba FZÚ, Prague.
HS06 performance per watt and transition to SL6 Michele Michelotto – INFN Padova 1.
HEPMARK2 Consiglio di Sezione 9 Luglio 2012 Michele Michelotto - Padova.
From Westmere to Magny-cours: Hep-Spec06 Cornell U. - Hepix Fall‘10 INFN - Padova michele.michelotto at pd.infn.it.
New CPU, new arch, KVM and commercial cloud Michele Michelotto 1.
The last generation of CPU processor for server farm. New challenges Michele Michelotto 1.
S. Pardi Frascati, 2012 March GPGPU Evaluation – First experiences in Napoli Silvio Pardi.
Moving out of SI2K How INFN is moving out of SI2K as a benchmark for Worker Nodes performance evaluation Michele Michelotto at pd.infn.it.
Multi-Core CPUs Matt Kuehn. Roadmap ► Intel vs AMD ► Early multi-core processors ► Threads vs Physical Cores ► Multithreading and Multi-core processing.
CERN IT Department CH-1211 Genève 23 Switzerland t IHEPCCC/HEPiX benchmarking WG Helge Meinhard / CERN-IT Grid Deployment Board 09 January.
Benchmarking of CPU models for HEP application
Brief introduction about “Grid at LNS”
Getting the Most out of Scientific Computing Resources
PC Components Microprocessor - performs all computations RAM - larger RAM memory contains more data Motherboard - holds all the above components Ports.
Evaluation of HEP worker nodes Michele Michelotto at pd.infn.it
CSCI206 - Computer Organization & Programming
CCR Autunno 2008 Gruppo Server
Getting the Most out of Scientific Computing Resources
Gruppo Server CCR michele.michelotto at pd.infn.it
ECE232: Hardware Organization and Design
Solid State Disks Testing with PROOF
How to benchmark an HEP worker node
Outline Benchmarking in ATLAS Performance scaling
Low Power processors in HEP
Gruppo Server CCR michele.michelotto at pd.infn.it
How INFN is moving out of SI2K has a benchmark for Worker Nodes
Geant4 MT Performance Soon Yung Jun (Fermilab)
David Front Weizmann institute May 2007
Passive benchmarking of ATLAS Tier-0 CPUs
What happens inside a CPU?
Transition to a new CPU benchmark
Comparing dual- and quad-core performance
CERN Benchmarking Cluster
INFN - Padova michele.michelotto at pd.infn.it
CSCI206 - Computer Organization & Programming
ConfMVM: A Hardware-Assisted Model to Confine Malicious VMs
CMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture
Caches & Memory.
Presentation transcript:

SI2K and beyond Michele Michelotto – INFN Padova CCR – Frascati 2007, May 30th

CCR07 - Rimini - M.Michelotto 2 CPU outlook Processors available Dual core Opteron AMD 22xx or older AMD2xx Dual core Intel 51xx Woodcrest Quad core Intel 53xx Clovertown Quad core Intel QX6700 (single proc) Quad core AMD “Barcelona”

CCR07 - Rimini - M.Michelotto 3 AMD vs Intel I was interested only in processor that permits at least four cores per box I consider also old processors for comparison Difficult to find CPU2006 info from SPEC for not so old processors (e.g. 22xx) Difficult to find CPU2000 processor for very new processor

CCR07 - Rimini - M.Michelotto 4 Specint 2000 ? Does it still make sense to use Specint 2000 as a benchmark?

CCR07 - Rimini - M.Michelotto 5 Intel vs AMD SI2000: Amd2220 vs Intel5160  1749/3061 = 57% SI2006: Amd2220 vs Intel5160  12.2/17.5 = 70% SI2000rate: Amd2220 vs Intel5160  78.3/121 = 65% SI2006rate: Amd2220 vs Intel5160  46.1/52.2 = 88% Clock: Amd2220 vs Intel5160  2800/3000 = 93%

CCR07 - Rimini - M.Michelotto 6 HEP Code How do they behave on real HEP code? To make a comparison I started with the sw used by Hans Wenzel from FNAL in his CHEP 2006 paper “Benchmarking AMD64 and EMT64” ROOT “stress test” 32 and 64 bit Pythia 32 and 64 bit CMS Montecarlo “Oscar” 32 bit only Waiting for new CMS sw

CCR07 - Rimini - M.Michelotto 7 Root “stress test” QX6700 running below what expected from clock or SI Good improvment when running 64/64 wrt 32/64 up to 46% 32/32 is more or less the same as 32/64 No diff between 2GB and 8GB

CCR07 - Rimini - M.Michelotto 8 Pythia 100K SUSY Events Good improvement up to 24% when running 64/64 wrt 32/64 No diff between 2GB and 8GB

CCR07 - Rimini - M.Michelotto 9 CMS_sw  evt More than 1000 SI per GHz!!! Only 600 SI per GHz Evt/sec per clock very close AMD better at evt/sec per Specint

CCR07 - Rimini - M.Michelotto 10 CMS_sw  evt Intel 5160 has best performance per core Intel 5345 best overall throughput AMD had slower clock. If you divide by clock performance very close Intel 5160 has a very high SI2000 pubblished Because of bigger caches Because of different memory footprint of SI2K vs CMS Performance per Specint 2000 better on the AMD

CCR07 - Rimini - M.Michelotto 11 Specint 2000 is used on all the Technical report and agreement with funding agencies On the other side Is being retired. I couldn’t find the intel Clovertown 5345 score Footprint is too small (designed for 200MB per core) Some processor like Intel5160 have “inflated” SI2000 number, probably because of the huge L2 caches May be other benchmark have a better correlation with my result? If I get a good correlation (+/- 10%) I’d consider myself satisfied Specint2000: good or bad?

CCR07 - Rimini - M.Michelotto 12 Spec CPU2006 Available since August 2006 Last evolution of SPEC suite (spec89, 92, 95, 2000) Includes more C++ then CPU 2000 Designed to run in about 1GB per core I could not run more than 3 on some 4 core box because of excessive paging Less sensitive to cache size Difficult to find pubblished result for >2y old processors Major part of pubblished result on MS Windows or Linux + Intel Compiler More difficult to run than CPU2000, at least with gcc

CCR07 - Rimini - M.Michelotto 13 My Conclusion CPU int 2000 no more usable as HEP benchmark with > 2006 processors Looking inside “CPU 2006 suite” we could find a solution but much more collaborative work is needed Hepix working group had a very slow start WLCG proposal to use SI2K measured with cern tuning and increased by 50%

CCR07 - Rimini - M.Michelotto 14 ATLAS Athena Simulation – memory usage per core, 8 cores VirtResShr 32-Bit 608m498m79m 64-Bit 1221m (2.0) 719m (1.44) 85m

CCR07 - Rimini - M.Michelotto 15 Spec vs measured Diff spec vs measured even greater Of course different compiler and O.S. Notice also the differences in clock and number of cores

CCR07 - Rimini - M.Michelotto 16 Final Comments Since we already demonstrated that SI2000 is not very meaningful with modern processor The same test should be done with SI2006 On both 32 and 64 bit environment On both gcc3 and gcc4 compilers Comparing HEP code vs SPEC measured and SPEC declared

CCR07 - Rimini - M.Michelotto 17 Esempio 5160 vs 2218 Il rapporto secondo SI2K dovrebbe essere 54% Usando il tuning CERN si riduce ad un più ragionevole 63% Ma secondo le mie applicazioni a 32bit dovrebbe essere 69% – 71%

CCR07 - Rimini - M.Michelotto 18 Esempio 5160 vs 2218 A 64bit il rapporto secondo SI2K non cambia 54% Usando il tuning CERN si riduce a 70% Ma secondo le mie applicazioni a 64bit dovrebbe essere 87%

CCR07 - Rimini - M.Michelotto 19 Esempio 5160 vs 2218 SI-2006 pubblicato o SI- rate2006 hanno rapporti più simili a quelli delle applicazioni Quelle SI2006 CERN non sono ufficiali ma qui addirittura AMD verrebbe favorito rispetto all’Intel

CCR07 - Rimini - M.Michelotto 20 Thank you for your attention