SI2K and beyond Michele Michelotto – INFN Padova CCR – Frascati 2007, May 30th
CCR07 - Rimini - M.Michelotto 2 CPU outlook Processors available Dual core Opteron AMD 22xx or older AMD2xx Dual core Intel 51xx Woodcrest Quad core Intel 53xx Clovertown Quad core Intel QX6700 (single proc) Quad core AMD “Barcelona”
CCR07 - Rimini - M.Michelotto 3 AMD vs Intel I was interested only in processor that permits at least four cores per box I consider also old processors for comparison Difficult to find CPU2006 info from SPEC for not so old processors (e.g. 22xx) Difficult to find CPU2000 processor for very new processor
CCR07 - Rimini - M.Michelotto 4 Specint 2000 ? Does it still make sense to use Specint 2000 as a benchmark?
CCR07 - Rimini - M.Michelotto 5 Intel vs AMD SI2000: Amd2220 vs Intel5160 1749/3061 = 57% SI2006: Amd2220 vs Intel5160 12.2/17.5 = 70% SI2000rate: Amd2220 vs Intel5160 78.3/121 = 65% SI2006rate: Amd2220 vs Intel5160 46.1/52.2 = 88% Clock: Amd2220 vs Intel5160 2800/3000 = 93%
CCR07 - Rimini - M.Michelotto 6 HEP Code How do they behave on real HEP code? To make a comparison I started with the sw used by Hans Wenzel from FNAL in his CHEP 2006 paper “Benchmarking AMD64 and EMT64” ROOT “stress test” 32 and 64 bit Pythia 32 and 64 bit CMS Montecarlo “Oscar” 32 bit only Waiting for new CMS sw
CCR07 - Rimini - M.Michelotto 7 Root “stress test” QX6700 running below what expected from clock or SI Good improvment when running 64/64 wrt 32/64 up to 46% 32/32 is more or less the same as 32/64 No diff between 2GB and 8GB
CCR07 - Rimini - M.Michelotto 8 Pythia 100K SUSY Events Good improvement up to 24% when running 64/64 wrt 32/64 No diff between 2GB and 8GB
CCR07 - Rimini - M.Michelotto 9 CMS_sw evt More than 1000 SI per GHz!!! Only 600 SI per GHz Evt/sec per clock very close AMD better at evt/sec per Specint
CCR07 - Rimini - M.Michelotto 10 CMS_sw evt Intel 5160 has best performance per core Intel 5345 best overall throughput AMD had slower clock. If you divide by clock performance very close Intel 5160 has a very high SI2000 pubblished Because of bigger caches Because of different memory footprint of SI2K vs CMS Performance per Specint 2000 better on the AMD
CCR07 - Rimini - M.Michelotto 11 Specint 2000 is used on all the Technical report and agreement with funding agencies On the other side Is being retired. I couldn’t find the intel Clovertown 5345 score Footprint is too small (designed for 200MB per core) Some processor like Intel5160 have “inflated” SI2000 number, probably because of the huge L2 caches May be other benchmark have a better correlation with my result? If I get a good correlation (+/- 10%) I’d consider myself satisfied Specint2000: good or bad?
CCR07 - Rimini - M.Michelotto 12 Spec CPU2006 Available since August 2006 Last evolution of SPEC suite (spec89, 92, 95, 2000) Includes more C++ then CPU 2000 Designed to run in about 1GB per core I could not run more than 3 on some 4 core box because of excessive paging Less sensitive to cache size Difficult to find pubblished result for >2y old processors Major part of pubblished result on MS Windows or Linux + Intel Compiler More difficult to run than CPU2000, at least with gcc
CCR07 - Rimini - M.Michelotto 13 My Conclusion CPU int 2000 no more usable as HEP benchmark with > 2006 processors Looking inside “CPU 2006 suite” we could find a solution but much more collaborative work is needed Hepix working group had a very slow start WLCG proposal to use SI2K measured with cern tuning and increased by 50%
CCR07 - Rimini - M.Michelotto 14 ATLAS Athena Simulation – memory usage per core, 8 cores VirtResShr 32-Bit 608m498m79m 64-Bit 1221m (2.0) 719m (1.44) 85m
CCR07 - Rimini - M.Michelotto 15 Spec vs measured Diff spec vs measured even greater Of course different compiler and O.S. Notice also the differences in clock and number of cores
CCR07 - Rimini - M.Michelotto 16 Final Comments Since we already demonstrated that SI2000 is not very meaningful with modern processor The same test should be done with SI2006 On both 32 and 64 bit environment On both gcc3 and gcc4 compilers Comparing HEP code vs SPEC measured and SPEC declared
CCR07 - Rimini - M.Michelotto 17 Esempio 5160 vs 2218 Il rapporto secondo SI2K dovrebbe essere 54% Usando il tuning CERN si riduce ad un più ragionevole 63% Ma secondo le mie applicazioni a 32bit dovrebbe essere 69% – 71%
CCR07 - Rimini - M.Michelotto 18 Esempio 5160 vs 2218 A 64bit il rapporto secondo SI2K non cambia 54% Usando il tuning CERN si riduce a 70% Ma secondo le mie applicazioni a 64bit dovrebbe essere 87%
CCR07 - Rimini - M.Michelotto 19 Esempio 5160 vs 2218 SI-2006 pubblicato o SI- rate2006 hanno rapporti più simili a quelli delle applicazioni Quelle SI2006 CERN non sono ufficiali ma qui addirittura AMD verrebbe favorito rispetto all’Intel
CCR07 - Rimini - M.Michelotto 20 Thank you for your attention