Comparing dual- and quad-core performance

Slides:



Advertisements
Similar presentations
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
Advertisements

The Protein Folding Problem David van der Spoel Dept. of Cell & Mol. Biology Uppsala, Sweden
Exploring The Green Blade Ken Lutz University of California, Berkeley LoCal Retreat, June 8, 2009.
Peter Wegner, DESY CHEP03, 25 March LQCD benchmarks on cluster architectures M. Hasenbusch, D. Pop, P. Wegner (DESY Zeuthen), A. Gellrich, H.Wittig.
HS06 on the last generation of CPU for HEP server farm Michele Michelotto 1.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Test results Test definition (1) Istituto Nazionale di Fisica Nucleare, Sezione di Roma; (2) Istituto Nazionale di Fisica Nucleare, Sezione di Bologna.
COLLABORATIVE EXECUTION ENVIRONMENT FOR HETEROGENEOUS PARALLEL SYSTEMS Aleksandar Ili´c, Leonel Sousa 2010 IEEE International Symposium on Parallel & Distributed.
Energy Profiling And Analysis Of The HPC Challenge Benchmarks Scalable Performance Laboratory Department of Computer Science Virginia Tech Shuaiwen Song,
Cluster currently consists of: 1 Dell PowerEdge Ghz Dual, quad core Xeons (8 cores) and 16G of RAM Original GRIDVM - SL4 VM-Ware host 1 Dell PowerEdge.
Service Review and Plans Carmine Cioffi Database Administrator and Developer 3D Workshop, Barcelona (ES), April 2009.
Parallelizing Security Checks on Commodity Hardware E.B. Nightingale, D. Peek, P.M. Chen and J. Flinn U Michigan.
AUTHORS: STIJN POLFLIET ET. AL. BY: ALI NIKRAVESH Studying Hardware and Software Trade-Offs for a Real-Life Web 2.0 Workload.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
Energy Savings with DVFS Reduction in CPU power Extra system power.
HS06 on new CPU, KVM virtual machines and commercial cloud Michele Michelotto 1.
Hyper Threading Technology. Introduction Hyper-threading is a technology developed by Intel Corporation for it’s Xeon processors with a 533 MHz system.
Performance and Scalability of xrootd Andrew Hanushevsky (SLAC), Wilko Kroeger (SLAC), Bill Weeks (SLAC), Fabrizio Furano (INFN/Padova), Gerardo Ganis.
HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09 INFN - Padova michele.michelotto at pd.infn.it.
Florida Tier2 Site Report USCMS Tier2 Workshop Livingston, LA March 3, 2009 Presented by Yu Fu for the University of Florida Tier2 Team (Paul Avery, Bourilkov.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
LSST Cluster Chris Cribbs (NCSA). LSST Cluster Power edge 1855 / 1955 Power Edge 1855 (*LSST1 – LSST 4) –Duel Core Xeon 3.6GHz (*LSST1 2XDuel Core Xeon)
Multi-core CPU’s April 9, Multi-Core at BNL First purchase of AMD dual-core in 2006 First purchase of Intel multi-core in 2007 –dual-core in early.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
Processors with Hyper-Threading and AliRoot performance Jiří Chudoba FZÚ, Prague.
HS06 performance per watt and transition to SL6 Michele Michelotto – INFN Padova 1.
Atlas Software Structure Complicated system maintained at CERN – Framework for Monte Carlo and real data (Athena) MC data generation, simulation and reconstruction.
FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.
GRAPE 현황. Hardware Configuration Master –2 x Dual Core Xeon 5130 (2 GHz) –2 GB memory –500 GB hard disk 4 Slaves –2 x Dual Core Xeon 5130 (2 GHz) –1 GB.
Lab Activities 1, 2. Some of the Lab Server Specifications CPU: 2 Quad(4) Core Intel Xeon 5400 processors CPU Speed: 2.5 GHz Cache : Each 2 cores share.
Operational and Application Experiences with the Infiniband Environment Sharon Brunett Caltech May 1, 2007.
New CPU, new arch, KVM and commercial cloud Michele Michelotto 1.
PASTA 2010 CPU, Disk in 2010 and beyond m. michelotto.
SI2K and beyond Michele Michelotto – INFN Padova CCR – Frascati 2007, May 30th.
Parallel Computers Today LANL / IBM Roadrunner > 1 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating point.
J.J. Keijser Nikhef Amsterdam Grid Group MyFirstMic experience Jan Just Keijser 26 November 2013.
Benchmarking of CPU models for HEP application
Brief introduction about “Grid at LNS”
Getting the Most out of Scientific Computing Resources
Evaluation of HEP worker nodes Michele Michelotto at pd.infn.it
Title of the Poster Supervised By: Prof.*********
Lecture 2: Performance Today’s topics:
Getting the Most out of Scientific Computing Resources
OPERATING SYSTEMS CS 3502 Fall 2017
Solid State Disks Testing with PROOF
Cluster / Grid Status Update
Diskpool and cloud storage benchmarks used in IT-DSS
How to benchmark an HEP worker node
Moroccan Grid Infrastructure MaGrid
Gruppo Server CCR michele.michelotto at pd.infn.it
Geant4 MT Performance Soon Yung Jun (Fermilab)
Passive benchmarking of ATLAS Tier-0 CPUs
PES Lessons learned from large scale LSF scalability tests
CERN Benchmarking Cluster
Unit 2 Computer Systems HND in Computing and Systems Development
Lifecycle Suppose we have two processes that require the CPU. The first one had the CPU and you would like to let the second process run, ie context switch.
Parallel Computers Today
SPECpower_ssj2008** Characterization
US ATLAS Physics & Computing
LQCD benchmarks on cluster architectures M. Hasenbusch, D. Pop, P
Support for ”interactive batch”
Mid Term review CSC345.
Types of Computers Mainframe/Server
Introduction to Single Board Computer
CSE 542: Operating Systems
Rajeev Balasubramonian
Exploring Multi-Core on
Presentation transcript:

Comparing dual- and quad-core performance Harald Vogt (DESY) Michael Volkmann (HUB) Peter Wegner (DESY) Stephan Wiesand (DESY) Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

Comparing dual- and quad-core performance Outline Benchmark hardware and software Architectural aspects Results Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

Benchmark systems System Model OS CPU RAM blade01 PE1955 SL4/64 Clovertown 5345, 2.33 Ghz x 4 8 x 2 GB DDR2-667 FBD blade01.32 SL3/64 32-app blade02 SL3//32 cube6.ht Prec. 560 SL3/32 Xeon, 2.4 Ghz x 1, +HT 2 GB RD-800 cube6.noht SL3//32 erinye1 PE1950 SL4/64 Woodcrest 5160, 3.0 Ghz x 2 4 x 2 GB DDR2-667 FBD galaxy30 X4100 SL4/64 Opteron 280, 2.4 Ghz x 2 4 x 2 GB DDR-400 linfini V20z SL4/64 Opteron 250, 2.4 Ghz x 1 4 x 1 GB DDR-400 victoria PE1950 SL4/64 Dempsey, 3.0 Ghz x 2, +HT 4 x 512 MB DDR2-667 FBD Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

Software ROOT Stress Benchmark Athena Framework: ROOT: 5.08.00 n parallel stress benchmarks run in batch mode, 3000 events (stress -b 3000), 2 consecutive runs, results are from first run (2nd run is to keep load at same level during all measurements). Test directories were created in /dev/shm (no I/O to disk). Power management (Demand Based Switching, PowerNow) was enabled on blade01, galaxy30 resp. Athena Framework: 12.0.3 32-Bit 12.3.0 64-Bit (simulation only) Simulation, Reconstruction: 10 Higgs Events, H->ZZ->llqq (DC3.005305.PythiaH185zzqqll.py) Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

Architectural aspects - Chipset Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

Architectural aspects – Chipset, cont. Bottlenecks ?? Core1 2 Core3 4 Role of snoop filter in 5000X Chipset ? Core1 2 3 4 Core5 6 7 8 2 x 10.6 GB/s 4 x 5.3 GB/s Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

ROOT stress test ( ROOT 5.08.00) 64-Bit ROOT on quad-core Clovertown 32-Bit ROOT on quad-core Clovertown 64-Bit ROOT on dual-core Woodcrest 64-Bit ROOT on dual-core Opteron Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

ROOT stress test ( ROOT 5.08.00) Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

ATLAS Athena Simulation – processing time Processing time (sec) Number of Jobs Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

ATLAS Athena Simulation – scaling behavior Tbest/Tn Number of Jobs Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

ATLAS Athena Simulation – memory usage per core, 8 cores (top) Virt Res Shr 32-Bit 608m 498m 79m 64-Bit 1221m (2.0) 719m (1.44) 85m Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

ATLAS Athena Reconstruction – processing time Processing time (sec) Number of Jobs Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

ATLAS Athena Reconstruction – scaling behavior Tbest/Tn Number of Jobs Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

ATLAS Athena event generation – processing time Processing time (sec) Number of Jobs Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

ATLAS Athena event generation – scaling behavior Tbest/Tn Number of Jobs Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

Performance per Watt Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007

Comparing dual- and quad-core performance Conclusions With more than 4 Jobs per system quad-cores are scaling extremely good for simulation and good for reconstruction compared to dual-cores – even when quad-core CPUs with lower clock frequency are used 64-Bit Appl. result in approx. 20% performance gain against 32-Bit Open issues: Particle Astrophysics-, LQCD-Benchmarks References http://root.cern.ch/root/Benchmark.html Atlas Athena framework: https://twiki.cern.ch/twiki/bin/view/Atlas/WorkBookAthenaFramework Christian Vilsbeck (http:www.tecchannel.de, 14.11.2006) Peter Wegner Multi-core performance in High Energy Physics HEPiX Workshop 2007