Download presentation
Presentation is loading. Please wait.
Published byΔημήτηρ Ταμτάκος Modified over 6 years ago
1
The High Performance Cluster for QCD Calculations: System Monitoring and Benchmarking
Lucas Fernandez Seivane Summer Student 2002 IT Group, DESY Hamburg Supervisor: Andreas Gellrich Oviedo University (Spain)
2
Topics Some Ideas of QM The QFT Problem Lattice Field Theory
What can we get? Approaches to the computing lattice.desy.de: Hardware Software The stuff we made: Clumon Possible improvements
3
Let’s do some physics… QM, “real behavior” of the world: ‘fuzzy world’
Relativity means causality (cause must precede consequence!) Any complete description of Nature must combine both ideas The only consistent way of doing this is … QUANTUM FIELD THEORY
4
The QFT Problem Impossible to solve it exactly PERTURBATIVE APPROACH
Necessity of small coupling constant (like em = 1/137) Example: QED (the strange theory of light and matter) Taylor: em+2em/2 + 3em/6 +…
5
… but for QCD Not small coupling constant (at least at low energies)
We cannot explain (at least analytically) a proton!!! We do need something exact (the LATTICE is EXACT*)
6
Lattice field theory Generic tool for approaching non perturbative QFT
But more necessary in QCD (non perturbative aspects) Even pure theoretical interests (Wilson approach)
7
Lattice can calculate Path Integrals
What can we get? We are interested in the spectra (bound states, masses of particles) We can do it by means of correlation functions: if we could calculate them exactly, we would have solved the theory They are extracted out of Path Integrals (foil1) The problem is calculate Path Integrals Lattice can calculate Path Integrals
8
(typical lattice sizes: a=0.05-0.1 fm, 1/a = 2GeV, L=32)
A Naïve Approach Discretize space-time Monte-Carlo methods for choosing field configurations (Random generators) Numerical evaluation of Path Integrals and correlation functions!!! (typical lattice sizes: a= fm, 1/a = 2GeV, L=32) but…
9
…but Huge computer power
Highly dimensional integrals The calculation requires to compute the inverse of an “infinite”-dimensional matrix, which takes a lot of CPU time and RAM. That’s why we need clusters, supercomputers or special machines (to divide the work) The amount of data transferred is not so important, the deciding factor is the LATENCY of the network and the scalability above 1TFlops
10
How can we get it? General Purpose Supercomputers:
Very expensive Rigid (difficult upgrades on hardware) Fully customed parallel machines: Completely optimized Only this use (difficult recycling) Necessity of design, develop and build (or modify) the hard & soft Commodity clusters “Cheap PC” components Completely customizable Easy to upgrade / recycle
11
Machines Commercial Supercomputers:
CrayT3E, Fujitsu VPP77, NECSx4, Hitachi SR8000… Parallel machines: APEmille/apeNEXT INFN/DESY QCDSP/QCDOC CU/UKQCD/Riken CP-PACS Tsukuba/Hitachi Commodity clusters + Fast Networking Low latency (Fast Networking) Fast Speed Standard software and programming environments
12
Lattice Cluster bought from a company (Megware), Beowulf type (1 master, 32 slaves) Before upgrade (some weeks ago): 32 nodes: IntelXEONP4 1.7GHz 256 KB cache 1GB Rambus RAM 2 64 bit PCI slots 18 GB SCSI hard disks Fast Ethernet switch (normal networking, NFS disk mounting) Myrinet network (low latency) Upgrade (August 2002) 16 nodes: 2 IntelXEONP4 1.7GHz 256 KB cache 16 nodes: 2 IntelXEONP4 2.0GHz 512 KB cache
13
Lattice cluster@DESY(2)
Software: SuSE Linux (modified by Megware) MPICH-GM (implementation of MPI-CHamaleon for Myrinet GM system) Megware Clustware (OpenSCE/SCMS modified): tool for monitoring and administration (but no logs)
14
Lattice cluster@DESY(3)
Andreas Gellrich First Version: Provides logs and monitoring Perl written (customizable)
15
Lattice cluster@DESY(4)
Me and Andreas Gellrich new version: Also graphical data and another log measure Uses MRTG to graph data
16
Clumon v2.0 (1)
17
Clumon v2.0 (2)
18
Work done (in progress)
Getting the flavor of a really high-perf cluster Learning Perl (more or less) to understand Andreas tool Playing around with Andreas tool Search for how to graph this kind of data Learning how to use MRTG/RRDtool Some test and previous versions Only have to do last retouches (polishing): Time info of the cluster Better documentation of the tools Play around this last week with other stuff Prepare talk and document and write up
19
Possible Improvements
The cluster is unplugged to AFS DESY Need for Backups / Archiving of the Data stored (dCash theoc01) Maybe reinstall the cluster with DESY Linux (to fully know what’s in it) Play around with other cluster stuff: OpenSCE, OSCAR, ROCKS…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.