The High Performance Cluster for QCD Calculations: System Monitoring and Benchmarking Lucas Fernandez Seivane Summer Student 2002.

Slides:

Advertisements

Similar presentations

The google file system Cs 595 Lecture 9.

Advertisements

HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

Types of Parallel Computers

Benchmarking Parallel Code. Benchmarking2 What are the performance characteristics of a parallel code? What should be measured?

Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.

Parallelized variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability Ramesh Nallapati, William Cohen and John.

A Commodity Cluster for Lattice QCD Calculations at DESY Andreas Gellrich *, Peter Wegner, Hartmut Wittig DESY CHEP03, 25 March 2003 Category 6: Lattice.

IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)

A Comparative Study of Network Protocols & Interconnect for Cluster Computing Performance Evaluation of Fast Ethernet, Gigabit Ethernet and Myrinet.

©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display. COMPSCI 125 Introduction to Computer Science I.

Comparative Study of Beowulf Clusters and Windows 2000 Clusters By Seshendranath Pitcha.

Linux clustering Morris Law, IT Coordinator, Science Faculty, Hong Kong Baptist University.

07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.

1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,

Moving to Win 7 Considerations Dean Steichen A2CAT 2010.

Computer Systems 1 Fundamentals of Computing

The Study of Computer Science Chapter 0 Intro to Computer Science CS1510, Section 2.

Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.

Elements of a Computer System Dr Kathryn Merrick Thursday 4 th June, 2009.

Memory Main memory consists of a number of storage locations, each of which is identified by a unique address The ability of the CPU to identify each location.

Terabyte IDE RAID-5 Disk Arrays David A. Sanders, Lucien M. Cremaldi, Vance Eschenburg, Romulus Godang, Christopher N. Lawrence, Chris Riley, and Donald.

CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.

Random access memory.

Simulating Quarks and Gluons with Quantum Chromodynamics February 10, CS635 Parallel Computer Architecture. Mahantesh Halappanavar.

 Design model for a computer  Named after John von Neuman  Instructions that tell the computer what to do are stored in memory  Stored program Memory.

Overview of Computing. Computer Science What is computer science? The systematic study of computing systems and computation. Contains theories for understanding.

So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.

1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,

An Introduction to Computers August 12, 2008 Mrs. C. Furman.

Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.

Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.

The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.

Beowulf Cluster Jon Green Jay Hutchinson Scott Hussey Mentor: Hongchi Shi.

J. J. Rehr & R.C. Albers Rev. Mod. Phys. 72, 621 (2000) A “cluster to cloud” story: Naturally parallel Each CPU calculates a few points in the energy grid.

Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.

CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.

Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.

Example: Sorting on Distributed Computing Environment Apr 20,

6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.

Computer Guts and Operating Systems CSCI 101 Week Two.

JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.

1 Lattice QCD Clusters Amitoj Singh Fermi National Accelerator Laboratory.

Distributed Programming CA107 Topics in Computing Series Martin Crane Karl Podesta.

What is QCD? Quantum ChromoDynamics is the theory of the strong force

Gravitational N-body Simulation Major Design Goals -Efficiency -Versatility (ability to use different numerical methods) -Scalability Lesser Design Goals.

COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.

NT Services for UNIX - first impressions Burkhard Renk, Uni Mainz.

CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.

Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.

Next Generation of Apache Hadoop MapReduce Owen

BIG DATA/ Hadoop Interview Questions.

Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,

Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.

Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

Supervisor: Andreas Gellrich

Constructing a system with multiple computers or processors

Genomic Data Clustering on FPGAs for Compression

Hadoop Clusters Tess Fulkerson.

Constructing a system with multiple computers or processors

Constructing a system with multiple computers or processors

Constructing a system with multiple computers or processors

Lucas Fernandez Seivane Summer Student 2002 IT Group, DESY Hamburg

Specifications Clean & Match

Performance And Scalability In Oracle9i And SQL Server 2000

Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.

Types of Parallel Computers

Presentation transcript:

The High Performance Cluster for QCD Calculations: System Monitoring and Benchmarking Lucas Fernandez Seivane Summer Student 2002 IT Group, DESY Hamburg Supervisor: Andreas Gellrich Oviedo University (Spain)

Topics  Some Ideas of QM  The QFT Problem  Lattice Field Theory  What can we get?  Approaches to the computing  lattice.desy.de:  Hardware  Software  The stuff we made: Clumon  Possible improvements

 QM, “real behavior” of the world: ‘fuzzy world’  Relativity means causality (cause must precede consequence!)  Any complete description of Nature must combine both ideas  The only consistent way of doing this is … QUANTUM FIELD THEORY Let’s do some physics…

 Impossible to solve it exactly  PERTURBATIVE APPROACH  Necessity of small coupling constant (like  em = 1/137)  Example: QED (the strange theory of light and matter) Taylor:  em +  2 em /2 +  3 em /6 +… The QFT Problem

 Not small coupling constant (at least at low energies)  We cannot explain (at least analytically) a proton!!!  We do need something exact (the LATTICE is EXACT*) … but for QCD

 Generic tool for approaching non perturbative QFT  But more necessary in QCD (non perturbative aspects)  Even pure theoretical interests (Wilson approach) Lattice field theory

 We are interested in the spectra (bound states, masses of particles)  We can do it by means of correlation functions: if we could calculate them exactly, we would have solved the theory  They are extracted out of Path Integrals (foil1)  The problem is calculate Path Integrals Lattice can calculate Path Integrals What can we get?

 Discretize space-time  Monte-Carlo methods for choosing field configurations (Random generators)  Numerical evaluation of Path Integrals and correlation functions!!! (typical lattice sizes: a= fm, 1/a = 2GeV, L=32) but… A Naïve Approach

 Huge computer power i. Highly dimensional integrals ii. The calculation requires to compute the inverse of an “infinite”-dimensional matrix, which takes a lot of CPU time and RAM.  That’s why we need clusters, supercomputers or special machines (to divide the work)  The amount of data transferred is not so important, the deciding factor is the LATENCY of the network and the scalability above 1TFlops …but

 General Purpose Supercomputers:  Very expensive  Rigid (difficult upgrades on hardware)  Fully customed parallel machines:  Completely optimized  Only this use (difficult recycling)  Necessity of design, develop and build (or modify) the hard & soft  Commodity clusters  “Cheap PC” components  Completely customizable  Easy to upgrade / recycle How can we get it?

 Commercial Supercomputers: CrayT3E, Fujitsu VPP77, NECSx4, Hitachi SR8000…  Parallel machines: APEmille/apeNEXTINFN/DESY QCDSP/QCDOCCU/UKQCD/Riken CP-PACSTsukuba/Hitachi  Commodity clusters + Fast Networking  Low latency (Fast Networking)  Fast Speed  Standard software and programming environments Machines

 Cluster bought from a company (Megware), Beowulf type (1 master, 32 slaves)  Before upgrade (some weeks ago): 32 nodes:IntelXEONP4 1.7GHz 256 KB cache 1GB Rambus RAM 2  64 bit PCI slots 18 GB SCSI hard disks Fast Ethernet switch (normal networking, NFS disk mounting) Myrinet network (low latency)  Upgrade (August 2002) 16 nodes:2 IntelXEONP4 1.7GHz 256 KB cache 16 nodes:2 IntelXEONP4 2.0GHz 512 KB cache Lattice

 Software: SuSE Linux (modified by Megware)  MPICH-GM (implementation of MPI- CHamaleon for Myrinet GM system)  Megware Clustware (OpenSCE/SCMS modified): tool for monitoring and administration (but no logs) Lattice

 Andreas Gellrich First Version:  Provides logs and monitoring  Perl written (customizable) Lattice

 Me and Andreas Gellrich new version:  Also graphical data and another log measure  Uses MRTG to graph data Lattice

Clumon v2.0 (1)

Clumon v2.0 (2)

 Getting the flavor of a really high-perf cluster  Learning Perl (more or less) to understand Andreas tool  Playing around with Andreas tool  Search for how to graph this kind of data  Learning how to use MRTG/RRDtool  Some test and previous versions  Only have to do last retouches (polishing):  Time info of the cluster  Better documentation of the tools  Play around this last week with other stuff  Prepare talk and document and write up Work done (in progress)

 The cluster is unplugged to AFS DESY  Need for Backups / Archiving of the Data stored (dCash theoc01)  Maybe reinstall the cluster with DESY Linux (to fully know what’s in it)  Play around with other cluster stuff: OpenSCE, OSCAR, ROCKS… Possible Improvements