Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.

Slides:



Advertisements
Similar presentations
Basic Concepts of a Computer Network
Advertisements

Nios Multi Processor Ethernet Embedded Platform Final Presentation
Chapter Six Networking Hardware.
Thoughts on Shared Caches Jeff Odom University of Maryland.
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
Beowulf Supercomputer System Lee, Jung won CS843.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
A Comparative Study of Network Protocols & Interconnect for Cluster Computing Performance Evaluation of Fast Ethernet, Gigabit Ethernet and Myrinet.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
Firewall and Proxy Server Director: Dr. Mort Anvari Name: Anan Chen Date: Summer 2000.
NETWORKING HARDWARE.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
14th April 1999Hepix Oxford Particle Physics Site Report Pete Gronbech Systems Manager.
Type of Software There are two main types of software They are System software Application software Hardware System Software (OS) Application Software.
Online Systems Status Review of requirements System configuration Current acquisitions Next steps... Upgrade Meeting 4-Sep-1997 Stu Fuess.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Ohio Supercomputer Center Cluster Computing Overview Summer Institute for Advanced Computing August 22, 2000 Doug Johnson, OSC.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
28 April 2003Imperial College1 Imperial College Site Report HEP Sysman meeting 28 April 2003.
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
MAP Project T. Bowcock, A. Kinvig, I. Last M. McCubbin, A. Moreton C. Parkes, G. Patel University of Liverpool.
Laboratório de Instrumentação e Física Experimental de Partículas GRID Activities at LIP Jorge Gomes - (LIP Computer Centre)
SLAC Site Report Chuck Boeheim Assistant Director, SLAC Computing Services.
3/5/2002e-business and Information Systems1 Computer Networking Computer System Computer Hardware Computer Software Computer Networking.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
Beowulf – Cluster Nodes & Networking Hardware Garrison Vaughan.
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
Cluster Software Overview
RAL Site report John Gordon ITD October 1999
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Based upon slides from Jay Lepreau, Utah Emulab Introduction Shiv Kalyanaraman
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing,
 LAN ADVANTAGE  Workstations can share peripherals devices like printers. Cheaper that providing a printer for each computer.  Workstations do not.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Scientific Computing Facilities for CMS Simulation Shams Shahid Ayub CTC-CERN Computer Lab.
Networks. What is a Network? A network is a collection of computers and other devices that allow computer users to send and receive information to and.
Background Computer System Architectures Computer System Software.
Oct. 6, 1999PHENIX Comp. Mtg.1 CC-J: Progress, Prospects and PBS Shin’ya Sawada (KEK) For CCJ-WG.
OCR AS Level F451: Data transmission Data transmission a. Describe the characteristics of a LAN (local area network) and a WAN (wide area network);
Network - definition A network is defined as a collection of computers and peripheral devices (such as printers) connected together. A local area network.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Introduction to Operating Systems Concepts
Unit Communication Hardware
CCNA 2 v3.1 Module 2 Introduction to Routers
OPERATING SYSTEM CONCEPT AND PRACTISE
Network Connected Multiprocessors
Operating System Review
Operating System.
System Software EIT, © Author Gay Robertson, 2016.
Local Area Networks, 3rd Edition David A. Stamper
SAM at CCIN2P3 configuration issues
Constructing a system with multiple computers or processors
Onno W. Purbo Wireless 2.4GHz 11Mbps Onno W. Purbo
CS 286 Computer Organization and Architecture
The PCI bus (Peripheral Component Interconnect ) is the most commonly used peripheral bus on desktops and bigger computers. higher-level bus architectures.
Operating System Review
Patrick Dreher Research Scientist & Associate Director
NCSA Supercluster Administration
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Networks Networking has become ubiquitous (cf. WWW)
Lesson 17 Networking Basics.
Lesson 17 Networking Basics.
Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.
Types of Parallel Computers
Basics of Computer Networking
Cluster Computers.
Presentation transcript:

Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT

A Multi-Purpose PC Farm Goals of the Project Functionality and Constraints Hardware Selection Software Selection Operation

Goals of the Project User requirements Production machine for the experimentalists for Monte Carlo simulations and physics analysis of experimental data Development and testing platform for the theorists to examine the performance characteristics of the x86 chip design Design a way for both experimentalists and theorists to peacefully co-exist sharing the existing PC farm hardware at the same time

Existing PC Farm Hardware The configuration for each machine in the existing PC farm 20 dual Pentium II 400 MHz CPUs 384 Mbytes memory 13 Gbytes disk space fast Ethernet PCs interconnected by Kingston EtherRx 100 BaseTx fast Ethernet stackable hubs A front end PC connecting the farm nodes to the internet

PC Farm Software Operating system is RedHat Linux x86 version 5.2 Linux kernel configured for SMP operations Production of batch jobs managed through Network Queuing System (http://www.gnqs.org)

LAN HUB

Constraints for the Project No new funds were available to purchase additional CPUs for the existing PC farm No new funds were available to purchase a separate PC farm for development and testing of theory codes No funds were available at the level needed to purchase high performance network switches (such as Myrinet) Small amounts of funds were available for additional peripherals

Modified PC Farm - Functionality Original configuration had 20 machines (40 CPUs) available under a batch queuing system (NQS) Modified configuration set aside 4 of the machines for the theorists (8 CPUs) leaving the other 32 CPUs for production work and analysis of experimental data Four 4-port Adaptec network cards were purchased and one was installed in each of the four machines The four machines were networked together in a two-dimensional torus

LAN HUB

Modes of Operation Production operation for the experimentalists involved configuring NQS so that it identified 40 CPUs available for production and analysis of data An alternate NQS configuration was built that identified only 32 CPUs available for production Only one of these two configurations could be installed and operational on the PC farm at a given time

Modes of Operation (cont’d) When the alternate NQS configuration was loaded Experimentalists would continue to use the 32 CPUs The theorists would first log onto the front end and then use ssh to log onto one of the 4 machines not grouped under the alternate NQS configuration From this point, theory codes could be started using one, two, or all four machines

Results Theorists - Experimentalists Gathered data as part of a larger program to compare the performance between x86, alpha 164 and 264 chips Interprocessor communication using MPI Tests on memory bandwidth Tests on lattice size of versus L2 cache for certain computation routines Experimentalists continued Monte Carlo production and analysis of experimental data

Last Slide