LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 1 Medusa: a LIGO Scientific Collaboration Facility.

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

RAID Oh yes Whats RAID? Redundant Array (of) Independent Disks. A scheme involving multiple disks which replicates data across multiple drives. Methods.
IT Essentials PC Hardware & Software v5.0
PowerEdge T20 Customer Presentation. Product overview Customer benefits Use cases Summary PowerEdge T20 Overview 2 PowerEdge T20 mini tower server.
1U Rack-Mountable 4-Bay NAS Server with iSCSI NAS-7450.
Beowulf Supercomputer System Lee, Jung won CS843.
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
Premio Server Product Training By Samuel Sanchez Desktop, Server, and Network Product Manager November 2000 (Click on the speaker icon on each slide to.
HELICS Petteri Johansson & Ilkka Uuhiniemi. HELICS COW –AMD Athlon MP 1.4Ghz –512 (2 in same computing node) –35 at top500.org –Linpack Benchmark 825.
Lesson 12 – NETWORK SERVERS Distinguish between servers and workstations. Choose servers for Windows NT and Netware. Maintain and troubleshoot servers.
Group 11 Pekka Nikula Ossi Hämäläinen Introduction to Parallel Computing Kentucky Linux Athlon Testbed 2
IT Systems Memory EN230-1 Justin Champion C208 –
Semester One 2001/2002 Sheffield Hallam University1 The Motherboard Major circuit board in PC Holds CPU where calculations and instructions on data are.
WHAT IS A COMPUTER? BY JACK SUMMERS. WHAT IS A COMPUTER? A computer basically a set of different components that when put together in the correct way.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
ASUS Confidential ASUS AP140R Server Introduction By Server Team V1.0.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
TRIUMF SITE REPORT – Corrie Kost April Catania (Italy) Update since last HEPiX/HEPNT meeting.
COMPUTER CONCEPTS.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
Windows Server MIS 424 Professor Sandvig. Overview Role of servers Performance Requirements Server Hardware Software Windows Server IIS.
Hardware Case that houses the computer Monitor Keyboard and Mouse Disk Drives – floppy disk, hard disk, CD Motherboard Power Supply (PSU) Speakers Ports.
Hardware. THE MOVES INSTITUTE Hardware So you want to build a cluster. What do you need to buy? Remember the definition of a beowulf cluster: Commodity.
5 September 2015 Culrur-exp project CULTURe EXchange Platform (CULTUR-EXP) project kick-off meeting, August 2013, Tbilisi, Georgia Joint Operational.
LIGO- G Z August 2001 LSC MeetingLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 1 Medusa: a LSC/UWM Data Analysis Facility.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
Microprocessors Chapter 3.
B.A. (Mahayana Studies) Introduction to Computer Science November March The Motherboard A look at the brains of the computer, the.
Online Systems Status Review of requirements System configuration Current acquisitions Next steps... Upgrade Meeting 4-Sep-1997 Stu Fuess.
Hardware & Software The CPU & Memory.
LSC General Meeting LIGO Scientific Collaboration - University of Wisconsin - Milwaukee 1 UWM Data Analysis Facility LSC General Meeting, August.
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
A+ Guide to Managing and Maintaining Your PC Fifth Edition Chapter 23 Purchasing a PC or Building Your Own.
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Input/Output CS 342 – Operating Systems Ibrahim Korpeoglu Bilkent University.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
23 Oct 2002HEPiX FNALJohn Gordon CLRC-RAL Site Report John Gordon CLRC eScience Centre.
1 Selecting LAN server (Week 3, Monday 9/8/2003) © Abdou Illia, Fall 2003.
T. Bowcock Liverpool Sept 00. Sept LHCb-GRID T. Bowcock 2 University of Liverpool Successes Issues Improving the system Comments.
Presentation on “Identify the components of the computer” Prepared by: Mohammad Ali Barhoum University ID ,Division 102 Submitted to the Engineer:
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
MAP Project T. Bowcock, A. Kinvig, I. Last M. McCubbin, A. Moreton C. Parkes, G. Patel University of Liverpool.
Chapter 2 Turning Data into Something You Can Use © The McGraw-Hill Companies, Inc., 2000 Processing Hardware.
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
Computer system & Architecture
May 25-26, 2006 LQCD Computing Review1 Jefferson Lab 2006 LQCD Analysis Cluster Chip Watson Jefferson Lab, High Performance Computing.
RAL Site report John Gordon ITD October 1999
RAL Site Report John Gordon HEPiX/HEPNT Catania 17th April 2002.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
Virtualization Supplemental Material beyond the textbook.
A+ Guide to Managing and Maintaining Your PC Fifth Edition Chapter 23 Purchasing a PC or Building Your Own.
The 2001 Tier-1 prototype for LHCb-Italy Vincenzo Vagnoni Genève, November 2000.
Learning Computer Concepts Lesson 1: Defining a Computer Lesson Objectives List types of computers Identify benefits of computers Describe how a computer.
Activity 1 5 minutes to discuss and feedback on the following:
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
MOTHERBOARD AND ITS COMPONENTS PRESENTED BY- Md. DANISH HEMLATHA J SUSHMITHA RAJ N GIRIRAJ. D ABHIRANJAN SINGH
Oct. 6, 1999PHENIX Comp. Mtg.1 CC-J: Progress, Prospects and PBS Shin’ya Sawada (KEK) For CCJ-WG.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
12/19/01MODIS Science Team Meeting1 MODAPS Status and Plans Edward Masuoka, Code 922 MODIS Science Data Support Team NASA’s Goddard Space Flight Center.
Cheltenham Courseware
Computer Hardware.
Design Unit 26 Design a small or home office network
Chapter 4: MEMORY.
Overview 1. Inside a PC 2. The Motherboard 3. RAM the 'brains' 4. ROM
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.
QMUL Site Report by Dave Kant HEPSYSMAN Meeting /09/2019
Cluster Computers.
Presentation transcript:

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 1 Medusa: a LIGO Scientific Collaboration Facility and GriPhyN Testbed University of Wisconsin - Milwaukee GriPhyN Meeting, October 16, 2001

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 2 Medusa Web Site:

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 3 Medusa Overview Beowulf cluster 296 Gflops peak 150 Gbytes RAM 23 TBytes disk storage 30 Tape AIT-2 robot Fully-meshed switch UPS power 296 nodes, each with 1 GHz Pentium III 512 Mbytes memory 100baseT Ethernet 80 Gbyte disk

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 4 Medusa Design Goals Intended for fast, flexible data analysis prototyping, quick turn-around work, and dedicated analysis. Data replaceable (from LIGO archive): use inexpensive distributed disks. Store representative data on disk: use internet or a small tape robot to transfer it from LIGO. Analysis is unscheduled and flexible, since data on disks. Easy to repeat (parts of) analysis runs. System crashes are annoying, but not catastrophic: analysis codes can be experimental Opportunity to try different software environments Hardware reliability target: 1 month uptime

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 5 Some design details... Choice of processors determined by performance on FFT benchmark code »AXP (expensive, slow FFTS) »Pentium IV (expensive, slower than PIII on our benchmarks) »Athlon Thunderbird (fast, but concerns about heat/reliability) »Pentium III (fast, cheap, reliable) Dual CPU systems slow Also concerned about power budget, $$$ budget, and reliability

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 6 No Rackmounts Saves about $250/box Entirely commodity components Space for extra disks, networking upgrade Boxes swapable in a minute

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 7 Some design details... Motherboard is an Intel D815EFV. This is a low-cost high-volume “consumer” grade system Real-time monitoring CPU temperature motherboard temperature CPU fan speed Case fan speed 6 supply voltages Ethenet “Wake on LAN” for remote power-up of systems Micro-ATX form-factor rather than ATX (3 PCI slots rather than 5) for smaller boxes. Lots of fans! Systems are well balanced: memory bus transfers data at 133 MHz x 8 bytes = 1.07 GB/sec disks about 30 MB/sec in block mode ethernet about 10 MB/sec

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 8 Some design details... Fully-meshed Accomodates up to 15 blades, each of which is either TX or TX ports Will also accomodate 10 Gb/s blades All cabling is CAT5e for potential gigabit upgrade 1800 W “Private” Network Switch: Foundry Networks FastIron III

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 9 Networking Topology FastIron III Switch (256 Gb/s backplane) Slave S001 Slave S Internet Master m001 medusa.phys.uwm.edu Master m002 hydra.phys.uwm.edu Data Server dataserver.phys.uwm.edu Slave S002 Slave S296 RAID File Server uwmlsc.phys.uwm.edu 100 Mb/sec Gb/sec

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 10 Cooling & Electrical Dedicated 5 ton air conditioner Dedicated 40 kVA UPS would have cost about $30k Instead used commodity 2250 VA UPS’s for $10k System uses about 50 Watts/node, 18 kW total Three-phase power, 150 amps

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 11 Software Linux kernel, RH 6.2 file structure All software resides in a UWM CVS repository »Base OS »Cloning from CD & over network »Nodes “interchangeable” - get identity from dhcp server on master Installed tools include LDAS, Condor,Globus, MPICH, LAM Log into any machine from any other (for example) rsh s120 Disks of all nodes automounted from all others ls /net/s120/etc cp /netdata/s290/file1 /netdata/s290/file2 simplifies data access, system maintenance

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 12 Memory Soft Error Rates Cosmic rays produce random soft memory errors. Is ECC (Error Checking & Correction) memory needed? System has 9500 memory chips ~ transistors Modern SDRAM is less sensitive to cosmic-ray induced errors - so only a one inexpensive chipset (VIA 694) supports ECC, but performance hit significant (20%). Soft errors arising from cosmic rays well-studied, error rates measured: »Stacked capacitor SDRAM (95% of market) worst-case error rates ~ 2/day »Trench Internal Charge capacitor SDRAM (5% of market) worst-case error rates 10/year, expected rates ~ 2/year Purchased systems with TIC SDRAM, no ECC

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 13 Procurement Used 3-week sealed bid with detailed written specification for all parts. Systems delivered with OS, “ready to go”. Nodes have a 3-year vendor warranty, with back-up manufacturers warranties on disks, CPUs, motherboards and memory. Spare parts closet at UWM maintained by vendor. 8 bids, ranging from $729/box to $1200/box Bid process was time-consuming, but has protected us.

LIGO- XXXX Oct 2001 GriPhyN All HandsLIGO Scientific Collaboration - University of Wisconsin - Milwaukee 14 Overall Hardware Budget Nodes$222 k Networking switch$ 60 k Air conditioning$ 30 k Tape library$ 15 k RAID file server$ 15 k UPS’s$ 12 k Test machines, samples$ 10 k Electrical work$ 10 k Shelving, cabling, miscellaneous$ 10 k TOTAL$ 384k Remaining funds contingency: networking upgrade, larger tape robot, more powerful front-end machines?