Lattice QCD Computing Project Review

Slides:



Advertisements
Similar presentations
©2009 HP Confidential template rev Ed Turkel Manager, WorldWide HPC Marketing 4/7/2011 BUILDING THE GREENEST PRODUCTION SUPERCOMPUTER IN THE.
Advertisements

Computer Room Provision in Atlas and R89 Graham Robinson.
The First Microprocessor By: Mark Tocchet and João Tupinambá.
IBM 1350 Cluster Expansion Doug Johnson Senior Systems Developer.
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
Designing Lattice QCD Clusters Supercomputing'04 November 6-12, 2004 Pittsburgh, PA.
Design Considerations Don Holmgren Lattice QCD Project Review May 24, Design Considerations Don Holmgren Lattice QCD Computing Project Review Cambridge,
CSC Site Update HP Nordic TIG April 2008 Janne Ignatius Marko Myllynen Dan Still.
Workshop on Commodity-Based Visualization Clusters Learning From the Stanford/DOE Visualization Cluster Mike Houston, Greg Humphreys, Randall Frank, Pat.
Real Parallel Computers. Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel.
HS06 on the last generation of CPU for HEP server farm Michele Michelotto 1.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
Prepared by Careene McCallum-Rodney Hardware specification of a computer system.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Recent experience in buying and configuring a cluster John Matrow, Director High Performance Computing Center Wichita State University.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
Building a High-performance Computing Cluster Using FreeBSD BSDCon '03 September 10, 2003 Brooks Davis, Michael AuYeung, Gary Green, Craig Lee The Aerospace.
1 Down Place Hammersmith London UK 530 Lytton Ave. Palo Alto CA USA.
Basic Computer Structure and Knowledge Project Work.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
Operational computing environment at EARS Jure Jerman Meteorological Office Environmental Agency of Slovenia (EARS)
ICER User Meeting 3/26/10. Agenda What’s new in iCER (Wolfgang) Whats new in HPCC (Bill) Results of the recent cluster bid Discussion of buy-in (costs,
Simulating Quarks and Gluons with Quantum Chromodynamics February 10, CS635 Parallel Computer Architecture. Mahantesh Halappanavar.
University of Southampton Clusters: Changing the Face of Campus Computing Kenji Takeda School of Engineering Sciences Ian Hardy Oz Parchment Southampton.
QCD Project Overview Ying Zhang September 26, 2005.
Site Lightning Report: MWT2 Mark Neubauer University of Illinois at Urbana-Champaign US ATLAS Facilities UC Santa Cruz Nov 14, 2012.
9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.
NSTXpool Computer Upgrade WP #1685 Bill Davis December 9, 2010.
Outline IT Organization SciComp Update CNI Update
AUTHORS: STIJN POLFLIET ET. AL. BY: ALI NIKRAVESH Studying Hardware and Software Trade-Offs for a Real-Life Web 2.0 Workload.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Site Networking Anna Jordan April 28, 2009.
LQCD Clusters at JLab Chip Watson Jie Chen, Robert Edwards Ying Chen, Walt Akers Jefferson Lab.
Scientific Computing Experimental Physics Lattice QCD Sandy Philpott May 20, 2011 IT Internal Review 12GeV Readiness.
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
Laboratório de Instrumentação e Física Experimental de Partículas GRID Activities at LIP Jorge Gomes - (LIP Computer Centre)
PDSF at NERSC Site Report HEPiX April 2010 Jay Srinivasan (w/contributions from I. Sakrejda, C. Whitney, and B. Draney) (Presented by Sandy.
Proposed 2007 Acquisition Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab.
EGEE is a project funded by the European Union under contract IST HellasGrid Hardware Tender Christos Aposkitis GRNET EGEE 3 rd parties Advanced.
U.S. ATLAS Tier 1 Planning Rich Baker Brookhaven National Laboratory US ATLAS Computing Advisory Panel Meeting Argonne National Laboratory October 30-31,
May 25-26, 2006 LQCD Computing Review1 Jefferson Lab 2006 LQCD Analysis Cluster Chip Watson Jefferson Lab, High Performance Computing.
Commodity Node Procurement Process Task Force: Status Stephen Wolbers Run 2 Computing Review September 13, 2005.
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower QCD Project Review May 24-25, 2005 Code distribution see
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
ATLAS Great Lakes Tier-2 (AGL-Tier2) Shawn McKee (for the AGL Tier2) University of Michigan US ATLAS Tier-2 Meeting at Harvard Boston, MA, August 17 th,
1 Lattice QCD Clusters Amitoj Singh Fermi National Accelerator Laboratory.
PSC’s CRAY-XT3 Preparation and Installation Timeline.
1 Cluster Development at Fermilab Don Holmgren All-Hands Meeting Jefferson Lab June 1-2, 2005.
11 January 2005 High Performance Computing at NCAR Tom Bettge Deputy Director Scientific Computing Division National Center for Atmospheric Research Boulder,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Multi-Core CPUs Matt Kuehn. Roadmap ► Intel vs AMD ► Early multi-core processors ► Threads vs Physical Cores ► Multithreading and Multi-core processing.
FY2006 Acquisition at Fermilab Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab.
SPRING 2012 Assembly Language. Definition 2 A microprocessor is a silicon chip which forms the core of a microcomputer the concept of what goes into a.
Unit 3 Processors and Memory Section B Chapter 1, Slide 2Starting Out with Visual Basic 3 rd EditionIntroduction to ComputersUnit 3B – Processors and.
WIRELESS UPGRADE AS A PROCESS How does IT upgrade a building to improve the wireless experience.
LQCD Computing Project Overview
Project Management – Part I
Computational Requirements
Introduction to Computers
CS111 Computer Programming
Cluster Active Archive
Low-Cost High-Performance Computing Via Consumer GPUs
FY09 Tactical Plan Status Report for Site Networking
System G And CHECS Cal Ribbens
Scientific Computing At Jefferson Lab
Command and Data Handling
Processors Just the Basics.
Cluster Computers.
Presentation transcript:

Lattice QCD Computing Project Review FY06 Procurement Don Holmgren Lattice QCD Computing Project Review Cambridge, MA May 24-25, 2005

Standard Cluster Layout

Cost Basis – FY2005 Fermi Cluster $242K 260/272 $931/$909 Per Node Cost Total Cost Count Per Node Cost $13.1K 260 $51 Serial ports, shelves, PDUs Infrastructure $5.5K $21 Switches, Cables Ethernet $28.0K 400 $70 Cables $114.9K $442 HCA $53.1K 16 $3318 24-Port Leaf $43.6K 1 144-Port Spine Infiniband $263.9K $1015 “Buckner” Pentium 640 3.2 Ghz, 1GB memory Nodes

Assumed configuration: JLAB Cluster Assumed configuration: Approximately 180 processors Either Xeon (Pentium 6xx) or dual Opteron Infiniband (or gigE mesh) Pricing: Estimate $700 per node maximum for Infiniband Based on FY05 cost + Mellanox estimates Estimate $1100 total for each Pentium 4 node Based on FY05 cost Higher Infiniband cost or lower Opteron cost may shift best price/performance to AMD Release to production: March 2006 450 Gflop/sec sustained (1:1 DWF:asqtad)

Assumed configuration: FNAL Cluster Assumed configuration: 800 processors Either 1066 MHz FSB Pentium 4 or dual Opteron Infiniband Pricing: Estimate $700 per node maximum for Infiniband Based on FY05 cost + Mellanox estimates Estimate $1100 total for each Pentium 4 node Based on FY05 cost Higher Infiniband cost or lower Opteron cost may shift best price/performance to AMD Release to production: September 2006 1.8 Tflops/sec sustained (1:1 DWF:asqtad)

Schedule Details Details of the FNAL procurement JLab procurement is essentially a ¼ scale version of FNAL, but starts earlier

Computer room reconstruction FNAL Details Computer room reconstruction Oct. 1, 2005 - July 1, 2006 Existing building has 1.5 MW power, 110 tons AC Sufficient for FY05 + FY06 Not enough capacity for FY07 Proposed expansions: additional 2.0 MW power, 180 tons AC Sufficient for 3000+ processors Budgeted off-project (FNAL base)

FNAL Details Prototyping Oct. 1, 2005 – Feb. 1, 2006 Evaluate: Single data rate vs double data rate Infiniband 1066 FSB Intel ia32/x86_64 CPUs Opteron motherboards with PCI-Express PathScale Infinipath (Infiniband physical layer) Dual core (AMD, Intel)

FNAL Details Network procurement Feb. 1, 2006 – May 1, 2006 Infiniband based on FY05 cluster, prototyping Understand oversubscription Alternatives if reject Infiniband: Myrinet PathScale Infinipath Quadrics Leaf and Spine design based on FY05 results Standard FNAL RFP process

FNAL Details Computer procurement Feb. 1, 2006 – July 1, 2006 Node choice based on prototyping Standard RFP process

Infrastructure design and procurement FNAL Details Infrastructure design and procurement March 1, 2006 – June 15, 2006 Components: Ethernet (control and service network) Serial lines (consoles, possibly out-of-band IPMI) Layout of racks/shelves Cable design PDU's

Integration and Testing FNAL Details Integration and Testing July 1, 2006 – Sept 15, 2006 Consists of: Computer room preparation Rack/shelf assembly and cabling Node installation OS installation via network Network configuration Unit testing Application testing Release to production at end

Fixed budget for equipment Build-to-Cost Fixed budget for equipment # of nodes determined by per node cost Performance determined by node count Also, determined by components Performance risk management: Use conservative performance estimates Use 18 month doubling times, even though we've seen faster for these applications Add float to vendor roadmaps Delay purchases to catch significant new components Performance improvement must offset delay Spending on user support versus equipment Try to run lean on user support But, must maximize science output Evaluate annually and shift funds between effort and equipment

Performance Milestones - FY06-FY09 Measured and estimated asqtad price/performance Blue crosses derive from our “deploy” milestones Green line uses 18 month halving time

Questions?