Computational Requirements

Slides:



Advertisements
Similar presentations
Analysis of RAMCloud Crash Probabilities Asaf Cidon Stanford University.
Advertisements

Supercomputing Challenges at the National Center for Atmospheric Research Dr. Richard Loft Computational Science Section Scientific Computing Division.
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
Design Considerations Don Holmgren Lattice QCD Project Review May 24, Design Considerations Don Holmgren Lattice QCD Computing Project Review Cambridge,
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
CPSC 231 Sorting Large Files (D.H.)1 LEARNING OBJECTIVES Sorting of large files –merge sort –performance of merge sort –multi-step merge sort.
PARALLEL PROCESSING The NAS Parallel Benchmarks Daniel Gross Chen Haiout.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
CSE 321b Computer Organization (2) تنظيم الحاسب (2) 3 rd year, Computer Engineering Winter 2015 Lecture #4 Dr. Hazem Ibrahim Shehata Dept. of Computer.
LQCD Project Overview Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab.
Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008.
Simulating Quarks and Gluons with Quantum Chromodynamics February 10, CS635 Parallel Computer Architecture. Mahantesh Halappanavar.
9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Physics Steven Gottlieb, NCSA/Indiana University Lattice QCD: focus on one area I understand well. A central aim of calculations using lattice QCD is to.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Nov 1, 2000Site report DESY1 DESY Site Report Wolfgang Friebel DESY Nov 1, 2000 HEPiX Fall
Computing and IT Update Jefferson Lab User Group Roy Whitney, CIO & CTO 10 June 2009.
NLIT May 26, 2010 Page 1 Computing Jefferson Lab Users Group Meeting 8 June 2010 Roy Whitney CIO & CTO.
PhysX CoE: LHC Data-intensive workflows and data- management Wahid Bhimji, Pete Clarke, Andrew Washbrook – Edinburgh And other CoE WP4 people…
Scientific Computing Experimental Physics Lattice QCD Sandy Philpott May 20, 2011 IT Internal Review 12GeV Readiness.
Lattice QCD and the SciDAC-2 LQCD Computing Project Lattice QCD Workflow Workshop Fermilab, December 18, 2006 Don Holmgren,
Overview of Physical Storage Media
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
Rev PA1 1 Exascale Node Model Following-up May 20th DMD discussion Updated, June 13th Sébastien Rumley, Robert Hendry, Dave Resnick, Anthony Lentine.
HPSS for Archival Storage Tom Sherwin Storage Group Leader, SDSC
Proposed 2007 Acquisition Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab.
1 ILDG Status in Japan  Lattice QCD Archive(LQA) a gateway to ILDG Japan Grid  HEPNet-J/sc an infrastructure for Japan Lattice QCD Grid A. Ukawa Center.
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower QCD Project Review May 24-25, 2005 Code distribution see
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
1 Cluster Development at Fermilab Don Holmgren All-Hands Meeting Jefferson Lab June 1-2, 2005.
Site Report on Physics Plans and ILDG Usage for US Balint Joo Jefferson Lab.
What is QCD? Quantum ChromoDynamics is the theory of the strong force
A QCD Grid: 5 Easy Pieces? Richard Kenway University of Edinburgh.
Status and plans at KEK Shoji Hashimoto Workshop on LQCD Software for Blue Gene/L, Boston University, Jan. 27, 2006.
U.S. Department of Energy’s Office of Science Midrange Scientific Computing Requirements Jefferson Lab Robert Edwards October 21, 2008.
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
© Thomas Ludwig Prof. Dr. Thomas Ludwig German Climate Computing Center (DKRZ) University of Hamburg, Department for Computer Science (UHH/FBI) Disks,
Storage for Run 3 Rainer Schwemmer, LHCb Computing Workshop 2015.
Margaret Votava / Scientific Computing Division FIFE Workshop 20 June 2016 State of the Facilities.
Jun Doi IBM Research – Tokyo Early Performance Evaluation of Lattice QCD on POWER+GPU Cluster 17 July 2015.
Compute and Storage For the Farm at Jlab
LQCD Computing Project Overview
Getting the Most out of Scientific Computing Resources
Project Management – Part I
Getting the Most out of Scientific Computing Resources
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Lattice QCD Computing Project Review
Andy Wang COP 5611 Advanced Operating Systems
Experiences with Large Data Sets
Chapter 1: Introduction
The LHC Computing Grid Visit of Her Royal Highness
Project Management – Part II
Network Requirements Javier Orellana
LQCD Computing Operations
CSI 400/500 Operating Systems Spring 2009
Chapter 1: Introduction
Grid Canada Testbed using HEP applications
Scientific Computing At Jefferson Lab
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
The Basics of Apache Hadoop
Parallel Cartographic Modeling
Nuclear Physics Data Management Needs Bruce G. Gibbard
Preparations for Reconstruction of Run6 Level2 Filtered PRDFs at Vanderbilt’s ACCRE Farm Charles Maguire et al. March 14, 2006 Local Group Meeting.
TeraScale Supernova Initiative
PVFS: A Parallel File System for Linux Clusters
Emulator of Cosmological Simulation for Initial Parameters Study
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Computational Requirements Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab

Lattice QCD Code Requirements Large sustained floating point delivery O(100 Gflops) – O(1 Tflops) on O(24 hour) jobs Large memory bandwidth 1.82 Bytes/Flop single precision 3.63 Bytes/Flop double precision Not cache friendly Large network bandwidth (asqtad example) MB/sec required = 0.545 x (Mflops Sustained) / L L=local lattice size (L^4) O(100 MB/sec) for O(Gflops) Communications limit scaling for large node counts May 25, 2006 Computational Requirements

Types of LQCD Calculations Vacuum Gauge Configuration Generation Long stochastic evolution that can only be done in a few streams in which lattice spacing and quark mass are varied Computational needs increase as lattice spacing and quark masses decrease High end capacity computing, O(Tflops) required Very low I/O requirements – store a new configuration of O(1 Gbyte) once an hour Precious outputs – investments of 10’s of Tflops-yrs May 25, 2006 Computational Requirements

Types of LQCD Calculations Analysis using vacuum gauge configurations Small jobs (16-128 nodes on a cluster) run using archived gauge configurations, generating propagators Smaller jobs (4-8 nodes on a cluster for asqtad) run against stored propagators (correlation calculations) Jobs are independent (per gauge configuration), so hundreds can be run in parallel Capacity computing, O(100 Gflops) per job Higher I/O requirements – consume O(1 GB) file, generate several propagators of size 2-17 GB, temporarily store propagators for later correlation calculations May 25, 2006 Computational Requirements

Further Aspects of Analysis Multiple groups analyze the same configurations for different physics projects. This requires the coordination of configuration generation across the entire U.S. community, and the coordination of analysis campaigns across multiple groups. For example, multiple groups may want to use the same propagators; coordination saves regeneration. Performance requirements for analysis vary widely. Some projects require performance similar to configuration generation, while others require much lower performance and are ideal for clusters. Multiple computers are valuable for analysis. May 25, 2006 Computational Requirements

Computational Requirements Recent Allocations The Scientific Program Committee allocates time according to scientific merit. Requests for time in the most recent allocation process (ended April 6/7, 2006, allocations to run July 1 – June 30) exceeded what is available. Normalized requests in millions of QCDOC and JLab “4G” node hours (note: QCDOC node = 0.35 Gflops, 4G node = 1.2 Gflops = 3.4 QCDOC nodes) QCDOC Clusters available 87M 17.5M requested 160M 50M May 25, 2006 Computational Requirements

I/O and Storage Requirements (FY2006) Gauge configurations Approximately 10 Tbytes, archived on tape and with subsets resident on disk Stored at multiple sites for protection Individual files are O(100MB) – O(2 GB) Few are written per large job Propagators Approximately 150 Tbytes in total, limited lifetime Individual files are O(100MB) – O(17 GB) Several are read/written per small job May 25, 2006 Computational Requirements

International Resources – FY2006 Current capabilities (Estimates of non-US resources from Steve Gottlieb): Country Sustained Performance Germany 10-15 TF Italy 5 TF Japan 14-18 TF UK 4-5 TF United States LQCD Project National Centers Dedicated Machines US Total 8.6 TF 1.0 TF 4.8 TF 14.4 TF May 25, 2006 Computational Requirements

Computational Requirements Backup Slide May 25, 2006 Computational Requirements

International Resources Current capabilities (Estimates from Steve Gottlieb’s talk): Location Type Size Peak Est. Perf. Total Paris-Sud apeNEXT 1 rack 0.8 TF 0.4 TF Bielefeld 6 (3) racks 4.9 TF 2.5 TF DESY (Zeuthen) 3 racks 1.2 TF Julich BlueGene/L 8 racks 45.8 TF 11.5 TF x ½? 10-15 TF Munich SGI Tollhouse 3328 nodes 70 TF 14 TF?? x? Rome ApeNEXT 12 (8) racks 9.8 TF 5 TF KEK 10 racks 57.3 TF 14.3 TF Tsukuba PACS-CS 2560 nodes 3.3 TF 14-18 TF Hitachi 2.1 TF 1 TF? Edinburgh QCDOC 12 racks 4.2 TF 1 racks 5.7 TF 1.4 TF x ? 4-5 TF May 25, 2006 Computational Requirements