1 Maui High Performance Computing Center Open System Support An AFRL, MHPCC and UH Collaboration December 18, 2007 Mike McCraney MHPCC Operations Director.

Slides:

Advertisements

Similar presentations

1 Agenda … HPC Technology & Trends HPC Platforms & Roadmaps HP Supercomputing Vision HP Today.

Advertisements

IBM 1350 Cluster Expansion Doug Johnson Senior Systems Developer.

Contact: Hirofumi Amano at Kyushu 40 Years of HPC Services In this memorable year, the.

CURRENT AND FUTURE HPC SOLUTIONS. T-PLATFORMS  Russia’s leading developer of turn-key solutions for supercomputing  Privately owned  140+ employees.

Information Technology Center Introduction to High Performance Computing at KFUPM.

Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.

Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.

A Commodity Cluster for Lattice QCD Calculations at DESY Andreas Gellrich *, Peter Wegner, Hartmut Wittig DESY CHEP03, 25 March 2003 Category 6: Lattice.

SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.

CSC Site Update HP Nordic TIG April 2008 Janne Ignatius Marko Myllynen Dan Still.

Academic and Research Technology (A&RT)

Real Parallel Computers. Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel.

High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.

Research Computing with Newton Gerald Ragghianti Nov. 12, 2010.

Tuesday, September 08, Head Node – Magic.cse.buffalo.edu Hardware Profile Model – Dell PowerEdge 1950 CPU - two Dual Core Xeon Processors (5148LV)

CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.

Illinois Campus Cluster Program User Forum October 24, 2012 Illini Union Room 210 2:00PM – 3:30PM.

Aim High…Fly, Fight, Win NWP Transition from AIX to Linux Lessons Learned Dan Sedlacek AFWA Chief Engineer AFWA A5/8 14 MAR 2011.

Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,

Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.

Data oriented job submission scheme for the PHENIX user analysis in CCJ Tomoaki Nakamura, Hideto En’yo, Takashi Ichihara, Yasushi Watanabe and Satoshi.

Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010.

Corporate Partner Overview and Update September 27, 2007 Gary Crane SURA Director IT Initiatives.

Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.

Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.

PET Summer Institute Kim Kido | Univ. Hawaii Manoa.

The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.

Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.

NLIT May 26, 2010 Page 1 Computing Jefferson Lab Users Group Meeting 8 June 2010 Roy Whitney CIO & CTO.

Early Experiences with Energy-Aware (EAS) Scheduling

Scientific Computing Experimental Physics Lattice QCD Sandy Philpott May 20, 2011 IT Internal Review 12GeV Readiness.

Common Practices for Managing Small HPC Clusters Supercomputing 12

Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.

SLAC Site Report Chuck Boeheim Assistant Director, SLAC Computing Services.

9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.

RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.

Contact: Hirofumi Amano at Kyushu Mission 40 Years of HPC Services Though the R. I. I.

Infiniband in EDA (Chip Design) Glenn Newell Sr. Staff IT Architect Synopsys.

ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.

ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.

Rob Allan Daresbury Laboratory NW-GRID Training Event 25 th January 2007 Introduction to NW-GRID R.J. Allan CCLRC Daresbury Laboratory.

1 Cray Inc. 11/28/2015 Cray Inc Slide 2 Cray Cray Adaptive Supercomputing Vision Cray moves to Linux-base OS Cray Introduces CX1 Cray moves.

Cluster Software Overview

PSC’s CRAY-XT3 Preparation and Installation Timeline.

TeraGrid Quarterly Meeting Arlington, VA Sep 6-7, 2007 NCSA RP Status Report.

Program Update Presented by Larry Davis, Deputy Director September 2009 Department of Defense High Performance Computing Modernization Program.

Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

ISI Research Opportunities at Maui High Performance Computing Center ISI Research Opportunities at Maui High Performance Computing Center Developed for.

Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.

Cray XD1 Reconfigurable Computing for Application Acceleration.

Introduction to Hartree Centre Resources: IBM iDataPlex Cluster and Training Workstations Rob Allan Scientific Computing Department STFC Daresbury Laboratory.

Operational and Application Experiences with the Infiniband Environment Sharon Brunett Caltech May 1, 2007.

Getting Started: XSEDE Comet Shahzeb Siddiqui - Software Systems Engineer Office: 222A Computer Building Institute of CyberScience May.

Architecture of a platform for innovation and research Erik Deumens – University of Florida SC15 – Austin – Nov 17, 2015.

CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.

Advanced Network Administration Computer Clusters.

An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.

Compute and Storage For the Farm at Jlab

RHEV Platform at LHCb Red Hat at CERN 17-18/1/17

Appro Xtreme-X Supercomputers

Joint Techs Workshop InfiniBand Now and Tomorrow

System G And CHECS Cal Ribbens

Cray Announces Cray Inc.

Chapter 2: The Linux System Part 1

Computing careers in the real world Or “I have my degree, now what?”

Cluster Computers.

Presentation transcript:

1 Maui High Performance Computing Center Open System Support An AFRL, MHPCC and UH Collaboration December 18, 2007 Mike McCraney MHPCC Operations Director

2 Agenda  MHPCC Background and History  Open System Description  Scheduled and Unscheduled Maintenance  Application Process  Additional Information Required  Summary and Q/A

3 An AFRL Center  An Air Force Research Laboratory Center  Operational since 1993  Managed by the University of Hawaii Subcontractor Partners – SAIC / Boeing  A DoD High Performance Computing Modernization Program (HPCMP) Distributed Center  Task Order Contract – Maximum Estimated Ordering Value = $181,000,000 Performance Dependent – 10 Years 4 Year Base Period with 2, 3-Year Term Awards  An Air Force Research Laboratory Center  Operational since 1993  Managed by the University of Hawaii Subcontractor Partners – SAIC / Boeing  A DoD High Performance Computing Modernization Program (HPCMP) Distributed Center  Task Order Contract – Maximum Estimated Ordering Value = $181,000,000 Performance Dependent – 10 Years 4 Year Base Period with 2, 3-Year Term Awards

4 A DoD HPCMP Distributed Center Distributed Centers  Allocated Distributed Centers Army High Performance Computing Research Center (AHPCRC) Arctic Region Supercomputing Center (ARSC) Maui High Performance Computing Center (MHPCC) Space and Missile Defense Command (SMDC)  Dedicated Distributed Centers ATC AFWA AEDC AFRL/IF Eglin FNMOC JFCOM/J9 Major Shared Resource Centers  Aeronautical Systems Center (ASC)  Army Research Laboratory (ARL)  Engineer Research and Development Center (ERDC)  Naval Oceanographic Office (NAVO) High Performance Computing Modernization Program Director, Defense Research and Engineering Director, DUSD (Science and Technology) DUSD NAWC-AD NAWC-CD NUWC RTTC RTTC SIMAF SIMAF SSCSD SSCSD WSMR WSMR

5 MHPCC HPC History  IBM P2SC Typhoon Installed  IBM P2SC  IBM P3 Tempest Installed  IBM Netfinity Huinalu Installed  IBM P2SC Typhoon Retired  IBM P4 Tempest Installed  LNXi Evolocity II Koa Installed  Cray XD1 Hoku Installed  IBM P3 Tempest Retired  IBM P4 Tempest Reassigned  Dell Poweredge Jaws Installed

6  Eight, 32 processor/32GB “nodes” IBM P690 Power4  Jobs may be scheduled across nodes for a total of 288p  Shared memory jobs can span up to 32p and 32GB  10TB Shared Disk available to all nodes  LoadLeveler Scheduling  One job per node – 32p chunks – can only support 8 simultaneous jobs  Issues:  Old technology, reaching end of life, upgradability issues  Cost prohibitive – Power consumption constant ~$400,000 annual power cost Hurricane Configuration Summary Current Hurricane Configuration:

7 Dell Configuration Summary Proposed Shark Configuration:  40, 4 processor/8GB “nodes” Intel 3.0Ghz Dual Core Woodcrest Processors  Jobs may be scheduled across nodes for a total of 160p  Shared memory jobs can span up to 8p and 16GB  10TB Shared Disk available to all nodes  LSF Scheduler  One job per node – 8p chunks – can support up to 40 simultaneous jobs  Shared use as Open system and TDS (test and development system)  Much lower power cost – Intel power management  System already maintained and in use  System covered 24x7 UPS, generator  Possible short-notice downtime Features/Issues:

8  Head Node for System Administration “Build” Nodes Running Parallel Tools – (pdsh, pdcp, etc.)  SSH Communications Between Nodes Localized Infiniband Network Private Ethernet  Dell Remote Access Controllers Private Ethernet Remote Power On/Off Temperature Reporting Operability Status Alarms 10 Blades Per Chassis  CFS Lustre Filesystem Shared Access High Performance Using Infiniband Fabric User Webtop 3 Interactive Node s (12 cores) Head Node Simulation Engine 1280 Batch (5120 Cores) Network s DREN Network s Storage DDN 200 TB 10 Gig-E Ethernet Fibre Cisco Infiniband (Copper) Cisco 6500 Core Fibre Channel Jaws Architecture User Webtop 24 Lustre I/O Nodes, 1 MDS Gig-E nodes with 10 Gig-E uplinks. 40 nodes per uplink.

9  Systems Software Red Hat Enterprise Linux v4 – Kernel Infiniband Cisco Software stack MVAPICH – MPICH over IB Library Gnu C/C++/Fortran Intel 9.1 C/C++/Fortran Platform LSF HPC 6.2 Platform Rocks Shark Software

10 Maintenance Schedule  New Proposed Schedule 8:00am – 5:00pm 2 nd and 4 th Wednesdays (as necessary) Check website for maintenance notices  Current 2:00pm – 4:00pm 2 nd and 4 th Thursday (as necessary) Check website (mhpcc.hpc.mil) for maintenance notices  Only take maintenance on scheduled systems  Check on Mondays before submitting jobs

11  Contact Helpdesk or website for application information  Documentation Needed: Account names, systems, special requirements Project title, nature of work, accessibility of code Nationality of applicant Collaborative relevance with AFRL  New Requirements “Case File” information For use in AFRL research collaboration Future AFRL applicability Intellectual property shared with AFRL  Annual Account Renewals September 30 is final day of the fiscal year Account Applications and Documentation

12 Summary  Anticipated migration to Shark  Should be more productive and able to support wide range of jobs  Cutting edge technology  Cost savings from Hurricane (~$400,000 annual)  Stay tuned for timeline – likely end of January, early February

13 Mahalo