Plans for the National NERC HPC services UM vn 6.1 installations and performance UM vn 6.6 and NEMO(?) plans.

Slides:



Advertisements
Similar presentations
Presented by The Evolving Environment at NCCS Vickie Lynch.
Advertisements

Complementary Capability Computing on HPCx Dr Alan Gray.
NCAS Unified Model Introduction Part 6: Finale University of Reading, March 2015.
NCAS Unified Model Introduction Part 1b: Running the UM University of Reading, 3-5 December 2014.
Profiling your application with Intel VTune at NERSC
Recent developments at DMI Jacob Brock Weather, climate and sea.
Part 1a: Overview of the UM system
NCAS Unified Model Introduction Part 5: Finale University of Reading, 3-5 December 2014.
KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) SX-9, a good Choice ? Steinbuch Centre for Computing.
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
NCAS Unified Model Introduction Part 1a: Overview of the UM system University of Reading, 3-5 December 2014.
HELICS Petteri Johansson & Ilkka Uuhiniemi. HELICS COW –AMD Athlon MP 1.4Ghz –512 (2 in same computing node) –35 at top500.org –Linpack Benchmark 825.
"Practical Considerations in Building Beowulf Clusters" Lessons from Experience and Future Directions Arch Davis (GS*69) Davis Systems Engineering.
Earth Simulator Jari Halla-aho Pekka Keränen. Architecture MIMD type distributed memory 640 Nodes, 8 vector processors each. 16GB shared memory per node.
CGAM Running the Met Office Unified Model on HPCx Paul Burton CGAM, University of Reading
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
Plans for Exploitation of the ORNL Titan Machine Richard P. Mount ATLAS Distributed Computing Technical Interchange Meeting May 17, 2013.
Aim High…Fly, Fight, Win NWP Transition from AIX to Linux Lessons Learned Dan Sedlacek AFWA Chief Engineer AFWA A5/8 14 MAR 2011.
HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.
High Performance Computing G Burton – ICG – Oct12 – v1.1 1.
Operational computing environment at EARS Jure Jerman Meteorological Office Environmental Agency of Slovenia (EARS)
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
Introduction to HPC resources for BCB 660 Nirav Merchant
Introduction to the HPCC Jim Leikert System Administrator High Performance Computing Center.
Introduction to the HPCC Dirk Colbry Research Specialist Institute for Cyber Enabled Research.
The Cray XC30 “Darter” System Daniel Lucio. The Darter Supercomputer.
Dr Mark Parsons Commercial Director, EPCC HECToR The latest UK National High Performance Computing Service.
23 Oct 2002HEPiX FNALJohn Gordon CLRC-RAL Site Report John Gordon CLRC eScience Centre.
Cray Innovation Barry Bolding, Ph.D. Director of Product Marketing, Cray September 2008.
Migration to Rose and High Resolution Modelling Jean-Christophe Rioual, CRUM, Met Office 09/04/2015.
Software Overview Environment, libraries, debuggers, programming tools and applications Jonathan Carter NUG Training 3 Oct 2005.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
Compiler Sensitivity Study Different compilers, which are after all the interface between researchers expectations expressed in the model code and the.
1 Cray Inc. 11/28/2015 Cray Inc Slide 2 Cray Cray Adaptive Supercomputing Vision Cray moves to Linux-base OS Cray Introduces CX1 Cray moves.
1 Lattice QCD Clusters Amitoj Singh Fermi National Accelerator Laboratory.
11 January 2005 High Performance Computing at NCAR Tom Bettge Deputy Director Scientific Computing Division National Center for Atmospheric Research Boulder,
NICS RP Update TeraGrid Round Table March 10, 2011 Ryan Braby NICS HPC Operations Group Lead.
Threaded Programming Lecture 2: Introduction to OpenMP.
NCAS Computational Modelling Service (CMS) Group providing services to the UK academic modelling community Output of UM Diagnostics Directly in CF NetCDF;
Single Node Optimization Computational Astrophysics.
NCAS Unified Model Introduction Part 7: Finale University of Reading, December 2015.
Presented by NCCS Hardware Jim Rogers Director of Operations National Center for Computational Sciences.
 System Requirements are the prerequisites needed in order for a software or any other resources to execute efficiently.  Most software defines two.
Science Support for Phase 4 Dr Alan D Simpson HPCx Project Director EPCC Technical Director.
March 2014 NCAS Unified Model Introduction Finale York – March 2014.
7/9/20161 cms.ncas.ac.uk March 2014 NCAS Unified Model Introduction Part 1: Overview of the UM system York – March 2014.
Application of Emerging Computational Architectures (GPU, MIC) to Atmospheric Modeling Tom Henderson NOAA Global Systems Division
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
Advanced Computing Facility Introduction
Experiences and Decisions in Met Office coupled ESM Development
Compute and Storage For the Farm at Jlab
Auburn University
HPC Roadshow Overview of HPC systems and software available within the LinkSCEEM project.
Central Ancillary Program (CAP)
HPC usage and software packages
Matt Lemons Nate Mayotte
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME Outreach SESAME,
Super Computing By RIsaj t r S3 ece, roll 50.
CRESCO Project: Salvatore Raia
Cray Announces Cray Inc.
Running OpenIFS on HECToR
Overview of HPC systems and software available within
SiCortex Update IDC HPC User Forum
Alternative Processor Panel Results 2008
Introduction to High Performance Computing Using Sapelo2 at GACRC
SAGES Scottish Alliance for Geoscience, Environment & Society
Working in The IITJ HPC System
Presentation transcript:

Plans for the National NERC HPC services UM vn 6.1 installations and performance UM vn 6.6 and NEMO(?) plans

HPCx (NERC ~10% share) Phase Phase 1 Phase 2 HECToR (NERC ~20% share) National HPC Facilities Phase 4 Black Widow (Vector) UKMO Shared (NERC <10% share)

1 node on HPCx = 16 processors UM atmosphere model resolutions low N48 -> N96 -> N144 -> N216 high UM version 6.1 on HPCx, phase2a IPCC like STASH and with climate meaning

When will the NCAS service on HECToR be available? 1.HECToR service started on 16 th October NERC will provide initial HECToR allocation during the NERC HPC steering panel to be held 22 nd November NCAS service, via the PUMA UMUI, will start with UM versions 4.5 and NCAS service for UM version 6.6 may begin at Easter 2008, depends on Met Office delivery of new versions

What is HECToR phase 1 service? A Cray XT4 with 11,328 cores, each acts as a single CPU, on which NERC has ~20% share of the allocation. The processors are AMD 2.8 Ghz Opterons. HECToR has a total of 32 Tbytes of memory and has a peak speed of 59 Tflops. The machine is run by Edinburgh (EPCC) and Daresbury and so has the same administration process as HPCx using SAFE (Service Administration From EPCC) So it has the same look and feel as HPCx. High level support is provided by NAG, which will cause a significant culture change for NCAS.

What is the HECToR service like compared to HPCx? - it runs SUSE linux (so we may need some script changes) - it uses MPICH2 for the processor interconnect (so we need to look at the UM scalability issues) - it has a new file system (so we need to explore UM I/O issues) - it doesn’t (yet?) have an archive system (this is being discussed with NERC, HECToR and EPSRC) - it has 3 different compilers PGI, pathscale and gnu (there are many UM issues to explore with all these options) - system software is controlled by modules (so we need to make changes to the UM setvars) - job submission using PBS (so we will make changes to UM scripts and the UMUI) - parallel jobs are launched with aprun not mpirun (so we have to change the UM scripts) - no serial queue (yet!) ( so we may have to change the way we compile the Um and what about the simple models?

UM Compiler issues CompilerCompiler Warninga = Intel Initialisation of variable A more than once is an extension to standard Fortran Intel Intel Intel IBM xlf 8.1 and Variable a is initialized more than once Pathscale Warning: Multiple DATA initialization of storage for A. Some initializations ignored PGI PGF90/x86-64 Linux PGF90-W-0164-Overlapping data initializations of a Survey from Polyhedron Software Results from a UM version 4.5 code sample

Other NCAS UM issues f77/ ftn PGI compiler Pathscale compiler module switch Basic UM PGI options now selected after rounding problems - we need now to look at portability/reproducibility - do some validation runs UM vn 4.5 1)Hadam3 + Hadam3P 2) Hadcm3 + preind QUEST? L64, Stochem Moses 2.1, 2.2 Famous/QUEST PRECIS, Hadrm3 Hadam4 We currently testing both compilers. UM vn 6.1 1)Hadgem -> Hadgem1a 2)Higem –> Higem2 3)NUGAM…… 4)Weather jobs? 5)UKCA? UM vn 6.3, 6.6……..

NCAS Plans for Porting UM to HECToR Set up central UM userid hum Install and test UM vn 6.1 and 4.5 Focus on portability, performance and scalability issues - there are currently many different queues but we need to provide advice to users at different resolutions Work out disk space strategy - how are we going to manage users personal archives? - what do we need to do with ECMWF and UKMO data? Design the FCM build system for HECToR for UM vn 6.3, timetable of the UK Met Office - timetable for UKCA, CASCADE, Higem, GSUM, QUESM

3 Gbyte files1.6 Gbyte files Time spent (secs) for I/O - UM atmosphere N216 L38  I/O is an issue on different computers hence GSUM will optimise I/O as well as provide a tuneable I/O strategy On HECToR

Current Issues - Robustness of the system - hardware still not that reliable but improving - lustre file system still having teething problems - support rather ‘green’ - No management committee in place to drive improvements - No long term storage solution - UM installation (vn45. and vn6.1) complete but validation is still not complete - Higem run still running - UKCA, chemistry solvers are taking 31 x HPCx ! - UMCET (ensemble framework) needs re-working - UM vn 6.6 using FCM should be installed by Easter 2008