From Athena to Minerva: A Brief Overview Ben Cash Minerva Project Team, Minerva Workshop, GMU/COLA, September 16, 2013.

Slides:



Advertisements
Similar presentations
Roadmap for Sourcing Decision Review Board (DRB)
Advertisements

Project Minerva Workshop COLA at George Mason University September 2013.
Prioritized New Research Initiative on Climate Change in Japan - under a new phase of the Science and Technology Basic Plan – Hiroki Kondo Special Advisor.
Integrated Ocean Observing System (IOOS) Data Management and Communication (DMAC) Standards Process Julie Bosch NOAA National Coastal Data Development.
CMIP5: Overview of the Coupled Model Intercomparison Project Phase 5
© GEO Secretariat The Group on Earth Observations – Status and Post 2015 Osamu Ochiai GEO Secretariat 41 st CGMS Tsukuba, Japan 8-12 July 2013.
Challenges and Needs in Research Views of Japan -emerging challenges and policy needs- Hiroki Kondo Advisor to the Ministry of Education, Culture, Sports,
Project Athena Workshop – 7-8 June ECMWF Jim Kinter – Project Overview Team Workshop 7-8 June 2010 ECMWF – Reading, UK Project Athena: Overview.
Resolution and Athena – some introductory comments Tim Palmer ECMWF and Oxford.
Climatology and Climate Change in Athena Simulations Project Athena Team ECMWF, June 7, 2010.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO SDSC RP Update October 21, 2010.
Scientific Grand Challenges Workshop Series: Challenges in Climate Change Science and the Role of Computing at the Extreme Scale Warren M. Washington National.
Project Athena Overview Project Athena: Origins  The World Modeling Summit (WMS) in May 2008 called for a revolution in climate modeling to more rapidly.
1 WRF Development Test Center A NOAA Perspective WRF ExOB Meeting U.S. Naval Observatory, Washington, D.C. 28 April 2006 Fred Toepfer NOAA Environmental.
Slide: 1 27 th CEOS Plenary |Montréal | November 2013 Agenda Item: 15 Chu ISHIDA(JAXA) on behalf of Rick Lawford, GEO Water CoP leader GEO Water.
Plans for Exploitation of the ORNL Titan Machine Richard P. Mount ATLAS Distributed Computing Technical Interchange Meeting May 17, 2013.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
SC November 2013 Ben Cash, COLA From Athena to Minerva: COLA’s Experience in the NCAR Advanced Scientific Discovery Program Animation courtesy.
Jordan G. Powers Mesoscale and Microscale Meteorology Division NCAR Earth System Laboratory National Center for Atmospheric Research Space Weather Workshop.
V. Chandrasekar (CSU), Mike Daniels (NCAR), Sara Graves (UAH), Branko Kerkez (Michigan), Frank Vernon (USCD) Integrating Real-time Data into the EarthCube.
US NITRD LSN-MAGIC Coordinating Team – Organization and Goals Richard Carlson NGNS Program Manager, Research Division, Office of Advanced Scientific Computing.
Coastal and Marine Spatial Planning Debra Hernandez, SECOORA Walter Johnson, BOEMRE Ru Morrison, NERACOOS Charly Alexander, IOOS Office.
Larry Marx and the Project Athena Team. Outline Project Athena Resources Models and Machine Usage Experiments Running Models Initial and Boundary Data.
Information Technology at Purdue Presented by: Dr. Gerry McCartney Vice President and CIO, ITaP HPC User Forum September 8-10, 2008 Using SiCortex SC5832.
SDSC RP Update TeraGrid Roundtable Reviewing Dash Unique characteristics: –A pre-production/evaluation “data-intensive” supercomputer based.
The Climate Prediction Project Global Climate Information for Regional Adaptation and Decision-Making in the 21 st Century.
Research Support Services Research Support Services.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Willem A. Landman Asmerom Beraki Francois Engelbrecht Stephanie Landman Supercomputing for weather and climate modelling: convenience or necessity.
Future Requirements for NSF ATM Computing Jim Kinter, co-chair Presentation to CCSM Advisory Board 9 January 2008.
1 Addressing Critical Skills Shortages at the NWS Environmental Modeling Center S. Lord and EMC Staff OFCM Workshop 23 April 2009.
Atkins New PSR Reporting Solution Using Actuate e.Spreadsheet October 2007.
What is a Climate Model?.
Slide: 1 Osamu Ochiai Water SBA Coordinator The GEO Water Strategy Report – The CEOS Contribution Presentation to the 26 th CEOS Plenary at Bengaluru,
Problem is to compute: f(latitude, longitude, elevation, time)  temperature, pressure, humidity, wind velocity Approach: –Discretize the.
GOVERNOR’S EARLY CHILDHOOD ADVISORY COUNCIL (ECAC) September 9, 2014.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
© GEO Secretariat Overall Concept What? B08FDP Forecast Demonstration Project (Nowcasting) B08RDP Research & Development Project (Ensemble Prediction Systems)
Innovative Program of Climate Change Projection for the 21st century (KAKUSHIN Program) Innovative Program of Climate Change Projection for the 21st century.
IDC HPC USER FORUM Weather & Climate PANEL September 2009 Broomfield, CO Panel questions: 1 response per question Limit length to 1 slide.
Experts in numerical algorithms and High Performance Computing services Challenges of the exponential increase in data Andrew Jones March 2010 SOS14.
3 rd Annual WRF Users Workshop Promote closer ties between research and operations Develop an advanced mesoscale forecast and assimilation system   Design.
Introducing Project Management Update December 2011.
Scientific Advisory Committee – September 2011COLA Information Systems COLA’s Information Systems 2011.
NASA Applied Sciences Program Update John A. Haynes Program Manager, Weather National Aeronautics and Space Administration Applied Sciences Program Earth.
PSC’s CRAY-XT3 Preparation and Installation Timeline.
CTB computer resources / CFSRR project Hua-Lu Pan.
Welcome to the PRECIS training workshop
NICS Update Bruce Loftis 16 December National Institute for Computational Sciences University of Tennessee and ORNL partnership  NICS is the 2.
Of what use is a statistician in climate modeling? Peter Guttorp University of Washington Norwegian Computing Center
NOAA Climate Program Office Richard D. Rosen Senior Advisor for Climate Research CICS Science Meeting College Park, MD September 9, 2010.
California Energy Action Plan December 7, 2004 Energy Report: 2004 and 2005 Overview December 7, 2004.
Considering Time in Designing Large-Scale Systems for Scientific Computing Nan-Chen Chen 1 Sarah S. Poon 2 Lavanya Ramakrishnan 2 Cecilia R. Aragon 1,2.
Slide 1 CLOUDS AND STORMS PROJECT ATHENA Very high resolution global climate simulations with the ECMWF model Martin Miller ECMWF (Acknowledgements to.
Scheduling a 100,000 Core Supercomputer for Maximum Utilization and Capability September 2010 Phil Andrews Patricia Kovatch Victor Hazlewood Troy Baer.
NASA Earth Exchange (NEX) A collaborative supercomputing environment for global change science Earth Science Division/NASA Advanced Supercomputing (NAS)
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
Petascale Computing Resource Allocations PRAC – NSF Ed Walker, NSF CISE/ACI March 3,
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Computational Requirements
Feedback/Performance Review and Compensation Process
TIGGE Archives and Access
Ensemble Spread and Resolution in Minerva Simulations
Scientific Computing At Jefferson Lab
WMO WIGOS is an all-encompassing, holistic approach to the improvement and evolution of the present WMO global observing systems into … (the next slide)
Developing an Integrated
Climate Models Current News and Weather
Government Business challenge
Presentation transcript:

From Athena to Minerva: A Brief Overview Ben Cash Minerva Project Team, Minerva Workshop, GMU/COLA, September 16, 2013

Athena Background  World Modeling Summit (WMS; May 2008)  Summit calls for revolution in climate modeling to more rapidly advance improvement in climate model resolution, accuracy and reliability  Recommends petascale supercomputers dedicated to climate modeling  Athena supercomputer  The U.S. National Science Foundation responds, offering to dedicate the retiring Athena supercomputer over a six-month period in  An international collaboration was formed among groups in the U.S., Japan and the U.K. to use Athena to take up the challenge

Project Athena  Dedicated supercomputer  Athena was a Cray XT-4 with 18,048 computational cores  Replaced by new Cray XT-5, Kraken, with 99,072 cores (since increased)  # 21 on June 2009 Top 500 list  6 months, 24/7, 99.3% utilization  Over 1 PB data generated  Large international collaboration  Over 30 people  6 groups  3 continents  State-of-the-art global AGCMs  NICAM (JAMSTEC/ U. Tokyo): Nonhydrostatic Icosahedral Atmospheric Model  IFS (ECMWF): Integrated Forecast System  Highest possible spatial resolution

Athena Science Goals Hypothesis: Increasing climate model resolution to accurately resolve mesoscale phenomena in the atmosphere (and ocean and land surface) can dramatically improve the fidelity of the models in simulating climate – mean, variances, covariances, and extreme events. Hypothesis: Simulating the effect of increasing greenhouse gases on regional aspects of climate, especially extremes, may, for some regions, depend critically on the spatial resolution of the climate model. Hypothesis: Explicitly resolving important processes, such as clouds in the atmosphere (and eddies in the ocean and landscape features on the continental surface), without parameterization, can improve the fidelity of the models, especially in describing the regional structure of weather and climate.

Qualitative Analysis: 2009 NICAM Precipitation and Cloudiness May 21-August 31

Athena Catalog

Athena Lessons Learned Dedicated usage of a relatively big supercomputer greatly enhances productivity Dealing with only a few users and their requirements allows for more efficient utilization of resources Challenge: Dedicated simulation projects like Project Athena can generate enormous amounts of data to be archived, analyzed and managed. NICS (and TeraGrid) do not currently have enough storage capacity. Data management is a big challenge. Preparation time: 2 to 3 weeks at least were needed before the beginning of dedicated runs to test and optimize the codes and to plan strategies for optimal use of the system. Communication throughout the project was essential: (weekly telecons, lists, personal calls, …)

Athena Limitations Athena was a tremendous success, generating tremendous amount of data and large number of papers for a six month project. BUT… Limited number of realizations Athena runs generally consisted of a single realization No way to assess robustness of results Uncoupled models Multiple, dissimilar models Resources were split between IFS and NICAM Differences in performance meant very different experiments performed – difficult to directly compare results Storage limitations and post-processing demands limited what could be saved for each model

Minerva Background  NCAR Yellowstone  In 2012, NCAR-Wyoming Supercomputing Center (NWSC) debuted Yellowstone, the successor to Bluefire, their previous production platform  IBM iDataplex, 72,280 cores, 1.5 petaflops peak performance  #17 on June 2013 Top 500 list  10.7 PB disk capability – vast increase over capacity available during Athena  High capacity HPSS data archive  Dedicated high memory analysis clusters (Geyser and Caldera) Accelerated Scientific Discovery (ASD) program  Recognizing that many groups will not be ready to take advantage of new architecture, NCAR accepted a small number proposals for early access to Yellowstone  3 months of near-dedicated access before being opened to general user community  Opportunity to continue successful Athena collaboration between COLA and ECMWF, and to address limitations in the Athena experiments

Minerva Timeline  March 2012 – Proposal finalized and submitted  31 million core hours requested  April 2012 – Proposal accepted  21 million core hours approved  Anticipated date of production start: July 21  Code testing and benchmarking on Janus begins  October 5, 2012  First login to Yellowstone – bcash reportedly user 1  October – November 23, 2012  Jobs are plagued by massive system instabilities, conflict between code and Intel compiler

Minerva Timeline continued  November 24 – Dec 1, 2012  Code conflict resolved, low core count jobs avoid worst of system instability  Minerva jobs occupy cores (!)  Peter Towers estimates Minerva easily sets record for “Most IFS FLOPs in a 24 hour period”  Jobs rapidly overrun initial 250 TB disk allocation, triggering request for additional resources  This becomes a Minerva project theme  Due to system instability, user accounts are not charged for jobs at this time  Roughly 7 million free core hours as a result: 28 million total  800+ TB generated

Minerva Catalog: Base Experiments ResolutionStart DatesEnsemblesLengthPeriod of Integration T319May months (total) ** T639May months (total) T639May 1, Nov 151 (total)5 and 4 months, respectively Minerva Catalog: Extended Experiments ResolutionStart DatesEnsemblesLengthPeriod of Integration T319May 1, Nov 1517 months T639May 1, Nov 1157 months T1279May 1157 months ** to be completed

Qualitative Analysis: 2010 T1279 Precipitation May – November

Minerva Lessons Learned Dedicated usage of a relatively big supercomputer greatly enhances productivity Experience with early usage period demonstrates tremendous progress can be made with dedicated access Dealing with only a few users allows for more efficient utilization Noticeable decrease in efficiency once scheduling multiple jobs of multiple sizes was turned over to a scheduler NCAR resources initially overwhelmed by challenges of new machine and individual problems that arose. Focus on a single model allows for in-depth exploration Data saved at much higher frequency Multiple ensemble members, increased vertical levels, etc.

Dedicated simulation projects like Athena and Minerva generate enormous amounts of data to be archived, analyzed and managed. Data management is a big challenge. Other than machine instability, data management and post-processing were solely responsible for halts in production. Even on a system designed with lessons from Athena in mind, production capabilities overwhelm storage and processing Post-processing and storage must be incorporated into production stream ‘Rapid burn’ projects such as Athena and Minerva are particularly prone to overwhelming storage resources

Despite advances beyond Athena, more work to be done Focus of Tuesday discussion Fill in matrix of experiments Further increases in ocean, at mospheric resolution Sensitivity tests (aerosols, greenhouse gases) ?? Beyond Minerva: A New Pantheon