Download presentation
Presentation is loading. Please wait.
Published byMyra Wilson Modified over 9 years ago
1
nci.org.au @NCInews Computational Environments and Analysis methods available on the NCI HPC & HPD Platform IN53E – 01 Ben Evans 1, Lesley Wyborn 1, Adam Lewis 2, Clinton Foster 2, Stuart Minchin 2, Tim Pugh 3, Alf Uhlerr 4, Bradley Evans 5, 1 ANU, 2 Geoscience Australia, 3 Bureau of Meteorology, 4 CSIRO, 5 Macquarie University
2
nci.org.au Overview High Performance Data (HPD) - data that is carefully prepared, standardised and structured so that it can be used in Data-Intensive Science on HPC (Evans et al., in press) – HPC – turning compute into IO-bound problems – HPD – turning IO-bound into ontology + semantic problems What are the HPC and HPD drivers? Build re-usable/sustainable software for use in Virtual Laboratories – integrated set of software for science, a mix of new and familiar What have we done? What’s next? © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 1/34
3
nci.org.au © National Computational Infrastructure 2014 Numerical Weather Prediction Roadmap Model Topography of Sydney, NSW 2 x daily 10-day & 3-day forecast 40km Global Model 4 x daily 3-day forecast 12km Regional Model Sydney, NSW (research 1.5km topography) 4 x daily 36-hour forecast 4km City/State Model TC Increasing model resolution for improved local information Future model ensembles for likelihood of significant weather 2 x daily 10-day & 3-day forecast 12km Global Model 8 x daily 3-day forecast 5km Regional Model 24 x daily 18h or 36h forecast 1.0km City/State Model 2013 2020 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans C/- Tim Pugh, BoM 2/34
4
nci.org.au © National Computational Infrastructure 2014 Capture, analysis & application of Earth Obs c/- Adam Lewis, GA IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 3/34
5
nci.org.au © National Computational Infrastructure 2014 How to bring as much observational scrutiny as possible to the CMIP/IPCC process? How to best utilize the wealth of satellite observations for the CMIP/IPCC process? c/- Robert Ferraro, NASA/JPL, ESGF F2F, 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Combining Satellite and Climate 4/34
6
nci.org.au Top 500 Super Computer list since 1990 Fast-and-flexible data access to structured data is required The needs to be a balance between processing power and ability to access data (data scaling) The focus is for on- demand direct access to large data sources enabling High performance analytics and analysis tools directly on that content http://www.top500.org/statistics/perfdevel/ Current NCI Next NCI IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans © National Computational Infrastructure 2014 5/34
7
nci.org.au © National Computational Infrastructure 2014 Elephant Flows Place Great Demands on Networks Physical pipe that leaks water at rate of.0046% by volume. Network ‘pipe’ that drops packets at rate of.0046%. Result 100% of data transferred, slowly, at <<5% optimal speed. Result 99.9954% of water transferred. essentially fixed determined by speed of light With proper engineering, we can minimize packet loss. Assumptions: 10Gbps TCP flow, 80ms RTT. See Eli Dart, Lauren Rotman, Brian Tierney, Mary Hester, and Jason Zurawski. The Science DMZ: A Network Design Pattern for Data-Intensive Science. In Proceedings of the IEEE/ACM Annual SuperComputing Conference (SC13), Denver CO, 2013. IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 6/34
8
nci.org.au © National Computational Infrastructure 2014 Raijin: 57,472 cores (Intel Xeon Sandy Bridge technology, 2.6 GHz) in 3592 compute nodes; 160 TBytes (approx.) of main memory; Infiniband FDR interconnect; and 7 PBytes (approx.) of usable fast filesystem (for short-term scratch space). 1.5 MW power; 100 tonnes of water in cooling Partner Cloud Same generation of technology as raijin (Intel Xeon Sandy Bridge technology, 2.6 GHz) but only 1500 cores; Infiniband FDR interconnect; Collaborative platform for services and The platform for hosting non-batch services NCI Nectar Cloud Same generation as partner cloud Non-managed environment Weak integration IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Computational and Cloud Platforms 7/34
9
nci.org.au © National Computational Infrastructure 2014 Per-Tenant public IP assignments (CIDR boundaries – typically /29) FDR IB OpenStack private IP (flat network*) - quota managed NFS Lustre NFS SSD NCI Cloud IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 8/34
10
nci.org.au NCI’s integrated high-performance environment 10 GigE /g/data 56Gb FDR IB Fabric /g/data1 ~7.4 PB /g/data2 ~6.7 PB /short 7.6PB /home, /system, /images, /apps Cache 1.0PB, Tape 12.3PB Massdata (tape) Persistent global parallel filesystem Raijin high-speed filesystem Raijin HPC Compute Raijin Login + Data movers NCI data movers To second data centre Raijin 56Gb FDR IB Fabric Internet © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 9/34
11
nci.org.au Building The Platform for Earth System modeling & Analysis © National Computational Infrastructure 2014 10PB+ Research Data Server-side analysis and visualization Data Services THREDDS VDI: Cloud scale user desktops on data Web-time analytics software IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 10/34
12
nci.org.au © National Computational Infrastructure 2014 10 PB of Data for Interdisciplinary Science BOMGACSIRO ANU Inter- national Other National CMIP5 3PB Astronomy (Optical) 200 TB Water Ocean 1.5 PB Atmosphere 2.4 PB Earth Observ. 2 PB Marine Videos 10 TB Geophysics 300 TB Weather 340 TB Mirrored from major science agencies and other sources Bathy, DEM 100 TB IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 11/34
13
nci.org.au © National Computational Infrastructure 2014 Data CollectionsApprox. Capacity CMIP5, CORDEX~3 Pbytes ACCESS products2.4 Pbytes LANDSAT, MODIS, VIIRS, AVHRR, INSAR, MERIS1.5 Pbytes Digital Elevation, Bathymetry, Onshore Geophysics700 Tbytes Seasonal Climate700 Tbytes Bureau of Meteorology Observations350 Tbytes Bureau of Meteorology Ocean-Marine350 Tbytes Terrestrial Ecosystem290 Tbytes Reanalysis products100 Tbytes National Environment Research Data Collections (NERDC) 1. Climate/ESS Model Assets and Data Products 2. Earth and Marine Observations and Data Products 3. Geoscience Collections 4. Terrestrial Ecosystems Collections 5. Water Management and Hydrology Collections IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 12/34
14
nci.org.au Internationally sourced Satellite Data (USGS, NASA, JAXA, ESA, …) Reanalysis (ECMWF, NCEP, NCAR, …) Climate Data (CMIP5, AMIP, GeoMIP, CORDEX, …) Ocean Modelling (Earth Simulator, NOAA, GFDL, …) These will only increase as we depend on more data, and some will be replicated. How should we keep this in sync, versioned, and back-referenced for the supplier? © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 13/34
15
nci.org.au © National Computational Infrastructure 2014 allow multiple data types but convert proprietary ones standardize record format and conventions Expose all attributes for search not just collection-level search, not just datasets, all data What are the handles we need to access the data? Provide more programmatic interfaces and link up data and compute resources More server side processing Add the semantic meaning to the data Is it scientifically appropriate for a data service to aggregate/interpolate? CMIP5 successful because we constrained the problem What unique identifiers do we need? DOI is only part of the story. Versioning is important. IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Some Data Challenges 14/34
16
nci.org.au Recording Hierarchy in 19139 1. Data collection – eg Climate and Weather modelling 2. Series – eg. Landsat 7 3. Datasets – Semantically the same 4. Attributes – including variables (versions, errata) Metadata Hierarchy for discovery © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 15/34
17
nci.org.au Geonetwork: Collection (and Series?) Dataset specific Geonetworks Dataset 1 Dataset 2 Dataset 3Dataset n Dataset 1 Dataset 2 Dataset 3 … CSW Harvesting and Cross-walks (eg RIF-CS Adapter) Full harvest of the metadata Full Search GeoNetwork Full Search GeoNetwork (or domain) Dataset 1 Dataset 2 Dataset 3 … Domain Specific or User deep query NCI GeoNetwork architecture (basic catalogues) © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Metadata Hierarchy implementation 16/34
18
nci.org.au GeoNetwork catalogue Lucene database DAP, OGC, … Services /g/data1 /g/data2 Supercomputer access Virtual lab © National Computational Infrastructure 2014 Trialing Elastic Search IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Finding data and services 17/34
19
nci.org.au © National Computational Infrastructure 2014 Recording full product description … now need to contextually embed for programs IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 18/34
20
nci.org.au © National Computational Infrastructure 2014 The selfish practical researcher: Not Virtual Organisations. Interoperable tools in virtual laboratories. Make seamless. Anti-collaboration: just apply standards Micro-ambition: did I get stuff done quicker/better Data handling (and particularly movement!) is a complete waste of time. Sustainability: The system should capture my operations. Why am I a secretary? I can’t remember what I did? The system did things that I didn’t know anyway! www.onlychild.org.uk What’s worse? IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Cite-me! People should recognise my genius! Do I have to be in PR? I’ve done my bit, and its really clever. Here you go, I am going to do something else. (Actually same issue with sub-contracting work, and multiparty agreements) 19/34 Collaborating with Researchers/Developers
21
nci.org.au © National Computational Infrastructure 2014 The selfish practical researcher: Not Virtual Organisations. Interoperable tools in virtual laboratories. Make seamless. Anti-collaboration: just apply standards Micro-ambition: did I get stuff done quicker/better Data handling (and particularly movement!) is a complete waste of time. Sustainability: The system should capture my operations. Why am I a secretary? I can’t remember what I did? The system did things that I didn’t know anyway! www.onlychild.org.uk What’s worse? Perhaps the opposite to all these items. IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 19/34 Collaborating with Researchers/Developers Cite-me! People should recognise my genius! Do I have to be in PR? I’ve done my bit, and its really clever. Here you go, I am going to do something else. (Actually same issue with sub-contracting work, and multiparty agreements)
22
nci.org.au © National Computational Infrastructure 2014 The selfish practical researcher: Not Virtual Organisations. Interoperable tools in virtual laboratories. Make seamless. Anti-collaboration: just apply standards Micro-ambition: did I get stuff done quicker/better Data handling (and particularly movement!) is a complete waste of time. Sustainability: The system should capture my operations. Why am I a secretary? I can’t remember what I did? The system did things that I didn’t know anyway! www.onlychild.org.uk What’s worse? Perhaps the opposite to all these items. Need a strategy to properly address this. IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Collaborating with Researchers/Developers 19/34 Cite-me! People should recognise my genius! Do I have to be in PR? I’ve done my bit, and its really clever. Here you go, I am going to do something else. (Actually same issue with sub-contracting work, and multiparty agreements)
23
nci.org.au © National Computational Infrastructure 2014 Project driven means: define a use-case end-date on the work The researcher / leading developers may be ahead of the curve We want to best tap this time and energy, … and to have a reasonable chance of converting for sustainability The Nth Degree, ST-TNG BarclayBarclay: Computer, begin new program. Create as follows: workstation chair. Now, create a standard alphanumeric console, positioned for the left hand. Now an iconic display console, positioned for the right hand. Tie both consoles into the Enterprise main computer core, utilizing neural-scan interface. Enterprise ComputerEnterprise Computer: There is no such device on file. BarclayBarclay: No problem. Here's how you build it. IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Collaborating with Researchers/Developers 20/34
24
nci.org.au © National Computational Infrastructure 2014 Virtual Labs: Separating Researcher from Software builders Cloud is an enabler, but: don’t make researchers become full system admins. save developers from being operational Productivity Perspiration Proj1:Start Proj1:End Project lifecycle – and preparing success Proj2-4:Start Proj2-4:End IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Prototype to Production - anti-Mine craft 21/34
25
nci.org.au © National Computational Infrastructure 2014 Development Phase in a project VL Managers Developers Headspace hours VL Managers Developer Poorly executed Developer Reasonably executed VL Mgr. Well executed ? IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Prototype to Production - anti-Mine craft 22/34
26
nci.org.au © National Computational Infrastructure 2014 Prototype to Production - anti-Mine craft Development Phase in a project VL Managers Developers Headspace hours VL Managers Developer Poorly executed Developer Reasonably executed VL Mgr Well executed Changed Scope – adopted broadly IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 22/34
27
nci.org.au Virtual Laboratory driven software patterns Basic OS functions Common Modules Bespoke Services Special config choices Super Software Stack NCI Stack 1 NCI Env Stack Workflow X Analytics Stack 2xStack1 Modify Stack1 Modify Stack 2 P2P Vis Stack Gridftp Take Stacks from Upstream And use as Bundles © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 23/34
28
nci.org.au © National Computational Infrastructure 2014 Step 1: Development Get template for development What is special, separate out what is common Reuse other software stacks where possible Step 2: Prototype Deploy in an isolated tenant of a cloud Determine dependencies. Test cases to demonstrate correctly functioning. Step 3: Sustainability Pull repo into operational tenant Prepare bundle for integration with rest of framework Hand back cleaned bundle Establish DevOps process IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Transition from developer, to prototype, to DevOps 24/34
29
nci.org.au NCI Core Bundles Community1 repo Community2 repo Virtual Laboratory Operational Bundle - Git controlled - pull model - continuous integration testing DevOps approach to building and operating environments © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 25/34
30
nci.org.au © National Computational Infrastructure 2014 Separates roles and responsibilities: Specialist on package VL managers system admin anti-architecture: “Architecture” to “framework” flexible with technology change makes handover easier Both Test/Dev/Ops and patches/rollback become BAU Sharable bundles Can tag release of software stacks Precondition for trusted software stacks Provenance - Scientific / gov policy scrutiny IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Advantages 26/34
31
nci.org.au © National Computational Infrastructure 2014 Transforms the system admins Role Change, from gatekeeper to DevOps management New skills, new way of thinking Separates out root trust for global storage dev teams are limited to test areas Root access for ops but can be a limited group Only Operating System provided to boot from Remove old-style Golden (fragile) Images Easier to security patch glue bundles together into different software stacks addresses the bloated node problem scale out generally easier Standard system configs go into “core” bundle (LDAP, logs, easter eggs) Recast project specific bundles to common, or core. Performance issues can be addressed across the Virtual Labs/in the core IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Advantages cont… 27/34
32
nci.org.au © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans A snapshot of layered bundles 28/34
33
nci.org.au Collaboration: Bureau of Meteorology, CSIRO, NCI, ARCCSS Climate and Weather Science Lab © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 29/34
34
nci.org.au Timetable -Early access started on 2 Sept, General release to CWSlab week Late September -Incorporate into all VLs (eg current AGDC Datacube to be upgraded) VDI - Virtualised Desktop Infrastructure © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans 30/34
35
nci.org.au © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans VDI – cont … 31/34
36
nci.org.au © National Computational Infrastructure 2014 IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans VDI – cont … 32/34
37
nci.org.au © National Computational Infrastructure 2014 Trans-disciplinary science To publish, catalogue and access self-documented data and software for enhancing trans-disciplinary, big data science within interoperable data services and protocols. Integrity of Science Managed services to capture a workflow’s process as a comparable, traceable output. Ease-of-access to data and software for enhanced workflow development and repeatable science which can be conducted with less effort or an acceleration of outputs. Integrity of Data The data repository services to ensure data integrity, provenance records, universal identifiers, repeatable data discovery and access from workflows or interactive users. IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans Progress toward Major Milestones 33/34
38
nci.org.au © National Computational Infrastructure 2014 Auth: Authentication and Authorisation Path forward …. Oauth2-style model. How to enable at all service provider points? Attributes, not virtual organisations Trusted software Related to citation, but same issues as data Provenance Need well thought out complex graphs, not just pre-canned stacks Effectively using new data technology Its no longer just POSIX Do we have to copy the same data into different forms? Libraries increasingly have a new role to play to hide complexity IN53E-01: “NCI Computational Environments and HPC/HPD ” #AGU14, 19 December, 2014 @BenJKEvans New Challenges 34/34
39
nci.org.au Ben.Evans@anu.edu.au @BenJKEvans
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.