HPC system for Meteorological research at HUS Meeting the challenges Nguyen Trung Kien Hanoi University of Science Melbourne, December 11 th, 2012 High resolution Vietnam projections workshop
Contents Computational/Storage Needs 1 Scale to meet the needs 2 GC for meteorological research 3 Q&A 4
Computational and Storage needs up to 2013 National projects: – Development of seasonal prediction system for prediction of extreme climate events for natural disaster prevention in Vietnam – Building an operational ensemble assimilation system for numerical weather prediction and an ensemble system for regional climate models to forecast and project extreme weather and climate events
Computational and Storage needs up to 2013 Joint DANIDA project: – Climate Change-Induced Water Disaster and Participatory Information System for Vulnerability Reduction in North Central Vietnam Joint project with CSIRO – Australia: – High-resolution downscaling for Vietnam: Facilitating effective climate adaptation
Computational and Storage needs up to 2013 Weather Forecast: MM5, HRM, WRF – 3-day forecast - 4 times daily – 2 hours/run (on 1 compute node – 2xQuadCore 2.5GHz, 8GB Ram) Tropical Cyclone Detection: RegCM – 12-month detect – 1 time monthly – 140 hours/run – Stores 70 GB data Seasonal Forecast: MM5, WRF, RegCM – 7-month forecast - 1 time weekly – hours/run – Stores 6-16 GB data
Computational and Storage needs up to 2013 Climate simulation 1979 – 2010: – Boundary, Initial condition: ERA40, NCEP, INTERIM – Models: RegCM, MM5CL, REMO, clWRF – 2-5 hours/month, output ~ 5GB data Climate projection : – A1B, A2 scenarios – Models: MM5CL, CCAM, RegCM, clWRF, REMO – 2-5 hours/month, output ~ 5GB data
Computational and Storage needs up to 2013 Large number of users: – 10 – staff – 2-3 PhD students – 5-6 Master students – > 15 Bachelor students – Users from other organizations Need to store data from previous projects The total storage needs: > 100 TB
Computational and Storage needs up to 2013 System as of 2011: – 11 compute nodes – Rpeak = 880 GFlops – Desktop HDD – low cost, low MTBF + 1 Gbps + NFS => Storage has low read/write speed and not reliable – Low bandwidth interconnect network (1Gbps) => Max Performance << Peak Performance Computational and Storage needs are “huge” => System needs to be upgraded
Scale to meet the needs - Network Use infiniband instead of Ethernet: – Many versions: SDR, DDR, QDR, FDR, … – Bandwidth support from 10 -> 56 Gbps – The servers support only PCI Express x4 => choose Infiniband SDR 4x - 10Gpbs
Scale to meet the needs - Storage Hot spare Raid5
Scale to meet the needs - Storage Hot spare Raid5 Hot spare Raid5
Scale to meet the needs - Storage Hot spare Raid5 Hot spare Raid5 Infiniband (10Gbps)
Scale to meet the needs - Storage Hot spare Raid5 Hot spare Raid5 Infiniband (10Gbps) Use only Enterprise SAS/SATAHDD LustreFS
Scale to meet the needs 14 node, 106 core, 141 GB RAM Rocks cluster 5.5 Rpeak ~ 1 Tflops Infiniband SDR 10Gbps & 1Gbps interconnect network 76 TB LustreFS using Enterprise HDDs LustreFS: from 1 client: ~ 700 MB/s, aggregate throughput up to 1.8 GB/s Infiniband 10Gbps 1Gbps Ethernet / /24 METOCEAN Cluster
Scale to meet the needs - Hadoop 76 TB LustreFS is not enough Compute nodes have 6-36 drive slots Lots of used Desktop SATA disks 0.3 – 2TB Commodity hardware (except LustreFS HDD) CC an we create reliable storage from the available hardware with constrained budget? HH adoop Distributed File System (HDFS)
Name node Data node3 Client Two way replication: A file is cut into 64MB blocks Each block is written onto two different datanodes Data node1 Data node2
Scale to meet the needs - Hadoop Name node Client Client read data block from datanode directly => high read speed Data node3 Data node1 Data node2
Scale to meet the needs - Hadoop Name node Fault Tolerance: under-replicated block is automatically copied to another datanode Data node2 Data node3 Data node1
Scale to meet the needs - Hadoop 60 TB “Cloud Storage” HDFS using Desktop HDDs Stores large files (multiple of 64MB) Replication factor = 2 (useful space: 30 TB) Mounted to Linux FS with FUSE HDFS & LustreFS metadata uploaded automatically to Dropbox Cloud Storage for disaster recovery Infiniband 10Gbps 1Gbps Ethernet / /24 Dropbox Cloud Storage fsimage edits MDT image
Grid Computing for meteorological research Demand for Computational and Storage for Meteorological Research has no limit Need more Computational power: – look further into the future, e.g.: 5day forecast – Better forecast: ensemble forecast needs 10s of models (MPI jobs) running in parallel – Operational Real-time forecast More data to save => need more storage: – 100s of TB to PB of data Constrained budget
Grid Computing for meteorological research Meteorological research organizations: HUS SRHMC IMHEN HUNRE NCHMF Clusters of different size: – 100s of Gflops to some Tflops – Some to 10s of TB storage Resources should be connected and shared to do bigger problems than each of us can do
Grid Computing for meteorological research HUS SRHMC IMHEN HUNRE NCHMF VINAREN 155 Mbps VINAREN 155 Mbps Grid/Cloud Storage to share data Computational Grid for Ensemble Forecast
Grid Computing for meteorological research Workload Management System NCHMF IMHEN SRHMC HUNRE HUS MPI/MapR Jobs Ensemble Forecasts
Grid Computing for meteorological research HUS SRHMC IMHEN HUNRE NCHMF VINAREN European Earth Science Grid TEIN TEIN = Trans-Eurasia Information Network Bandwidth = 622 Mbps
Questions? Thank you for your attention!