Southern California Earthquake Center SCEC Application Performance and Software Development Thomas H. Jordan [1], Y. Cui [2], K. Olsen [3], R. Taborda[4],

Slides:



Advertisements
Similar presentations
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
Advertisements

1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.
10/09/2007CIG/SPICE/IRIS/USAF1 Broadband Ground Motion Simulations for a Mw 7.8 Southern San Andreas Earthquake: ShakeOut Robert W. Graves (URS Corporation)
Cyberinfrastructure for Scalable and High Performance Geospatial Computation Xuan Shi Graduate assistants supported by the CyberGIS grant Fei Ye (2011)
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO High-Frequency Simulations of Global Seismic Wave Propagation A seismology challenge:
1 High Performance Computing at SCEC Scott Callaghan Southern California Earthquake Center University of Southern California.
CyberShake Study 14.2 Technical Readiness Review.
Jeroen Tromp Computational Seismology. Governing Equations Equation of motion: Boundary condition: Initial conditions: Earthquake source: Constitutive.
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
SCEC Information Technology Overview for 2012 Philip J. Maechling Information Technology Architect Southern California Earthquake Center SCEC Board of.
Scaling to New Heights Retrospective IEEE/ACM SC2002 Conference Baltimore, MD.
NSF Geoinformatics Project (Sept 2012 – August 2014) Geoinformatics: Community Computational Platforms for Developing Three-Dimensional Models of Earth.
Large-scale 3-D Simulations of Spontaneous Rupture and Wave Propagation in Complex, Nonlinear Media Roten, D. 1, Olsen, K.B. 2, Day, S.M. 2, Dalguer, L.A.
1 SCEC Broadband Platform Development Using USC HPCC Philip Maechling 12 Nov 2012.
1.UCERF3 development (Field/Milner) 2.Broadband Platform development (Silva/Goulet/Somerville and others) 3.CVM development to support higher frequencies.
Development of Scalable and Accurate Earthquake Simulations Xue-bin Chi Computer Network Information Center, Chinese Academy of Sciences
CyberShake Study 15.4 Technical Readiness Review.
CyberShake Study 2.3 Technical Readiness Review. Study re-versioning SCEC software uses year.month versioning Suggest renaming this study to 13.4.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Fig. 1. A wiring diagram for the SCEC computational pathways of earthquake system science (left) and large-scale calculations exemplifying each of the.
Southern California Earthquake Center SCEC Application Performance and Software Development Yifeng Cui [2], T. H. Jordan [1], K. Olsen [3], R. Taborda.
On Parallel Time Domain Finite Difference Computation of the Elastic Wave Equation and Perfectly Matched Layers (PML) Absorbing Boundary Conditions (With.
SCEC Community Modeling Environment (SCEC/CME): SCEC TeraShake Platform: Dynamic Rupture and Wave Propagation Simulations Seismological Society of America.
Validation of physics-based ground motion earthquake simulations using a velocity model improved by tomographic inversion results 1 Ricardo Taborda, 1.
Manno, , © by Supercomputing Systems 1 1 COSMO - Dynamical Core Rewrite Approach, Rewrite and Status Tobias Gysi POMPA Workshop, Manno,
ARCHES: GPU Ray Tracing I.Motivation – Emergence of Heterogeneous Systems II.Overview and Approach III.Uintah Hybrid CPU/GPU Scheduler IV.Current Uintah.
06/22/041 Data-Gathering Systems IRIS Stanford/ USGS UNAVCO JPL/UCSD Data Management Organizations PI’s, Groups, Centers, etc. Publications, Presentations,
CyberShake Study 15.3 Science Readiness Review. Study 15.3 Scientific Goals Calculate a 1 Hz map of Southern California Produce meaningful 2 second results.
Phase 1: Comparison of Results at 4Hz Phase 1 Goal: Compare 4Hz ground motion results from different codes to establish whether the codes produce equivalent.
DOE Network PI Meeting 2005 Runtime Data Management for Data-Intensive Scientific Applications Xiaosong Ma NC State University Joint Faculty: Oak Ridge.
Southern California Earthquake Center CyberShake Progress Update 3 November 2014 through 4 May 2015 UGMS May 2015 Meeting Philip Maechling SCEC IT Architect.
Experiences Running Seismic Hazard Workflows Scott Callaghan Southern California Earthquake Center University of Southern California SC13 Workflow BoF.
Southern California Earthquake Center SCEC Collaboratory for Interseismic Simulation and Modeling (CISM) Infrastructure Philip J. Maechling (SCEC) September.
UCERF3 Uniform California Earthquake Rupture Forecast (UCERF3) 14 Full-3D tomographic model CVM-S4.26 of S. California 2 CyberShake 14.2 seismic hazard.
Funded by the NSF OCI program grants OCI and OCI Mats Rynge, Gideon Juve, Karan Vahi, Gaurang Mehta, Ewa Deelman Information Sciences Institute,
Hercules Development Status John Urbanic ASTA Presentation July 16 th, 2009.
1 1.Used AWP-ODC-GPU to run 10Hz Wave propagation simulation with rough fault rupture in half-space with and without small scale heterogeneities. 2.Used.
Rapid Centroid Moment Tensor (CMT) Inversion in 3D Earth Structure Model for Earthquakes in Southern California 1 En-Jui Lee, 1 Po Chen, 2 Thomas H. Jordan,
Southern California Earthquake Center SI2-SSI: Community Software for Extreme-Scale Computing in Earthquake System Science (SEISM2) Wrap-up Session Thomas.
Southern California Earthquake Center CyberShake Progress Update November 3, 2014 – 4 May 2015 UGMS May 2015 Meeting Philip Maechling SCEC IT Architect.
Welcome to the CME Project Meeting 2013 Philip J. Maechling Information Technology Architect Southern California Earthquake Center.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
1 USC Information Sciences InstituteYolanda Gil AAAI-08 Tutorial July 13, 2008 Part IV Workflow Mapping and Execution in Pegasus (Thanks.
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
SCEC CyberShake on TG & OSG: Options and Experiments Allan Espinosa*°, Daniel S. Katz*, Michael Wilde*, Ian Foster*°,
SCEC Capability Simulations on TeraGrid
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Recent TeraGrid Visualization Support Projects at NCSA Dave.
Overview of Scientific Workflows: Why Use Them?
Special Project Highlights ( )
CyberShake Study 16.9 Science Readiness Review
Seismic Hazard Analysis Using Distributed Workflows
Scott Callaghan Southern California Earthquake Center
CyberShake Study 16.9 Discussion
SCEC Community Modeling Environment (SCEC/CME)
High-F Project Southern California Earthquake Center
Philip J. Maechling (SCEC) September 13, 2015
2University of Southern California, 3San Diego Supercomputer Center
CyberShake Study 17.3 Science Readiness Review
Pegasus Workflows on XSEDE
CyberShake Study 2.2: Science Review Scott Callaghan 1.
rvGAHP – Push-Based Job Submission Using Reverse SSH Connections
CyberShake Study 14.2 Science Readiness Review
Southern California Earthquake Center
Southern California Earthquake Center
CyberShake Study 18.8 Technical Readiness Review
CyberShake Study 2.2: Computational Review Scott Callaghan 1.
CyberShake Study 18.8 Planning
Presentation transcript:

Southern California Earthquake Center SCEC Application Performance and Software Development Thomas H. Jordan [1], Y. Cui [2], K. Olsen [3], R. Taborda[4], J. Bielak [5], P. Small [6], E. Poyraz [2], J. Zhou [2], P. Chen [14], E.-J. Lee [1], S. Callaghan [1], R. Graves [7], P. J. Maechling [1], D. Gill [1], K. Milner [1], F. Silva [1], S. Day [3], K. Withers [3], W. Savran [3], Z. Shi [3], M. Norman [8], H. Finkel [9], G. Juve [10], K. Vahi [10], E. Deelman [10], H. Karaoglu [5], Y. Isbiliroglu [11], D. Restrepo [12], L. Ramirez-Guzman [13] [1] Southern California Earthquake Center, [2] San Diego Supercomputer Center, [3] San Diego State Univ., [4] Univ. Memphis, [5] Carnegie Mellon Univ., [6] Univ. Southern California, [7] U.S. Geological Survey, [8] Oak Ridge Leadership Computing Facility, [9] Argonne Leadership Computing Facility, [10] Information Science Institute, [11] Paul C. Rizzo Associates, Inc., [12] Universidad Eafit, [13] National Univ. Mexico, [14] Univ. Wyoming OLCF Symposium, 22 July 2014

Southern California Earthquake Center Improvement of models 4 Invert Other Data Geology Geodesy 4 Ground-motion inverse problem (AWP-ODC, SPECFEM3D) Physics- based simulations AWP = Anelastic Wave Propagation NSR = Nonlinear Site Response KFR = Kinematic Fault Rupture DFR = Dynamic Fault Rupture 2 AWP Ground Motions NSR 2 KFR Ground motion simulation (AWP-ODC, Hercules) Empiricalmodels Standard seismic hazard analysis (RWG, AWP-ODC) 1 Intensity Measures Attenuation Relationship 1 Earthquake Rupture Forecast SCEC Computational Pathways “Extended” Earthquake Rupture Forecast Structural Representation 3 AWPDFR 3 Dynamic rupture modeling (SORD, AWP-ODC) Hybrid MPI/CUDA AWP-ODC – Yifeng Cui and Kim Olsen Hercules – Jacobo Bielak and Ricardo Tarbora SORD – Steven Day CyberShake - Scott Callaghan OpenSHA/UCERF3 - Kevin Milner UCVM, CVM-H - David Gill Broadband - Fabio Silva AWP-ODC AWP-ODC, Hercules

Southern California Earthquake Center AWP-ODC Started as personal research code (Olsen 1994) 3D velocity-stress wave equations solved by explicit staggered-grid 4 th -order FD Memory variable formulation of inelastic relaxation using coarse-grained representation (Day 1998) Dynamic rupture by the staggered-grid split- node (SGSN) method (Dalguer and Day 2007) Absorbing boundary conditions by perfectly matched layers (PML) (Marcinkovich and Olsen 2003) and Cerjan et al. (1985) Inelastic relaxation variables for memory-variable ODEs in AWP-ODC

Southern California Earthquake Center AWP-ODC Weak Scaling 94% efficiency 100% efficiency 2.3 Pflop/s 93% efficiency (Cui et al., 2013)

Southern California Earthquake Center Hercules – General Architecture » Finite-Element Method » Integrated Meshing (unstructured hexahedral) » Uses and octree-based library for meshing and to order elements and nodes in memory End-to-End Simulation Process Main Solving Loop Most Demanding Operations Octree-Based FE Mesh » Explicit FE solver » Plane wave approximation to absorbing boundary conditions » Natural free surface condition » Frequency Independent Q Hercules was developed by the Quake Group at Carnegie Mellon University with support from SCEC/CME projects. Its current developers team include collaborators at the National University of Mexico, the University of Memphis, and the SCEC/IT team among others. Jacobo Bielak (CMU) and Ricardo Taborda (UM) See Refs. Taborda et al. (2010) and Tu et al. (2006)

Southern California Earthquake Center I/O Hercules on Titan – GPU Module Implementation Modifications to Solver Loop Source Forces Stiffness, Attenuation Nonlinear and Buildings Comm Forces Comm Disp Disp Update Reset Original Solver Loop CPU I/O Source Forces Stiffness, Attenuation Nonlinear and Buildings Comm Forces Comm Disp Disp Update CPU GPU Solver Loop Reset GPU Sync point DispForces Disp (Patrick Small of USC and Ricardo Taborda of UM, 2014) Chino Hills 2.8 Hz, BKT damping, 1.5 B elements, 2000 time steps (512 compute nodes)

Southern California Earthquake Center Hercules on Titan – GPU Performance Initial Strong Scaling Tests on Titan (in green) compared to other systems Recent Hercules developments include GPU capabilities using CUDA Performance tests for a benchmark 2.8 Hz Chino Hills simulation show near perfect strong and weak scalability on multiple HPC systems including TITAN using GPU The acceleration ratio of the GPU code with respect to the CPU is of a factor of 2.5x overall (Jacobo Bielak of CMU, Ricardo Taborda of UM and Patrick Small of USC, 2014)

Southern California Earthquake Center MIRA Hercules on Titan: La Habra Validation Experiment Combination of 3 SCEC Software Platforms (BBP, UCVM, High-F) across multiple HPC systems including Titan at OLCF and other DOE (Mira) and NSF (Blue Waters) systems Blue WatersMIRA USC HPC Titan Source Inversion Centroid Moment Tensor BBP Extended Source Generation Standard Rupture Format Extended and Point Source Models Hercules (GPU Module) CVM-S4 3D Tomographic Inversion CVM-S4.26 Perturbations 3D Grid UCVM Platform CVM-S4 CVM-S4.26 Etrees SCEC Data Center Recorded Seismograms Synthetic Seismograms Desktop Computer Post-processing Validation (GOF Scores) Plotting (Matlab + GMT) SCEC Data Center Recorded Seismograms

Southern California Earthquake Center Massive I/O requirementsStencil computationsNearest neighbor comm. SCEC Apps Lustre File SystemNode Architecture3-D Torus Interconnect HW Systems Limited OSTs Shared Memories Machine Topology Vector Capable Memory intensive Comm. latency I/O Throughput Regular Comp. pattern Algorithms and Hardware Attributes Effective Buffering ADIOS Array Optimization Memory Reduction Cache Utilization Node Mapping Latency hiding Contiguous I/O access

Southern California Earthquake Center Rank placement technique –Node filling with X-Y-Z orders –Maximizing intra-node and minimizing inter-node communication Asynchronous communication –Significantly reduced latency through local communication –Reduced system buffer requirement through pre-post receives Computation/communication overlap –Effectively hide computation times –Effective when T compute_hide >T compute_overhead – One-sided Communications (on Ranger) Shared memory (Joint work with Zizhong Chen of CSM) AWP-ODC Communication Approach on Jaguar

Southern California Earthquake Center Round trip latency Synch. overhead (Joint work with DK Panda Team of OSU) Rank placement technique –Node filling with X-Y-Z orders –Maximizing intra-node and minimizing inter-node communication Asynchronous communication –Significantly reduced latency through local communication –Reduced system buffer requirement through pre-post receives Computation/communication overlap –Effectively hide computation times –Effective when T compute_hide >T compute_overhead – One-sided Communications (on Ranger) AWP-ODC Communication Approach on Jaguar

Southern California Earthquake Center Round trip latency Synch. overhead (Joint work with DK Panda Team of OSU) Rank placement technique –Node filling with X-Y-Z orders –Maximizing intra-node and minimizing inter-node communication Asynchronous communication –Significantly reduced latency through local communication –Reduced system buffer requirement through pre-post receives Computation/communication overlap –Effectively hide computation times –Effective when T compute_hide >T compute_overhead –One-sided Communications (on Ranger) computation communication Timing benefit sequentialoverlapped AWP-ODC Communication Approach on Jaguar

Southern California Earthquake Center AWP-ODC Communication Approach on Titan (Cui et al., SC’13)

Southern California Earthquake Center nvvp Profiling

Southern California Earthquake Center Topology Tuning on XE6/XK7 Matching the virtual 3D Cartesian to an elongated physical subnet prism shape Maximizing faster connected BW XZ plane allocation Obtaining a tighter, more compact and cuboidal shaped BW subnet allocation Reducing inter- node hops along the slowest BW torus Y direction Joint work with G. Bauer, O. Padron (NCSA), R. Fiedler (Cray) and L. Shih (UH) Default node ordering

Southern California Earthquake Center Joint work with G. Bauer, O. Padron (NCSA), R. Fiedler (Cray) and L. Shih (UH) Tuned node ordering using Topaware Topology Tuning on XE6/XK7 Matching the virtual 3D Cartesian to an elongated physical subnet prism shape Maximizing faster connected BW XZ plane allocation Obtaining a tighter, more compact and cuboidal shaped BW subnet allocation Reducing inter- node hops along the slowest BW torus Y direction

Southern California Earthquake Center # nodesDefaultTopawareSpeedupEfficiency %100% %87.5%->90% %52.6%->81% Topology Tuning on XE6/XK7 Matching the virtual 3D Cartesian to an elongated physical subnet prism shape Maximizing faster connected BW XZ plane allocation Obtaining a tighter, more compact and cuboidal shaped BW subnet allocation Reducing inter- node hops along the slowest BW torus Y direction

Southern California Earthquake Center Parallel I/O Read and redistribute multiple terabytes inputs (19 GB/s) Contiguous block read by reduced number of readers High bandwidth asynchronous point-to-point communication redistribution cores Shared file Two-layer I/O Model

Southern California Earthquake Center OSTs Two-layer I/O Model Parallel I/O Read and redistribute multiple terabytes inputs (19 GB/s) Contiguous block read by reduced number of readers High bandwidth asynchronous point-to-point communication redistribution Aggregate and write (10GB/s)

Southern California Earthquake Center OSTs Stripe size Stripe size Stripe size … Temporal aggregator time step 1 time step 2 … time step N MPI-IO Aggregate and write (10GB/s) Temporal aggregation buffers Contiguous writes Throughput System adaptive at run-time Two-layer I/O Model Parallel I/O Read and redistribute multiple terabytes inputs (19 GB/s) Contiguous block read by reduced number of readers High bandwidth asynchronous point-to-point communication redistribution

Southern California Earthquake Center Fast X or Fast T (Poyraz et al., ICCS’’14) Fast-X: small-chunked and more interleaved. Fast-T: large-chunked and less interleaved

Southern California Earthquake Center ADIOS Checkpointing Problems at M8 on Jaguar: system instabilities, 32 TB checkpointing per time step Chino Hills 5Hz simulation validated ADIOS implementation:  Mesh size: 7000 x 5000 x 2500  Nr. of cores: 87,500 on Jaguar  WCT: 3 hours  Total timesteps: 40K  ADIOS saved checkpoints at 20K th timestep and validated the outputs at 40K th timestep  Avg. I/O performance: 22.5 GB/s (compared to 10 GB/s writing achieved with manually- tuned code using MPI-IO) Implementation Supported by Norbert Podhorszki, Scott Klasky, and Qing Liu at ORNL Future plan: add ADIOS Checkpointing to the GPU code AWP-ODC ADIOS Lustre

Southern California Earthquake Center SEISM-IO: An IO Library for Integrated Seismic Modeling ADIOS HDF5 pNetCDF

Southern California Earthquake Center CPUs/GPUs Co-scheduling aprun -n 50 & get the PID of the GPU job cybershake_coscheduling.py: build all the cybershake input files divide up the nodes and work among a customizable number of jobs for each job: fork extract_sgt.py cores --> performs pre-processing and launches "aprun -n -N 15 -r 1 &" get PID of the CPU job while executable A jobs are running: check PIDs to see if job has completed if completed: launch “aprun -n -N 15 -r 1 &” while executable B jobs are running: check for completion check for GPU job completion –CPUs run reciprocity-based seismogram and intensity computations while GPUs are used for strain Green tensor calculations –Run multiple MPI jobs on compute nodes using Node Managers (MOM)

Southern California Earthquake Center Post-processing on CPUs: API for Pthreads AWP-API lets individual pthreads make use of CPUs: post-processing –Vmag, SGT, seismograms –Statistics (real-time performance measuring) –Adaptive/interactive control tools –In-situ visualization –Output writing is introduced as a pthread that uses the API

Southern California Earthquake Center CyberShake Calculations CyberShake contains two phases Strain Green Tensor (SGT) calculation –Large MPI jobs –AWP-ODC-SGT GPU –85% of CyberShake compute time Post-processing (reciprocal calculation) –Many (~400k) serial, high throughput, loosely coupled jobs –Workflow tools used to manage jobs Both phases are required to determine seismic hazard at one site For a hazard map, must calculate ~200 sites

Southern California Earthquake Center CyberShake Workflows 6 jobs Mesh generation Tensor Workflow 1 job 2 jobs Post-Processing Workflow DB Insert Tensor simulation Tensor extraction Data Products Workflow Hazard Curve PMC SeisPSA PMC jobs 85,000 jobs SeisPSA PMC SeisPSA......

Southern California Earthquake Center CyberShake Workflows Using Pegasus-MPI-Cluster High throughput jobs wrapped in MPI master-worker job (Rynge et al., XSEDE’2012)

Southern California Earthquake Center CyberShake Study 14.2 Metrics 1,144 hazard curves (4 maps) on NCSA Blue Waters 342 hours wallclock time (14.25 days) 46,720 CPUs GPUs used on average –Peak of 295,040 CPUs, 1100 GPUs GPU SGT code 6.5x more efficient than CPU SGT code (XK7 vs XE6 at node level) 99.8 million jobs executed (81 jobs/second) –31,463 jobs automatically run in the Blue Waters queue On average, 26.2 workflows (curves) concurrently 29

Southern California Earthquake Center CyberShake 1.0 HzXE6XK7 (CPU-GPU co-scheduling) Nodes400 SGT hrs per site Post-processing hours per site** **2.00 Total Hrs per site Total SUs(Millions)*723 M299 M179 M SUs saving (Millions) 424 M543 M * Scale to 5000 sites based on two strain Green tensor runs per site ** based on CyberShake 13.4 map 3.7x speedup CyberShake SGT Simulations on XK7 vs XE6

Southern California Earthquake Center CyberShake SGT Simulations on XK7 vs XE6 CyberShake 1.0 HzXE6XK7 (CPU-GPU co-scheduling) Nodes400 SGT hrs per site Post-processing hours per site** **2.00 Total Hrs per site Total SUs(Millions)*723 M299 M179 M SUs saving (Millions) 424 M543 M 3.7x speedup * Scale to 5000 sites based on two strain Green tensor runs per site ** based on CyberShake 13.4 map

Southern California Earthquake Center 32 Broadband Platform Workflow Broadband Platform Software Distributions: Source Codes and Input Config Files: 2G (increases as platform runs) Data Files (Greens Functions): 11G (static input files) Hercules

Southern California Earthquake Center Earthquake Problems at Extreme Scale Dynamic rupture simulations –Current 1D outer/inner scale: 6x10 5 –Target: 1D m/0.001m (6x10 8 ) W ave propagation simulations –Current 4D scale ratio: 1x10 17 –Target 4D scale ratio: 3x10 23 Data-intensive simulations –Current tomography simulations: ~ 0.5 PB plan to carry out 5 iterations, 1.9 TB for each seismic source, total at least 441 TB for the duration of the inversion –Target tomography simulations:~ 32 XBs

Southern California Earthquake Center SCEC Computational Plan on Titan ResearchMilestoneCodeNr. Of Runs M SUs Material heterogeneities wave propagation 2-Hz regional simulations for CVM with small-scale stochastic material perturbations AWP-ODC-GPU813 Attenuation and source wave propagation 10-Hz simulations integrating rupture dynamic results and wave propagation simulator AWP-ODC-GPU SORD 519 Structural representation and wave propagation 4 Hz scenario and validation simulation, integration of frequency dependent Q, topography, and nonlinear wave propagation Hercules-GPU520 CyberShake PSHA 1.0-Hz hazard mapAWP-SGT-GPU CyberShake PSHA 1.5-Hz hazard mapAWP-SGT-GPU > 282 M SUs

Southern California Earthquake Center SCEC Software Development Advanced algorithms –Development of Discontinuous Mesh AWP –New physics: near-surface heterogeneities, frequency-dependent attenuation, fault roughness, near-fault plasticity, soil non-linearities, topography –High-F simulation of ShakeOut scenario 0-4 Hz or higher Prepare SCEC HPC codes for next-generation systems –Programming model Three levels of parallelism to address accelerating technology. Portability. Data locality and communication avoiding –Automation: Improvement of SCEC workflows –I/O and fault tolerance Cope with millions of simultaneous I/O requests. Support multi-tiered I/O systems for scalable data handling. MPI/network and node level fault tolerance –Performance Hybrid heterogeneous computing. Support for in-situ and post-hoc data processing. Load balancing –Benchmark SCEC mini-applications and tune on next-generation processors and interconnects

Southern California Earthquake Center Collaborators Matthew Norman, Jack Wells, Judith Hill (ORNL), Carl Ponder, Cyril Zeller, Stanley Posey and Roy Kim (NVIDIA), Jeffrey Vetter, Mitch Horton, Graham Lopez, Richard Glassbrook (NICS), DK. Panda and Sreeram Potluri (OSU), Gregory Bauer, Jay Alameda, Omar Padron (NCSA); Robert Fiedler (Cray), Liwen Shih (UH), Gideon Juve, Mats Rynge Mynge, Karan Vahi and Guarang Mehta (USC) Computing Resources NCSA Blue Waters, OLCF Titan, ALCF Mira, XSEDE Keeneland, USC HPCC, XSEDE Stampede/Kraken, NVIDIA GPUs donation to HPGeoC. Computations on Titan are supported through DOE INCITE program supported under DE-AC05-00OR22725 NSF Grants SI2-SSI (OCI ), Geoinformatics (EAR ), XSEDE (OCI ), NCSA NEIS-P2/PRAC (OCI ) This research was supported by SCEC which is funded by NSF Cooperative Agreement EAR and USGS Cooperative Agreement 07HQAG0008 Acknowledgements

Southern California Earthquake Center End