CyberShake Study 2.2: Computational Review Scott Callaghan 1
Computational Goals 269 CyberShake sites on Kraken with existing SGTs 47 complete, 221 remaining, 1 lost Produce seismograms, PSA values, hazard curves Establish Kraken / Cray architecture as platform for CyberShake
Study Goal Map
Inputs 221 sets of SGTs + MD5 sums Rupture Geometries 5 on HPC 184 on Ranger disk 31 on Ranch archive Will need to be staged back to Ranger About 8.5 TB Rupture Geometries 14,000 files 1.5 GB
Outputs Files Database Access 116 M seismograms, 116 M PSA files 2.5 TB (2.1 TB new) 350,000 workflow files 1.3 TB (1.1 TB new) Small number of curves, maps Database 350 M entries (37% increase) About 40 GB Access Hazard curves, maps posted on web site PSA values in DB Seismograms on disk
Computing Environment/Resources Kraken nodes Pegasus 4.2.0 + PMC SGT extraction code Memcached In-memory rupture variation generation Seismogram/PSA code Combined CyberShake codes tagged in SVN
Computing Resources 1.2M Kraken SUs, ~5M SUs available Local disk space 3.2 TB (additional) required 4.8 TB available on scec-02 Duration Start 10/8 (with review approval) ~2 months (dependent on Kraken queue, I/O) Personnel Scott Request help from Pegasus group when needed
Reproducibility Science code tagged in SVN Metadata captured in database SGTs long-term Ranger decommissioned in Feb Either archive or throw away and regenerate
Metrics Calculate metrics previously highlighted in papers and posters, especially: Average makespan Parallel speedup Utilization Tasks/sec Delay per job SI2 metrics Number of hazard curves Compare metrics, determine improvement
Open Issues / Risk Analysis Kraken I/O Depending on file system performance, runtime is variable by a factor of 3 Kraken gridmanager Support for load? SUs Uncertain about usage of other SCEC users Statistics gathering Have had issues with pegasus-monitord in the past May have to populate DB after workflow is complete