Download presentation
Presentation is loading. Please wait.
Published byFredrick Lovin Modified over 9 years ago
1
Operated by Los Alamos National Security, LLC for NNSA U N C L A S S I F I E D Slide 1 Data Release James Nunez ( jnunez@lanl.gov )jnunez@lanl.gov High Performance Computing Division Los Alamos National Lab August 10, 2011
2
Operated by Los Alamos National Security, LLC for NNSA U N C L A S S I F I E D Slide 2 Measurement and Understanding ScalaTrace http://moss.csc.ncsu.edu/~mueller/ScalaTrace Darshan – Petascale I/O Characterization Tool http://www.mcs.anl.gov/darshan
3
Operated by Los Alamos National Security, LLC for NNSA U N C L A S S I F I E D Slide 3 Links to Existing Available LANL Data Machine Failure/Usage/Event/Location & Disk Failure Data Sets http://institutes.lanl.gov/data/ Traces of MPI-IO Based Synthetic http://institutes.lanl.gov/data/tdata/ MPI-IO based synthetic & MPI-File Tree Walk http://institutes.lanl.gov/data/software/ File Systems Statistics Survey (fsstats) Code http://www.pdsi-scidac.org/fsstats/ File Systems Statistics Survey (fsstats) Results Los Alamos, Pacific Northwest, Oakridge National Lab and other production file system results at http://www.pdsi-scidac.org/cgi-bin/fsstats-list.cgihttp://www.pdsi-scidac.org/cgi-bin/fsstats-list.cgi LANL Workstation Data at http://institutes.lanl.gov/data/workstation/ USENIX Computer Failure Data Repository http://cdfr.usenix.org
4
Operated by Los Alamos National Security, LLC for NNSA U N C L A S S I F I E D Slide 4 Planned Data Release – Archive Data Archive and file system listing information data HPSS, GPFS-archive and NFS space An entry in the data file looks like: drwx------ 13059 14338 16384 733951 733951 16384 /1/1/3/2 drwx------ 13059 14338 16384 733951 733951 16384 /1/1/3/2/5 -rw-rw---- 1769 1491 5904072 734283 734304 524288 /1/1/165/1611/24/3212/2120 The format of each entry is: MODE USER_ID GROUP_ID FILE_SIZE MODIFICATION_TIME CREATION_TIME BLOCKSIZE PATH
5
Operated by Los Alamos National Security, LLC for NNSA U N C L A S S I F I E D Slide 5 Planned Data Release – Supercomputer Previous Nine years of computer operational failure data, over 23,000 records for several thousand machines Several million usage records (job size, processors/machines used, duration, time, etc.) Machine Layout Information (Building, room, Rack location in room, node location in rack, hot/cold rows, etc.) Refresh of Failure Data and Machine Layout from 2006 to present Includes old machines and some new ones
6
Operated by Los Alamos National Security, LLC for NNSA U N C L A S S I F I E D Slide 6 Machine Information
7
Operated by Los Alamos National Security, LLC for NNSA U N C L A S S I F I E D Slide 7 Machine Layout Information Anonymous Information Machine “name” Location - building, room RackPosition Position in Rack, Direction Row facing Hot/Cold Row N#,1 to 26,28 to 35,"1 to 37, top to bottom", 1,23,28,1,rear to N/Hot 2,23,28,2,rear to N/Hot 3,23,28,3,rear to N/Hot 4,23,28,4,rear to N/Hot 5,23,28,5,rear to N/Hot 6,23,28,6,rear to N/Hot 7,23,28,7,rear to N/Hot 8,23,28,8,rear to N/Hot … 159,23,34,19,rear to N/Hot 160,23,34,20,rear to N/Hot 161,23,34,21,rear to N/Hot 162,23,34,22,rear to N/Hot 163,23,34,23,rear to N/Hot 164,23,34,24,rear to N/Hot
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.