May 30-31, 2012 HDF5 Workshop at PSI May Metadata Journaling Dana Robinson The HDF Group Efficient Use of HDF5 With High Data Rate X-Ray Detectors Paul Scherrer Institut
May 30-31, 2012 HDF5 Workshop at PSI Journaling FileMetadata Suppose we have some interrelated metadata that we would like to write into a file
May 30-31, 2012 HDF5 Workshop at PSI Journaling Corrupted FileMetadata X ? If the write is interrupted (process killed, etc.), then we will have an invalid/corrupt file.
May 30-31, 2012 HDF5 Workshop at PSI Journaling File Metadata Transaction 2 2 Journaling avoids the corrupt file problem by recording the set of writes (a transaction) in a journal. Journal 1 1
May 30-31, 2012 HDF5 Workshop at PSI Journaling File Metadata Transaction When a transaction is interrupted, a recovery tool can repair the file. Journal X h5recover
May 30-31, 2012 HDF5 Workshop at PSI HDF5 Implementation HDF feature. Prevents loss of an entire file due to a crashed writer. Covers file metadata only! We make no guarantees about data. Works with parallel. Currently uses an external journal file. Journaling slows performance.
May 30-31, 2012 HDF5 Workshop at PSI Superblock Additions Journaling Flag Internal/External Flag Journal Location (path or address) Journal Version Number
May 30-31, 2012 HDF5 Workshop at PSI Journal File Format Binary file --> --> ( | | ) --> --> -->
May 30-31, 2012 HDF5 Workshop at PSI New API Call herr_t H5Pset_jnl_config(hid_t plist_id, const H5AC_jnl_config_t *config_ptr); Takes a struct (H5AC_jnl_config_t) which contains journaling parameters.
May 30-31, 2012 HDF5 Workshop at PSI H5AC_jnl_config_t typedef struct H5AC_jnl_config_t { intversion; /* metadata journaling configuration fields: */ hbool_t enable_journaling; char journal_file_path[H5AC__MAX_JOURNAL_FILE_NAME_LEN + 1]; hbool_t journal_recovered; size_t jbrb_buf_size; int jbrb_num_bufs; hbool_t jbrb_use_aio; hbool_t jbrb_human_readable; } H5AC_jnl_config_t; Defined and documented in H5ACpublic.h Will be documented in the reference manual in HDF
May 30-31, 2012 HDF5 Workshop at PSI Transaction Start/End Transaction start/end calls are added to the beginning and end of all API functions that modify metadata. H5Xdo_something() { start_transaction(); /* Do things which modify metadata */ end_transaction(); }
May 30-31, 2012 HDF5 Workshop at PSI Ring Buffer Journal Buffer 0 Journal Buffer 1 Journal Buffer 1 Journal Buffer 2 Journal Buffer 2 Journal Buffer 3 Journal Buffer 3 Journal Buffer 4 Journal Buffer 4
May 30-31, 2012 HDF5 Workshop at PSI Journal Buffers Contains raw/binary journal entries to be later streamed to the journal location. These journal entries vary in size. Entry 0 Entry 1 Entry 2 Entry 3
May 30-31, 2012 HDF5 Workshop at PSI Start Transaction Assign the next transaction ID number. Insert a "begin transaction" message in the journal buffer with that transaction ID. Return the ID to the caller.
May 30-31, 2012 HDF5 Workshop at PSI Insert a Journal Entry Check for space in the current journal buffer. If no space… Start an asynchronous write of the current journal buffer. Test to see if the next buffer has an uncompleted write If there is, stall until it completes Switch to the next journal buffer Make an entry in the journal buffer
May 30-31, 2012 HDF5 Workshop at PSI End Transaction Insert an "end transaction" entry into the journal buffer. Increment the transaction ID number.
May 30-31, 2012 HDF5 Workshop at PSI Flush and Close Flush Write current journal buffer to disk Flush journal entries Truncate the journal Close Flush (as above) Load superblock and set journaling tag to FALSE Sync superblock
May 30-31, 2012 HDF5 Workshop at PSI Parallel Requires relatively few changes Transaction entries must be serialized at sync points and end of transaction. Process 0 really handles the transaction I/O. Journal I/O only happens at synch points for better I/O efficiency.
May 30-31, 2012 HDF5 Workshop at PSI h5recover h5recover [OPTIONS] [OBJECTS] [HDF5_FILE] OBJECTS -j, --journal [JOURNAL_FILE] journal file name OPTIONS -b, --backup [BACKUP_NAME] Specify a name for the backup copy of the HDF5 file. default = '[HDF5_FILE].backup' -f --force Recover without confirmation if the journal file is empty. -n, --nocopy Do not create a backup copy. -h, --help Print a usage message and exit -v, --verbose Generate more verbose output (repeat for increased verbosity) -V, --version Print version number and exit -x, --examine Examine the supplied file(s), report, and exit without action.
May 30-31, 2012 HDF5 Workshop at PSI h5recover Algorithm Try to find the superblock in the target file. Check to see if the journaling flag is set. Try to find the journal. Open the journal and validate it. Apply all metadata writes specified in the journal up to the last transaction. Reset the journaling flag and flush the file to disk.