Gregory Mendell LIGO Hanford Observatory S6 Data Production/LSC Data Grid Plans: RDS HOFT LDG March 2008 LSC/DASWG F2F Meetings Gregory Mendell LIGO Hanford Observatory LIGO-Gxxxxxxxxx-00-W LIGO-G050374-00-W LIGO-G050635-00-W
S5 Reduced Data Sets (RDS) RDS frames are subsets of the raw data, with some channels downsampled Level 1 RDS (/archive/frames/S5/L1; type == RDS_R_L1) separate frames for LHO and LLO: LHO: 131 H1 channels and 143 H2 channels LLO: 181 L1 channels Level 3 RDS (/archive/frames/S5/L3; type == RDS_R_L3) DARM_ERR, STATE_VECTOR, and SEGNUM; separate frames for LHO and LLO. Level 4 RDS (/archive/frames/S5/L4; type == [H1|H2|L1]_RDS_R_L1) DARM_ERR downsampled by a factor 4, STATE_VECTOR, and SEGNUM; separate frame for H1, H2 and L1 data.
LIGO S5 Data Products Disk Space Tape Archiving @ 186 GBs/tape; Dual Copy Data Rate 1 Yr 1 Yr Comp. Ratio LHO raw 9.063 MB/s 272 TBs 1800 tapes 1.7 on tape LHO Level 1 RDS 1.391 MB/s 42 TBs 460 tapes 1.2 in files LHO Level 3 RDS 0.117 MB/s 3.5 TBs 39 tapes 1.2 in files LHO Level 4 RDS 0.029 MB/s 0.87 TBs 10 tapes 1.2 in files LHO Level 1 h(t) 0.656 MB/s 20.8 TBs 230 tapes -- LHO Level 2 h(t) 0.234 MB/s 3.8 TBs 80 tapes LHO SFTs 0.032 MB/s 0.96 TBs 11 tapes -- LLO raw 4.406 MB/s 133 TBs 970 tapes 1.5 on tape LLO Level 1 RDS 0.750 MB/s 23 TBs 249 tapes 1.2 in files LLO Level 3 RDS 0.059 MB/s 1.8 TBs 20 tapes 1.2 in files LLO Level 4 RDS 0.015 MB/s 0.45 TBs 5 tapes 1.2 in files LLO Level 1 h(t) 0.328 MB/s 10.4 TBs 115 tapes LLO Level 2 h(t) 0.117 MB/s 3.8 TBs 40 tapes -- LLO SFTs 0.016 MB/s 0.48 TBs 5 tapes -- Totals: 17.3 MB/s 521 TBs 4056 tapes as above * All of Level 1 RDS will not necessary fit on cluster node disks and/or inside the tape library at the sites. A copy of all data of all types is stored at the sites on tape, either in the tape library or off-line in tape cabinets.
A5/S6 Change to RDS compression The compression of the RDS frames has changed from gzip to zero_suppress_int_float_otherwise_gzip in the A5 Astrowatch data. This will continue with the S6 RDS data. The new compression method is already in the C and C++ frame libraries, but not in the fast frame library used by dtt. A request to add this has been sent to Daniel Sigg. Tests that show that zero_suppress_int_float_otherwise_gzip reduces the size of the Level 1 RDS frames by 30% and speeds up output (and probably input) by around 20%.
Data Products For S6 Raw Full Frame (L0) Data Compress raw data in the frames? (test by John Zweizig indicate can reduce size by factor of 2) Change frame duration from 32 s? Level 1 RDS (RDS_R_L1) Data Requires 65 TBs of disk space per year (LDAS Plans to keep this on disk. All IFOs at both sites?) Need to finalize S6 channel list with input from DASWG/GlitchGroup/DetChar Remove channels in Level 1 h(t) frames from Level 1 RDS frames? Change 64 s frame duration? Split LSC, ASC, PEM, SUS, FSR into separate frames? Make Level 2 RDS of most used channels? (e.g., request by Soma) Level 3 RDS (RDS_R_L3) Data Change 256 s frame duration? Level 4 RDS ([H1|H2|L1_RDS_R_L4) Data Discontinued Timing RDS (requested by Szabi) 93 Timing channels not in Level 1 RDS data at 31 MBs per 64 s frame file, for 15 TBs/yr of data. E.G,: H0:DAQ-FB0_UPTIME SECONDS, H0:DAQ-FB1_UPTIME_SECONDS, H0:TIM-MSR_BYVAT_TIME_DIFF, H0:TIM-MSR_H2VAT_TIME_DIFF, H1:DAQ-GPS_RAMP_EX, H1:GDS-IRIGB_LVEA, H1:GDS-TIME_MON, H1:GPS-EX_ATOMIC_FAIL… Online h(t) (HOFT) Data LDAS will generate using Xavi’s code. See next slides.
S5/S6 RDS Generation Publishing/LDR LDAS RDS frames transfer get segments from LDAS diskcacheAPI RDS frames transfer to CIT via LDR Driver Script createrds.tcl uses Ligotools LDASjob package to submit jobs run LDAS createRDS jobs MIT UWM PSU AEI RDS frames are written to archive filesystem
Will start testing during astrowatch!!! NEW S6 h(t) Generation Publishing/LDR LDAS get segments from LDAS diskcacheAPI and LSCsegFind NEW Driver Script run by LDAS: h(t) frames transfer to CIT via LDR createhoft.tcl driver script has been written and simple tests run Run LSCdataFind and Xavi’s code Advantages: Sys Admins run scripts, like RDS generation Same monitoring/gap checking that makes RDS generation robust Sys Admins aware of ldas down-time, problems Local Sys Admins close to Calibration (Greg & Mike, Igor & Brian) MIT UWM PSU AEI h(t) frames are written to archive filesystem Will start testing during astrowatch!!!
RDS monitoring will include h(t) monitoring! Click here to get directory listing; browse for scripts and adc*.txt file to get channel lists.
LSC Data Grid (LDG) LDG Client*: LDG ClientPro: LDG Server: package( 'Client-Environment' ); package( 'VDT_CACHE:Globus-Client' ); package( 'VDT_CACHE:CA-Certificates' ); package( 'VDT_CACHE:Condor' ); package( 'VDT_CACHE:GSIOpenSSH' ); package( 'VDT_CACHE:KX509' ); package( 'VDT_CACHE:UberFTP' ); package( 'VDT_CACHE:EDG-Make-Gridmap' ); package( 'VDT_CACHE:VOMS-Client' ); package( 'Client-FixSSH' ); package( 'Client-Cert-Util' ); package( 'Client-LSC-CA' ); LDG ClientPro: Same as Client, plus… package( 'VDT_CACHE:MyProxy' ); package( 'VDT_CACHE:PyGlobus' ); package( 'VDT_CACHE:PyGlobusURLCopy' ); package( 'VDT_CACHE:Globus-RLS-Client' ); package( 'VDT_CACHE:Pegasus' ); package( 'VDT_CACHE:VOMS-Client' ); package( 'VDT_CACHE:Globus-Base-WSGRAM-Client' ); package( 'VDT_CACHE:Globus-WS-Client' ); package( 'VDT_CACHE:TclGlobus-Client' ); LDG Server: package( 'Server-Environment' ) package( 'VDT_CACHE:Globus' ) package( 'VDT_CACHE:CA-Certificates' ) package( 'VDT_CACHE:CA-Certificates-Updater' ) package( 'VDT_CACHE:Condor' ) package( 'VDT_CACHE:GSIOpenSSH' ) package( 'VDT_CACHE:KX509' ) package( 'VDT_CACHE:MyProxy' ) package( 'VDT_CACHE:UberFTP' ) package( 'VDT_CACHE:EDG-Make-Gridmap' ) package( 'VDT_CACHE:Globus-RLS') package( 'VDT_CACHE:Globus-Core') package( 'VDT_CACHE:Globus-Condor-Setup' ) package( 'VDT_CACHE:PyGlobus' ) package( 'VDT_CACHE:PyGlobusURLCopy' ) package( 'VDT_CACHE:Pegasus' ) package( 'VDT_CACHE:VOMS-Client' ) package( 'VDT_CACHE:Globus-WS' ) package( 'VDT_CACHE:Tomcat-5.5' ) package( 'VDT_CACHE:TclGlobus' ) package( 'Server-FixSSH' ) package( 'Server-RLS-Python-Client' ) package( 'Server-Cert-Util' ) package( 'Server-LSC-CA' ) *Note that a generic install that builds globus client from source is available for platforms not supported by VDT, e.g., for Solaris & PowerPC
LDG Plans Support for CentOS 5 (done) and Debian Etch (next release) Further support for Mac OS X on Intel (Leopard) Most Frequent User Requests: Make it work on my platform. Problems: Missing dependencies, so should work towards improved dependency checking Minor bug with work around (so should document fixes in easy to find location) Lite version that just installs gsi-enabled ssh, certificate management tools Help getting/renewing certificates (See Authentication Committee’s work)
LDG Middleware Packaging Email from Stuart: As suggested by Scott in the Feb 28 CompComm meeting I would like to initiate the CompComm Middleware Packaging Group to address how LIGO should manage the distribution of grid middleware. The initial group is to consist of Greg and Scott, though they are encouraged to recruit other expertise and opinions as needed. The primary questions to address are: Should the LDG server bundle be replaced with the OSG stack? If so, what about sites, such as the Observatories, that are not running OSG jobs. 2) If we stick with LDG as a rebundle of the VDT: a) Do we have the right split between client, clientpro, and server (more or less splits?) b) How should the Server bundle be enhanced to effectively support WS-GRAM services (aka GT4)--e.g., VDT enhancement to support distinct LDG_SERVER_LOCATION and VDT_LOCATION or modify LDG to make these equal? c) Should we actively push for the replacement of Pacman? and if so what should we request, e.g., RPM, Solaris packages, DEB, ... 3) What LIGO applications should be modified to use standard LDG middleware installations, e.g., LDAS is currently investigating the feasibility of using TclGlobus and Globus from LDG Server rather than a parallel install in /ldcg, what about LDR, others?
END
Channel Lists Level 1: http://ldas.ligo-wa.caltech.edu/ldas_outgoing/createrds/dsorun/contrib/createrds/S5_L1/adcdecimate_H-RDS_R_L1-S5.txt http://ldas.ligo-la.caltech.edu/ldas_outgoing/createrds/dsorun/contrib/createrds/S5_L1/adcdecimate_L-RDS_R_L1-S5.txt Level 3: http://ldas.ligo-wa.caltech.edu/ldas_outgoing/createrds/dsorun/contrib/createrds/S5_L3/adcdecimate_H-RDS_R_L3-S5.txt http://ldas.ligo-la.caltech.edu/ldas_outgoing/createrds/dsorun/contrib/createrds/S5_L3/adcdecimate_L-RDS_R_L3-S5.txt Level 4: http://ldas.ligo-wa.caltech.edu/ldas_outgoing/createrds/dsorun/contrib/createrds/S5_H1_L4/adcdecimate_H-H1_RDS_R_L4-S5.txt http://ldas.ligo-wa.caltech.edu/ldas_outgoing/createrds/dsorun/contrib/createrds/S5_H2_L4/adcdecimate_H-H2_RDS_R_L4-S5.txt http://ldas.ligo-la.caltech.edu/ldas_outgoing/createrds/dsorun/contrib/createrds/S5_L4/adcdecimate_L-RDS_R_L4-S5.txt L1:LSC-AS_Q 1 L1:LSC-AS_I 2 L1:LSC-POB_Q 2 ... L1:LSC-DARM_CTRL 1 L1:LSC-DARM_ERR 1 L0:PEM-EY_BAYMIC 1 L0:PEM-EX_BAYMIC 1 Note that one fast channel for 1 yr of S5 data can take up ~ 2 TBs of disk space. Level 3: H2:LSC-DARM_ERR 1 H2:IFO-SV_STATE_VECTOR 1 H2:IFO-SV_SEGNUM 1 H1:LSC-DARM_ERR 1 H1:IFO-SV_STATE_VECTOR 1 H1:IFO-SV_SEGNUM 1 Level 4: H2:LSC-DARM_ERR 4