Download presentation
Presentation is loading. Please wait.
Published byMaude Thomas Modified over 9 years ago
1
The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009
2
The new software procedures 1/2 We implemented a simple, robust replica manager architecture. An automatic system that: scans for new DAQ files and build metadata on it (based upon FrDump output) ; keep track of that files and order them in multiple queues (one for each kind of file) ; prepares the data transfer sessions (builded on static configuration parameters) and starts them, one session for each data flow; checks the sessions output status and performs some actions based on it (basically different actions were perfomed on a succesful or failed data transfer) ; schedules a retry on a failed transfer session; keeps tracks of all operation scheduled (succesfull or failed) ; builds a metadata structure for each file ( a raw ffl entry )
3
The new software procedures 2/2 … and it also: has the same architecture and similar topology for each data flow: only the sendFile class changes, so we have some primitives that are wrapper around bbftp's and SRB's command (… and why not in a future on gridFTP). has a network “star configuration” (from Cascina to the CCs with 8 independent flows); collects informations on closed sessions only parsing the local log and the output of the performed operations in order to find the status of the transferred files; builds locally a remote ffl, based upon the FrDump output performed on the local file and mixing them with the static information on the remote destination directory; organize the data path in the same way in all repositories in order to have same script for search for missing files or errors.
4
The Cascina – Bologna – Lyon star architecture Lyon Bologna datagw.virgo.infn.it Procdata vols Rawdata circular buffer SRBbbftp
5
The LIGO data interface (using LDR) LIGO Lyon Bologna dataldr.virgo.infn.itdatagw.virgo.infn.it LIGO vols (RW) Procdata vols (RO) SRBbbftp
6
The achieved performance 2009-07-20 16:45:27,108 INFO DtDBase: adding V-raw-932135940-180.gwf to rawdata queque 2009-07-20 16:45:28,563 INFO SRBEngine: [raw2ly] sending file V-raw-932135940-180.gwf 2009-07-20 16:45:31,314 INFO BBEngine: [raw2bo] sending file V-raw-932135940-180.gwf 2009-07-20 16:46:32,715 INFO BBEngine: [raw2bo] file V-raw-932135940-180.gwf successfully sent 2009-07-20 16:46:36,227 INFO BBEngine: [raw2bo] sent updated ffl./ffl/raw2bo.ffl 2009-07-20 16:46:57,363 INFO SRBEngine: [raw2ly] file V-raw-932135940-180.gwf successfully sent 2009-07-20 16:47:00,978 INFO SRBEngine: [raw2ly] sent updated ffl./ffl/raw2ly.ffl fflGen.pl [Mon Jul 20 16:45:27 2009] -> file to insert V-raw-932135940-180.gwf on st4rear::v081 fflGen.pl [Mon Jul 20 16:45:27 2009] -> sending infos about V-raw-932135940-180.gwf to dataSend fflGen.pl [Mon Jul 20 16:45:27 2009] -> sending infos about V-raw-932135940-180.gwf to dataBackup fflGen.pl [Mon Jul 20 16:45:27 2009] -> generate a new ffl file... fflGen.pl [Mon Jul 20 16:45:34 2009] ->...public ffl file updated with 87962 records An example with a VSR2 rawdata file: (V-raw-932135940-180.gwf) → available in Cascina to users (circular buffer) at 16:45:27 → published in the local ffl in Cascina at 16:45:34 → available in Bologna (published with ffl) at 16:46:36 (1'09”) → available in Lyon (published with ffl) at 16:47:00 (1'33”)
7
The amount of data sent to CCs Here is the amount of data sent to remote CCs until now (27 oct '09 at 10:00am): - from logs we see that we are in a “just in time” situation for about the 93% of the data transfer activity (this means that we got a delay of about 2 minutes between the publication of the file in Cascina and the availability of the file replica at remote CCs) - at this moment, only 3 files were missed (2 raw and 1 proc on a total of about 53000 files) from the sent list (due to exceptions not managed by the procedure). Problems manually fixed.
8
Conclusions We achieve a g ood level of performance for all the 8 independent data flows active : (rawdata, hreconline, ligo H1, ligo L1 each from Cascina to Bologna and Lyon) No particular problems were detected in Bologna: only two file missing from the list; Some problems were detected in Lyon, one for a missing file, all other are related to the SRB interface: Sput command sometimes lock (a manual procedure is needed for unlock it) good FFL files were transferred to Lyon but they result to be a zero file length at destination sometimes we loose the synchronization between the SRB/xrootd layer and the HPSS layer (ex: the Smv command). About this problems, we got a good support from the Lyon SRB service team
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.