Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ideas and test setup for data transfer from CERN to Italian Ground Segment. M. Boschini, A. Favalli, M. Levtchenko CERN – March, 31, 2003.

Similar presentations


Presentation on theme: "Ideas and test setup for data transfer from CERN to Italian Ground Segment. M. Boschini, A. Favalli, M. Levtchenko CERN – March, 31, 2003."— Presentation transcript:

1 Ideas and test setup for data transfer from CERN to Italian Ground Segment. M. Boschini, A. Favalli, M. Levtchenko CERN – March, 31, 2003

2 Outline Goals Ideas Setup Tests Conclusions M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

3 Goals of the system In the frame of the overall AMS-02 Data handling, an Italian Ground Segment will be set-up to maintain a MASTER COPY of all AMS-02 data (raw and reconstructed). System will have to Efficiently transfer data from CERN to IGS Book-keep Data transfer M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

4 Ideas... Decouple Data transfer protocol from book-keeping system Use standard techniques for both Test solutions separately Test overall solution....results... M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

5 Test system... RH box at CERN on a 10 Mb/s line RH box at MI-INFN (further tests on Local MI-INFN 100 Mb/s line) AMS-01 raw 1min files and AMS-01 n- tuple.gz M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

6 Ideas...data transfer protocol Test OpenSCP and bbftp OpenSCP too heavy (test and results presented in 2002) Bbftp: already tested in Dec. 2001 with AMS- SW group (Eline, Klimentov)....thus, test bbftp in terms of network usage efficiency and reliability M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

7 Ideas...bbftp parameters Bbftp has 2 main parameters: Number of parallel streams TCP window size Goal is finding best ratio WIN_SIZE/NUM_STREAMS having in mind that keeping NUM_STREAMS low is preferrable (RFC 1323) M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

8 Ideas...TCP parameters Theory: for an RFC1323 compliant network we can define CAPACITY = (BANWIDTH x RTT) OPT_WIN_SIZE = CAPACITY (the one for which you maximize throughput…) Since NUM_STREAM ~ (CAPACITY)/(WIN_SIZE) Setting WIN_SIZE = OPT_WIN_SIZE, NUM_STREAM = 1 and have maximum thr. BUT…in reality: LFNs have a high (BANWIDTH x RTT) but “standard” TCP implementations have a limit on win- size…(64K) M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

9 Tests...bbftp parameters 10 Mb/s M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

10 Tests...iperf parameters 34 Mb/s (FNAL-MI)

11 Tests...bbftp conclusions  AMS-02 scenario will have LFN, thus 1) re-study TCP optimization in more realistic scenario 2) in general, we can assume that we will need large WIN_SIZE and high NUM_STREAM M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

12 Ideas...book-keeping system We need to keep track of Which file has to be transferred How has it been transferred Where has it been stored M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

13 Ideas...book-keeping system We decided to adopt a Data Base. One at CERN One in Milano CERN DB will contain file info and DT info for all files (even those unsuccessfully transferred) Milano DB will contain file info and DT info for all files arrived at MI. Automatic consistency has to be set-up DB's have to be web-browsable M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

14 Tests...book-keeping system We decided to use MySQL. Stress-tested MySQL. 2500 concurrent writes/sec have been measured. rollback/commit available. Uptime: 18 months. M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

15 Ideas...The complete system The complete system has to: Find new files at CERN in a “spool” directory. Insert in CERN-DB Move files to Milano Update CERN-DB with transfer status Insert in MI-DB M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

16 Ideas...The complete system Hypothesis: “Production” rate (4 +8)Mb/s “permitted downtime” 3 days Spool dir big enough (~400 GB for 3 days) Peak bandwidth = 36 Mb/s M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

17 Test...The complete system Test “reality”: “permitted downtime” 0.3 days Spool dir big enough (~40 GB 0.3 days) Peak bandwidth = 10 Mb/s but we already showed that bbftp+adaptive win-size can maximize bandwidth usage... …our test is ~ hypothesis x scale-factor. M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

18 Test...The complete system System is made out of: at CERN “fake_production” (8 Mb/s) Main --> looks for new files in spool dir bbftp_forker --> forked by Main when files found. Sends files to MI, inserts/updates in CERN-DB “Tells” MI to update DB (SSL client/server). Perl5 (SSL+DBI) + C patch to bbftp. M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

19 Test...The complete system System is made out of: at MILANO TCP/IP (SSL) server waiting for “requests for update” from CERN Updates DB. Perl5 scripts (SSL + DBI) M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

20 Test...The complete system SSL has been adopted to connect to MI in order to crypt connection to DB (MySQL uses standard sockets). We implemented bbftp private_auth feature. M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

21 DB Consistency DB Consistency is a mechanism which ensures that DB contents at CERN and MI are consistent with the data that has actually been transferred. DB Consistency is performed in 2 ways: 1) check between DB entries at CERN corresponding to files which seem to have been transferred ok and DB entries at the IGS, is performed every hour.This check is based on SQL selections and comparisons. If difference is found, alert is sent and files are re-transferred. 2) based on a redundancy approach, in which a third host, also located in Milano, will keep a ''copy'' of both CERN and IGS DB's. The copy is performed as for now as DB dump/restore. On-line update is under development. M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

22 “data integrity” DATA INTEGRITY is evaluated by means of an MD5 digest. md5sum calculated at CERN Sent to MI at “request for update” md5sum calculated at MI if (md5sum(MI) != md5sum(CERN) ) INTEGRITY = ‘FAIL’ in DB and DB consistency mechanism re-sends file. M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

23 “stress simulation” After 6 months of “normal” running, we started a stress simulation, artificially and randomly stoppping services. We stopped: Network (total down time in 3 months: 3 weeks scattered in 90 different downtimes, ranging from 5 min to 8 hr) MySQL (total down time in 3 months: 1 day,scattered in 48 different downtimes, ranging from 5 min to 1 hr) Bbftpd (total down time in 3 months: 1 day,scattered in 48 different downtimes, ranging from 5 min to 1 hr) Components of DT (total down time in 3 months: 3 weeks scattered in 90 different downtimes, ranging from 5 min to 8 hr) M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

24 Total running … 9 months of uptime 380,000 files transferred 7.6 TB transferred 5% of files needed to be retransmitted (network) 0.03% of files needed to be retransmitted (DB consistency checks) M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

25 GUIs … System has also “graphical” tools: DB Browser at CERN and MI (Perl5, PHP) DT System handling GUI at CERN and MI (GTK, Perl5) M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003

26 To do Implement tests on a 100 Mb/s CERN LAN in order to study and optimize TCP window size and NUM_STREAMS is a scenario more similar to the AMS-02 one Implement on-line redundancy with third host. Integrate with High Speed Network hosts at CERN (wacdr001d.cern.ch) as suggested and discussed with IT Communication Serviceswacdr001d.cern.ch

27 Conclusions… More than 380000 files have been transferred correctly, ~ 5% needed to be retransmitted because of network outages. TCP parameters have been studied and tweaked in order to optimize the bandwidth usage The whole data set is organized in a DB, which acts as a data transfer book-keeping system. The DB proved to be robust and fast enough to suit our needs. DB Consistency mechanisms recovered a 0.03 % data loss. The system can be handled through a GUI and monitored via web. Other work to do…(higher bandwidth link, on-line consistency) M. Boschini – Ideas and Test Setup for Data Transfer from CERN to Italian Ground Segment CERN – March, 31, 2003


Download ppt "Ideas and test setup for data transfer from CERN to Italian Ground Segment. M. Boschini, A. Favalli, M. Levtchenko CERN – March, 31, 2003."

Similar presentations


Ads by Google