Roberto Barbera Prague, ALICE Multi-site Data Transfer Tests on a Wide Area Network Giuseppe Lo Re Roberto Barbera Work in collaboration with: P. Cerello, D. Di Bari, G. Donvito (CMS), E. Fragiacomo, A. Fritz, M. Luvisetto, M. Masera, F. Minafra, D. Mura, S. Piano, M. Sitta, J. Švec, R. Turrisi Contributions from GARR and INFN NetGroup: C. Allocchio, M. Campanella, L. Gaido, S. Lusso, M. Michelotto, S. Spanu, S. Zani, D. De Girolamo CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, Outline Objectives Preparation and benchmark Testbed layout and test results Conclusions CHEP 2004, 30 Sep 2004
Prague, Objectives See if the actual bandwidths can cope with the ALICE needs Spot possible bottle-necks out in the point-to-point transfers (I/O LAN WAN LAN I/O) Check, with “real” numbers of “real” use cases, if bandwidth attributions foreseen in the next future are adequate I/O server Front-end router Front-end router WANLAN disk I/O (W/R block size) TCP windows # streams BDP = BW*RTT CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, Preparation and Benchmark Standard configuration of both the TCP stack and disk I/O parameters in Linux SSH keys exchanged among all machines to “secure” file transfers without typing passwords Automatic procedure installed on all machines with both Flat and Multi-Tier configurations. CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, Testbed layout and “numbers” BA: 3 servers (2 ALICE, 1 CMS) BO: 6 servers CA: 2 servers CNAF: 2 servers CT: 2 servers PD: 6 servers TO: 2 servers TS: 1 server Prague: 1 server Houston: 1 server CNAF Padova Houston Prague CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, Disk access measurements (non reserved access, local disk) MachineWrite (MBytes/s)Read (MBytes/s) boalice8.bo.infn.it55 server3.ca.infn.it4561 aliserv10.ct.infn.it2734 alifarm02.to.infn.it4059 alifarm.ts.infn.it2836 MachineWrite (MBytes/s)Read (MBytes/s) boalice8.bo.infn.it53 server3.ca.infn.it4332 aliserv10.ct.infn.it5725 pcalice19.pd.infn.it55 alifarm02.to.infn.it3153 alifarm.ts.infn.it2734 Bonnie IOzone CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, Bandwidth measurements MachineBW1(Mb/s)BW2(Mb/s)BW4 (Mb/s)BW8(Mb/s)BW16(Mb/s)BW32(Mb/s) boalice8.bo.infn.it server3.ca.infn.it aliserv10.ct.infn.it pcalice19.pd.infn.it alifarm02.to.infn.it alifarm.ts.infn.it Iperf Netperf-2.1 MachineBW1(Mb/s)BW2(Mb/s)BW4 (Mb/s)BW8(Mb/s)BW16(Mb/s)BW32(Mb/s) boalice8.bo.infn.it server3.ca.infn.it aliserv10.ct.infn.it pcalice19.pd.infn.it alifarm02.to.infn.it alifarm.ts.infn.it CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, GARR network status at the beginning Bari: 28 Mb/s (BGA: 16 Mb/s) Bologna: 32 Mb/s Cagliari: 8 Mb/s Catania: 34 Mb/s CNAF: 1024 Mb/s Padova: 155 Mb/s Torino: 155 Mb/s (BGA: 70 Mb/s) Trieste: 16 Mb/s CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, Network bandwidths (after the tests) Bari: 28 Mb/s (BGA: 16 Mb/s) Bologna: 100 Mb/s (BGA: 32 Mb/s) Cagliari: 32 Mb/s Catania: 34 Mb/s (direct connection to GARR-G in 6 months, up to 2.5 Gb/s) CNAF: 1024 Mb/s Padova: 155 Mb/s Torino: 155 Mb/s (BGA: 70 Mb/s) Trieste: 24 Mb/s CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, Flat test CHEP 2004, 30 Sep 2004 Each server transfer files from/to any other servers waits a random time uniformly choosen between 0 and customizable maximum (1 min and 5 mins tried so far) chooses at random on of the other N-1 servers (with a weight proportional to the maximum bandwith of the site that server belongs to) chooses at random one of three files with different sizes (1.6 GB, 0.8 GB, and 0.3 GB) sends back and forth the file using bbFTP with a customizable number of parallel streams (16 and 8 tried) checks if any bits got lost and fills a detailed log file
Roberto Barbera Selected results (Bologna) saturated ! Official GARR NOC statistics CHEP 2004, 30 Sep 2004
Roberto Barbera Selected results (Cagliari) saturated ! Official GARR NOC statistics
Roberto Barbera Selected results (Catania) heavy traffic ! Official GARR NOC statistics CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, Multi-tier use-case (HBT prod., 5000 evts., 9 TB) CNAF 60% Tier-1 CT 20% TO 20% Tier TB 1 MB in 50 MB out Tier-3/4 BABOCAPDTS CHEP 2004, 30 Sep 2004
Roberto Barbera Results (Official GARR NOC stats.)
Roberto Barbera Prague, Latest developments TCP tuning to improve throughput The participation of non Italian sites (especially with large RTT’s) like Prague and Houston has been useful to verify the effect of TCP tuning. SiteRTT (msec) from CNAFBW (Mb/s) from CNAFBDP (MB) Houston Prague Catania CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, bbFTP vs # streams and TCP windows Catania-CNAF Max bw measured (iperf) = 25 Mb/s 1 streams 2 streams 4 streams Saturated also for small files saturated CHEP 2004, 30 Sep 2004
Houston-CNAF, Max bw measured (iperf) = 50 Mb/s Roberto Barbera Prague, bbFTP vs # streams and TCP windows 1 streams 2 streams 4 streams 6 streams saturated CHEP 2004, 30 Sep 2004
Prague-CNAF, Max bw measured (iperf) = 250 Mb/s Roberto Barbera Prague, Dipartimento di Fisica dell’Università di Catania and INFN Catania - Italy ALICE Collaboration bbFTP vs no streams and TCP windows 1 streams 2 streams With an very high maximum buffer size (130KB->8 MB) 1 streams 2 streams CHEP 2004, 30 Sep 2004
Prague-CNAF with 1, 2, 4 and 6 streams with a very large maximum buffer size (8MB » BDP). [Ref: Roberto Barbera Prague, streams 1 streams 4 streams6 streams Bottleneck at I/O level CHEP 2004, 30 Sep 2004
Roberto Barbera Prague, Conclusions First “real” multi-site/multi-server stress-test of the Italian GARR network Actual bandwidths resulted strongly inadequate if we especially consider all ALICE sites “as a whole” and the present number of servers already available by now Useful information on the actual farm architecture (limits of NFS in case of many parallel threads and big files) Big “perturbation” and interest inside both INFN NetGroup and GARR with prompt and excellent feed-back and support Strong and “incredibly” fast bandwith upgrades in many sites made by the GARR NOC Mapping of the testbed on a multi-tier topology does not seem to pose major problems for Tier-3’s CHEP 2004, 30 Sep 2004