Presentation is loading. Please wait.

Presentation is loading. Please wait.

Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1.

Similar presentations


Presentation on theme: "Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1."— Presentation transcript:

1 Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

2 Update from the last year 2014/12/10Tomoaki Nakamura2 No HW upgrade from the last year for Grid resources -2560 CPU cores (18.03 HS06/core) -RAM (2GB/core for 1280CPU, 4GB/core for 1280CPU) -No memory upgrade until the end of 2015 (considered at last year) -2000PB for pledged Disk (2014) and ~600TB for LocalGroupDisk All service instance have been migrated to EMI3 -CREAM, DPM, BDII (site/top), Arugus, gLexec-WN, APEL -WMS, LB, MyProxy: can be decommissioned for ATLAS The other service instance -perfSONAR (latency 1G, bandwidth 1G, bandwidth 10G) -Squid (condDB x 2 + CVMFS x 2) Services for ATLAS have been deployed -DPM-WebdDAV: used for Rucio renaming, will be used for central deletion -DPM-XrootD and FAX setup: connected with Asia redirector -Multi core queuex: 512 cores, 20% of resources, 64 static 8-core slots No HW upgrade from the last year for Grid resources -2560 CPU cores (18.03 HS06/core) -RAM (2GB/core for 1280CPU, 4GB/core for 1280CPU) -No memory upgrade until the end of 2015 (considered at last year) -2000PB for pledged Disk (2014) and ~600TB for LocalGroupDisk All service instance have been migrated to EMI3 -CREAM, DPM, BDII (site/top), Arugus, gLexec-WN, APEL -WMS, LB, MyProxy: can be decommissioned for ATLAS The other service instance -perfSONAR (latency 1G, bandwidth 1G, bandwidth 10G) -Squid (condDB x 2 + CVMFS x 2) Services for ATLAS have been deployed -DPM-WebdDAV: used for Rucio renaming, will be used for central deletion -DPM-XrootD and FAX setup: connected with Asia redirector -Multi core queuex: 512 cores, 20% of resources, 64 static 8-core slots

3 FAX remote access 2014/12/10Tomoaki Nakamura3 4TB / day = ~46 MB / sec

4 ASAP (all data) 2014/12/10Tomoaki Nakamura4 (ATLAS Site Availability Performance) 99.77%

5 Pledge for the next year and beyond 2014/12/10Tomoaki Nakamura5 For FY2015 -Increase 400TB to pledge -528TB (8 servers) will be added to DPM by the end of Mar. 2015 -Total DPM capacity: 3168TB (~750TB for LocalGroupDisk) End of 2015 -End of this system -Procurement work will start from the next spring -If we can get 6TB HDD, total storage capacity can be doubled at 4th system For FY2015 -Increase 400TB to pledge -528TB (8 servers) will be added to DPM by the end of Mar. 2015 -Total DPM capacity: 3168TB (~750TB for LocalGroupDisk) End of 2015 -End of this system -Procurement work will start from the next spring -If we can get 6TB HDD, total storage capacity can be doubled at 4th system

6 International network for Tokyo 2014/12/10Tomoaki Nakamura6 TOKYO ASGC BNL TRIUMF NDGF RAL CCIN2P3 CERN CANF PIC SARA NIKEF LA Pacific Atlantic 10Gbps WIX New line (10Gbps) since May. 2013 OSAKA 40Gbps 10x3 Gbps 10 Gbps Amsterdam Geneva Dedicated line Frankfurt

7 Configuration for the LHCONE evaluation 2014/12/10Tomoaki Nakamura7 MLXe32 (10G) Dell8024 (10G) Dell 5448 (1G) Catalyst 6500 (10G) Catalyst 3750 (10G) NY DC LA Dell8024 (10G) UI (Gridftp) perfSONAR (Latency) perfSONAR (Latency) perfSONAR (Bandwidth) perfSONAR (Bandwidth) perfSONAR (Latency/Bandwidth) perfSONAR (Latency/Bandwidth) UI (Gridftp) ICEPP (production) 157.82.112.0/21 UTnet SINET IPv4/v6 LHCONE BGP peering ICEPP (LHCONE evaluation) 157.82.118.0/24 10Gbps 1Gbps

8 Stability on packet loss (CC-IN2P3) 2014/12/10Tomoaki Nakamura8 Directly affect to transfer rate.

9 Fraction of packet loss (NY vs. DC) 2014/12/10Tomoaki Nakamura9 Comparable each other.

10 Minimum latency (CC-IN2P3) 2014/12/10Tomoaki Nakamura10 Useful to know the typical latency and stability.

11 Minimum latency (CC-IN2P3) 2014/12/10Tomoaki Nakamura11 Originating from other group in Univ. of Tokyo.

12 Distribution of Minimum latency (CC-IN2P3) 2014/12/10Tomoaki Nakamura12

13 Distribution of Minimum latency (CC-IN2P3) 2014/12/10Tomoaki Nakamura13 originating from other group.miss measurement.

14 Maximum latency (CC-IN2P3) 2014/12/10Tomoaki Nakamura14 Useful to find problems.

15 Maximum latency (CC-IN2P3) 2014/12/10Tomoaki Nakamura15 Also have spikes. Additional periodic noise.

16 Distribution of Maximum latency (CC-IN2P3) 2014/12/10Tomoaki Nakamura16

17 Distribution of Maximum latency (CC-IN2P3) 2014/12/10Tomoaki Nakamura17 Discrepancy due to the periodic noise.

18 Also for the other sites 2014/12/10Tomoaki Nakamura18 (US) (FR) One of the perfsonar instance in Tokyo seems to fall into the busy state once in a day. It is independent of source sites. But, no significant errors in system and service logs.

19 Maximum latency (masked by time) 2014/12/10Tomoaki Nakamura19 Periodic nose can be cleaned up.

20 Maximum latency by mask (CC-IN2P3) 2014/12/10Tomoaki Nakamura20 Still remaining, but comparable.

21 Bandwidth measurement (CC-IN2P3 and CNAF) 2014/12/10Tomoaki Nakamura21 Asymmetric ~38 MB/s (incoming) ~28 MB/s (outgoing) Symmetric, but unstable ~34 MB/s (incoming) ~35 MB/s (outgoing)

22 Minimum latency (CC-IN2P3 in 2014) 2014/12/10Tomoaki Nakamura22

23 Minimum latency (CC-IN2P3 in 2014) 2014/12/10Tomoaki Nakamura23 Spikes were gone. Average value is split.

24 Latency in one day (CC-IN2P3) 2014/12/10Tomoaki Nakamura24 Both production line via NY Incoming Outgoing Load balancing somewhere in NY or GEANT?

25 Maximum latency (CC-IN2P3, 2014) 2014/12/10Tomoaki Nakamura25 Some improvement in FR-Geneva?

26 Bandwidth measurement (latest data) 2014/12/10Tomoaki Nakamura26 Still asymmetric ~35 MB/s (incoming) ~24 MB/s (outgoing) Symmetric, and very stable ~32 MB/s (incoming) ~30 MB/s (outgoing)

27 Configuration for the LHCONE evaluation 2014/12/10Tomoaki Nakamura27 MLXe32 (10G) Dell8024 (10G) Dell 5448 (1G) Catalyst 6500 (10G) Catalyst 3750 (10G) NY DC LA Dell8024 (10G) UI (Gridftp) perfSONAR (Latency) perfSONAR (Latency) perfSONAR (Bandwidth) perfSONAR (Bandwidth) perfSONAR (Latency/Bandwidth) perfSONAR (Latency/Bandwidth) UI (Gridftp) ICEPP (production) 157.82.112.0/21 UTnet SINET IPv4/v6 LHCONE BGP peering ICEPP (LHCONE evaluation) 157.82.118.0/24 10Gbps 1Gbps

28 LHCONE (EU sites) for all production servers 2014/12/10Tomoaki Nakamura28 MLXe32 (10G) Dell8024 (10G) Dell 5448 (1G) Catalyst 6500 (10G) Catalyst 3750 (10G) NY DC LA Dell8024 (10G) UI (Gridftp) perfSONAR (Latency) perfSONAR (Latency) perfSONAR (Bandwidth) perfSONAR (Bandwidth) perfSONAR (Latency/Bandwidth) perfSONAR (Latency/Bandwidth) UI (Gridftp) ICEPP (production) 157.82.112.0/21 UTnet SINET IPv4/v6 LHCONE BGP peering ICEPP (LHCONE evaluation) 157.82.118.0/24 10Gbps 1Gbps

29 Nov. 11, 2014 (latency for CCIN2P3) 2014/12/10Tomoaki Nakamura29

30 Nov. 11, 2014 (latency for CNAF) 2014/12/10Tomoaki Nakamura30

31 Nov. 11 (throughput for CCIN2P3) 2014/12/10Tomoaki Nakamura31

32 Nov. 11 (throughput for CNAF) 2014/12/10Tomoaki Nakamura32

33 Dec. 7, 2014 (incoming B.W. is saturated) 2014/12/10Tomoaki Nakamura33 User subscription of AOD via DaTri physics.Egampa, 8TeV all period: ~150TB Still on going today (continuously several days)

34 Breakdown from GridFTP log 2014/12/10Tomoaki Nakamura34 Part of LHCONE contribution Mainly FTS3 and direct transfer from multiple sites 10 min. bin 1 min. bin

35 Near future and Concerns 2014/12/10Tomoaki Nakamura35 LHCONE -Next for US and Canada -And then, for Asisa (ASGC, IHEP) Network Bandwidth -2015: more 10G from ICEPP to SINET? UTokyo is offering, but depends on them. -JFY2016: SINET will be upgraded (SINET5) 100G for US (LA) 20G for EU (reverse around) EMI3 -End of full support April 30, 2014 -End of standard update October 31, 2014 -End of security update April 30, 2015 Batch job system Troque/Maui, no more support, not effective dynamic multi-core allocation HTCondor, SLURM or the other commercial product (UNIVA GE, LSF) LHCONE -Next for US and Canada -And then, for Asisa (ASGC, IHEP) Network Bandwidth -2015: more 10G from ICEPP to SINET? UTokyo is offering, but depends on them. -JFY2016: SINET will be upgraded (SINET5) 100G for US (LA) 20G for EU (reverse around) EMI3 -End of full support April 30, 2014 -End of standard update October 31, 2014 -End of security update April 30, 2015 Batch job system Troque/Maui, no more support, not effective dynamic multi-core allocation HTCondor, SLURM or the other commercial product (UNIVA GE, LSF)


Download ppt "Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1."

Similar presentations


Ads by Google