October 28, 2013 at 14th CERN-Korea Committee, Geneva

1 Status of Tier-1 @ GSDC, KISTI
October 28, 2013 at 14th CERN-Korea Committee, Geneva Status of GSDC, KISTI Gungwon Kang, Hangjin Jang & Sang-Un Ahn, for the GSDC Tier-1 Team

2 KISTI GSDC Tier-1 Team ~ 9 people ROLE Name Representative
Haeng-Jin Jang System Management Hee-Jun Yoon System Administration Seung-Hee Lee Jeong-Heon Kim Storage (Disk & Tape) Sang-Oh Park Network Hyoung-Woo Park KISTI support (Dr. Bu-Seung Cho) Site Operation & Administration Il-Yeon Yeo Sang-Un Ahn KIAF Operation & User Support ~ 9 people 14th CERN-Korea Committee 28 October 2013

3 Updates – 1/3 VOBOX upgrade (May 2013)
From gLite-VOBOX to WLCG-VOBOX (EMI) Latest version of AliEn & SL6 (x64 architecture) supported ALICE packages deployed Scientific Linux 6 Kernel Security Patch (May 2013) GLExec deployment (July 2013) New worker node probe supporting MUPJ(Multi-User Pilot Jobs) deployed: GLExec ARGUS server was configured at CREAM-CE at the same time MAUI-Torque scheduling optimization for new pilot jobs SHA-2 support (July 2013) In order to prepare for new certificate using SHA-2 (256 bits) issued & in production by the end of this year CREAM-CE needed to be update (update #10, v1.14.4) CernVM-FS deployment (July 2013) ALICE decided to migrate from torrent-type package distribution to using CVMFS Squid proxy servers are required to be setup for cache 2 proxies were installed for high availability feature 14th CERN-Korea Committee 28 October 2013

4 Updates – 2/3 RAW replication: 10 Gbps link upgrade:
Data transfer of p-Pb & p-p collision data taken in 2013 was done on 16th August Total size of 310 TB with 400k files Average transfer rate: 22MB/s (from 7th March to 16th August; peak ~400MB/s) 10 Gbps link upgrade: 10 Gbps link upgrade plan presented to WLCG MB on 17th September By the end of this year, 2 Gbps link will be in production Initiative for 10 Gbps configuration will start in the next fiscal year (starting 1st March usually) Becoming full Tier1 to be discussed in the coming spring (probably at the next WLCG OB in March 2014) ……To be explained in more detail later 14th CERN-Korea Committee 28 October 2013

5 Computing Resource Status
2013 Pledges (CPU): HepSpec06 25,000 Current HepSepc06: 15,840 (# of slots: 1800 including 4 reserved pilots slots) 1,800 Jobs slots available (4 reserved slots for pilot jobs) with H/T enabled New servers has been delivered in the last week of this CKC Physical 2,000 cores will be allocated to Tier1 for meeting pledges 2013 Pledges (Tape Storage): Tape 1,500 TB Current Tape capacity: 1,000 TB Pledges will be met in November 2013 Pledges (Disk Storage): Disk 1,000 TB Current Disk capacity: 966 TB (allocated 1,000 TB but usable space slightly below) One additional XROOTD server will be added Without Hyper-Threading (per core) With Hyper-Threading HepSpec06 Score 14.3 8.8 # of cores required to meet pledges 1750 2860 ※ Benchmark Environment: Intel Xeon 2.67 GHz; 6 cores * 2 CPUs; Scientific Linux 6 (x64); gcc 4.4.6, g , gfortran 4.4.6 14th CERN-Korea Committee 28 October 2013

6 Operation Status Concurrent job capacity: 1796 (for ALICE only, 4 reserved for pilots) After RAW replication done, reconstruction jobs are the majority No critical mal-functioning of system, but few interventions VOBOX upgrade, SL6 security update Trivial missing library issues after kernel compilation Short network interventions: scheduled & unscheduled Mostly transparent: GLExec, CVMFS deployment & CREAM-CE update VOBOX upgrade, SL6 Security update, trivial libs issues ~1,800 ~1,400 Scheduled downtime ~ 19th October (Un-)Scheduled network intervention Jan Jul Oct 14th CERN-Korea Committee 28 October 2013

7 2013 p-Pb RAW replication p-Pb data transfer started on 7th March and done on 16th October Total data size: TB (400k files), 177 transfer runs No critical incident during the replication Transfer speed: 22.74MB/s on average, 402.5MB/s on peak Below than expected (~60MB/s) due to shared 1Gbps link with other Tier1 services First reconstruction jobs started in 11 June and KISTI-GSDC showed a good performance (~ 50% contribution of 3 RAW reconstruction cycles) 7th Mar 2013 Heavy traffic on WNs Incident on firewall SL6 Security Update 16th Oct 2013 ~35MB/s ~20MB/s KISTI-Asian ALICE Tier-2 Seminar 5 August 2013

8 Site Availability/Reliability
Monthly WLCG report Mostly met target number: 97% Overall Reliability: 95.8% Few interventions in May Short network down in July Definition: 𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑖𝑙𝑖𝑡𝑦 = 𝑈𝑃 𝑈𝑃 + 𝐷𝑂𝑊𝑁 + 𝑆_𝐷𝑂𝑊𝑁 𝑅𝑒𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦 = 𝑈𝑃 𝑈𝑃 + 𝐷𝑂𝑊𝑁 14th CERN-Korea Committee 28 October 2013

9 Network – 10G Network to Join OPN
Establishing 2Gbps link between CERN-KISTI is on the administrative process Contract with NLR(US) and SURFnet(NL) will be started soon Budget to upgrade network up to 10Gbps is secured and will be performed in Mar. 2014 Including 1Gbps backup link Joining OPN has discussed with CERN Network experts (in early Oct) Timescale for 10Gbps upgrade requires more than 6 months (connection/test is foreseen in Aug. 2014) Plan revised (Sept. 17 on WLCG MB): Year 2012 2013 2014 2015 Bandwidth (proposed) 1Gbps 1Gbps  2Gbps (Oct. 2013) 2Gbps  10Gbps (Aug. 2014) 10Gbps (1Gbps) (2Gbps) (3Gbps) - WLCG MB

10 Conclusion Full sets of p-Pb collision data has been replicated to KISTI tape storage Heavy activities to reconstruct them are on-going By the end of this year, dedicated 2Gbps bandwidth will be established between CERN and KI STI Joining OPN should start when 2Gbps link established 10Gbps network upgrade plan submitted to WLCG MB Possibly could be discussed becoming full Tier-1 in Mar depending on the network status 14th CERN-Korea Committee 28 October 2013

11 Milestones WLCG MB 2013-09-17 Issue
Objective Target Nominate KISTI/GSDC representatives in the WLCG Management Board and the GDB Jun. 2012 Establishment of a 1Gbps connectivity to CERN Apr. 2012 Installation of tape system Dec. 2012 High speed transfer of data from CERN to KISTI at the speed required to receive and archive 10% of the ALICE AA raw data foreseen for 2012 over a continuous period of 2 weeks Apr. 2013 Provide a precise plan for 3Gbps (or higher) connectivity to CERN Sep. 2013 Present a plan for providing on-call services/support according to the T1 specifications as laid out in the WLCG MoU 85% of the job capacity running for at least 2 months 90% Storage Element (DPM and/or XROOTD) availability (functional tests) for at least 2 months Running of the reliability tests (both OPS and ALICE-specific) and publishing those to the new SAM infrastructure Feb. 2013 Integration with the APEL accounting system and publishing accounting data Jan. 2013 90% of the WLCG T1 service targets for at least 2 months - Integration in the WLCG OPN (with 2Gbps) Discussed Functional tests of the OPN (with 2Gbps) ■ Done ■ To be done ■ Issue Issue Slow performance than expected: perfSONAR test scheduled when 2Gbps link established To be done OPN Integration process has discussed and timeline should be fixed after 2Gbps link established OPN Functional tests to be followed T1 service target 90% of Availability (98.5 % of Reliability) from April to September ALICE confirmed that KISTI Tier-1 has shown to be a reliable site WLCG MB

