Construction Experience and Application of the HEP DataGrid in Korea Bockjoo Kim On behalf of Korean HEP Data Grid Working Group CHEP2003, UC San Diego Monday 24 March 2003
Outline Korean Committe HEP Experiments Development of Korean HEP Data Grid Goals of Korean HEP Data Grid Hardware and Software Resources Network CPU’s Storages Grid Software – EDG testbed, SAMGrid Achievements in Y2002 Prospects in Y2003
Korean Institutions and HEP Experiments US FNAL (CDF) US BNL (PHENIX) Korea CHEP Other Korean HEP institutions Space Station (AMS) 2005 Europe CERN (CMS) 2007 Japan KEK (K2K/Belle) Space Station (AMS) ( Korean DataGrid Related Experiments Only) 12 Institutions are active HEP participants Current Experiments: Belle/ KEK, K2K/KEK, Pheonix / BNL, CDF / Fermilab Near Future Experiments AMS / ISS (MIT, NASA, CERN) : Y2005 CMS (CERN, Europe) : Y2007 Linear Collider Experiment(s)
Development of Korean HEP Data Grid Grid Forum Korea (GFK) ’ s formed in 2001 and thus KHEPDGWG started Korean HEP Data Grid approved by KISTI / MIC(GFK) on March 22, NCA supports CHEP with two international networking utilization projects which are closely related with the Korean HEP Data Grid : Europe and Japan/USA Networking KT/KOREN-NOC supports CHEP with PC clusters for networking Companies like IBM-Korea, CIES agreed to support CHEP 50TB Tape Library and 1TB +Servers) INDUSTRY CHEP itself supports HEP Data Grid with its own research fund from the Ministry of Science and Technology (MOST) MOST and CHEP Kyungpook Nat ’ l Univ. supports CHEP with spaces for the KHEPDGWG KOREN/APAN supports Korean HEP DG with 1 Gbps bandwidth of CHEP to KOREN (2002) networking (one is CHEP and the other is in Seoul) is under discussion Hyeonhae/Genkai APII (GbE) for HEP (beteewn Korea-Japan) project is in progress 1 st International HEP Data Grid Workshop in Nov 2002
Goals of Korean HEP Data Grid Implementation of the Tier-1 Regional Data Center for the LHC-CMS (CERN) experiment in Asia. The Regional Data Center can be also used as regional data center for other experiments (Belle, CDF, AMS, etc.) Networking Multi-leveled (Tier) hierarchy of distributed servers (both for data and for computation) to provide transparent access to both raw and derived data. Tier0 (CERN) – Tier1 (CHEP) : ~Gbps via TEIN Tier1(CEHP) – Tier1(US and Japan) : ~Gbps via APII/Hyeonhae Tier-1 (CHEP), Tier-2 or 3 (participating institutions): 45Mbps ~ 1 Gbps via KOREN Computing(1000 CPU Linux clusters) Data Storage Capability Storage 1.1 PB Raid Type Disk (Tier1+Tier2) Tape Drive ~ 3.2 PB HPSS Servers Software: Contribute to grid application package development
Korean HEP Data Grid Network Configuration (2002) Network Bandwidth between institutions CHEP-KOREN: 1 Gbps (ready to Users) SNU-KOREN: 1Gbps ready for test CHEP-SNU: 1Gbps ready for test SKKU-KOREN: 155 Mbps (not yet to Users) Yonsei-KOREN: 155 Mbps (not yet to Users) File Transfer Tests: KNU-SNU, KNU-SKKU : ~50 Mbps KNU-KEK, KNU-Fermilab : 17 Mbps(155Mbps,45Mbps) KNU-CERN : 8 Mbps (10 Mbps)
Distributed PC-linux Farm Distributed PC-linux Clusters (~206 CPU’s so far) 10 sites for testbed setup or/and tests Center for High Energy Physics(CHEP): 142 CPU’s SNU: 6 CPU’s KOREN/NOC: ~40 CPU’s CHEP to KOREN: 1 GbE test established Yonsei U, SKKU, Konkuk U, Chonnam Nat’l Univ, Ewha WU, Dongshin U: 1 CPU each 4 sites outside of Korea : 18 CPU’s KEK,FNAL,CERN, and ETH
PC-Linux Farm at KNU
CHEP/KNU 48 TBStorage and network equipment Storages and Network Equipment
Storage Specification IBM TAPE LIBRARY SYSTEM-48 TB (13~18/Nov/2002) 3494-L TB 3494-S10 16 TB 3494-L TB 3494-S10 16 TB Raid Disks Fast T200: 1 TB Raid Disks: 1 TB Disks on Nodes (4.4 TB) SW: TSM (HSM) HSM Server : S7A 262Mhz, 8Way, 4GB Memory 48 TB L12 S10
Grid Software All is globus 2 based software KNU and SNU host one EDG testbed each and are running within Korea at the full scale Application of the EDG testbed to currently running experiments is configured for EDG testbed for CDF data analysis EDG testbed for Belle data analysis (This is in progress) Worker Nodes for the SAM Grid (Fermilab, USA) is also installed for the CDF data analysis at KNU CHEP assigned a few CPU’s for iVDGL testbed setup (Feb 2003)
EDG Testbeds EDG Test bed at SNU EDG Test bed at KNU
Configuration of EDG testbed in Korea Web Services: SE VOuser WN VO user NFS GSIFTP MAP on disk With maximum security grid-security NFS GSIFTP NFS GSIFTP NFS GSIFTP GDMP server (with new VO) GDMP client (with new VO) GDMP client (with new VO) SNU SKKU KNU/CHEP UI Real user In operation In preparation LDAP RB Disk CE VO user Big Fat Disk CDF CPU K2K CPU
An Application of EDG testbed The EDG testbed functionality is extended to include Korean CDF as a VO The extension is to attach existing CPU’s with CDF softwares to the EDG testbed Add a VO following EDG discussion list CE in the EDG testbed is modified Define a que in a non-CE machine grid-mapfile, grid-mapfile.que1_on_ce, grid-mapfile.que2_on_nonce (exclusive job submission ) ce-static.ldif.que1_on_ce, ce-static.ldif.que1_on_nonce ce-globus.skel globus-script-pbs-submit globus-script-pbs-poll (for ques on non-CE) Experiment Specific Machine (= que on non-CE) is modified Make a minimal WN configuration without greatly modifying existing machine (pbs install/setup, Pooled accounts, mounting security) /etc/hosts.equiv for pooled account users to submit jobs on non-CE que References [1] [2]
Overview of the EDG application SE VOuser WN VO user NFS GSIFTP NFS GSIFTP NFS GSIFTP NFS GSIFTP GDMP server (with new VO) GDMP client (with new VO) UI Real user RB /home Modified CE VO user /flatfiles/SE00 CDF Run2 Softwares Local LDAP Server Authorized Users dguser for RB VO users for CDF LDAP and.fr VO users for CMS, LHCB, ATLAS, etc Modified CE Q ’ s for EDG VO ’ s Q for CDF VO EDG WN /etc/grid-security grid-mapfile PBS Server PBS Client CDF VO Q Another site NFS New WN CAF Feynman Center Fermilab
Working Sample Files for CDF Job JDL Executable = "run_cdf_tofsim.sh"; StdOutput = "run_cdf_tofsim.out"; StdError = "run_cdf_tofsim.err"; InputSandbox = {"run_cdf_tofsim.sh"}; OutputSandbox = {"run_cdf_tofsim.out","run_cdf_tofsim.err",".BrokerInfo"}; Input Shell Script #!/bin/sh source ~cdfsoft/cdf2.shrc setup cdfsoft int1 newrel -t 4.9.0int1 test1 cd test1 addpkg -h TofSim gmake TofSim.all gmake TofSim.tbin./bin/Linux2-KCC_4_0/tofSim_test tofSim/test/tofsim_usr_pythia_bbar_dbfile.tcl
Web Service for EDG testbed To facilitate access to the EDG testbeds in Korea Mailman python cgi wrapper is utilized EDG job related Python commands are modified for web service At the moment, login is possible through a proxy file Logged user can see the user’s Job ID’s Retrieved job output remains at the web server machine
Web Service for EDG testbed Login by Loading Proxy Job submission by Loading jdl Job Submission Result Page 1.Job Status can be checked 2.Submitted Job can be cancelled List of JOB ID’s to get output
SAM Grid Monitoring Home page DCAF (DeCenteralized Analysis Farm) in KNU for SAM Grid for SAM Grid
What KHEPDG achieves in Y2002 Infrastructure 206 CPUs/ 6.5 TB Disk/ 48 TB Tape library + Networking Infrastructure HSM system for tape library KNU and SNU host one EDG testbed each which is running within Korean in full scale and accessible via web KNU installed SAMGrid (US Fermilab products) worknodes (as demonstrated at SC2002) CHEP started discussing on collaboration with iVDGL SNU/KNU implemented an application of the EUDG testbed f or the CDF and the implementation is working Network test is performed between Korea-US, Korea-Japan, Korea-EU, and within Korea. 1 st Internatonal HEP DataGrid workshop held at CHEP
Prospects of KHEPDGWG for Y2003 More testbed setup (e.g., iVDGL’s WorldGrid) Extend application of EDG testbed with currently running experiments to, e.g., Belle Cross Grid Tests between EDG – iVDGL in Korea Investigate possibility of Globus3 Full operation of HPSS (HSM) with Grid Softwares Increase number of clusters to 400 CPU or more Increase Storages to 100 TB Participate in the CMS data challenge 2 nd HEP DataGrid Workshop will be held in August
Summary HEP Data Grid is being considered for most of Korean HEP institutions So far the HEP Data Grid project has received excellent supports from government, industry, and research institutions EDG testbeds and its application are operational in Korea, and we will expand with other testbeds, e.g., iVDGL WorldGrid 1 st international workshop on HEP Data Grid was held successfully in November 2002 CHEP will host 2 nd international workshop on HEP Data Grid in August 2003