Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Computing for High Energy Physics in Japan Hiroyuki Matsunaga International Center for Elementary Particle Physics (ICEPP), The University of Tokyo.

Similar presentations


Presentation on theme: "Grid Computing for High Energy Physics in Japan Hiroyuki Matsunaga International Center for Elementary Particle Physics (ICEPP), The University of Tokyo."— Presentation transcript:

1 Grid Computing for High Energy Physics in Japan Hiroyuki Matsunaga International Center for Elementary Particle Physics (ICEPP), The University of Tokyo International Workshop on e-Science for Physics 2008

2 2 Major High Energy Physics Program in Japan KEK-B (Tsukuba) –Belle J-PARC (Tokai) –Japan Proton Accelerator Research Complex –Operation will start within this year –T2K (Tokai to Kamioka) long baseline neutrino experiment Kamioka –SuperKamiokande –KamLAND International collaboration –CERN LHC (ATLAS, ALICE) –Fermilab Tevatron (CDF) –BNL RHIC (PHENIX)

3 3 Grid Related Activities ICEPP, University of Tokyo –WLCG Tier2 site for ATLAS Regional Center for ATLAS-Japan group Hiroshima University –WLCG Tier2 site for ALICE KEK –Two EGEE production sites BELLE experiment, J-PARC, ILC… –University support –NAREGI Grid deployment at universities –Nagoya U. (Belle), Tsukuba U. (CDF)… Network

4 4 Grid Deployment at University of Tokyo ICEPP, University of Tokyo –Involved in international HEP experiments since 1974 Operated pilot system since 2002 Current computer system started working last year –TOKYO-LCG2. gLite3 installed CC-IN2P3 (Lyon, France) is the associated Tier 1 site within ATLAS computing model –Detector data from CERN go through CC-IN2P3 –Exceptionally far distance for T1-T2 RTT ~280msec, ~10 hops Challenge for efficient data transfer Data catalog for the files in Tokyo located at Lyon –ASGC (Taiwan) could be additional associated Tier1 Geographically nearest Tier 1 (RTT ~32msec) Operations have been supported by ASGC –Neighboring timezone

5 5 Hardware resources Tier-2 site plus (non-grid) regional center facility –Support local user analysis by ALTAS Japan group Blade servers –650 nodes (2600 cores) Disk arrays –140 Boxes (~6TB/box) –4Gb Fibre-Channel File servers –Attach 5 disk arrays –10 GbE NIC Tape robot (LTO3) –8000 tapes, 32 drives PledgedPlanned to be pledged 2007200820092010 CPU (kSI2k)1000 3000 Disk (Tbyes)200400 600 Nominal WAN (Mbits/sec) 2000 Tape robot Blade servers Disk arrays

6 6 SINET3SINET3 SINET3 (Japanese NREN) –Third generation of SINET, since Apr. 2007 –Provided by NII (National Institute of Informatics) Backbone: up tp 40Gbps Major universities connect with 1-10 Gbps –10 Gbps to Tokyo RC International links –2 x 10 Gbps to US –2 x 622 Mbps to Asia

7 7 International Link 10Gbps between Tokyo and CC-IN2P3 –SINET3 + GEANT + RENATER (French NREN) –public network (shared with other traffic) 1Gbps link to ASGC (to be upgraded to 2.4 Gbps) Tokyo New York Lyon SINET3 (10Gbps) GEANT (10Gbps) RENATER (10Gbps) Taipei

8 8 Network test with Iperf Memory-to-memory test performed with Iperf program Use Linux boxes dedicated for iperf test at both ends –1Gbps limited by NIC –Linux kernel 2.6.9 (BIC TCP) –Window size 8Mbytes, 8 parallel streams For Lyon-Tokyo: long recovery time due to long RTT Lyon Tokyo (RTT: 280ms) Taipei Tokyo (RTT: 32ms)

9 9 Data Transfer from Lyon Tier1 center Data transferred from Lyon to Tokyo –Used Storage Elements in production –ATLAS MC simulation data Storage Elements –Lyon: dCache (>30 gridFTP servers, Solaris, ZFS) –Tokyo: DPM (6 gridFTP servers, Linux, XFS) FTS (File Transfer System) –Main tool for bulk data transfer –Execute multiple file transfers (by using gridFTP) concurrently Set number of streams for gridFTP –Used in ATLAS Distributed Data Management system

10 10 Performance of data transfer >500 Mbytes/s observed in May, 2008 –Filesize: 3.5Gbytes –20 files in parallel, 10 streams each –~40Mbytes/s for each file transfer Low activity at CC-IN2P3 during the period (other than ours) 500 Mbytes/s 0 20 40 Mbytes/s 10 1 100 Throughput per file transfer

11 11 Data transfer between ASGC and Tokyo Transferred 1000 files at a test (1Gbytes filesize) Tried various numbers of concurrent files / streams –From 4/1 to 25/15 Saturate 1Gbps WAN bandwidth Tokyo -> ASGC ASGC -> Tokyo 20/10 4/2 4/4 16/1 20/10 25/15 4/1 8/1 4/2 8/2 4/4 16/1 25/10

12 12 CPU Usage in the last year (Sep 2007 – Aug 2008) 3,253,321 CPU time (kSI2k*hours) in last year –Most jobs are ATLAS MC simulation Job submission is coordinated by CC-IN2P3 (the associated Tier1) Outputs are uploaded to the data storage at CC-IN2P3 –Large contribution to the ATLAS MC production TOKYO-LCG2 CPU time per month CPU time at Large Tier2 sites

13 13 ALICE Tier2 center at Hiroshima University WLCG/EGEE site –“JP-HIROSHIMA-WLCG” Possible Tier 2 site for ALICE

14 14 Status at Hiroshima Just became EGEE production site – Aug. 2008 Associated Tier1 site will likely be CC-IN2P3 –No ALICE Tier1 in Asia-Pacific region Resources –568 CPU cores Dual-Core Xeon(3GHz) X 2cpus X 38boxes Quad-Core Xeon(2.6GHz) X 2cpus X 32boxes Quad-Core Xeon(3GHz) X 2cpus X 20blades –Storage: ~200 TB next year Network: 1Gbps –On SINET3

15 15 KEKKEK Belle experiment has been running –Need to have access to existing peta-bytes of data Site operations –KEK does not support any LHC experiment –Try to gain experience by operating sites in order to prepare for future Tier1 level Grid center University support NAREGI KEK Tsukuba campus Mt. Tsukuba KEKB Linac Belle exp.

16 16 Grid Deployment at KEK Two EGEE sites –JP-KEK-CRC-1 Rather experimental use and R&D –JP-KEK-CRC-2 More stable services NAREGI –Used beta version for testing and evaluation Supported VOs –belle (main target at present), ilc, calice, … –Not support LCG VOs VOMS operation –belle (registered in CIC) –ppj (accelerator science in Japan), naokek –g4med, apdg, atlasj, ail

17 17 Belle VO Federation established –5 countries, 7 institutes, 10 sites Nagoya Univ., Univ. of Melbourne, ASGC, NCU, CYFRONET, Korea Univ., KEK VOMS is provided by KEK Activities –Submit MC production jobs –Functional and performance tests –Interface to existing peta-bytes of data

18 18 Takashi Sasaki (KEK)

19 19 ppj VO Federated among major universities and KEK –Tohoku U. (ILC, KamLAND) –U. Tsukuba (CDF) –Nagoya U. (Belle, ATLAS) –Kobe U. (ILC, ATLAS) –Hiroshima IT (ATLAS, Computing Science) Common VO for accelerator science in Japan –NOT depend on specific projects, but resources shared KEK acts as GOC –Remote installation –Monitoring Based on Nagios and Wiki –Software update

20 20 KEK Grid CA Started since Jan. 2006 Accredited as an IGTF (International Grid Trust Federation) compliant CA JFY 2006 Apr 2006 - Mar 2007 JFY 2007 Apr2007 – Mar 2008 Personal cert.68119 Host cert.139238 Web server cert.40 Numbers of Issued certificates

21 21 NAREGINAREGI NAREGI: NAtional REsearch Grid Initiative –Host institute: National Institute of Infrmatics (NII) –R&D of the Grid middleware for research and industrial applications –Main targets are nanotechnology and biotechnology More focused on computing grid Data grid part integrated later Ver. 1.0 of middleware released in May, 2008 –Software maintenance and user support services will be continued

22 22 NAREGI at KEK NAREGI-  version installed on the testbed –1.0.1: Jun. 2006 – Nov. 2006 Manual installation for all the steps –1.0.2: Feb 2007 –2.0.0: Oct. 2007 apt-rpm installation –2.0.1: Dec. 2007 Site federation test –KEK-NAREGI/NII: Oct. 2007 –KEK-National Astronomy Observatory (NAO): Mar. 2008 Evaluation of application environment of NAREGI –job submission/retrieval, remote data stage-in/out

23 23 Takashi Sasaki (KEK)

24 24 Data Storage: Gfarm Gfarm: distributed file system –DataGrid part in NAREGI –Data are stored in multiple disk servers Tests performed : –Stage-in and stage-out to the Gfarm storage –GridFTP interface Between gLite site and NAREGI site –File access from application Have access with FUSE (Filesystem in userspace) –Without the need of changing application program –IO speed is several times slower than local disk

25 25 Future Plan on NAREGI at KEK Migration to the production version Test of interoperability with gLite Improve the middleware in the application domain –Development of the new API to the application Virtualization of the middleware for script languages (to be used at web portal as well) –Monitoring Jobs, sites,…

26 26 SummarySummary WLCG –ATLAS Tier2 at Tokyo Stable operation –ALICE Tier2 at Hiroshima Just started operation in production Coordinated effort lead by KEK –Site operations with gLite and NAREGI middlewares Belle VO: SRB –Will be replaced with iRODs ppj VO: deployment at universities –Supported and monitored by KEK –NAREGI R&D, interoperability


Download ppt "Grid Computing for High Energy Physics in Japan Hiroyuki Matsunaga International Center for Elementary Particle Physics (ICEPP), The University of Tokyo."

Similar presentations


Ads by Google