A data Grid test-bed environment in Gigabit WAN with HPSS in Japan

Slides:



Advertisements
Similar presentations
Cross-site data transfer on TeraGrid using GridFTP TeraGrid06 Institute User Introduction to TeraGrid June 12 th by Krishna Muriki
Advertisements

Middleware Support for RDMA-based Data Transfer in Cloud Computing Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi Department of Electrical.
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Title US-CMS User Facilities Vivian O’Dell US CMS Physics Meeting May 18, 2001.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
GridFTP Guy Warner, NeSC Training.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
Data oriented job submission scheme for the PHENIX user analysis in CCJ Tomoaki Nakamura, Hideto En’yo, Takashi Ichihara, Yasushi Watanabe and Satoshi.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
IRODS performance test and SRB system at KEK Yoshimi KEK Building data grids with iRODS 27 May 2008.
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Why GridFTP? l Performance u Parallel TCP streams, optimal TCP buffer u Non TCP protocol such as UDT u Order of magnitude greater l Cluster-to-cluster.
High Performance Storage System Harry Hulen
Data GRID Activity in Japan Yoshiyuki WATASE KEK (High energy Accelerator Research Organization) Tsukuba, Japan
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
File and Object Replication in Data Grids Chin-Yi Tsai.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
International Workshop on HEP Data Grid Nov 9, 2002, KNU Data Storage, Network, Handling, and Clustering in CDF Korea group Intae Yu*, Junghyun Kim, Ilsung.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
DYNES Storage Infrastructure Artur Barczyk California Institute of Technology LHCOPN Meeting Geneva, October 07, 2010.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
PHENIX Computing Center in Japan (CC-J) Takashi Ichihara (RIKEN and RIKEN BNL Research Center ) Presented on 08/02/2000 at CHEP2000 conference, Padova,
Disk Farms at Jefferson Lab Bryan Hess
National HEP Data Grid Project in Korea Kihyeon Cho Center for High Energy Physics (CHEP) Kyungpook National University CDF CAF & Grid Meeting July 12,
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
January 30, 2016 RHIC/USATLAS Computing Facility Overview Dantong Yu Brookhaven National Lab.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Status of Tokyo LCG tier-2 center for atlas / H. Sakamoto / ISGC07 Status of Tokyo LCG Tier 2 Center for ATLAS Hiroshi Sakamoto International Center for.
A data Grid test-bed environment in Gigabit WAN with HPSS in Japan A.Manabe, K.Ishikawa +, Y.Itoh +, S.Kawabata, T.Mashimo*, H.Matsumoto*, Y.Morita, H.Sakamoto*,
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
GGF 17 - May, 11th 2006 FI-RG: Firewall Issues Overview Document update and discussion The “Firewall Issues Overview” document.
SRB at KEK Yoshimi Iida, Kohki Ishikawa KEK – CC-IN2P3 Meeting on Grids at Lyon September 11-13, 2006.
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
Performance measurement of transferring files on the federated SRB
Dynamic Extension of the INFN Tier-1 on external resources
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Status report on LHC_2: ATLAS computing
The demonstration of Lustre in EAST data system
Diskpool and cloud storage benchmarks used in IT-DSS
LCG Deployment in Japan
PC Farms & Central Data Recording
Status and Plans on GRID related activities at KEK
Realization of a stable network flow with high performance communication in high bandwidth-delay product network Y. Kodama, T. Kudoh, O. Tatebe, S. Sekiguchi.
The transfer performance of iRODS between CC-IN2P3 and KEK
Bernd Panzer-Steindel, CERN/IT
Update on Plan for KISTI-GSDC
SAM at CCIN2P3 configuration issues
UK GridPP Tier-1/A Centre at CLRC
The INFN TIER1 Regional Centre
How can a detector saturate a 10Gb link through a remote file system
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
NETWORKING TECHNOLOGIES
Introduction to Networks
2018 Huawei H Real Questions Killtest
Grid Canada Testbed using HEP applications
Interoperability of Digital Repositories
Introduction to Networking & TCP/IP
Cost Effective Network Storage Solutions
GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing.
EEC4113 Data Communication & Multimedia System Chapter 1: Introduction by Muhazam Mustapha, July 2010.
Evaluation of Objectivity/AMS on the Wide Area Network
Presentation transcript:

A data Grid test-bed environment in Gigabit WAN with HPSS in Japan A.Manabe, K.Ishikawa+, Y.Itoh+, S.Kawabata, T.Mashimo*, H.Matsumoto*, Y.Morita, H.Sakamoto*, T.Sasaki, H.Sato, J.Tanaka*, I.Ueda*, Y.Watase S.Yamamoto+, S.Yashiro KEK *ICEPP +IBM Japan

Outline ICEPP Computer facility and its roll in ATLAS Japan. ICEPP-KEK Nordu-Grid Testbed. GridFTP and GSI enabled pftp (GSI-pftp) Data transfer performance. 2003/3/27 CHEP03

ICEPP(International Center for Elementary Particle Physics) ICEPP located in Univ.of Tokyo Central computer facility of ATLAS Japan. ICEPP started to install PC farm from 2002 and joined ATLAS DC1 in the last summer. ICEPP PC farm Computing Nodes 39 nodes PenIII 1.4G x 2cpu 108 nodes Xeon 2.8G x 2cpu will be installed in this month. 2.5TB NFS server 2.2GHz Xeon LTO Library 2003/3/27 CHEP03

ICEPP-KEK Grid Testbed Object To configure R&D test bed for ATLAS Japan regional center hierarchy. ICEPP (Tokyo) ATLAS Tier 1 regional center KEK (Tsukuba) Supposed to be an ATLAS Tier2 local center. Short term requirement Archiving ~3 TB data produced at ATLAS DC1 to mass storage and open it to ATLAS members in KEK. 2003/3/27 CHEP03

ICEPP-KEK Grid Test Bed Hardware Hardwares ICEPP Computing Elements 4 nodes with 8cpus. KEK 50 nodes with 100 cpus. HPSS servers 2 disk movers. HPSS system (shared with general users in KEK.) 2003/3/27 CHEP03

Test bed – KEK side Fujitsu TS225 50 nodes 120TB HPSS storage PenIII 1GHz x 2CPU 120TB HPSS storage Using for basic storage service to general users in KEK 5 Tape movers with 20 3590 drives (share use) 2 disk movers with SSA Raid disks (dedicated use for this test bed) 2003/3/27 CHEP03

ICEPP-KEK Grid Testbed Network 1 GbE connection over Super-SINET between ICEPP PC farm, KEK PC farm and HPSS servers in single subnet. RTT ~ 4ms / quality is quite good. 2003/3/27 CHEP03

Japan HEP network backbone on “Super-SINET” Project of Advanced Science and Technologies Super SINET Super SINET : Research  10Gbps Backbone GbE / 10GbE Bridges for peer-connection Connecting major HEP Univ. and institues in Japan. Operation of Optical Cross Connect (OXC) for fiber / wavelength switching Operational from 4th January, 2002 to end of March, 2005 Will be merged in Photonic SINET after April, 2005 2003/3/27 CHEP03

KEK ~60km ICEPP

GRID testbed environment with HPSS through GbE-WAN HPSS servers NorduGrid - grid-manager - gridftp-server Globus-mds Globus-replica PBS server ICEPP KEK SE HPSS 120TB CE SE 0.2TB ~ 60km NorduGrid - grid-manager - gridftp-server Globus-mds PBS server CE 6CPUs CE 100 CPUs PBS clients PBS clients 1Gbps 100Mbps User PCs 2003/3/27 CHEP03

ICEPP-KEK Grid Testbed software Globus 2.2.2 Nordugrid 0.3.12 + PBS 2.3.16 HPSS 4.3+ GSI enabled pftp (GSI-pftp) 2003/3/27 CHEP03

NorduGrid NorduGrid (The Nordic Test bed for Wide Area Computing and Data Handling) http://www.nordugrid.org “The NorduGrid architecture and tools” presented by A.Waananen et al. @ CHEP03 2003/3/27 CHEP03

Why NorduGrid Natural Application of GLOBUS toolkit for PBS. PBS clients do NOT need Globus/NorduGrid installation. We installed NG/Globus to just 3 nodes. (ICEPP CE,KEK CE, KEK HPSS SE) but can use more than 60nodes. Simple, but sufficient functionality. Actually used at ATLAS DC in Nordic states. Good start for basic regional center functionality test. 2003/3/27 CHEP03

HPSS as NorduGrid Storage Element HPSS does not speak ‘Globus’. We need something GridFTP for HPSS In design phase at Argonne Lab. Some are also being developed? (SDSC?) GSI enabled pftp (GSI-pftp) developed at LBL. GSI-pftp is not a GridFTP. But…. 2003/3/27 CHEP03

GSI-pftp as NorduGrid SE Both Gridftp and GSI-pftp are a kind of ftp, only extended protocols are not common. GridFTP GSI-pftp SPAS,SPOR,ETET, ESTO, SBUF, DCAU PBSZ,PCLO,POPN,PPOR,PROT,PRTR,PSTO AUTH,ADAT and other RFC959 2003/3/27 CHEP03

GSI-pftp as NorduGrid SE Protocols for parallel transfer and buffer management are different. DCAU (Data Channel Authentication) is unique for Gridftp. But it is option of user. GSI-pftpd and Grid-ftp client can successfully communicate each other excepting parallel transfer. 2003/3/27 CHEP03

Sample XRSL &(executable=gsim1) (arguments="-d") (inputfiles= (”data.in" "gsiftp://dt05s.cc.kek.jp:2811/hpss/ce/chep/manabe/data2")) (stdout=datafiles.out) (join=true) (maxcputime="36000") (middleware="nordugrid") (jobname="HPSS access test") (stdlog="grid_debug") (ftpThreads=1) 2003/3/27 CHEP03

Players In HPSS Disk Mover HPSS server (Disk Cache) x3 CE Computing Element in ICEPP/KEK Shared by many users Tape: 3590 (14MB/s 40GB) Disk mover GSIpftp Server x3 Tape movers CE (Gridftp client) 2CPU Power3 375MHz AIX 4.3 HPSS 4.3 Globus 2.0 x3 2CPU PenIII 1GHz RedHat 7.2 Globus 2.2 2CPU Power3 375MHz AIX 4.3 HPSS 4.3 Disk mover 2003/3/27 CHEP03

Possible HPSS Configuration 1 KEK ICEPP Super-SINET 1GbE HPSS Server Disk Mover Computing Element 60km SP Switch 150MB/s Put ‘disk mover (cache)’ near HPSS server. Cache should be near to consumer but ‘disk mover’ is far from CE. Get high-performance of SP switch. 2003/3/27 CHEP03

Possible HPSS Configuration 2 ICEPP KEK Super-SINET 1GbE Computing Element Computing Element LAN 1GbE HPSS Server Disk Mover Put ‘remote disk mover(cache)’ near CE. Fast access between CE and cached files. If access to the same file from KEK side CE, long detour happen. 2003/3/27 CHEP03

Possible HPSS configuration 3 KEK ICEPP Computing Element Computing Element HPSS Hierarchy 3 HPSS Hierarchy 2 HPSS Server To avoid long access delay for CE in KEK, disk layer can be divided into two hierarchy. But complicated configuration is it. HPSS Hierarchy 1 2003/3/27 CHEP03

Possible HPSS Configuration 1 KEK ICEPP Super-SINET 1GbE HPSS Server Disk Mover 60km Computing Element LAN WAN LAN 1GbE Current Setup Computing Element 2003/3/27 CHEP03

Performance Basic Network performance. HPSS Client API performance. pftp client - pftp server performance. Gridftp client - pftp server performance. 2003/3/27 CHEP03

Basic Network Performance RTT~4ms packet loss free. MTU=1500 CPU/NIC is bottleneck. Max TCP Buffer Size(256k) in HPSS servers cannot changed. (optimized for IBM SP switch) LAN WAN

Basic network performance on Super-sinet Network transfer with # of TCP session >4 TCP session gets MAX transfer speed. If enough TCP buffer size ~1 session get almost MAX speed. 600 Client Buffer size = 1MB WAN 400 Aggregate Tx speed (MBit/s) Client Buffer size = 100KB 200 ICEPP client KEK HPSS mover Buffer size HPSS mover = 256kB 2 4 6 8 10 # of TCP session

Disk mover disk performance HPSS SSA raw disk performance read/write ~ 35/100 MB/s PC farm’s disk performance. Read/write ~ 30-40MB/s

HPSS Client API LAN HPSS disk <-> CE memory WAN

HPSS Client API NW latency impacts to file transfer speed. Max. raw TCP speed was almost same, but data transfer speed became 1/2 in RTT~4ms WAN. The reason is not clear yet. But frequent communication between HPSS core server and HPSS client exists? (every chunk size (=4MB) ?) write overhead at single buffer transfer was bigger than read. 64MB buffer size was enough for RTT=~4ms network. 2003/3/27 CHEP03

pftp-pftp HPSS mover disk -> Client 80 LAN to client /dev/null 60 WAN Transfer speed (MB/s) 40 KEK client 20 ICEPP client ICEPP client Pwidth 2 4 6 8 10 # of file transfer in parallel

pftp-pftp ‘get’ performance We compared GSI pftp-pftp transfer with normal kerb-pftp-pftp. Both had equivalent transfer speed. Same as in Client-API transfer, even with enough buffer size, transfer speed in WAN is 1/2 of that in LAN. Simultaneous multiple file transfer (>4) gain aggregate transfer bandwidth. We had 2 disk movers with 2 disk paths each (2x2=4) Single file transfer with multiple TCP session (pftp function (command=pwidth)) was not effective for RTT=4ms network with enough FTP buffer. 2003/3/27 CHEP03

80 60 40 20 2 4 6 8 10 pftp-pftp HPSS mover disk Client to /dev/null Aggregate Transfer speed (MB/s) KEK client (LAN) 40 ICEPP client(WAN) 20 Ftp buffer=64MB to client disk client disk speed 35-45MB/s 2 4 6 8 10 # of file transfer in parallel Client disk speed @ KEK = 48MB/s Client disk speed @ ICEPP=33MB/s

pftp-pftp get performance (2) Even if each component (disk, network) has good performance. Total staging performance becomes bad. If access is done in serial way. 640Mbps=80MB/s CPU CPU 40MB/s 100MB/s Total speed = 1/( 1/100 + 1/80 + 1/40) = 21MB/s 2003/3/27 CHEP03

HPSS ‘get’ with Tape Library pftp-pftp get performance Thanks to HPSS multi file transfer between tape and disk hierarchy, and enough number of tape drives, we could get speed up in multiple file transfer even if data was in tapes. 300 tape off drive tape in drive Data in Tape 200 Elapsed Time (sec) Data in disk cache 100 data was on HPSS mover disk data was in HPSS mover mounted tape data was in HPSS mover unmounted tape 2 4 6 8 10 # of file transfer in parallel

GSI-pftp ‘put’ performance 1 file N files Aggregate N files N files 1 file file (pwidth)

Gridftp client and GSI-pftp server disk mver (!=pftpd) client pftp-pftpd disk mver (=pftpd) client gridftp-pftpd disk mver (!=pftpd) client gridftp-pftpd

GSI-pftpd with Gridftp client It works ! But less secure than Gridftp-Gridftpd (omit data path authentication) In our environment, GridFTP parallel TCP transfer is not needed. With multiple disk mover, all data transfer go through single pftpd server. (if use with Gridftp client) 2003/3/27 CHEP03

Path difference pftp - pftpd Gridftp – GSI-pftpd Disk mover x3 Tape mover Disk mover pftp Server CE (pftp client) x3 pftp - pftpd Tape mover Disk mover pftp Server CE (gridftp client) x3 Gridftp – GSI-pftpd

Summary ICEPP and KEK configured NorduGrid test bed with HPSS storage server over High speed GbE WANetwork. Network latency affected HPSS data transfer speed. ‘GSI-pftpd’ developed by LBL is successfully adopted to the interface between NorduGrid and HPSS. But it has room for performance improvement with multi-disk movers. 2003/3/27 CHEP03

Summary HPSS parallel mechanism (multi-disk/tape servers) was effective for utilize High-speed middle-range distance network bandwidth. 2003/3/27 CHEP03