Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

Similar presentations


Presentation on theme: "1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),"— Presentation transcript:

1 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP), the University of Tokyo For Regional Center R&D Collaboration, Computing Research Center, KEK ICEPP, the University of Tokyo

2 2 KEK-ICEPP R&D Collaboration ICEPP, Univ. Tokyo Regional Analysis Center for LHC ATLAS Experiment KEK Central HEP Facility in Japan Joint R&D Program Computing Research Center, KEK and ICEPP, Tokyo Started in 2001 For Distributed Analysis Framework Introduces Computing Grid Technology

3 3 R&D Items PC Farm Blade Servers Storage System IDE File Server Prototype Wide Area Network KEK-ICEPP GbE Connection CERN-ICEPP Connectivity Study Integrated Study HPSS Access via WAN

4 4 HP Blade Server HP ProLiant BL20p Xeon 2.8GHz 2CPU/node 1GB memory SCSI 36GBx2, hardware RAID1 3 GbE NIC iLO remote administration tool 8 blades/ Enclosure(6U) Total 108 blades(216 CPUs) in 3 racks Rocks 2.3.2(RedHat 7.3 base) for Cluster management

5 5 HP Blade Server 2

6 6 HP Blade Server Network Configuration SW NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO SW Frontend Server Campus LAN Private LAN PXE/Control Private LAN Data Private LAN iLO Blade Servers NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO NIC1 NIC2 NIC3 iLO

7 7 File Server Prototype 2 Xeon 2.8GHz (FSB 533MHz) Memory 1GB 2 PCI RAID I/F, 3ware 7500-8 2 GbE(1000Base-T) LAN 4U Rack mount Case, 16 hot-swap IDE HDD bay 16 250GB HDD + 40GB HDD 460W N+1 Redundant Power Supply 460W N+1 Redundant Power Supply 7 servers, total 24.5TB

8 8 File Server Prototype 2 Benchmarks Read 350MB/s, Write 170-250MB/s RAID0 16 HDD Read 140-160MB/s, Write 20-50MB/s, RAID5 8 HDD RAID0 16HDD RAID5 8 HDD

9 9 GbE between KEK and ICEPP 10Gbps SuperSINET Backbone Commodity Project Oriented 1Gbps KEK-ICEPP for ATLAS Experiment

10 10 WAN Performance Measurement “A” setting TCP 479Mbps -P 1 -t 1200 -w 128KB TCP 925Mbps -P 2 -t 1200 -w 128KB TCP 931Mbps -P 4 -t 1200 -w 128KB UDP 953Mbps -b 1000MB -t 1200 -w 128KB "A" setting: 104.9MB/s "B" setting: 110.2MB/s “B” setting TCP 922Mbps -P 1 -t 1200 -w 4096KB UDP 954Mbps -b 1000MB -t 1200 -w 4096KB

11 11 Wide Area Network International network 2.5Gbps x 2 (=5Gbps) to NY

12 12 CERN-Tokyo (Very preliminary) [UDP] 17:00-17:30 option: -b 1000MB -w 256KB [ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0-1095.5 sec 116 GBytes 906 Mbits/sec 0.015 ms 4723300/89157875 (5.3%) [ 3] 0.0-1095.5 sec 134933 datagrams received out-of-order [TCP] 11:00-16:30 [ ID] Interval Transfer Bandwidth Option [SUM] 0.0-1202.7 sec 645 MBytes 4.50 Mbits/sec -P 4 -w 256KB [SUM] 0.0-1203.3 sec 1.13 GBytes 8.04 Mbits/sec -P 8 -w 256KB [SUM] 0.0-1215.0 sec 2.35 GBytes 16.6 Mbits/sec -P 16 -w 256KB [SUM] 0.0-1219.9 sec 5.17 GBytes 36.4 Mbits/sec -P 32 -w 256KB [SUM] 0.0-1230.9 sec 8.30 GBytes 57.9 Mbits/sec -P 64 -w 256KB [SUM] 0.0-1860.0 sec 30.2 GBytes 140 Mbits/sec -P 128 -w 256KB [SUM] 0.0-1860.2 sec 37.0 GBytes 171 Mbits/sec -P 160 -w 256KB

13 13 1Gbps 100Mbps ICEPPKEK 100 CPUs 6CPUs HPSS 120TB GRID testbed environment with HPSS through GbE-WAN NorduGrid - grid-manager - gridftp-server Globus-mds Globus-replica PBS server NorduGrid - grid-manager - gridftp-server Globus-mds PBS server PBS clients HPSS servers ~ 60km 0.2TB SECE SE User PCs

14 14 Basic Network Performance RTT~3ms packet loss free. MTU=1500 CPU/NIC is bottleneck. Max TCP Buffer Size(256k) in HPSS servers cannot be changed. (optimized for IBM SP switch) WAN LAN

15 15 Client disk speed @ KEK = 48MB/s Client disk speed @ ICEPP=33MB/s 0246810 0 20 40 60 80 KEK client ( LAN ) ICEPP client( WAN ) # of file transfer in parallel Aggregate Transfer speed (MB/s) Pftp→pftp HPSS mover disk Client disk client disk speed 35-45MB/s to /dev/null to client disk Ftp buffer=64MB

16 16 Summary R&D for Distributed Analysis Framework Study of Fabric Blade Servers Disk Array/File Server Prototype Network Connectivity KEK-ICEPPCERN-ICEPP Integrated Performance HPSS Access via WAN

17 17 Prospects Network Connectivity CERN-KEK/Tokyo to 10Gbps in 2004 Tokyo-Taipei Connectivity Study soon File Server Upgrade 100TB Disk Capacity Hierarchical Storage Study soon Prepare for Experiments 2004 Data Challenges


Download ppt "1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),"

Similar presentations


Ads by Google