First of ALL Big appologize for Kei’s absence Hero of this year’s LSR achievement Takeshi in his experiment.

Slides:



Advertisements
Similar presentations
TransLight/StarLight Tom DeFanti (Principal Investigator) Maxine Brown (Co-Principal Investigator) Joe Mambretti (Investigator) Alan Verlo, Linda Winkler.
Advertisements

The LAC/UIC experiences through JGN2/APAN during SC04 Katsushi Kouyama and Kazumi Kumazoe Kitakyushu JGN Research Center / NiCT Robert L. Grossman, Yunhong.
Storage System Integration with High Performance Networks Jon Bakken and Don Petravick FNAL.
Tuning and Evaluating TCP End-to-End Performance in LFN Networks P. Cimbál* Measurement was supported by Sven Ubik**
TCP transfers over high latency/bandwidth network & Grid TCP Sylvain Ravot
High Speed Total Order for SAN infrastructure Tal Anker, Danny Dolev, Gregory Greenman, Ilya Shnaiderman School of Engineering and Computer Science The.
August 10, Circuit TCP (CTCP) Helali Bhuiyan
Fundamentals of Computer Networks ECE 478/578
Kei Hiraki University of Tokyo Realization and Utilization of high-BW TCP on real application Kei Hiraki Data Reservoir / GRAPE-DR project The University.
An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN
DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG.
PFLDNet Argonne Feb 2004 R. Hughes-Jones Manchester 1 UDP Performance and PCI-X Activity of the Intel 10 Gigabit Ethernet Adapter on: HP rx2600 Dual Itanium.
ESLEA Bedfont Lakes Dec 04 Richard Hughes-Jones Network Measurement & Characterisation and the Challenge of SuperComputing SC200x.
Practical TDMA for Datacenter Ethernet
1 © 2012 InfoComm International Essentials of AV Technology Networking for Data and AV.
Long Distance experiments of Data Reservoir system
Network Monitoring School of Electronics and Information Kyung Hee University. Choong Seon HONG Selected from ICAT 2003 Material of James W. K. Hong.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
CMS Data Transfer Challenges LHCOPN-LHCONE meeting Michigan, Sept 15/16th, 2014 Azher Mughal Caltech.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
Masaki Hirabaru NICT The 3rd International HEP DataGrid Workshop August 26, 2004 Kyungpook National Univ., Daegu, Korea High Performance.
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
Large File Transfer on 20,000 km - Between Korea and Switzerland Yusung Kim, Daewon Kim, Joonbok Lee, Kilnam Chon
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology High speed WAN data transfers for science Session Recent Results.
Pacific Wave and Its Role in Research and Education Networks Jan Eveleth Managing Director, Pacific Northwest Gigapop Quilt Peering Workshop St. Louis,
Masaki Hirabaru Internet Architecture Group GL Meeting March 19, 2004 High Performance Data transfer on High Bandwidth-Delay Product Networks.
LambdaGRID the NREN (r)Evolution Kees Neggers Managing Director SURFnet Reykjavik, 26 August 2003.
Kees Neggers SURFnet SC2003 Phoenix, 20 November 2003.
Technology for Using High Performance Networks or How to Make Your Network Go Faster…. Robin Tasker UK Light Town Meeting 9 September.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
CA*net 4 International Grid Testbed Tel:
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., Allyn Romanow, Cisco Systems,
CCNA 3 Week 4 Switching Concepts. Copyright © 2005 University of Bolton Introduction Lan design has moved away from using shared media, hubs and repeaters.
Optimised Memory Transfer & Flow Control for High Speed Networks - Codito Technologies Pvt. Ltd. - D Y Patil College of Engineering.
GLIF Infrastructure Kees Neggers SURFnet SC2004 Pittsburgh, PA 12 November 2004.
Techs in Paradise 2004, Honolulu / Lambda Networking BOF / Jan 27 NetherLight day-to-day experience APAN lambda networking BOF Erik Radius Manager Network.
Parallel TCP Bill Allcock Argonne National Laboratory.
Rate Control Rate control tunes the packet sending rate. No more than one packet can be sent during each packet sending period. Additive Increase: Every.
23 January 2003Paolo Moroni (Slide 1) SWITCH-cc meeting DataTAG overview.
Masaki Hirabaru Network Performance Measurement and Monitoring APAN Conference 2005 in Bangkok January 27, 2005 Advanced TCP Performance.
Internet data transfer record between CERN and California Sylvain Ravot (Caltech) Paolo Moroni (CERN)
Masaki Hirabaru NICT Koganei 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Performance Measurement on Large Bandwidth-Delay Product.
Project Results Thanks to the exceptional cooperation spirit between the European and North American teams involved in the DataTAG project,
TCP transfers over high latency/bandwidth networks Internet2 Member Meeting HENP working group session April 9-11, 2003, Arlington T. Kelly, University.
GigaPort NG Network SURFnet6 and NetherLight Erik-Jan Bos Director of Network Services, SURFnet GDB Meeting, SARA&NIKHEF, Amsterdam October 13, 2004.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
30 June Wide Area Networking Performance Challenges Olivier Martin, CERN UK DTI visit.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
EDC Intenet2 AmericaView TAH Oct 28, 2002 AlaskaView Proof-of-Concept Test Tom Heinrichs, UAF/GINA/ION Grant Mah USGS/EDC Mike Rechtenbaugh USGS/EDC Jeff.
TCP Traffic Characteristics—Deep buffer Switch
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
S. Ravot, J. Bunn, H. Newman, Y. Xia, D. Nae California Institute of Technology CHEP 2004 Network Session September 1, 2004 Breaking the 1 GByte/sec Barrier?
L1/HLT trigger farm Bologna setup 0 By Gianluca Peco INFN Bologna Genève,
Masaki Hirabaru (NICT) and Jin Tanaka (KDDI) Impact of Bottleneck Queue on Long Distant TCP Transfer August 25, 2005 NOC-Network Engineering Session Advanced.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
Troubleshooting Ben Fineman,
Internet2 Land Speed Record
Realization of a stable network flow with high performance communication in high bandwidth-delay product network Y. Kodama, T. Kudoh, O. Tatebe, S. Sekiguchi.
Efficient utilization of 40/100 Gbps long-distance network
R. Hughes-Jones Manchester
Networking between China and Europe
MB-NG Review High Performance Network Demonstration 21 April 2004
Wide Area Networking at SLAC, Feb ‘03
Global Lambda Integrated Facility
High-Performance Data Transport for Grid Applications
Achieving reliable high performance in LFNs (long-fat networks)
Presentation transcript:

First of ALL Big appologize for Kei’s absence

Hero of this year’s LSR achievement Takeshi in his experiment

What is Data Reservoir? Share Scientific Data over long distance –Physics, astronomy, earth science, biology High-speed data transfer on Long Fat pipe Network Easy to use –File system transparent

Data Reservoir System User Programs Disk Server IP Switch File Server Disk Server IP Switch Disk Server iSCSI Bulk Transfer Global Network Using iSCSI protocol Without any modification on applicatoins

3 rd Generation SC04 SC05 Round the World 31,248km 1 to 1, memory to memory transfer Single Stream, Longest Path, Standard MTU TCP Throughput Award Fastest IPv6 Hisotry of Data Reservoir and SC BandWidth Challente 1 st Generation SC02 26 to 26 servers 1GbE interface RTT 200ms, 90 % usage of bottleneck OC-12 2 nd Generation SC03 Aggregated 10Gbps 24,000Km 1 and a half round trip between U.S. Tokyo 32 to 32 Servers too many :-< 4 th Generation SC06 A pair of machines Disk to Disk transfer Single 7.2Gbps Dual 8.65 Gbps

Once upon a time, There started an ambitious project to construct an L2 network between CERN and Tokyo via Amsterdam, Canada, and U.S. Fortunately ( ! ), our team got a chance to try it ♪

Network Tokyo CERN Pittsburgh Chicago Amsterdam Geneva Seattle Vancouver Calgary Minneapolis WIDE APAN/JGN II IEEAF/Tyco/WIDE CANARIE SURFnet Abilene

3 rd Generation Data Reservoir started Background WAN PHY over the world Programmable 10GbE NIC is available Challenge How much bandwidth can we use by single stream?

Struggles while the 1 st experiment Almost no information –Ping + loopback is the only source –Different network, different timezone –TELEPHONE must be the most important equipment. Over 7Gbps between Tokyo and CERN

It is nice of this experiment to have a lot of new friend! We really appreciate nice adivces. Submission to Internet2 Land Speed Record Experiments while X’mas vacation, the smallest traffic season!

Some Results SC04 Band Width Challenge U.S. – Tokyo – U.S. – CERN 31,248km, RTT 433ms, 7.57Gbps Xmas Experiment Season with smallest network traffic. Very Very strict dead-line for preparation Tokyo Chicago Amsterdam Siattle Tokyo 33,979km, RTT 498ms 7.21Gbps : Update LSR 8times.

Network Tokyo CERN Pittsburgh Chicago Amsterdam Geneva Seattle Vancouver Calgary Minneapolis WIDE APAN/JGN II IEEAF/Tyco/WIDE CANARIE SURFnet Abilene

Challenge in 2006 To attain 90% of 10Gbps The difficulty WAN PHY (MAX 9.6Gbps) ⇔ LAN PHY Only 4% of 10Gbps, But, if RTT = 500, the difference is 25MBytes for Round Trip (TCP can control transmission rate with RTT grain) Another difficulty PCI-X bottleneck → Now, cleared

LSR in New players Circuit -- NetIron 40G  NetIron RX-4 in Seattle GSO (Generic Segmentation Offload ) –Offloading CRC calculation Chelsio T PCI-X2.0 support IPG tuning is available Iperf modification with sendfile() Hardware Approach for 10Gbit Network TAPEE: Network Analyzer

2006 LSR Challenge, again on X’mas Around Dec/10: Seattle line test Around Dec/20: Round-The-World up Dec/31: Submission Jan/8/2007: Round-The-World down

Host Xeon 5160 * 1 –Woodcrest core –Dual core DDR400 2GB Chelsio T310-SR on PCI-Express x8 –There is no longer bus speed bottleneck Linux

Circuit Round The World circuit –522ms RTT –Trans Pacific & Trans Atlantic –WAN PHY & LAN PHY mixed –Tokyo – [Los Angels] – Chicago – Amsterdam –Amsterdam – [Chicago] – Seattle – Tokyo

Amsterdam NetherLight At SARA SURFnet IEEAFCANARIE L3 switch Chicago StarLight L2 switch Atlantic Ocean Pacific Ocean WAN PHY Force10 E1200 HDXc Foundry RX-4 Seattle Pacific Northwest Gigapop SURFnet WIDE JGN2 ONS ONS Foundry NI40G GS4000 WAN PHY HDXc GS4000 Others L1 switch T-LEX IEEAF WAN PHY LAN PHY JGN2 LAN PHY CANARIE CA* NET 4 WIDE WAN PHY LAN PHY LSR Network Topology Foundry RX-4 WAN PHY Age-1 Intel Xeon Age-2 Intel Xeon Fujitsu XG800 JGN2 Tokyo Force10 E300 JGN2 Los Angels JGN2 WAN PHY CISCO 7609 HDXc SURFnet WAN PHY NYC MANLAN TransLight LAN PHY TransLight

LSR distance From To Distance HND (35°33'08"N 139°46'47"E) ORD (41°58'43"N 87°54'17"W) km ORD (41°58'43"N 87°54'17"W) AMS (52°18'31"N 04°45'50"E) 6630 km AMS (52°18'31"N 04°45'50"E) SEA (47°26'56"N 122°18'34"W) 7864 km SEA (47°26'56"N 122°18'34"W) HND (35°33'08"N 139°46'47"E) 7730 km 4 segment path: km

IPG Tuning Chelsio T310 has special function of setting IPG (Inter Packet Gap) –Enables to control the Ethernet NIC transmission rate –Upto 2048 octet (IEEE standard IPG 12 octet) Fine Grain Tuning For Standard Frame control 50 ~ 100 %, For 8000B Jumbo Frame 80 ~ 100%

Without pacing (IPG 136) 600MB RWIN

Pacing (IPG 800) 600MB RWIN

Pacing (IPG 700) 600MB RWIN

Pacing (IPG 720) 600MB RWIN

Iperf modification We have been used Iperf Iperf transmission flow –Allocate several kB buffer –Initialize buffer with random data –while() { write(sock, buffer) } This invokes copy between user and kernel space

Iperf modification (cont’d) An advice from Chelsio –“Use netperf’s sendfile mode to confirm receiver performance” Modification –Iperf-zerocopy transmission flow open(temporary file)  file descriptor fd buffer = mmap(fd) initialize buffer with random data while() { sendfile(sock, fd) } –sendfile(2) sends data from kernel After some discussion, we concluded that using this version of Iperf meets LSR rule

GSO

GSO + zerocopy

New submission 7.67Gbps average –Standard-Iperf –Peak 8.10Gbps, 20 minutes, No packet loss 9.08Gbps average –Iperf-zerocopy –Peak 9.11Gbps, 5 hours, No packet loss

History of single-stream IPv4 Land Speed Record Year Distance bandwidth product Pbit m / s 2004/11/9 Data Reservoir project WIDE project 149 Pbit m / s , /11/ Pbit m / s 10 Gbps * 30,000km 2006/2/ Pbit m / s 2004/12/ Pbit m / s

History of single-stream IPv6 Land Speed Record Year Distance bandwidth product Pbit m / s 2004/10/29 Data Reservoir project WIDE project 167 Pbit m / s , /11/13 Data Reservoir project WIDE project 208 Pbit m / s 10 Gbps * 30,000km 2006/12/28 Data Reservoir project WIDE project 272 Pbit m / s