Presentation is loading. Please wait.

Presentation is loading. Please wait.

Samuel Wood Manikandan Punniyakotti Supervisors: Brad Smith, Katia Obraczka, JJ Garcia-Luna-Aceves

Similar presentations


Presentation on theme: "Samuel Wood Manikandan Punniyakotti Supervisors: Brad Smith, Katia Obraczka, JJ Garcia-Luna-Aceves"— Presentation transcript:

1 Samuel Wood Manikandan Punniyakotti Supervisors: Brad Smith, Katia Obraczka, JJ Garcia-Luna-Aceves http://gnet.soe.ucsc.edu

2  As genome sequencing becomes cheaper and more frequent researchers need quick methods of transferring large datasets over long distances for collaboration  1000 Genomes Project  Data transfers of dozens of genomes between data centers on opposite sides of the United States are daily occurrences  Examine genomic data transfer between hosts on a high speed network such as Internet2, with long round trip times (RTTs)  These are called Long Fat Network (LFN) due to the large bandwidth delay product

3  TCP is popular for unicast communications and has been packaged with commodity operating systems and networking APIs.  But for long-haul high bandwidth networks (Long Fat Networks) commodity TCP has been found to be less suitable because, (i) TCP’s conservative congestion control mechanisms reduce the throughput heavily when there are errors (ii) Reliability through ACKS and retransmissions and hence the latency of a packet recovery is at least an RTT (iii) Huge buffers at the end hosts to fully utilize the capacity Solutions: - Tuning the TCP parameters at the end-hosts - Using better congestion control algorithms - Using sophisticated data transfer tools

4  Provide a set of guidelines to end hosts of a large genomic data transfer that will reduce the total transmission duration (assuming an immutable intermediate network)  Secondary goals include providing secure encryption and network fairness

5  Refers to adjusting the TCP parameters in the kernel  Most applications do not try to understand the network =>TCP auto-tuning with pre-configured limits  Some default values not optimized for LFNs  Fasterdata: changes in the TCP kernel settings /etc/sysctl.conf to improve TCP auto-tuning

6  Fasterdata - knowledge base for network administrators transferring large datasets over LFNs - is part of Esnet  SpeedGuide – an online Broadband Internet performance guide

7 ParametersMeaning net.core.rmem_maxMaximum OS receive buffer size for all types of connections net.core.wmem_maxMaximum OS send buffer size for all types of connections net.ipv4.tcp_rmemMemory reserved for TCP receive buffers (per connection default) net.ipv4.tcp_wmemMemory reserved for TCP send buffers (per connection default) net.ipv4.tcp_congestion_controlPluggable congestion control algorithms net.ipv4.tcp_sackSelective acknowledgement net.ipv4.tcp_window_scalingSupport for large TCP Windows

8  PerfSONAR -infrastructure for network performance monitoring  solve end-to-end performance problems on paths crossing several networks  Includes several network monitoring tools:  BWCTL (Bandwidth Test Controller) that can use Iperf  OWAMP (One Way Ping)  NDT (Network Diagnostic Tool)  ping  traceroute

9

10  Dummynet  link emulator tool (ipfw)  run experiments in user-configurable network environments  Simulates/enforces queue and bandwidth limitations, delays, packet losses, and multipath effects

11  Tools like SCP and SFTP don’t work well in LFNs  No parallel streams  Assume a LAN  Faster Data Transfer (FDT)  GridFTP  paraFetch

12  UCSC to.. (Or) full-factorial experiment* (i) solution application (GridFTP, paraFetch, FDT) (ii) number of parallel streams (iii) host TCP settings (iv) with or without encryption (v) memory-to-memory versus disk-to-disk transfers. RTT, packet loss, bandwidth and no. of hops remain unchanged within each scenario CasesGenome Datacenter siteRTTNo. of hops ABaylor College of Medicine in Houston, Texas~ 42ms9 BBroad Institute in Massachusetts ~92ms16

13 Latency10ms40ms60ms80ms100ms Packet Loss0 pkt/sec1 pkt/sec2 pkt/sec3pkt/sec4pkt/sec Bandwidth100Mbps250Mbps500Mbps750Mbps1000Mbps Jitter0(1/3)Latency(1/2)Latency(2/3)LatencyLatency

14

15

16

17

18

19 Disk-Disk Mem-Mem

20 Disk-Disk Mem-Mem

21 Disk-DiskMem-Mem

22  TCP sequence graphs to better understand the performance differences  Content-specific compression to reduce the transmission duration  Other tools like ASPERA and UDP Blasters  Changes in network infrastructure  Virtual circuits  Proxies  Multipath

23  ESnet (2011)  SpeedGuide.net (2011)  Ha, S., Rhee, I. & Xu, L. (2008), `Cubic: a new tcp-friendly high-speed tcp variant',  SIGOPS Oper. Syst. Rev. Mathis, M., Heffner, J. & Reddy, R. (2003), `Web100: extended tcp instrumentation for research, education and diagnosis', SIGCOMM Comput. Commun. Rev.  Wang, C. & Zhang, D. (2011), `A novel compression tool for efficient storage of genome resequencing data', Nucleic Acids Research.

24 Questions?


Download ppt "Samuel Wood Manikandan Punniyakotti Supervisors: Brad Smith, Katia Obraczka, JJ Garcia-Luna-Aceves"

Similar presentations


Ads by Google