UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 PIs: Wendy Huntoon/PSC, Tom Dunigan/ORNL, Brian Tierney/LBNL Impact and Connections IMPACT: increase throughput of bulk transfers over high delay, bandwidth networks (like DOE’s ESnet) select optimal paths and transport parameters for distributed (Grid) application (e.g.: GridFTP) provide network performance data base from active and passive monitoring CONNECTIONS: SciDAC: Astrophysics, Bandwidth Estimation, Data Grid, INCITE, Logistical Networking Base:Network Monitoring, Data Grid, Transport Protocols Milestones/Dates/Status Network probes and sensors Mon/Yr DONE - initial sensor and tool deployment 12/01 12/01 - data base design 4/02 - initial data base implementation 9/02 - final sensor/data base 6/03 Transport protocol optimizations - protocol analysis 11/02 - initial tuning daemon 3/02 - bulk transfer tuning demos 8/02 - final tuning daemon 6/03 Multipath support - analytical analysis 8/02 - proof-of-principal routing daemons 12/02 - grid applications demos 4/03 Net100 Novel Ideas Net100 will tune network-UNaware applications based on recent and current link characteristics Net100 will tune more than just transport buffer sizes, such as TCP AIMD parameters DUP threshold Delayed ACK Net100 will determine optimal paths and whether to use multiple streams and/or multiple paths Net100 kernel utilizes passive monitoring from the Web100 kernel NET100: Developing network-aware operating systems Tasks: -develop/deploy network probes/sensors -develop network metrics data base -develop transport protocol optimizations -develop network-tuning daemon Date Prepared: 1/7/02 High-Performance Network Research- SciDAC/Base MICS Program Manager: Thomas Ndousse
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 project New DOE-funded (Office of Science) project ($1M/yr, 3 yrs) Principal investigators –Wendy Huntoon and the NCAR/PSC/Web100 team (Matt Mathis) –Brian Tierney, LBNL –Tom Dunigan, ORNL Objective: develop network aware operating systems – optimize and understand end-to-end network and application performance – eliminate the “wizard gap” Motivation –DOE has a large investment in high speed networks (ESnet) and distributed applications –many network applications are not utilizing the available bandwidth
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 approach Develop Network Tools Analysis Framework (NTAF) –collect data for network tuning Develop/evaluate/deploy network tools (Enable, NWS, iperf, pipechar, …) aggregate and transform output from tools and Web100 Store/query/archive performance data –evaluate network applications over DOE’s ESnet (OC12, OC48,10GigE…) bulk transfers over high bandwidth/delay network distributed applications (grid) Investigate TCP optimizations –simulate/emulate/deploy –Linux kernel mods Autotune network applications –WAD (workaround daemon)
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Web100 summary NSF funded (NCAR/PSC) web100.org Modified Linux kernel (2.4.9) instrumented kernel to read/set TCP variables for a specific flow –readable: RTT, counts (bytes, pkts, retransmits,dups), state (SACKs, windowscale, cwnd, ssthresh) (115 variables!) –settable: buffer sizes GUI to display/modify a flow’s TCP variables, real-time API for network-aware applications Early evaluators: ANL,SLAC, LBNL, ORNL, universities
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Motivation bulk transfers are slow – faster links (OC12, OC48, 10GigE ), but long delay –classic TCP tuning problem – also broken TCP stacks –Under-provisioned routers/switches –TCP is lossy, slow to recover tune it or replace it? Compute/data grids –sense/probe link bandwidths/latencies –schedule/configure distributed application
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory TCP losses Packet losses during startup, linear recovery 0.5 Mbs instantaneous average Packet loss Early packet drops
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory TCP tuning (workarounds) Avoid losses –retain/probe for “optimal” buffer sizes –ECN capable routers/hosts –reduce bursts (TCP vegas) Faster recovery –bigger MSS (jumbo frames) –speculative recovery (D-SACK) –modified congestion avoidance? Autotune (WAD variables) –Buffer size –Dupthresh –Del ACK, Nagle –AIMD –Virtual MSS
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Tuning opportunities Parallel streams ( psockets ) –how to choose number of streams, buffer sizes? –autotune ? Application routing daemons –indirect TCP –alternate path (Wolski, UCSB) –multipath (Rao, ORNL) Other protocols (SCTP, DCP) –Out of order delivery –rate-based Are these fair?
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Work-around Daemon (WAD) Version 0 –passively collect flow data –tune unknowing sender/receiver –config file with “tuning info” ? –Based on Web100/Linux 2.4 To be done –collecting tuning info –adding more knobs to kernel Related work –Feng’s Dynamic Right Sizing –Linux 2.4 auto-tuning/caching –Mathis TCP buffer tunning
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Network Tool Analysis Framework (NTAF) Configure and launch network tools –measure bandwidth/latency ( iperf, pchar, pipechar ) –collect passive data (SNMP from routers, OS/Web100 counters) –forecast bandwidth/latency for grid resource scheduling –augment tools to report Web100 data Collect and transform tool results into a common format Save results for short-term auto-tuning and archive for later analysis –compare predicted to actual performance –measure effectiveness of tools and auto-tuning Auto-tune network applications –WAD (WorkAround Daemon) –tunable TCP stack
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 interactions Net100 is both a producer and consumer of network performance data –Active probes (Claffy Bandwidth Estimation, INCITE) –Passive sensors (LBL Network monitoring) Auto-tuning –TCP optimizations (Feng/LANL, Linux 2.4) –smart transfer (IQecho, Logistical networking) –non-TCP protocols (DCP, STP, SCTP, rate-based, ?) Net100 tuning could be applied to distributed applications –Climate/Probe, SuperNova, DataGrids –interact with Grid metaware (forecasting, scheduling, tuning)